Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#2574 - busybox ntpd: NTP + DNSSEC chicken-and-egg problem at boot #7703

Closed
openwrt-bot opened this issue Oct 31, 2019 · 3 comments
Closed
Labels

Comments

@openwrt-bot
Copy link

Aditza2015:

environment:

system type : Qualcomm Atheros QCA9558 ver 1 rev 0
machine : TP-LINK TL-WR1043ND v2
cpu model : MIPS 74Kc V5.0

cat /etc/openwrt_release

DISTRIB_ID='SuperWRT'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r11362-4bf9bec361'
DISTRIB_TARGET='ar71xx/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='SuperWRT SNAPSHOT r11362-4bf9bec361'
DISTRIB_TAINTS='no-all'

uname -a

Linux MyRouter-v2 4.14.150 #0 Wed Oct 30 10:16:25 2019 mips GNU/Linux

opkg list-installed | grep -i "busy|dns"

busybox - 1.31.0-1
ddns-scripts - 2.7.8-12
dnsmasq-full - 2.80-14
luci-app-ddns - 2.4.9-7
rpcd-mod-rrdns - 20170710

Problem description:
Most routers do not have a built-in hardware clock and try to obtain the time through NTP at boot time.

If DNSSEC is also cofigured and NSEC/NSEC3 validation is enforced then this becomes impossible, because the router will not be capable of obtaining an usable time reference since it cannot do proper secure DNS resolution for the time servers' hostnames.

relevant bits from /etc/config/dhcp that turn NTPD / busybox ntpd into a dead duck at boot and which in turn causes the entire DNSSEC resolution to fail because of time differences are:

config dnsmasq
option dnssec '1'
option dnsseccheckunsigned '1'

Steps to reproduce:

  • enable DNSSEC and DNSSEC validation of unsigned zones.
  • save the configuration
  • make sure the running firmware image was built more than 24 hours ago and the most recent file under /etc/ is from more than 24 hours ago.
    (preferably, should be more than 30 days ago, to account for long lived RRSIG dns signatures)
  • reboot the router
  • check the router's date and time after reboot. It should now be reset to sometime in the past.
  • after reboot, discover that DNS resolution is broken for the entire router because DNSSEC cannot validate the time-sensitive signatures. If you have request logging enables you'll see a lot of "BOGUS" error messages in syslog. This is because dnsmasq now considers all DNS replies received as fake because of the signature time mismatch.

Tentative solution:
Would it be possible to adjust the startup script of Busybox NTPD so that it first tries to obtain a rough time sync reference from somewhere, without relying on the time server hostnames configured in /etc/config/system?
/etc/init.d/sysfixtime is useless when the router doesn't have a built-in hardware clock and the most recent file timestamp from /etc/ could be from weeks ago.

Maybe query a couple of times at boot one time server that has a static ip address, to obtain an usable time reference so that DNSSEC validation can be bootstrapped later on?

probably possible to use here:
Google time servers https://time.google.com
Cloudflare time servers https://time.cloudflare.com

These servers are members of the NTP pool project and they have fixed IP addresses published in DNS for worldwide use:
Google:
216.239.35.0
216.239.35.4
216.239.35.8
216.239.35.12
2001:4860:4806:0:0:0:0:0
2001:4860:4806:4:0:0:0:0
2001:4860:4806:8:0:0:0:0
2001:4860:4806:c:0:0:0:0

Cloudflare:
162.159.200.1
162.159.200.123
2606:4700:f1:0:0:0:0:1
2606:4700:f1:0:0:0:0:123

note: this bug is related to openwrt/packages#10409
that bug is opened for the standalone NTPD package while this one is for the base-system busybox/ntpd package.

@openwrt-bot
Copy link
Author

ldir:

There's an existing mechanism that takes advantage of busybox' ntpd '-S' parameter to run a hotplug script on ntp time update. This signals dnsmasq with SIGINT to let dnsmasq know to check timestamps... until then zone timestamps are not checked.

This used to just work.

@openwrt-bot
Copy link
Author

hmh:

It is supposed to work like that, yes. But I have seen it fail several times with openwrt 19.07.

So, you can count this as a confirmation the bug does exist.

Unfortunately, the several times it happened, I was not in a position to fully debug system state and had to "fix it now!" (which is easy done by disabling dnsmasq dnssec).

I did try quite hard to reproduce it, but thus far no such luck: even if I disable sysfixtime (which means the system clock will be somewere near 1970-01-01 until ntp kicks in).

The basic functionality is working. But some corner case is basically utterly broken.

Right now, you are advised to NOT use dnsmasq+dnssec unless you have raw IP addresses on your list of ntp servers, or a proper RTC.

@openwrt-bot
Copy link
Author

hmh:

Ehh, yes, it IS broken. Hideously. Since basically forever.

The attached patch to /etc/init.d/dnsmasq fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant