OpenWrt/LEDE Project

  • Status Closed
  • Percent Complete
    100%
  • Task Type Bug Report
  • Category Base system
  • Assigned To
    Hans Dedecker
  • Operating System All
  • Severity High
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Adi - 31.10.2019
Last edited by Hans Dedecker - 25.03.2020

FS#2574 - busybox ntpd: NTP + DNSSEC chicken-and-egg problem at boot

environment:

system type             : Qualcomm Atheros QCA9558 ver 1 rev 0
machine                 : TP-LINK TL-WR1043ND v2
cpu model               : MIPS 74Kc V5.0

# cat /etc/openwrt_release
DISTRIB_ID='SuperWRT'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r11362-4bf9bec361'
DISTRIB_TARGET='ar71xx/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='SuperWRT SNAPSHOT r11362-4bf9bec361'
DISTRIB_TAINTS='no-all'

# uname -a
Linux MyRouter-v2 4.14.150 #0 Wed Oct 30 10:16:25 2019 mips GNU/Linux

# opkg list-installed | grep -i "busy\|dns"
busybox - 1.31.0-1
ddns-scripts - 2.7.8-12
dnsmasq-full - 2.80-14
luci-app-ddns - 2.4.9-7
rpcd-mod-rrdns - 20170710

Problem description: Most routers do not have a built-in hardware clock and try to obtain the time through NTP at boot time.

If DNSSEC is also cofigured and NSEC/NSEC3 validation is enforced then this becomes impossible, because the router will not be capable of obtaining an usable time reference since it cannot do proper secure DNS resolution for the time servers’ hostnames.

relevant bits from /etc/config/dhcp that turn NTPD / busybox ntpd into a dead duck at boot and which in turn causes the entire DNSSEC resolution to fail because of time differences are:

config dnsmasq
	option dnssec '1'
	option dnsseccheckunsigned '1'

Steps to reproduce:
- enable DNSSEC and DNSSEC validation of unsigned zones.
- save the configuration
- make sure the running firmware image was built more than 24 hours ago and the most recent file under /etc/ is from more than 24 hours ago.
(preferably, should be more than 30 days ago, to account for long lived RRSIG dns signatures)
- reboot the router
- check the router’s date and time after reboot. It should now be reset to sometime in the past.
- after reboot, discover that DNS resolution is broken for the entire router because DNSSEC cannot validate the time-sensitive signatures. If you have request logging enables you’ll see a lot of “BOGUS” error messages in syslog. This is because dnsmasq now considers all DNS replies received as fake because of the signature time mismatch.

Tentative solution: Would it be possible to adjust the startup script of Busybox NTPD so that it first tries to obtain a rough time sync reference from somewhere, without relying on the time server hostnames configured in /etc/config/system?
/etc/init.d/sysfixtime is useless when the router doesn’t have a built-in hardware clock and the most recent file timestamp from /etc/ could be from weeks ago.

Maybe query a couple of times at boot one time server that has a static ip address, to obtain an usable time reference so that DNSSEC validation can be bootstrapped later on?

probably possible to use here:
Google time servers https://time.google.com Cloudflare time servers https://time.cloudflare.com

These servers are members of the NTP pool project and they have fixed IP addresses published in DNS for worldwide use:
Google:
216.239.35.0
216.239.35.4
216.239.35.8
216.239.35.12
2001:4860:4806:0:0:0:0:0
2001:4860:4806:4:0:0:0:0
2001:4860:4806:8:0:0:0:0
2001:4860:4806:c:0:0:0:0

Cloudflare:
162.159.200.1
162.159.200.123
2606:4700:f1:0:0:0:0:1
2606:4700:f1:0:0:0:0:123

note: this bug is related to https://github.com/openwrt/packages/issues/10409 that bug is opened for the standalone NTPD package while this one is for the base-system busybox/ntpd package.

Closed by  Hans Dedecker
25.03.2020 20:46
Reason for closing:  Fixed
Additional comments about closing:  

Fixed in commit https://git.o penwrt.org/?p=openwrt/openwrt.git;a=comm it;h=556b8581a15c855b2de0efbea6b625ab16c c9daf

Project Manager
Kevin 'ldir' Darbyshire-Bryant commented on 07.11.2019 00:37

There's an existing mechanism that takes advantage of busybox' ntpd '-S' parameter to run a hotplug script on ntp time update. This signals dnsmasq with SIGINT to let dnsmasq know to check timestamps... until then zone timestamps are not checked.

This used to just work.

Henrique de Moraes Holschuh commented on 24.02.2020 18:30

It *is* supposed to work like that, yes. But I have seen it fail several times with openwrt 19.07.

So, you can count this as a confirmation the bug does exist.

Unfortunately, the several times it happened, I was not in a position to fully debug system state and had to "fix it now!" (which is easy done by disabling dnsmasq dnssec).

I did try quite hard to reproduce it, but thus far no such luck: even if I disable sysfixtime (which means the system clock will be somewere near 1970-01-01 until ntp kicks in).

The basic functionality *is* working. But some corner case is basically utterly broken.

Right now, you are advised to NOT use dnsmasq+dnssec unless you have *raw IP addresses* on your list of ntp servers, or a proper RTC.

Henrique de Moraes Holschuh commented on 01.03.2020 01:28

Ehh, yes, it *IS* broken. Hideously. Since basically forever.

The attached patch to /etc/init.d/dnsmasq fixes it.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing