OpenWrt/LEDE Project

  • Status Closed
  • Percent Complete
    100%
  • Task Type Bug Report
  • Category Base system
  • Assigned To
    Hans Dedecker
  • Operating System All
  • Severity Medium
  • Priority Very Low
  • Reported Version openwrt-18.06
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Marc - 27.04.2020
Last edited by Hans Dedecker - 13.05.2020

FS#3056 - odhcpd: "on-link" Router Information Options pollution in Router Advertisements

Device problem occurs on

Reproduced on Archer C7 v5 but not hardware specific at all.

Software versions of OpenWrt/LEDE release, packages, etc.

openwrt-18.06 branch (git-20.029.49294-41e2258)
odhcpd-v6only 1.15-3

Steps to reproduce

The function send_router_advert() in src/router.c doesn’t seem to know the purpose of Router Information Options. It systematically pollutes its Router Advertisements with bogus, RFC4191 off-link Route Information Options (RIOs) that seem to somehow clone on-link Prefix Information Options (PIOs) in the same RA packet.

This makes most Linux hosts unreachable with IPv6 on the same LAN.

This can be reproduced very easily with odhcpd-ipv6only - 1.15-3 and either ndptool or tcpdump. From a Linux workstation:

$ ndptool monitor -i wlan &
$ ndptool send -t rs -i wlan0

Prefix: 2601:2c0:7a00:724::/64, valid_time: 239810s, preferred_time: 239810s, on_link: yes, autonomous_addr_conf: yes, router_addr: no
Route:  2601:2c0:7a00:724::/64, lifetime: 239810s, preference: medium
 tcpdump -v icmp6
 
 Prefix Info Option (3), length 32 (4):, 2601:2c0:7a00:724::/64 Flags [onlink, auto], valid time 159616s, pref. time 159616s
 Route Info Option (24), length 24 (3):  2601:2c0:7a00:724::/64, pref=medium, lifetime=159616s

These wrong RIOs are harmless for the (many?) RFC4191 Type A or B operating systems that ignore RIOs, however they are harmful for some Type-C hosts like NetworkManager 1.20.10-1.fc31 or NetworkManager 1.22.10-1ubuntu1 that get confused by these RIOs. Most Linux distributions use NetworkManager by default.

RFC4191 RIOs meant for advertising OFF-LINK routes, they are not meant for (trying and failing to) duplicate information already found in ON-LINK prefixes. Quoting https://tools.ietf.org/html/rfc4191

In some network topologies where the host has multiple routers on its Default Router List, the choice of router for an off-link destination is important.

Consequence

⇒ Some RFC4191 Type C hosts like NetworkManager are unreachable on the LAN ⇐

For some Type C hosts like NetworkManager, the (wrong) RIO takes precedence over the (correct) PIO in the routing table. Let’s call these “Type C-” hosts.

When trying to reach a Type C- host (with ping6 or anything else) packets are sent directly on-link to the Type C- host. Then the Type C- host obeys the broken RIO pollution and replies _indirectly_ through the router. Finally, the following ip6tables rule in OpenWRT’s firewall configuration drops the unexpected reply:

FORWARD
  DROP   all      anywhere   anywhere       ctstate INVALID /* !fw3 */

Confusingly, Type C- hosts _can_ reach other Type C- hosts - through the router. Other hosts can reach each others hosts directly. Only some communications are broken which was very difficult to debug.


Closed by  Hans Dedecker
13.05.2020 05:58
Reason for closing:  Fixed
Additional comments about closing:  

Fixed in commit https://git.op enwrt.org/?p=project/odhcpd.git;a=commit ;h=5ce077026b991f49d96464587386f93d92f56 385

Marc commented on 27.04.2020 20:01

Thanks thaller for sharing the pointer to the relevant NetworkManager code:
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/dec1678fecadf/src/ndisc/nm-ndisc.c#L554

Some (less user friendly) alternatives to NetworkManager avoid the issue:

  • sysctl -w net.ipv6.conf.wlan0.accept_ra=1 ignores RIOs.
  • systemd-networkd does take this RIO pollution into account and adds it to the routing table but unlike NetworkManager it populates the routing table with BOTH the PIO and bogus RIO and the on-link prefix information seems to take precedence for some reason. Windows 10 behaves the same.
Brian J. Murrell commented on 28.04.2020 13:03

Interesting. Pity the Priority is Very Low even though the Severity is Medium.

Marc commented on 28.04.2020 15:47

I didn't see any Severity guideline at https://openwrt.org/bugs While this is quite "critical" for IPv6, I set the initial Severity to Medium assuming most people can fallback on IPv4.

"Very Low" is probably just the initial priority, especially for "Unconfirmed" bugs. You don't get to pick a priority when you file a bug and I don't see anyone changed it in the history tab.

While Severity could be somewhat objective given some guidelines, Priority is by definition totally relative and subjective :-)

Speaking of "Unconfirmed" it wouldn't hurt if a couple people could reproduce and confirm here.

Project Manager
Hans Dedecker commented on 04.05.2020 19:49

odhcpd sends RIO RA options to be in line with RFC7084 and in particular requirement L3 :

An IPv6 CE router MUST advertise itself as a router for the
delegated prefix(es) (and ULA prefix if configured to provide
ULA addressing) using the "Route Information Option" specified
in Section 2.3 of [RFC4191]. This advertisement is
independent of having or not having IPv6 connectivity on the
WAN interface.

But I agree this can cause issues on type C hosts if the delegated prefix length is equal to the downstream delegated prefix length.
Therefore I've made a change to exclude the RIO RA option for a given prefix if the delegated prefix length is equal to the downstream delegated prefix length (https://git.openwrt.org/?p=project/odhcpd.git;a=commit;h=5ce077026b991f49d96464587386f93d92f56385)
Can you check if this fixes the issue for you ?

Marc commented on 13.05.2020 01:09

Thanks Hans for the quick fix, really appreciated.

Therefore I've made a change to exclude the RIO RA option for a given prefix if the delegated prefix length is equal to the downstream delegated prefix length...

... the latter being always equal to 64 (for those looking at the code change).

I did indeed observe that the problem happens with the fc00:: ULA _only_ when its prefix is /64, didn't happen with fc00:/48. I tried to make the relatively long bug description shorter and I didn't think it was very important. Looks like it is :-)

(You obviously can't change the /64 prefix when it's given by your Internet Provider)

https://git.openwrt.org/?p=project/odhcpd.git;a=commit;h=5ce077026b991f4
Can you check if this fixes the issue for you ?

For a while I was wondering whether manually downloading and installing
https://downloads.openwrt.org/snapshots/packages/mipsel_24kc/base/ odhcpd-ipv6only_2020-05-04-5ce07702-3_mipsel_24kc.ipk would work. To my amazement, this dead simple command was enough to do the job instead:

# opkg upgrade odhcp-ipv6only 
Upgrading odhcpd-ipv6only on root from 2019-12-16-e53fec89-3 to 2020-05-03-49e4949c-3...
Downloading http://downloads.openwrt.org/releases/19.07.2/packages/mips_24kc/base/ odhcpd-ipv6only_2020-05-03-49e4949c-3_mips_24kc.ipk

I initially got confused by commits e53fec89 and 49e4949c but then I found them in this other branch: https://git.openwrt.org/?p=project/odhcpd.git;a=shortlog;h=refs/heads/openwrt-19.07

Problem fixed, NetworkManager is back! Thanks for rewarding all the time I spent debugging this that quickly.

(Besides having a friendlier graphical UI, one of the smarter things NetworkManager does is setting a higher default metric on wifi when both wifi and wired are connected)

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing