New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FS#964 - odhcpdIPv6 node(Win 10) stops sending Router Solicitation but it still receives RAs from LEDE #5949
Comments
dedeckeh: The attached logs don't reveal any issue. |
jogo:
This is expected/conforming behavior to [[https://tools.ietf.org/html/rfc4861#section-6.3.7|RFC 4861]] - RS are only sent on a link up event, until an RA is received. RAs are sent in regular intervals by the router (so RS are not actually required, and just to allow faster router discovery). The RFC even forbids sending RS except in response to link up / interface becomes available events. So the question is probably why windows 10 does not like the RAs. Does it ever update the router lifetime? Do you have any logs from the windows machine (be it regular output). |
jogo:
Windows stops sending RS because the standard says you stop sending RS once you received a RA. And as I said, try to use netsh or so to find out how lifetimes of routes etc go on the windows side (especially if RAs change something or not). |
freefor: Sorry, I misunderstood you and the rfc. Thank you for pointing it out. |
freefor: I attached the uci odhcpd debug log and a pcap (icmpv6 filter). Logs are in UTC timestamp (my timezone is UTC+2). This is what I did: My windows machine still has an IPv6 address, IPv6 DNS but no default gateway. I checked also the lifetime on my windows machine using netsh and it gets refreshed every once in while(usually it has a value between 5 - 3 mins as duration). Thank you for your time. |
dedeckeh: The odhcpd traces indicate the default IPv6 route on the router is expiring : Fri Aug 18 14:29:38 2017 daemon.info odhcpd[835]: Raising SIGUSR1 due to default route change This results into a RA being sent to the lan with routerlifetime 0 Fri Aug 18 14:29:54 2017 daemon.info odhcpd[835]: Raising SIGUSR1 due to default route change This behavior is continuously repeated and is very unusual as the lifetime of the default IPv6 route is determined by the RA route lifetime received on the wan. Are you able to sniff IPv6 packets on the wan (eg via tcpdump) ? I'm particular interested in the contents of the RA and DHCPv6 messages received on the PPP link |
freefor: Some changes from last time: I attached pcap from WAN and Windows 10 machine.There is also an ipconfig output when IPv6 is working fine. UTC 19:23 - Started capturing Unfortunately I forgot to increase the system.log max filesize and I miss a large amount of odhcp debug(PART1 UTC 19:25 ~ 19:38 and PART2 UTC 21:28 ~ 21:30). I won't be able to do other tests until 27th August but let me know what do you want me to do. Thanks. |
dedeckeh: The RA transmission rate on the wan ppp interface is very chatty; it varies between 10 and 23 seconds... On top the router lifetime is set to 40 seconds resulting into a similar lifetime for the default IPv6 route. |
EricLuehrsen: Maybe this is an ISP problem ... Any parameter lifetime under one minute is out of line with the implied best practices in RFC 4861 and since. Default router lifetime sent or achieving 0 removes the router from the host router list. A lifetime of 40 seconds is not robust to local congestion or missed RA messages. Similarly, unsolicited RA faster than one minute are a potential congestion source (depending on connection type or speed). While odhcp6c limit of 30 seconds may be arbitrary, it may also be a well considered throttle to the burden from excessive RA events. |
freefor: Great to see you have some ideas. As I said, I am available from the 27th August for testing. Thank you. |
jogo:
Luckily you can change that as a workaround (although not per UCI) with the ''-m'' switch. If you modify ''/lib/netifd/proto/dhcpv6.sh'' and add e.g. ''-m 10'' (for 10 seconds minimum) to the arguments passed to odhcp6c, it should help in the mean time. |
dedeckeh: @ForFree Can you add as suggested by Jonas Gorski -m 10 to the proto_run_command in /lib/netifd/proto/dhcpv6.sh; this will instruct odhcp6c to accept RA messages spaced with a minimum interval of 10 seconds. |
dedeckeh: @eric Luehrsen From experience I know Tim uses network config settings which differ a lot from other ISPs and as such are not seen as best practices.
Neither do I find any requirement in RFC7084 |
EricLuehrsen: @hans Dedecker, you have hit it right on the head. There is no one specification limit, but rather a set of parametric relationships in a system. There are also simple matters of reality. A certain too fast RA represents congestion or link waste. It is not possible to robustly maintain a link if the information for the link is not refreshed at least 3 times in its link life (an RA may go missing or be garbled). At 20-40 odd seconds, it would be further ridiculous for a client to RA solicit as a fall-back-measure after the routers half-life and get things more in a knot. This is why I said "implied best practices," because the system of relationships just doesn't work well any other way. |
freefor: @hans Dedecker I edited dhcpv6.sh like this: saved, reboot both router and then my windows machine. |
dedeckeh: @eric Luehrsen I don't think "implied best practices" will be a strong enough argument to convince Tim; after all it's an interpretation of the RFC and not an explicit written down requirement. |
dedeckeh: Change dhcpv6.sh as follows : This will start odhcp6c in verbose mode and will instruct odhcp6c to accept RA messages spaced with a minimum interval of 10 seconds |
freefor: @hans Dedecker Now my IPv6 default gateway is stable!! I will continue monitoring the situation but it seems ok. EDIT: after 1 hour and 30 minutes I wrote this comment, my windows machine does not have any GUA anymore only ULA address. EDIT2: It seems like that the pppoe went down completely that's why I didn't have any GUA address. That is normal. |
freefor: Over 6 hours and my IPv6 default gateway is stable. Thank you! |
dedeckeh: @ForFree Can you provide the odhcp6c and odhcpd traces when possible in the current configuration ? |
freefor: @hans Dedecker I was able to do remote syslog and also save some pcapng. As I wrote in the my previous post, today my IPv6 was stable for over 6 hours. UTC 13:32 - started wireshark on windows machine
UTC 14:20 - I checked with ipconfig and my PC didn't have any GUA address but only ULA. UTC 14:31:11 - I manually shutdown the wan2 interface and reactivating the wan2 interface pppoe-wan2 came UP without any IPv6. UTC 14:48:30~ - I started again wan2 and here I made a mistake because I used the same filename(file pppoewan2-part2.pcap), overwriting everything from 14:31:11 to 14:48:10. UTC 17:50 - I check with ipconfig and my PC didn't have any GUA address but only ULA. pppoe-wan2 went down again.I don't remember if the pppoe session resumed on its own or I shut/unshut wan2 interface.. and I start a new tcpdump (file pppoewan2-part3.pcap). UTC 19:49 - My IPv6 was stable and still is now that I am writing this comment (UTC 20:50) I noticed that the RAs/RSs problem is solved but sometimes my pppoe-wan2 get disconnected. I guess this is a coincidence and could be my ISP's pppoe that is not stable. |
freefor: 5 days and my IPv6 default gateway is stable. Some pppoe disconnections but that's the ISP. |
dedeckeh: @ForFree A patch has been pushed (https://git.lede-project.org/?p=project/odhcp6c.git;a=commit;h=51733a6d3bfe0fb9e8c53aea22231e5b8a1f64c3) which aligns odhcp6c RA behavior with RFC4861; by default a RA update is accepted with a minimum interval of 3 seconds. |
freefor: @hans Dedecker Compiled a new build with your patch included. |
freefor: @hans Dedecker After 3 days testing r4786-05c3647d35, I confirm you that my IPv6 default gateway is stable without changing the minimum interval. Thank you! |
dedeckeh: @ForFree Thanks for testing and the feedback |
freefor:
Router: Buffalo wbmr-hp-g300h
Lede version: Reboot (SNAPSHOT, r4696-df3295f50e)
ISP: TIM Italy - ADSL2+
NIC: Realtek PCIe GBE - Driver 10.19.627.2017 (27/06/2017 - Latest)
I have set up a new pppoe session to get IPv6 on all my mobile devices(see attachment). Without a dedicated pppoe session,
my Android phone could not reach IPv4-only hosts.
Note:My ISP delegates via DHCPv6-PD a single dynamic /64.
Problem:
Everything worked fine until I noticed that my Windows 10 (Build 15063.540) laptop loses its IPv6 default gateway after 3 hours
more or less. If I disable/reenable the NIC (wired) the default gateway returns and I can use IPv6.
I did a wireshark capture on my laptop and I can see my laptop receiving RAs but I stop sending RS to the router.
I captured at the same time traffic on the router and indeed there are no RS from my laptop.
I attach here the output before and after I have a default gateway for IPv6
You can find also my /etc/config/network, /etc/config/firewall and /etc/config/dhcp settings.
Let me know if you need more logs.
I can easily reproduce this problem(just have to wait 3 hours).
Forum discussion: https://forum.lede-project.org/t/problem-with-ipv6-and-mobile-device/241
The text was updated successfully, but these errors were encountered: