OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity High
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Ion Vasile - 09.10.2020

FS#3373 - IPV6 flow offload broken

Enabling this option makes the ipv6 connection unstable.

Example: connecting to #openwrt-devel from hexchat (Ubuntu) makes you constantly disconnect.
Disabling flow offload makes the connection stable.


netprince commented on 14.01.2021 15:51

I can confirm, using -j FLOWOFFLOAD from ip6tables causes ipv6 connections to stall after a few minutes.

netprince commented on 25.01.2021 15:52

Just another datapoint, using the exact same configuration for each build. This was tested using a Xiaomi Mi Router 3G. This occurs with and without hardware offload enabled.

A build of OpenWrt 19.07-SNAPSHOT, r11285-11f4918ebb (dated 20210125), offloading works great.

A build of OpenWrt SNAPSHOT, r15599-37752336bd (dated 20210125), ipv6 connections stall after being idle for a while. (switching back and forth between apps on a phone for example)

When the connection stalls, everything in the app (on my phone) just hangs for about 8-10 seconds before finally coming back to life.

hacc1225 commented on 31.01.2021 13:46

Tested on ath79 (TP-Link Archer C7 v2) with the same issues.

hacc1225 commented on 01.02.2021 12:39

When -j FLOWOFFLOAD is enable, the IPv6 packet loss rate is as high as 50%.
Build the image with commit ddab795b370da986149f8c8e6b3455bf9c1066fe

joda commented on 28.04.2021 12:21

I'm still seeing this issue in OpenWrt 21.02.0-rc1 r16046-59980f7aaf (netgear r7800). I'd be happy to help diagnose this if given some assistance in how to best track it down.

gtx commented on 13.05.2021 04:20

confirm this exact issue still happening on r16708-e7249669d2 on r7800

the7thstranger commented on 30.06.2021 09:10

I also have this issue on a WDR4300 with 21.02.0-rc3. Flow offloading was working fine before, with 19.07.7. Connections break completely after a while, all kinds of elements on web pages are not loaded, and so on.

Andy Botting commented on 03.07.2021 11:59

Took me a week before I discovered this issue, and I find this bug report just after I add a new topic to the forum https://forum.openwrt.org/t/21-02-0-rc3-losing-ipv6-packets-when-flow-offloading-1/100575

In the mean time, I've just added this line to /etc/firewall.user

ip6tables -D FORWARD -m comment --comment "!fw3: Traffic offloading" -m conntrack --ctstate RELATED,ESTABLISHED -j FLOWOFFLOAD --hw
Greg Dietsche commented on 03.08.2021 03:03

I think I may have a similar issue with IPv4 and flow offloading turned on. Idle ssh connections drop relatively quickly when offloading is enabled. When offloading is turned off, they stay connected indefinately. Please see issue #3759 where I have documented the problems that I have had.

Greg Dietsche commented on 03.08.2021 03:04

I meant to write please see issue #3759. Thanks!

bohdan-s commented on 16.08.2021 05:18

Hi, I have confirm my bug  FS#3973  is linked to this bug. Disabling offloading or IPv6 resolves the issue. But when both are enabled this bug is observed.

Shine- commented on 02.09.2021 21:09

OK, seriously. You released 21.02 today. With this bug. I updated. It - well - STOPPED WORKING. You know, most major websites, like, Google. Facebook. Ebay. Are using IPv6. And you are releasing a new major release of OpenWRT that stops supporting IPv6. Wow. Hello? I mean, yes, offloading is disabled by default. But every 19.07 user has it turned on, because OpenWRT is painfully slow without. And now you break it? And nevertheless release the broken version as a new major release? Again, wow. Wow.

Josh Stone commented on 03.09.2021 23:16

The 21.02 release has not been announced yet, but I expect this will be a known issue in the release notes, just as it was in rc4. The developers decided not to let this be an indefinite blocker, as the release is already very late. It sounds like this will still be investigated for a point release though.

fda commented on 05.09.2021 20:21

I have multiple vlans and separate ULAs on every interface. odhcpd announces every subnet to every vlan!
Now i noticed this bug is also gone with disabled sw-offloading

Mykola commented on 06.09.2021 13:09

Archer C7-V2. Problem confirmed

Cheddoleum commented on 06.09.2021 19:40

I'm puzzled at the lack of a substantive description of this bug. There is no detail, no steps to reproduce, not even any anecdotes except a handful contributed in the comments.

For example, does this affect native IPv6 connections only? Tunneled connections such as 6rd or 6in4 use IPv4 for ingress and egress, but then are routed internally to the destination LAN subnet as v6, so I'm guessing that the v6 flow tables, and therefore this bug, would be in effect. But that is just a guess; as this is in general release, it would be helpful to know whether mitigation is needed without doing a bunch of ad-hoc, nearly-uninformed testing.

stephen commented on 07.09.2021 00:41

details ipv6 is borked when offloading is enabled. steps to reproduce have ipv6 on and turn on software or hardware offloading. this effects visiting ipv6 primary sites like google, youtube, facebook etc. it effects all connections to ipv6 sites. how is ipv6 borked well when you go to an ipv6 website it will sometimes load instantly like ipv4 but every other or every reload will take several seconds to load or timeout.
fix is to either disable ipv6 or offloading or
turn on software offloading
install ipset
for software offloading i havent confirmed this to work but i would think this does go to the custom rules of ur fire wall and add this

ip6tables -D FORWARD -m comment --comment "!fw3: Traffic offloading" -m conntrack --ctstate RELATED,ESTABLISHED -j FLOWOFFLOAD

then ssh into the device

uci set firewall.@include[0].reload="1"
uci commit firewall
service firewall restart
fw3 flush
fw3 restart

and you should have offloading working and ipv6

for hardware offloading do this
turn on software and hardware offloading
go to the custom rules of ur firewall and add this

ip6tables -D FORWARD -m comment --comment "!fw3: Traffic offloading" -m conntrack --ctstate RELATED,ESTABLISHED -j FLOWOFFLOAD --hw

then ssh into ur router

uci set firewall.@include[0].reload="1"
uci commit firewall
service firewall restart
fw3 flush
fw3 restart

i havent tried with software offloading cause it works with hardware offloading for me
i am using OpenWrt SNAPSHOT r17482-6f2044c2d7
on MikroTik RouterBOARD 760iGS
there appears to be another bug in current snapshot causing speeds to drop off after an initial peak

fda commented on 07.09.2021 11:06

I have native ipv6 but with nat6 enabled. An unbound behind an openwrt router can not use UDP connections when sw-offloading is enabled! Tcp connections are working, but from time to time there are errors in syslog
Could the new DSA be related?

Cheddoleum commented on 07.09.2021 19:24

This thread suggests that the change that caused the problem is known and revertible, though I see no confirmation yet:

http://lists.openwrt.org/pipermail/openwrt-devel/2021-August/036223.html

Meanwhile, to perhaps answer my own earlier question, I'm running _tunneled_ IPv6 (Hurricane Electric 6in4) in 21.02.0 and there is as yet no indication of this problem with either HW or SW flow offloading.

stephen commented on 08.09.2021 06:38
jacko neill commented on 08.09.2021 14:20

Another affected user here, Xiaomi mi3g. Disabling the rule via ip6tables as indicated above works for me as a workaround. I'd still like to keep flow offload enabled for ipv4, as this router isn't powerful enough to handle the 300mps connection my ISP provides.

I can reproduce this pretty consistently on iphone when scrolling through Facebook/Instagram and switching apps every minute or so. Not sure though on what sort of logs I could grab to help confirming the bug

Shine- commented on 08.09.2021 21:29

If this is really caused by kernel commit e97d940 (Edit: it's not, see below!) as suggested in the mailing list then the attached patch should fix it. It effectively reverts e97d940 and additionally picks all necessary changes from 4592ee7f, skipping stuff that kernel 5.4 doesn't have yet (like configurable offload timeouts).

It compiles, but is otherwise completely untested, since I went back to 19.07 for now and won't be able to perform another upgrade attempt for at least the next 2 weeks. Try at your own risk, reports welcome.

Edit: So I did dig up a spare box to test, and no, it's not working, even with this patch. Would be strange anyway, since if that were the cause, then both, IPv4 and IPv6 would be affected by the bug, not only IPv6, and also, the packet loss would only happen after 30 or 120 seconds (which are the hardcoded timeouts). However, it occurs pretty much immediately.

Feel free to confirm for yourself, I can't delete the attachment anyway. Maybe a mod can delete it, then I'd re-attach it with proper name and description, for reference. It might be the fix for FS#3759, after all.

J Str commented on 09.09.2021 07:55

Affected, too, I can confirm that after I updated my Archer C2600 from 19.07 to 21.02, software flow offloading on, this broke IPv6 connections immediately.
Switching it off fixes the issue, however wan»lan routing throughput is cut in half then in 21.02, compared to 19.07

supersebbo commented on 10.09.2021 09:22

I am also intruiged by the lack of definition and priority of this bug.

I'm not currently seeing any issues as I am only using IPv6 on one internal network so nothing traverses the router. However its concerning to think that if I was to start using IPv6 accross multiple networks, this would likely be a hard to diagnose issue which I would probably waste hours on trying to bebug my IPv6 config. I know it's in the release notes... but who looks at the release notes after they've upgraded.

Feels like a big miss for a major release.

Happy to setup some test networks and hosts if that would help... what is needed?

Pepe commented on 10.09.2021 09:41

Another affected user here. TP-Link Archer C7 v5.
IPv6 throughput is cut to less than half after upgrading to 21.02 but forced to remove the firewall rules to disable offloading as suggested above.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing