Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#1541 - Invalid Kernel logspam : sit: non-ECT from <various IPs, Invalid IPs> #6716

Closed
openwrt-bot opened this issue May 10, 2018 · 16 comments
Labels

Comments

@openwrt-bot
Copy link

plntyk:

Invalid IP suggests sth. wrong ?

Logs:

dmesg |grep sit | sed 's/[\ [0-9].[0-9]*]//g' |sort |uniq

sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
sit: non-ECT from 0.14.42.24 with TOS=0x2
sit: non-ECT from 0.14.42.24 with TOS=0x6
sit: non-ECT from 0.14.42.24 with TOS=0xb
sit: non-ECT from 0.14.42.24 with TOS=0xe
sit: non-ECT from 0.14.42.24 with TOS=0xf
sit: non-ECT from 0.206.0.0 with TOS=0x1
sit: non-ECT from 0.206.0.0 with TOS=0x9
sit: non-ECT from 0.206.0.0 with TOS=0xa
sit: non-ECT from 1.30.2.141 with TOS=0x5
sit: non-ECT from 1.30.2.141 with TOS=0x9
sit: non-ECT from 1.30.2.141 with TOS=0xb
sit: non-ECT from 1.30.2.141 with TOS=0xf
sit: non-ECT from 1.30.2.143 with TOS=0x3
sit: non-ECT from 1.30.2.143 with TOS=0x5
sit: non-ECT from 1.30.2.143 with TOS=0x9
sit: non-ECT from 1.30.2.143 with TOS=0xe
sit: non-ECT from 20.22.0.242 with TOS=0x5
sit: non-ECT from 20.22.0.242 with TOS=0x9
sit: non-ECT from 20.22.0.242 with TOS=0xb
sit: non-ECT from 32.0.0.1 with TOS=0x6
sit: non-ECT from 32.0.0.1 with TOS=0x9
sit: non-ECT from 32.72.0.1 with TOS=0x6
sit: non-ECT from 64.1.8.27 with TOS=0x2
sit: non-ECT from 64.1.8.27 with TOS=0xb
sit: non-ECT from 64.1.8.27 with TOS=0xd
sit: non-ECT from 64.1.8.27 with TOS=0xf
sit: non-ECT from 64.14.8.11 with TOS=0xe
sit: non-ECT from 64.14.8.5 with TOS=0x5
sit: non-ECT from 64.14.8.5 with TOS=0xd
sit: non-ECT from 64.14.8.7 with TOS=0x1
sit: non-ECT from 64.14.8.7 with TOS=0x2
sit: non-ECT from 64.14.8.7 with TOS=0x3
sit: non-ECT from 64.14.8.7 with TOS=0x6
sit: non-ECT from 64.14.8.7 with TOS=0xa
sit: non-ECT from 64.14.8.7 with TOS=0xd
sit: non-ECT from 64.14.8.9 with TOS=0xd
sit: non-ECT from 72.96.0.0 with TOS=0x1
sit: non-ECT from 72.96.0.0 with TOS=0x2
sit: non-ECT from 72.96.0.0 with TOS=0x3
sit: non-ECT from 72.96.0.0 with TOS=0x5
sit: non-ECT from 72.96.0.0 with TOS=0x6
sit: non-ECT from 72.96.0.0 with TOS=0x7
sit: non-ECT from 72.96.0.0 with TOS=0x9
sit: non-ECT from 72.96.0.0 with TOS=0xa
sit: non-ECT from 72.96.0.0 with TOS=0xb
sit: non-ECT from 72.96.0.0 with TOS=0xe
sit: non-ECT from 72.96.0.0 with TOS=0xf

Supply the following if possible:

  • Device problem occurs on

TP-Link WDR3600

  • Software versions of OpenWrt/LEDE release, packages, etc.

root@OpenWrt:# cat /etc/openwrt_version
r6867-ce4d2fb5cc
root@OpenWrt:
# cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r6867-ce4d2fb5cc'
DISTRIB_TARGET='ar71xx/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r6867-ce4d2fb5cc'
DISTRIB_TAINTS='no-all'

  • Steps to reproduce

Updated router to recent trunk (WDR3600)
kept config

Guessing:
log spam might involve tunnelbroker tunnel or openvpn server

some issues with either comp_lzo or openvpn-mbedtls on updating
were resolved with "compress lzo" and openvpn-openssl on server side

vpn client is LEDE r5099 (OpenVPN 2.4.4 mips-openwrt-linux-gnu [SSL (mbed TLS)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD]
library versions: mbed TLS 2.6.0, LZO 2.10)

@openwrt-bot
Copy link
Author

plntyk:

found https://forum.freifunk.net/t/kernel-ip-tunnel-non-ect-from-185-66-194-0-with-tos-0x/6149/5

But there the problem vanished after some random updates

@openwrt-bot
Copy link
Author

arjendekorte:

FWIW, I see this too. About two weeks ago this started. Fortunately, I still have a build available that is not affected, so I'll dig in next week to see what is the culprit. I guess it is something related to OpenVPN/routing, since I can no longer tunnel IPv6 over an IPv4 OpenVPN connection.

@openwrt-bot
Copy link
Author

mennozon:

I ran into this as well, the workaround for me is to do 'echo N > /sys/module/sit/parameters/log_ecn_error'

My guess is that the default changed at some point but I have not yet looked further into this.

@openwrt-bot
Copy link
Author

arjendekorte:

I changed the contents '/etc/modules.d/32-sit' of from

sit

to
sit log_ecn_error=0

This will suppress the errors too after rebooting. It doesn't fix the underlying problem though.

@openwrt-bot
Copy link
Author

Kufat:

I also have this issue on a recent custom build of the 17.01 branch. I have a tunnelbroker IPv6 tunnel but don't use a VPN, if that helps narrow things down.

@openwrt-bot
Copy link
Author

sqter:

Try sit log_ecn_error=N in contents '/etc/modules.d/32-sit'

@openwrt-bot
Copy link
Author

rmounce:

I am seeing this with kernel 4.9.102 on mpc85xx with a simple 6in4 tunnel configured.

I don't know when the bug was introduced, but the problem is that the inner IPv6 header of inbound packets (from the tunnel broker) is being interpreted as an IPv4 header when looking for the ToS/ECN bits. The debug message is also looking in the IPv4 location for the source address, and is instead printing the second quarter of the v6 source address formatted as a v4 address.

Ubuntu has also identified the issue, a fix will hopefully be backported to 4.9.x upstream.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1772775

Edit: found the commit that backported the bug to 4.9, even after it had been reverted in mainline
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/net/ipv6/sit.c?h=linux-4.9.y&id=92b86857ed7e8be6843d216fefe2178b172dbf63

torvalds/linux@f4eb17e1efe5

@openwrt-bot
Copy link
Author

Pilot6:

I've sent a pull request against the 17.01 branch.
The same is needed for 18.06 for 4.9 kernel.

@openwrt-bot
Copy link
Author

@openwrt-bot
Copy link
Author

p3x-robot:

Hello guys!

Do you have any solution! I still experience that the DNS is not working 100%. I addded compress lzo yes to OpenVpn, but I do not find why I can set on compress lzo to the tunnelbroker tunnel.

This is on Linksys WRT. At least now it works, but my DNS server (home DNS server via server) sometimes drops packages, but when I click on the browser like 2-3 times it works, in 17.01.4 it never happened).

@openwrt-bot
Copy link
Author

patrakov:

This bug is much more serious than just log spam. Each logged line is actually one packet dropped for no good reason. The bug is actually about misinterpreting the first byte of the IPv6 Flow Label field as an IPv4 TOS field and then (mistakenly) applying logic from RFC 6040 to drop packets under the supposed explicit congestion.

Turris equivalent: https://forum.turris.cz/t/performance-issue-with-ipv6-6rd-on-turris-omnia/7505

@openwrt-bot
Copy link
Author

patrakov:

In the #openwrt channel I was asked how to reproduce this in lab conditions.

  1. Install release 17.01.5 on WRT1200ACv2.
  2. Set up a PPPoe WAN
  3. Register with tunnelbroker.net, create a tunnel, tune down the MTU as required for PPPoE.
  4. Setup a 6in4 tunnel in OpenWRT
  5. Modify firewall so that pings from WAN to LAN over IPv6 are accepted
  6. Run tcpdump on pppoe-wan (ideally with a filter like "host 1.2.3.4" where 1.2.3.4 is the tunnel server) and on 6in4-wan6, separately
  7. Ping some host in LAN from some Linux host in WAN

Important: the host in WAN must have a non-zero flow label on its ping packets. That's why "Linux".

  1. See the IPv6 packets captured by tcpdump on pppoe-wan but not on 6in4-wan6, and also see the TOS log spam with invalid IP addresses.

  2. If the pings do come through, wait 20 minutes and try again (so that ping chooses a different flow label), or try from a different location. For me, pinging from a 6to4 host running the latest Arch Linux was reliably triggering the issue.

  3. Explicitly pass "-F 0" to ping and see that the packets come through.

@openwrt-bot
Copy link
Author

MauritsVB:

If this bug has [[https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/net/ipv6?h=v4.9.112&id=5b8fcc075714b86fb8fe194bb463fed2998a8e85|been fixed upstream in kernel 4.9.112]] and the OpenWrt 18.06 codebase [[https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=f7036a34ace38b701243e9357d7f509f8a66f0b1|was just updated to kernel 4.9.118]] that should mean that OpenWrt 18.06.1 will fix this bug.

Correct?

@openwrt-bot
Copy link
Author

Pilot6:

@alexander PPPoE is not needed for that.

@maurits You are correct. If you build from the 18.06 branch, you'll get a fix.

@openwrt-bot
Copy link
Author

patrakov:

I confirm that on mvebu (WRT1200ACv2) on OpenWRT 18.06.1 the bug no longer exists. It would still be nice to fix the regression in the LEDE 17.01.x release.

@openwrt-bot
Copy link
Author

danielg4:

Supposedly, this bug has also [[https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/net/ipv6?h=v4.4.143&id=0b12830665116f90ed795a1a54e2fced083c1a59|been fixed upstream in kernel 4.4.143]], so 17.01.6 will have it too, since the 17.01 codebase [[https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=8a72a868fd8000b826d5337fc413a51b01f39b5d|is currently using kernel 4.4.151]], which includes that fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant