Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#1501 - DS-Lite is broken on snapshot with some models #6556

Closed
openwrt-bot opened this issue Apr 17, 2018 · 35 comments
Closed

FS#1501 - DS-Lite is broken on snapshot with some models #6556

openwrt-bot opened this issue Apr 17, 2018 · 35 comments
Labels

Comments

@openwrt-bot
Copy link

TimB87:

Hey everybody,

my provider forces me to use ds-lite to get connection to the v4 internet.

I first tried openwrt on an old Linksys E4200, which worked with any version - but the router ran with around 50% cpu utilization at all times and it was very slow.
After updating to an Linksys WRT 3200 ACM, 17.01.4 works great!
Newest trunk though doesn't.

It also does not work on WRT 1900 ACS v2 and Netgear R7800 according to various forum posts.

I already gathered some information about it on the openwrt Forum -> https://forum.openwrt.org/viewtopic.php?id=73755

What else can I provide to make future versions work?

To mee, it seems like there is a problem with odhcp6c and maybe dnsmasq? Plus the script always produces "command not found" errors (even on the e4200), so maybe the script does not work flawlessly?

Thanks for your help!
Best regards,
Tim

@openwrt-bot
Copy link
Author

dedeckeh:

It would be helpfull if you add the network config which you're using.
The traces on the forum thread indicate the dynamic created wan6_4 interface fails to be created by netifd which is not related to odhcp6c and/or dnsmasq packages.

@openwrt-bot
Copy link
Author

TimB87:

Hey,

this is my /etc/config/network
root@OpenWrt:~# cat /etc/config/network

config interface 'loopback'
option ifname 'lo'
option proto 'static'
option ipaddr '127.0.0.1'
option netmask '255.0.0.0'

config globals 'globals'
option ula_prefix 'fd73:0f87:3075::/48'

config interface 'lan'
option type 'bridge'
option ifname 'eth0.1'
option proto 'static'
option ipaddr '192.168.1.1'
option netmask '255.255.255.0'
option ip6assign '60'

config interface 'wan'
option ifname 'eth1.2'
option proto 'dhcp'

config interface 'wan6'
option ifname 'eth1.2'
option proto 'dhcpv6'

config switch
option name 'switch0'
option reset '1'
option enable_vlan '1'

config switch_vlan
option device 'switch0'
option vlan '1'
option ports '0 1 2 3 5t'

config switch_vlan
option device 'switch0'
option vlan '2'
option ports '4 6t'

I haven't configured my network, just installed dslite package via opkg and rebooted (which works fine on the same router with 17.01.4).

odhcp6c and dnsmasq was a rough guess. I really have no idea, I can just say that it looks like it is having a connection..

@openwrt-bot
Copy link
Author

dedeckeh:

I'm unable to reproduce the issue on a TPLink TL-WDR4300
I've attached a dslite.sh file with extra trace info for further trouble shooting.
Can you copy this file to /lib/netifd/proto followed by a /etc/init.d/network restart ?
Can you check the output of logread when dslite failes to come up ?

@openwrt-bot
Copy link
Author

TimB87:

Hey Hans,

many thanks for looking into it!

I did a fresh flash of the newest snapshot, installed dslite and luci and then rebooted, same behavior - then replaced the script with yours - again the same behavior.

I installed tmux to follow the logread when doing a network restart. I don't see much, but:

Wed Apr 25 03:03:19 2018 daemon.err odhcp6c[1944]: Failed to send RS (Address not available)
..
Wed Apr 25 03:03:19 2018 daemon.err odhcp6c[1944]: Failed to send DHCPV6 message to ff02::1:2 (Address not available)
..
Wed Apr 25 03:03:21 2018 daemon.notice netifd: Interface 'wan6_4' is now up
..
Wed Apr 25 03:03:22 2018 daemon.warn odhcpd[1813]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!

Edit: in Luci it now says that dslite-wan is disconnected. Plus I'm stupid. Forgot to chmod +x the script after replacing it.
Now Luci reports dslite-wan is connected, still no outside v4 connectivity.

It still says:
Wed Apr 25 03:10:15 2018 daemon.warn odhcpd[1814]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!

Edit #2:
those are the logs from a fresh boot with those changes:
System Log https://pastebin.com/iubxkbq2
Kernel Log https://pastebin.com/TF6dM6JA

It is fetching the correct AFTR server by itself, like it's supposed to.

@openwrt-bot
Copy link
Author

dedeckeh:

Hi,

Thanks for the logs, can you add the output of ifstatus wan6 ?

@openwrt-bot
Copy link
Author

TimB87:

Sure,

here you go https://pastebin.com/nkzy4QKV

@openwrt-bot
Copy link
Author

dedeckeh:

Can you repeat the test but make odhcpd more verbose by setting the loglevel to 7 (uci set dhcp.odhcpd.loglevel=7) ?
IPv6 wise the behavior seems ok as the traces indicate the dslite interface is coming up; there's a public IPv6 prefix assigned to the lan.

Can you also add the output of ip route show and ip addr show ?

@openwrt-bot
Copy link
Author

TimB87:

Good morning :)

Sure. I just flashed the latest snapshot (built yesterday) and installed ds-lite (+luci), disabled masquerading of the firewall and set the loglevel to 7 with uci like you said, then rebooted.

Those are the logs:
Kernel https://pastebin.com/GRSSr2Du
System https://pastebin.com/2ehzdvga
ifstatus and ip output https://pastebin.com/DcQd8M7D

Thank you and have a nice Sunday!

@openwrt-bot
Copy link
Author

TimB87:

Hey Hans,

I am sorry, I didn't commit the changes with uci.
This is the syslog with debug info: https://pastebin.com/qBZfeiTw

Best regards,
Tim

@openwrt-bot
Copy link
Author

jpdribbler:

I just tried dslite again using the latest build from https://forum.lede-project.org/t/build-for-netgear-r7800/316 and beside dslite i was not able to get an ipv6 at all (DHCPv6). Probably not helpful without logs but fyi :)

Thanks Hans for checking.

@openwrt-bot
Copy link
Author

dedeckeh:

The logs don't show any obvious issue as the wan6_4 dslite interface is coming up and the required IPv4/6 routes are in place.
There's no reason to disable masquerading as it should work even with masquerading in place.

To trouble shoot it further I would like to launch an IPv4 ping to 8.8.8.8 from a lan device.
While the ping is ongoing I would like to see the output of

  • ifconfig

  • ifstatus wan6_4

  • iptables -t filter -L -vn

  • iptables -t nat -L -vn

  • ip6tabes -t filter -L -vn

@openwrt-bot
Copy link
Author

TimB87:

Hey Hans,

first of, on my windows 10 client, I ran ping -t 8.8.8.8.
This is then taken from a fresh snapshot (03.05.), firewall untouched, odhcpd loglevel 7 and your edited dslite.sh. I set WAN(4) to unmanaged and rebooted:

Kernel Log https://pastebin.com/0W3SMqqA
System Log https://pastebin.com/fjwuxjVK
Commands https://pastebin.com/2HFxhVsY

Of course, the ping is not able to reach anything saying 'Request timed out.'

On a side note, for what it's worth, my provider switched me to Dual Stack yesterday (I have no idea why they are suddenly able to, but I won't protest). While DHCPv4 and v6 now work flawlessly, it seems that I can still use an AFTR to get another v4 address - I tested that with 17.01.4 - so I will still be able to deliver logs.

Thanks for taking your time and have a great weekend,

Tim

@openwrt-bot
Copy link
Author

dedeckeh:

Thanks for providing the requested traces; unfortunately I can't find any indication to the observed problem as the iptables packet counters show the expected values while the interface status is as expected.
Would you be able to run tcpdump on eth1.2 (tcpdump -i eth1.2 -vvv)? I would like to check if we see IPv6 packets containing the ping request on the wan device.

@openwrt-bot
Copy link
Author

TimB87:

Sure, this is again with the most current snapshot, WAN unmanaged and dslite configured.
On one my clients I ran ping on 8.8.8.8.

root@OpenWrt:# ifconfig ds-wan6_4
ds-wan6_4 Link encap:UNSPEC HWaddr 2A-02-09-08-13-00-00-0C-00-00-00-00-00-00-00-00
inet addr:192.0.0.2 P-t-P:192.0.0.1 Mask:255.255.255.255
inet6 addr: fe80::943a:ecff:fe51:15b2/64 Scope:Link
UP POINTOPOINT RUNNING NOARP MTU:1280 Metric:1
RX packets:5 errors:0 dropped:0 overruns:0 frame:0
TX packets:183 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:200 (200.0 B) TX bytes:10912 (10.6 KiB)
root@OpenWrt:
# tcpdump -i eth1.2 -vvv -w ping-t8.8.8.8.output
tcpdump: listening on eth1.2, link-type EN10MB (Ethernet), capture size 262144 bytes
^C1393 packets captured
1470 packets received by filter
0 packets dropped by kernel
root@OpenWrt:~#

Please see the attached capture. Thanks :)

@openwrt-bot
Copy link
Author

dedeckeh:

The pcap capture shows the IPv4 ping is encapsulated as an IPv6 packet and sent on the wire.
However the remote end returns an ICMPv6 indicating an unrecognized next header type; I suspect the IPv6 destination option could be the problem.
Are you able to run the same test on a lede-17.01.4 build ? It would be interesting to see if the IPv6 destination option is included as well in the IPv6 packet containing the IPv4 ping

@openwrt-bot
Copy link
Author

TimB87:

Hey Hans,

please see the attached pcap capture, as well as some commands on console & Process/Kernel logs.
I was using your modified dslite.sh version on 17.01.4 for this. Hope this helps!

Best regards,
Tim

P.S.: I had to use eth1, I think 17.01.4 uses other vlan settings?

@openwrt-bot
Copy link
Author

dedeckeh:

The 17.01.4-ping.output traces confirm the problem is triggered by the extra IPv6 Destination Option header which is present in IPv6 packet containing the IPv4 ping packet.
This is a change in behavior due to the kernel upstep to 4.14; as the remote end rejects IPv6 packets containing the IPv6 Destination Option header is there a way you can raise this issue with your ISP ?

@openwrt-bot
Copy link
Author

TimB87:

Hey Hans,

well I could at least try it but I am not sure if they will reply to this.
So there is no way to change this behavior on my end?

What exactly can I tell them? "The linux kernel changed it's ipv6 behavior and now sends a destination option header which get's rejected by the AFTR resulting in ds-lite to fail to operate"?

@openwrt-bot
Copy link
Author

dedeckeh:

Hi,

At the moment there's no way to change this behavior on the device.

You could ask the ISP why the AFTR is unable to process IPv6 packets having a destination option header containing Tunnel Encapsulation Limit (RFC2473) which results into ds-lite connectivity issues as ICMPv6 packets are returned indicating unrecognized next header type.

@openwrt-bot
Copy link
Author

TimB87:

Hi Hans,

I have just sent an Email to my provider (Unity Media) and now I wait for their response. In my experience they respond rather slow to emails, but I'll try to keep this bug report updated as soon as I get an response.

Thanks a lot for the big help!

Best regards,
Tim

@openwrt-bot
Copy link
Author

LinuxfarmerHH:

MeToo

LEDE Reboot SNAPSHOT r5264-94491a1571 / LuCI Master (git-17.309.31241-676ce11)
TP-Link Archer C7 v2
ds-lite 7.2
odhcp6c 2017-09-05-51733a6d-4

/etc/config/network
config interface 'wan'
option proto 'dslite'
option peeraddr '2a02:2028:ff00::1:0:3b'

config interface 'wan6'
option ifname 'eth0.2'
option proto 'dhcpv6'
option reqaddress 'try'
option reqprefix 'auto'

config switch
option name 'switch0'
option reset '1'
option enable_vlan '1'

Heavy CPU load on the router, the connection lost 2/3 of throughput and deliver pings around 150ms, but IPv4 and IPv6 are running.

Sat Jun 2 09:34:19 2018 daemon.notice netifd: wan6_4 (12540): Command failed: Unknown error
Sat Jun 2 09:34:19 2018 daemon.notice netifd: Interface 'wan6_4' is now down
Sat Jun 2 09:34:19 2018 daemon.notice netifd: Interface 'wan6_4' is setting up now
Sat Jun 2 09:34:20 2018 daemon.notice netifd: wan6_4 (12621): Command failed: Unknown error
Sat Jun 2 09:34:20 2018 daemon.notice netifd: Interface 'wan6_4' is now down
Sat Jun 2 09:34:20 2018 daemon.notice netifd: Interface 'wan6_4' is setting up now
Sat Jun 2 09:34:21 2018 daemon.notice netifd: wan6_4 (12682): Command failed: Unknown error
Sat Jun 2 09:34:21 2018 daemon.notice netifd: Interface 'wan6_4' is now down
Sat Jun 2 09:34:21 2018 daemon.notice netifd: Interface 'wan6_4' is setting up now
Sat Jun 2 09:34:22 2018 daemon.notice netifd: wan6_4 (12756): Command failed: Unknown error

Is there any kind of older ds-lite version that works better?

@openwrt-bot
Copy link
Author

TimB87:

Hi Eric,

this seems very much unrelated to this bug as your v4 connection is established.
Please make your own bug report about the problem you are facing.

On topic: My provider has not written me back so far, and I think they will try to ignore the problem as long as nobody yells to loud. I'll try and give them a call and try to get somebody talking :)

Best regards,
Tim

@openwrt-bot
Copy link
Author

dedeckeh:

Hi Tim,

Meanwhile I'm working on a solution to make the inclusion of the Tunnel Encapsulation Limit configurable (https://git.openwrt.org/?p=openwrt/staging/dedeckeh.git;a=commit;h=45b0ad3bad35eac55d6436861dc82f55e7786def) to fix your connectivity issue; I will let you know when I've pushed the commits into master.

Hans

@openwrt-bot
Copy link
Author

TimB87:

Hey Hans,

thanks a lot!

master is the most current builds, right? Ok, give me a ping as soon as it's in or I could set up a build environment for openwrt, if you want me to test patches!

Have a great weekend, best regards
Tim

@openwrt-bot
Copy link
Author

dedeckeh:

Hi Tim,

I've pushed the patches to master to make the inclusion of the Tunnel Encapsulation Limit configurable; specific for your case encaplimit_dslite needs to be set to ignore in the wan6 network section (uci set network.wan6.encaplimit_dslite=ignore; uci commit).
As a result the destination option header should not be included anymore in the dslite packets.

Hans

@openwrt-bot
Copy link
Author

TimB87:

Good morning Hans,

sorry I didn't find the time to test during the week.
I can confirm that I get a working v4 connection with your changes!

Things that I noticed:

  1. The speed is borked somehow. v4 4.31 Mbit/s against v6 120 Mbit/s.
  2. Upon changing to dslite protocol the firewall settings are erased for that connection and (with luci) I can not set it again to the wan zone - maybe the root of the speed problem?

My provider so far has only written me a letter saying they tried to contact me (I don't see anything..).

Thanks a lot and best regards,

Tim

@openwrt-bot
Copy link
Author

dedeckeh:

Hi Tim,

It's not clear to me what you mean when changing to dslite protocol the firewall settings are erased for that connection.
The automatic created dslite interface inherits the zone of the wan6 interface unless zone_dslite is set in the network wan6 interface config; so there's no real need to set the firewall zone of the automatic dslite interface.

Hans

@openwrt-bot
Copy link
Author

tahiro86j:

Good morning Hans, from Japan...
I think I've seen your name somewhere else, and here you are again...
I personally support the inclusion of encapsulation limit option to ds-lite package, and given your hint for improvement, will also personally work on the same improvement for grev4 & grev6 packages as I am having issues with those protocols too (that need immediate correction).

@openwrt-bot
Copy link
Author

TimB87:

Hi Hans,

please see those two attached screenshots. I made them just now with the newest snapshot, speedtest is from http://ipv6-test.com/speedtest/

WAN is not in the 'WAN Firewall Zone' anymore, and I can not change it (via Luci), the according fields are completely blank.
I readded wan in /etc/config/firewall directly, rebooted and then WAN is back in the firewall zone, but the speed is still the same.

Best regards,
Tim

@openwrt-bot
Copy link
Author

dedeckeh:

Hi Tim,

Can you attach your /etc/config/network ?
I'm confused as I thought you were using the automatic wan6_4 interface while the attached screenshot seems to indicate this is not the case.

@openwrt-bot
Copy link
Author

TimB87:

Hi Hans,

you are right, I did it differently this way, sorry. I got my mind around things.
This is /etc/config/network according to the screenshots from one post prior:
http://dpaste.com/1VB3ZDF

I reflashed the latest snapshot and tried with setting wan to unmanaged:
This is the /etc/config/network:
http://dpaste.com/3S94Q4Z

wan is still not in the firewall wan zone. I attached two new screenshots.

Best regards,
Tim

Edit: I added another screenshot to be clear about luci, there is no option to assign it to the according firewall zone anymore (for both, unmanaged and manual ds-lite setup).

@openwrt-bot
Copy link
Author

dedeckeh:

Hi Tim,

The network config having wan proto dslite cannot work; you cannot have an automatic wan6_4 dslite interface together with a wan interface having proto dslite (see also https://bugs.openwrt.org/index.php?do=details&task_id=1574)
I'm not a Luci expert but why are you still keeping the wan interface as it has no purpose in a dslite scenario ?
As mentioned earlier the wan6_4 interface will inherit the zone of the wan6 interface; this can be checked by using the cmd ifstatus wan6_4

@openwrt-bot
Copy link
Author

TimB87:

Hi Hans,

actually I had wan set to unmanaged and not ds-lite so it would automatically set itself up in my second run yesterday.
I tried deleting it but it made no difference.

Best regards,
Tim

@openwrt-bot
Copy link
Author

dedeckeh:

Hi Tim

I propose to close this bug report and log the dslite performance issue in a separate record. I assume you have been testing this on a Linksys WRT 3200 ACM ?
Unfortunately I cannot reproduce the performance issue on my setup

@openwrt-bot
Copy link
Author

TimB87:

Hi Hans,

I agree on moving to another bug report, since the initial problem is solved, and I am willing to do some more testing for it - maybe it's another provider specific specialty?

I'll open a new report after this post.

So far, thank you very much for your work!

Have a nice weekend,
Tim

P.S.: Ok, change of plans, some script just ate my post on commiting, I'll add a bug report tomorrow O:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant