FS#1259 - PPPoE disconnects under high upstream load #5826

openwrt-bot · 2018-01-06T17:48:42Z

llucz:

Hello,

I'm using an OpenWrt (Barrier Breaker r39582 ) TP-Link router to connect to the Internet through a DSL modem, via PPPoE.

Everything is fine, except when under high upstream load (i.e., when I saturate my upstream bandwidth, and only under these conditions). In these conditions, the OpenWrt router is unable to receive modem replies to PPPoE LCP echo-requests, and terminates the PPP connection. This behavior is experienced //only// under heavy upstream traffic, and is pretty deterministic.

This defect is also witnessed by the following log excerpt:
Sat Jan 6 17:47:13 2018 daemon.info dnsmasq[13213]: using nameserver 8.8.4.4#53 Sat Jan 6 17:47:13 2018 daemon.info dnsmasq[13213]: using nameserver 8.8.8.8#53 Sat Jan 6 17:47:13 2018 daemon.info dnsmasq[13213]: using local addresses only for domain lan Sat Jan 6 17:50:01 2018 daemon.info pppd[12884]: No response to 6 echo-requests Sat Jan 6 17:50:01 2018 daemon.notice pppd[12884]: Serial link appears to be disconnected. Sat Jan 6 17:50:01 2018 daemon.info pppd[12884]: Connect time 2.9 minutes. Sat Jan 6 17:50:01 2018 daemon.info pppd[12884]: Sent 53819998 bytes, received 1176989 bytes

I've attempted to play with PPPD lcp-echo-interval and lcp-echo-failure parameters, but I'm unable to solve the problem through these settings.

I guess the correct solution should be to "prioritize" (QoS?) PPP LCP echos, but maybe there is a simpler way to do it.

Thanks!

The text was updated successfully, but these errors were encountered:

openwrt-bot · 2018-01-06T18:55:47Z

moeller0:

So a stop-gap measure would be to install sqm-scripts and luci-app-sqm and instantiate a shaper on pppoe-wan with a bandwidth slightly below the real available bandwidth. The shaper on pppoe-wan will never actually see the LCP packets and hence will not be able to drop them and by setting the bandwidth slightly below the available rate thee should always enough pppoe bandwidth t accomodate the lcp packets.

openwrt-bot · 2018-01-14T11:36:53Z

weedy:

This isn't a bug, it's "by design".

Since you're flooding your upload buffers on the router and modem the maintenance packets are taking too long and pppd thinks you have a dead link. LEDE borrowed some patches from Debian (I think) that will skip this issue for you.

You should still use QoS for interactivity reasons.

# Adaptive/Smart keepalive
option keepalive_adaptive 1 
## lcp_failure,lcp_interval
option keepalive "7,3"

openwrt-bot · 2018-03-11T08:41:53Z

nbd:

The relevant option is already set by default since some time around 2014.

openwrt-bot · 2018-08-25T13:59:09Z

cbz:

I'm not sure the option is always set - the pppd option it maps to is lcp-echo-adaptive. This isn't a default option for pppd, and its trivial to see that openwrt launches pppd without this option set (ps output or running strings on /proc/pppd-id/cmdline )

openwrt-bot · 2018-08-25T14:17:12Z

cbz:

It also seems to be the logic of this here:

https://github.com/openwrt/openwrt/blob/master/package/network/services/ppp/files/ppp.sh#L123

openwrt-bot · 2018-08-25T18:17:56Z

bill888:

fwiw, I witnessed this problem when I started using BT Home Hub 5A on a 55/10mb VDSL2 connection a year ago. Whenever the 10 mbps upstream was saturated, the DSL connection would disconnect and 'No response to 5 echo-requests' would be reported in the system log.

In LuCI->Network->Interfaces->WAN->Advanced Settings, there are two default settings:

LCP echo failure threshold 0
LCP echo interval 5

The 'default' setting LCP echo failure threshold to Zero, implies all LCP echo failures would be ignored. (see attached image) But I found this to be incorrect/misleading.

The solution was to specify a non-zero value in LuCI as in example below

eg.
LCP echo failure threshold 5
LCP echo interval 5

A non-zero value ADDS the option keepalive to /etc/config/network:


config interface 'wan'
       option keepalive '5 5'

This increased the timeout to 25 seconds and resolved my DSL disconnection issues.

LuCI doesn't appear to be able to manage this keepalive_adaptive option.

openwrt-bot closed this as completed Mar 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FS#1259 - PPPoE disconnects under high upstream load #5826

FS#1259 - PPPoE disconnects under high upstream load #5826

openwrt-bot commented Jan 6, 2018

openwrt-bot commented Jan 6, 2018

openwrt-bot commented Jan 14, 2018

openwrt-bot commented Mar 11, 2018

openwrt-bot commented Aug 25, 2018

openwrt-bot commented Aug 25, 2018

openwrt-bot commented Aug 25, 2018

FS#1259 - PPPoE disconnects under high upstream load #5826

FS#1259 - PPPoE disconnects under high upstream load #5826

Comments

openwrt-bot commented Jan 6, 2018

openwrt-bot commented Jan 6, 2018

openwrt-bot commented Jan 14, 2018

openwrt-bot commented Mar 11, 2018

openwrt-bot commented Aug 25, 2018

openwrt-bot commented Aug 25, 2018

openwrt-bot commented Aug 25, 2018