OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity Low
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by damocles-git - 26.07.2019

FS#2410 - TD-W8980: Kernel panic - not syncing: Fatal exception in interrupt

A few times to day the ADSL2+ connection is unable to reconnect because something in the driver seems unable to work again, I have an older router (cisco) and it reconnects whitout any trouble.

Restarting the interface with the br2684ctl init script causes a kernel panic.

DEVICE="TP-Link TD-W8980"   
VERSION="18.06.4"
BUILD_ID="r7808-ef686b7292"
LEDE_BOARD="lantiq/xrx200"
LEDE_ARCH="mips_24kc"
Fri Jul 26 19:53:54 2019 daemon.notice pppd[32180]: pppd 2.4.7 started by root, uid 0
Fri Jul 26 19:53:54 2019 daemon.debug pppd[32180]: Send PPPOE Discovery V1T1 PADI session 0x0 length 4
Fri Jul 26 19:53:54 2019 daemon.debug pppd[32180]:  dst ff:ff:ff:ff:ff:ff  src 30:b5:c2:9d:c4:7e
Fri Jul 26 19:53:54 2019 daemon.debug pppd[32180]:  [service-name]
Fri Jul 26 19:53:59 2019 daemon.debug pppd[32180]: Send PPPOE Discovery V1T1 PADI session 0x0 length 4
Fri Jul 26 19:53:59 2019 daemon.debug pppd[32180]:  dst ff:ff:ff:ff:ff:ff  src 30:b5:c2:9d:c4:7e
Fri Jul 26 19:53:59 2019 daemon.debug pppd[32180]:  [service-name]
Fri Jul 26 19:54:04 2019 daemon.debug pppd[32180]: Send PPPOE Discovery V1T1 PADI session 0x0 length 4
Fri Jul 26 19:54:04 2019 daemon.debug pppd[32180]:  dst ff:ff:ff:ff:ff:ff  src 30:b5:c2:9d:c4:7e
Fri Jul 26 19:54:04 2019 daemon.debug pppd[32180]:  [service-name]
Fri Jul 26 19:54:09 2019 daemon.warn pppd[32180]: Timeout waiting for PADO packets
Fri Jul 26 19:54:09 2019 daemon.err pppd[32180]: Unable to complete PPPoE Discovery
Fri Jul 26 19:54:09 2019 daemon.info pppd[32180]: Exit.
root@OpenWrt:~# /etc/init.d/br2684ctl
[...]
[  888.844371] CPU 0 Unable to h
andle kernel paging request at virtual address 0000000c, epc == 82cce5b0, ra == 819e07a0
[  888.853829] Oops[#1]:
[  888.856086] CPU: 0 PID: 3338 Comm: sh Not tainted 4.9.184 #0
[  888.861738] task: 83a88bc0 task.stack: 81534000
[  888.866255] $ 0   : 00000000 7f8bfef0 80a08b40 82cce5a0
[  888.871476] $ 4   : 8271dc00 80a08b40 0000f6be 00000001
[  888.876697] $ 8   : 83806000 00001ff0 8003256c 7f8bfe64
[  888.881920] $12   : 7f8c0058 00000000 00000000 774b42c0
[  888.887145] $16   : 00000000 00000001 819e6b88 00000000
[  888.892365] $20   : 00000001 819e6b9c 0000000f 819e0000
[  888.897587] $24   : 0045bbe4 00000000                  
[  888.902809] $28   : 81534000 83807eb0 00000001 819e07a0
[  888.908034] Hi    : 00000018
[  888.910903] Lo    : 00000002
[  888.913857] epc   : 82cce5b0 0x82cce5b0 [br2684@82cce000+0x19f0]
[  888.919803] ra    : 819e07a0 0x819e07a0 [ltq_atm_vr9@819e0000+0x6e00]
[  888.926227] Status: 1100ff02 KERNEL EXL 
[  888.930145] Cause : 00800008 (ExcCode 02)
[  888.934147] BadVA : 0000000c
[  888.937023] PrId  : 00019556 (MIPS 34Kc)
[  888.940933] Modules linked in: ltq_atm_vr9 ath9k ath9k_common ath9k_hw ath pppoe nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_HL xt_DSCP xt_CT xt_CLASSIFY pppox ppp_async owl_loader nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack_netlink ltq_deu_vr9 iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat fuse sch_cake act_connmark nf_conntrack act_skbedit act_mirred em_u32 cls_u32 cls_tcindex cls_flow cls_route cls_fw sch_tbf sch_htb sch_hfsc sch_ingress drv_dsl_cpe_api ledtrig_usbport drv_mei_cpe xt_set ip_set_list_set ip_set_hash_netiface ip_set_hash_netport ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables nfsv4 nfsv3 pppoatm ppp_generic slhc
 nfsd nfs ifb rpcsec_gss_krb5 auth_rpcgss oid_registry tun lockd sunrpc grace autofs4 dns_resolver br2684 atm exportfs drv_ifxos sha1_generic md5 hmac ecb des_generic cts cbc usb_storage uhci_hcd ohci_platform ohci_hcd sd_mod scsi_mod ext4 jbd2 mbcache crc32c_generic dwc2 gpio_button_hotplug
Process sh (pid: 3338, threadinfo=81534000, task=83a88bc0, tls=774b5dc0)
[  889.090467] Stack : 00000000 805288a0 00000000 80075714 00000011 819e07a0 8383a354 8108fb80
[  889.098820]         8383a354 00000000 00020000 819e1d30 8052f3f8 00000000 00000001 00407b39
[  889.107178]         80524d40 800738a4 805331c0 00000000 819e39d4 819e39d0 8052f3f8 8052f0a0
[  889.115532]         8057b200 00010000 00000002 00000000 00000100 80033600 774aa000 00000000
[  889.123888]         00000000 800739d8 81468000 00000007 80524058 00000040 806841e0 00000006
[  889.132244]         ...
[  889.134680] Call Trace:[  889.136961] [<80075714>] 0x80075714
[  889.140450] [<819e07a0>] 0x819e07a0 [ltq_atm_vr9@819e0000+0x6e00]
[  889.146612] [<819e1d30>] 0x819e1d30 [ltq_atm_vr9@819e0000+0x6e00]
[  889.152627] [<800738a4>] 0x800738a4
[  889.156111] [<80033600>] 0x80033600
[  889.159589] [<800739d8>] 0x800739d8
[  889.163072] [<80032b74>] 0x80032b74
[  889.166553] [<80032f54>] 0x80032f54
[  889.170034] [<80072d8c>] 0x80072d8c
[  889.173515] [<8000307c>] 0x8000307c
[  889.176996] [<8000c4ec>] 0x8000c4ec
[  889.180479] [<8000a9e8>] 0x8000a9e8
[  889.183950] 
[  889.185425] Code: afbf0014  afb00010  8c9002f8 <8e02000c> 0040f809  00000000  3c028053  8c42f0ac  7c420400 
[  889.195176] 
[  889.196816] ---[ end trace 8e8def78ee4b4969 ]---
[  889.207641] Kernel panic - not syncing: Fatal exception in interrupt
[  889.217636] Rebooting in 3 seconds..
ATU-C Vendor ID:                          FF,B5,47,53,50,4E,00,10
ATU-C System Vendor ID:                   00,00,30,30,30,30,00,00
Chipset:                                  Lantiq-VRX200
Firmware Version:                         5.8.0.11.1.1
API Version:                              4.17.18.6
XTSE Capabilities:                        0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0
Annex:                                    A
Line Mode:                                G.992.5 (ADSL2+)
Profile:                                  
Line State:                               UP [0x801: showtime_tc_sync]
Forward Error Correction Seconds (FECS):  Near: 0 / Far: 0
Errored seconds (ES):                     Near: 2 / Far: 0
Severely Errored Seconds (SES):           Near: 0 / Far: 0
Loss of Signal Seconds (LOSS):            Near: 0 / Far: 0
Unavailable Seconds (UAS):                Near: 34 / Far: 34
Header Error Code Errors (HEC):           Near: 0 / Far: 0
Non Pre-emtive CRC errors (CRC_P):        Near: 0 / Far: 0
Pre-emtive CRC errors (CRCP_P):           Near: 0 / Far: 0
Power Management Mode:                    L0 - Synchronized
Latency [Interleave Delay]:               0.25 ms [Fast]   0.25 ms [Fast]
Data Rate:                                Down: 12.287 Mb/s / Up: 1.023 Mb/s
Line Attenuation (LATN):                  Down: 18.4 dB / Up: 7.3 dB
Signal Attenuation (SATN):                Down: 16.8 dB / Up: 0.0 dB
Noise Margin (SNR):                       Down: 19.1 dB / Up: 10.3 dB
Aggregate Transmit Power (ACTATP):        Down: 18.4 dB / Up: 10.4 dB
Max. Attainable Data Rate (ATTNDR):       Down: 20.132 Mb/s / Up: 1.056 Mb/s
Line Uptime Seconds:                      1273
Line Uptime:                              21m 13s
Project Manager
Hauke Mehrtens commented on 07.08.2019 18:59

Could you please try this with master. This log misses the kernel symbols which makes it hard to decode. It looks like you are using a R8 DSL FW (Version 5.8.X), this needs a special setting in the driver, because the FW message API is not compatible to the R7 API.

damocles-git commented on 26.08.2019 18:39

It also happens with the more recent snapshot. Are there any instructions I can follow to debug this?

# uname -a
Linux OpenWrt 4.19.66 #0 SMP Sat Aug 24 00:55:33 2019 mips GNU/Linux
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r10850-921675a2d1'
DISTRIB_TARGET='lantiq/xrx200'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r10850-921675a2d1'
DISTRIB_TAINTS=''
ATU-C Vendor ID:                          FF,B5,47,53,50,4E,00,10
ATU-C System Vendor ID:                   00,00,30,30,30,30,00,00
Chipset:                                  Lantiq-VRX200
Firmware Version:                         5.8.0.11.1.1
API Version:                              4.17.18.6
XTSE Capabilities:                        0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0
Annex:                                    A
Line Mode:                                G.992.5 (ADSL2+)
Profile:                                  
Line State:                               UP [0x801: showtime_tc_sync]
Forward Error Correction Seconds (FECS):  Near: 0 / Far: 3817664192
Errored seconds (ES):                     Near: 7 / Far: 4
Severely Errored Seconds (SES):           Near: 0 / Far: 0
Loss of Signal Seconds (LOSS):            Near: 0 / Far: 0
Unavailable Seconds (UAS):                Near: 76 / Far: 76
Header Error Code Errors (HEC):           Near: 38 / Far: 16
Non Pre-emtive CRC errors (CRC_P):        Near: 0 / Far: 0
Pre-emtive CRC errors (CRCP_P):           Near: 0 / Far: 0
Power Management Mode:                    L0 - Synchronized
Latency [Interleave Delay]:               0.25 ms [Fast]   0.25 ms [Fast]
Data Rate:                                Down: 12.287 Mb/s / Up: 1.015 Mb/s
Line Attenuation (LATN):                  Down: 18.6 dB / Up: 7.6 dB
Signal Attenuation (SATN):                Down: 17.0 dB / Up: 7.6 dB
Noise Margin (SNR):                       Down: 17.7 dB / Up: 10.0 dB
Aggregate Transmit Power (ACTATP):        Down: 18.4 dB / Up: 1.5 dB
Max. Attainable Data Rate (ATTNDR):       Down: 19.048 Mb/s / Up: 1.048 Mb/s
Line Uptime Seconds:                      1204
Line Uptime:                              20m 4s
# /etc/init.d/br2684ctl restart
root@OpenWrt:/# [184148.220316] CPU 0 Unable to handle kernel paging request at virtual address 0000000c, epc == 8337e678, ra == 82aa1d50
[184148.229601] Oops[#1]:
[184148.231945] CPU: 0 PID: 1012 Comm: netifd Not tainted 4.19.66 #0
[184148.238022] $ 0   : 00000000 00000001 8337e668 82ad79c0
[184148.243331] $ 4   : 82fa0800 82ad79c0 00000040 00000002
[184148.248639] $ 8   : 8380c000 00001ff0 77f870b0 77f87140
[184148.253949] $12   : 7fa3f2f8 77ee6030 77f08ca0 80002001
[184148.259258] $16   : 00000000 82aa69dc 00000001 00000001
[184148.264567] $20   : 82aa6bbc 82aa0000 00000001 00000000
[184148.269877] $24   : 00426ff4 00000000                  
[184148.275186] $28   : 82fe8000 8380dea8 0000000a 82aa1d50
[184148.280497] Hi    : 00000000
[184148.283457] Lo    : 02dc4000
[184148.286456] epc   : 8337e678 0x8337e678
[184148.290344] ra    : 82aa1d50 0x82aa1d50
[184148.294249] Status: 1100ff02	KERNEL EXL 
[184148.298255] Cause : 00800008 (ExcCode 02)
[184148.302344] BadVA : 0000000c
[184148.305306] PrId  : 00019556 (MIPS 34Kc)
[184148.309304] Modules linked in: nf_conntrack_netlink nfnetlink ltq_atm_vr9 ath9k ath9k_common iptable_nat ipt_MASQUERADE ath9k_hw ath xt_state xt_nat xt_conntrack xt_REDIRECT xt_FLOWOFFLOAD xt_CT pppoe nf_nat_ipv4 nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mac80211 ipt_REJECT cfg80211 xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG pppox ppp_async owl_loader nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 ltq_deu_vr9 iptable_mangle iptable_filter ip_tables crc_ccitt compat drv_dsl_cpe_api ledtrig_usbport drv_mei_cpe nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 pppoatm ppp_generic slhc br2684 atm drv_ifxos dwc2 gpio_button_hotplug
[184148.376785] Process netifd (pid: 1012, threadinfo=3bd76b73, task=2694addf, tls=77f90ee8)
[184148.384942] Stack : c0000000 ffffffc0 808539a0 00000000 00000009 82aa1d50 8063f748 80640000
[184148.393384]         00000001 806b54a0 f0000000 80850000 c0000000 80850000 77f86000 8302a1e0
[184148.401826]         8302a21c 800807ac 8380df18 803e0ca4 82aa37e8 00000002 00000000 8109c234
[184148.410269]         806bdfe4 807158a0 00000040 807102a0 0000000a 80039754 8380df20 8380df20
[184148.418712]         806b004c 806b54a0 806b0058 00000040 00000007 00000006 00000100 806bdfe4
[184148.427155]         ...
[184148.429690] Call Trace:
[184148.432227] [<8337e678>] 0x8337e678
[184148.435792] Code: afbf0014  afb00010  8c900320 <8e02000c> 0040f809  00000000  c2020034  24430001  e2030034 
[184148.445608] 
[184148.447358] ---[ end trace e8b15f6988b1366c ]---
[184148.453057] Kernel panic - not syncing: Fatal exception in interrupt
[184148.458951] Rebooting in 3 seconds..
Project Manager
Hauke Mehrtens commented on 22.09.2019 21:29

Could you please post your network configuration, mainly /etc/config/network without the passwords. Is this reproducible without a working ADSL connection?

damocles-git commented on 27.10.2019 10:27

Thank you by taking the time to see my case.

This is the relevant configuration:

config interface 'wan'
        option ifname 'dsl0'
        option proto 'pppoe'
        option auto '1'
        option username '<user>'
        option password '<password>'
        option delegate '0'
        option ipv6 '0'

unfortunatelly the panic seems network related and I am unable to reproduce it whithout the ADSL line connected.

It just works fine for some hours, manually executing interface restarts by this time doesn't trigger the panic but later when it lost the communication with the DSLAM and the interface is restarted it causes a panic.

My current cisco router which I'm trying replace with openwrt, it just restart the connection process and works fine for months:

Cisco877 uptime is 3 days, 4 hours, 2 minutes
ATM0 is up, line protocol is up 
  Hardware is MPC ATMSAR (with Alcatel ADSL Module)
  MTU 4470 bytes, sub MTU 4470, BW 12000 Kbit/sec, DLY 360 usec, 
     reliability 255/255, txload 4/255, rxload 208/255
  Encapsulation ATM, loopback not set
  Encapsulation(s): AAL5  AAL2, PVC mode
  10 maximum active VCs, 1024 VCs per VP, 1 current VCCs
  VC Auto Creation Disabled.
  VC idle disconnect time: 300 seconds
  Last input 00:00:26, output 00:00:00, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/75/545/0 (size/max/drops/flushes); Total output drops: 16327530
  Queueing strategy: Per VC Queueing
  5 minute input rate 9799000 bits/sec, 879 packets/sec
  5 minute output rate 208000 bits/sec, 281 packets/sec
     240550040 packets input, 3066694825 bytes, 626 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
     0 input errors, 29 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     111084352 packets output, 3609541017 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 unknown protocol drops
     0 output buffer failures, 0 output buffers swapped out

Is there anything else I can try?

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing