Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#737 - Kernel panic on Buffalo WZR600DHP when LAN plugged at boot time, unaligned access #6080

Closed
openwrt-bot opened this issue Apr 25, 2017 · 7 comments
Labels

Comments

@openwrt-bot
Copy link

russell:

Supply the following if possible:

  • Device problem occurs on

Buffalo WZR600DHP (multiple different devices)

  • Software versions of LEDE release, packages, etc.

All recent versions tested, including "Reboot (SNAPSHOT, r3939+1-de5b8e5d2f)" ... currently building HEAD to try there

  • Steps to reproduce
  • Remove power (warm boots seem more likely to boot successfully)
  • Plug a powered-on ethernet-connected device into one of the LAN ports (I used port 2)
  • Apply power to the Buffalo, watch as the red diagnostic light goes on steadily, blinks fast, blinks slower, then goes off without the WLAN leds turning on normally, then repeat indefinitely. Eventually, one of the reboots may succeed.

A panic looks like this:

[ 20.142060] eth0: link up (1000Mbps/Full duplex)
[ 20.148486] Unhandled kernel unaligned access[#1]:
[ 20.153285] CPU: 0 PID: 960 Comm: basename Not tainted 4.4.59 #0
[ 20.159280] task: 87cf1e30 ti: 86c9c000 task.ti: 86c9c000
[ 20.164661] $ 0 : 00000000 00000001 00000000 00000001
[ 20.169907] $ 4 : 871193ff 87311068 800f3538 87311068
[ 20.175154] $ 8 : 87311068 80065058 7701b000 00002000
[ 20.180402] $12 : 00000000 00003000 fffffffc 00000000
[ 20.185649] $16 : 87119980 871193ff 87cf1e30 76f870d0
[ 20.190896] $20 : 000a1f10 7701bf6f 7701cdcc 86c6d1e0
[ 20.196143] $24 : 00000000 800f3538
[ 20.201391] $28 : 86c9c000 86c9dda8 86c9ddf0 8010ba2c
[ 20.206638] Hi : 00000000
[ 20.209505] Lo : 00000008
[ 20.212393] epc : 80099ad8 commit_creds+0x3c/0x20c
[ 20.217356] ra : 8010ba2c install_exec_creds+0x18/0x38
[ 20.222735] Status: 1000fc03 KERNEL EXL IE
[ 20.226936] Cause : 00800010 (ExcCode 04)
[ 20.230935] BadVA : 871193ff
[ 20.233802] PrId : 00019374 (MIPS 24Kc)
[ 20.237714] Modules linked in: ath9k ath9k_common ath9k_hw ath nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_condition xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ipt_ECN ip_tables compat br_netfilter ledtrig_usbport ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables tun ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[ 20.319746] Process basename (pid: 960, threadinfo=86c9c000, task=87cf1e30, tls=778c5d48)
[ 20.327896] Stack : 86c8d634 7701bf6f 7701cdcc 86c6d1e0 86c9ddf0 800746d0 86c9ddf0 800f5f14
86c6c400 87119a80 86c6c400 8010ba2c fffffff2 87119a80 86c6c400 76f870d0
00000000 8014596c 00000000 800f6d60 00000000 800763c0 00002012 00000091
55555554 80410000 0045a000 00000008 0045a535 0045a000 004497c0 00400000
76f78000 00400000 86c6c500 87119f00 86c6c300 00000000 00000003 00000003
...
[ 20.363640] Call Trace:
[ 20.366082] [<80099ad8>] commit_creds+0x3c/0x20c
[ 20.370700] [<8010ba2c>] install_exec_creds+0x18/0x38
[ 20.375752] [<8014596c>] load_elf_binary+0xae0/0x112c
[ 20.380794] [<8010bad8>] search_binary_handler+0x8c/0x1cc
[ 20.386189] [<8010d09c>] do_execveat_common+0x400/0x5ac
[ 20.391400]
[ 20.392880]
Code: 8c820000 28420001 00020336 24420001 e2220000 1040fffc 00000000 8e260014
[ 20.402936] ---[ end trace f860d77d4ae98434 ]---
[ 20.409602] Fatal exception: panic in 5 seconds
[ 20.414551] CPU 0 Unable to handle kernel paging request at virtual address 00000200, epc == 87fe0254, ra == 87fe1ce8
[ 20.425372] Oops[#2]:
[ 20.427647] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G D 4.4.59 #0
[ 20.434942] task: 87c2ca10 ti: 87c44000 task.ti: 87c44000
[ 20.440323] $ 0 : 00000000 80480000 80000001 00000200
[ 20.445570] $ 4 : 86c9e000 00000001 00000000 dcbc842b
[ 20.450817] $ 8 : 86c9e058 86c9e058 00000000 00000000
[ 20.456064] $12 : 00440002 ffffffff 00000000 ffffffff
[ 20.461311] $16 : 86c9e000 000006ec 00000377 804275d8
[ 20.466559] $20 : 000006ec 00000004 804275d8 00000000
[ 20.471805] $24 : 00000000 87fe7890
[ 20.477053] $28 : 87c44000 87c45b20 86c9e040 87fe1ce8
[ 20.482300] Hi : 00000377
[ 20.485167] Lo : 3456f000
[ 20.488079] epc : 87fe0254 nf_ct_untracked_status_or+0x78/0x98 [nf_conntrack]
[ 20.495383] ra : 87fe1ce8 __nf_conntrack_confirm+0xbc/0x364 [nf_conntrack]
[ 20.502493] Status: 1000fc03 KERNEL EXL IE
[ 20.506694] Cause : 0080000c (ExcCode 03)
[ 20.510693] BadVA : 00000200
[ 20.513560] PrId : 00019374 (MIPS 24Kc)
[ 20.517472] Modules linked in: ath9k ath9k_common ath9k_hw ath nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_condition xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ipt_ECN ip_tables compat br_netfilter ledtrig_usbport ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables tun ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[ 20.599503] Process ksoftirqd/0 (pid: 3, threadinfo=87c44000, task=87c2ca10, tls=00000000)
[ 20.607740] Stack : 00000002 87fd97c0 00000001 02000000 87c45b9c 87c45bd4 804279ec 87118cc0
00000001 00000004 fffffffd 00000000 02080020 80294f60 00000300 71a45669
00000000 00000000 87118cc0 87118cc0 87118cc0 87c45bd4 fffffffb 00000001
00000003 80295004 804279e4 71a45669 87c45c04 87c45c3c 00000000 8741d284
87118cc0 80420000 00000001 87118cc0 02080020 00000800 8720b05c 8029b5f0
...
[ 20.643491] Call Trace:
[ 20.645944] [<87fe0254>] nf_ct_untracked_status_or+0x78/0x98 [nf_conntrack]
[ 20.652919] [<87fe1ce8>] __nf_conntrack_confirm+0xbc/0x364 [nf_conntrack]
[ 20.659709] [<80294f60>] nf_iterate+0x7c/0xc8
[ 20.664055] [<80295004>] nf_hook_slow+0x58/0xf8
[ 20.668582] [<8029b5f0>] ip_local_deliver+0xb8/0xe8
[ 20.673451] [<8029b8e8>] ip_rcv+0x2c8/0x334
[ 20.677629] [<80268c5c>] __netif_receive_skb_core+0x614/0x6d0
[ 20.683373] [<8032f74c>] br_pass_frame_up+0x94/0x10c
[ 20.688328] [<8032ffd0>] br_handle_frame+0x2cc/0x418
[ 20.693279] [<802689dc>] __netif_receive_skb_core+0x394/0x6d0
[ 20.699028] [<8023703c>] ag71xx_poll+0x430/0x5c0
[ 20.703637] [<80269ab4>] net_rx_action+0x10c/0x2c8
[ 20.708420] [<80084134>] __do_softirq+0x250/0x298
[ 20.713116] [<800841a4>] run_ksoftirqd+0x28/0x60
[ 20.717725] [<8009abf8>] smpboot_thread_fn+0x158/0x188
[ 20.722851] [<80098628>] kthread+0xd8/0xec
[ 20.726936] [<80060878>] ret_from_kernel_thread+0x14/0x1c
[ 20.732315]
[ 20.733795]
Code: 00020336 8c820008 30450001 14a00002 00000000 ac430004 24020200 ac82000c
[ 20.743784] ---[ end trace f860d77d4ae98435 ]---
[ 20.749725] Kernel panic - not syncing: Fatal exception in interrupt
[ 20.757339] Rebooting in 3 seconds..

Or another one:

[ 20.121647] eth0: link up (1000Mbps/Full duplex) [ 20.131943] Unhandled kernel unaligned access[#1]: [ 20.136738] CPU: 0 PID: 937 Comm: snmpd Not tainted 4.4.59 #0 [ 20.142474] task: 87d92840 ti: 86c7c000 task.ti: 86c7c000 [ 20.147854] $ 0 : 00000000 7fe40058 8010de54 00000000 [ 20.153101] $ 4 : 873b557f 86c7de7c 86c7de74 00000001 [ 20.158348] $ 8 : 00000020 80065050 80107538 00000000 [ 20.163595] $12 : 7fe40040 00000000 00000000 77d382c0 [ 20.168842] $16 : 00000000 873b557f 86c7df08 00000080 [ 20.174090] $20 : 00000080 77d2e000 86c831e0 77d32d8c [ 20.179336] $24 : 00000fa3 77c9d000 [ 20.184584] $28 : 86c7c000 86c7de08 00000000 8010de9c [ 20.189832] Hi : 00000004 [ 20.192707] Lo : 00000000 [ 20.195596] epc : 800685a8 mutex_lock+0x0/0x30 [ 20.200203] ra : 8010de9c pipe_read+0x48/0x284 [ 20.204889] Status: 1000fc03 KERNEL EXL IE [ 20.209090] Cause : 00800010 (ExcCode 04) [ 20.213089] BadVA : 873b557f [ 20.215957] PrId : 00019374 (MIPS 24Kc) [ 20.219869] Modules linked in: ath9k ath9k_common ath9k_hw ath nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_condition xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ipt_ECN ip_tables compat br_netfilter ledtrig_usbport ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables tun ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common [ 20.301899] Process snmpd (pid: 937, threadinfo=86c7c000, task=87d92840, tls=77d39d48) [ 20.309790] Stack : 8730d004 068b46ad 86c84e70 00000012 00000000 00000000 00000000 873b0780 87cf2efc 86c7de7c 00000000 86c831e0 86c7df08 00000080 77d2e000 77d2e000 77d31518 77d32d8c 00000000 8010647c 0045acd4 80076b6c 80410000 8008a31c 00000080 00000000 77d31518 7fe4015c 00000080 00000000 00000000 00000080 86c7de74 00000001 86c831e0 00000000 00000000 00000000 00000000 00000000 ... [ 20.345535] Call Trace: [ 20.347973] [<800685a8>] mutex_lock+0x0/0x30 [ 20.352235] [<8010de9c>] pipe_read+0x48/0x284 [ 20.356586] [<8010647c>] __vfs_read+0xc4/0x104 [ 20.361023] [<80106bfc>] vfs_read+0x84/0x130 [ 20.365284] [<80107590>] SyS_read+0x58/0xc4 [ 20.369462] [<80062c2c>] syscall_common+0x30/0x54 [ 20.374153] [ 20.375640] Code: 8fb00024 03e00008 27bd0040 2443ffff e0830000 1060fffc 00000000 2442ffff [ 20.385691] ---[ end trace 7089734b36069ee3 ]--- [ 20.392349] Fatal exception: panic in 5 seconds [ 20.400076] CPU 0 Unable to handle kernel paging request at virtual address 00000200, epc == 87ff0254, ra == 87ff1ce8 [ 20.410891] Oops[#2]: [ 20.413160] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G D 4.4.59 #0 [ 20.420456] task: 87c2ca10 ti: 87c44000 task.ti: 87c44000 [ 20.425836] $ 0 : 00000000 80480000 80000001 00000200 [ 20.431083] $ 4 : 8730e000 00000001 00000000 409e9f5a [ 20.436331] $ 8 : 8730e058 8730e058 00000000 00000000 [ 20.441578] $12 : 00440002 ffffffff 00000000 ffffffff [ 20.446825] $16 : 8730e000 000002e0 00000655 804275d8 [ 20.452073] $20 : 000002e0 00000004 804275d8 00000000 [ 20.457320] $24 : 00000000 87ff7890 [ 20.462567] $28 : 87c44000 87c45b20 8730e040 87ff1ce8 [ 20.467814] Hi : 00000655 [ 20.470680] Lo : 17458800 [ 20.473587] epc : 87ff0254 nf_ct_untracked_status_or+0x78/0x98 [nf_conntrack] [ 20.480887] ra : 87ff1ce8 __nf_conntrack_confirm+0xbc/0x364 [nf_conntrack] [ 20.487999] Status: 1000fc03 KERNEL EXL IE [ 20.492199] Cause : 0080000c (ExcCode 03) [ 20.496199] BadVA : 00000200 [ 20.499075] PrId : 00019374 (MIPS 24Kc) [ 20.502986] Modules linked in: ath9k ath9k_common ath9k_hw ath nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_condition xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ipt_ECN ip_tables compat br_netfilter ledtrig_usbport ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables tun ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common [ 20.585017] Process ksoftirqd/0 (pid: 3, threadinfo=87c44000, task=87c2ca10, tls=00000000) [ 20.593254] Stack : 00000002 874187c0 00000001 02000000 87c45b9c 87c45bd4 804279ec 87312180 00000001 00000004 fffffffd 00000000 02080020 80294f60 00000300 d41771e8 00000000 00000000 87312180 87312180 87312180 87c45bd4 fffffffb 00000001 00000003 80295004 804279e4 d41771e8 87c45c04 87c45c3c 00000000 87413604 87312180 80420000 00000001 87312180 02080020 00000800 8721105c 8029b5f0 ... [ 20.629004] Call Trace: [ 20.631457] [<87ff0254>] nf_ct_untracked_status_or+0x78/0x98 [nf_conntrack] [ 20.638422] [<87ff1ce8>] __nf_conntrack_confirm+0xbc/0x364 [nf_conntrack] [ 20.645214] [<80294f60>] nf_iterate+0x7c/0xc8 [ 20.649561] [<80295004>] nf_hook_slow+0x58/0xf8 [ 20.654087] [<8029b5f0>] ip_local_deliver+0xb8/0xe8 [ 20.658957] [<8029b8e8>] ip_rcv+0x2c8/0x334 [ 20.663134] [<80268c5c>] __netif_receive_skb_core+0x614/0x6d0 [ 20.668878] [<8032f74c>] br_pass_frame_up+0x94/0x10c [ 20.673833] [<8032ffd0>] br_handle_frame+0x2cc/0x418 [ 20.678785] [<802689dc>] __netif_receive_skb_core+0x394/0x6d0 [ 20.684534] [<8023703c>] ag71xx_poll+0x430/0x5c0 [ 20.689142] [<80269ab4>] net_rx_action+0x10c/0x2c8 [ 20.693926] [<80084134>] __do_softirq+0x250/0x298 [ 20.698622] [<800841a4>] run_ksoftirqd+0x28/0x60 [ 20.703231] [<8009abf8>] smpboot_thread_fn+0x158/0x188 [ 20.708365] [<80098628>] kthread+0xd8/0xec [ 20.712450] [<80060878>] ret_from_kernel_thread+0x14/0x1c [ 20.717829] [ 20.719309] Code: 00020336 8c820008 30450001 14a00002 00000000 ac430004 24020200 ac82000c [ 20.729304] ---[ end trace 7089734b36069ee4 ]--- [ 20.735253] Kernel panic - not syncing: Fatal exception in interrupt [ 20.742878] Rebooting in 3 seconds..

I have more panics, available on request.

@openwrt-bot
Copy link
Author

russell:

Duplicated on r4027-725238f698 (current HEAD). A series of full bootlogs with panics are attached.

@openwrt-bot
Copy link
Author

russell:

bisection is not generating clear results, will update when clarity achieved.

@openwrt-bot
Copy link
Author

russell:

It appears the problem manifests if br_netfilter (from kmod-br-netfilter) or xt_physdev (from kmod-ipt-extra) are loaded at boot time and a link is present on the LAN ports (eth0).

@openwrt-bot
Copy link
Author

russell:

I am bisecting with the following .config stub:

CONFIG_TARGET_ar71xx=y CONFIG_TARGET_ar71xx_generic=y CONFIG_TARGET_ar71xx_generic_DEVICE_WZR600DHP=y CONFIG_DEVEL=y CONFIG_BUILD_LOG=y # CONFIG_PACKAGE_firewall is not set CONFIG_PACKAGE_kmod-br-netfilter=y CONFIG_PACKAGE_kmod-ipt-extra=y # CONFIG_PACKAGE_uboot-envtools is not set CONFIG_PACKAGE_kmod-ipt-nat=y CONFIG_PACKAGE_kmod-nf-nat=y

@openwrt-bot
Copy link
Author

russell:

reboot-1238-ga5c32a1f19 = no panic
reboot-1239-gfa85ee1d4e = panic

In the panic case, lsmod shows both the problematic modules: br_netfilter and xt_physdev, the no-panic case, those modules don't appear.

@openwrt-bot
Copy link
Author

russell:

I tried to replicate the problem on a Netgear WNDR3800, but didn't see the panic there.

@openwrt-bot
Copy link
Author

russell:

Problem is gone as of reboot-5244-gf0c37f6ceb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant