Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#263 - WRT3200ACM WAN connection failure in under 24 hours #6350

Closed
openwrt-bot opened this issue Nov 2, 2016 · 9 comments
Closed

FS#263 - WRT3200ACM WAN connection failure in under 24 hours #6350

openwrt-bot opened this issue Nov 2, 2016 · 9 comments
Labels

Comments

@openwrt-bot
Copy link

mrm:

Symptom

WAN connectivity is lost until the device is rebooted; failure aligns with stack trace (attached). I've observed three failures of this same type, and all have occurred at around 18 hours of device up time. Work-around (yet to be tested) is to schedule a periodic device reboot every 12 hours.

Device problem occurs on

  • Linksys WRT3200ACM (version 1)
  • LEDE configuration is near-stock, with the only modifications being PPPoE configuration and WiFi setup/enable. Issue does not appear to be similar to FS#227 (PPPoE).

Software versions of LEDE release, packages, etc.

Release:

[0.000000] Linux version 4.4.28 (buildbot@builds) (gcc version 5.4.0 (LEDE GCC 5.4.0 r2062) ) #0 SMP Mon Oct 31 16:13:37 2016
[0.000000] CPU: ARMv7 Processor [414fc091] revision 1 (ARMv7), cr=10c5387d
[0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[0.000000] Machine model: Linksys WRT3200ACM

Packages:
Default packages + luci-ssl

Possibly helpful info (80211 related output in log):
<<Marvell 802.11ac Wireless Network Driver version 10.3.2.0-20161011>>
pci 0000:00:01.0: enabling device (0140 -> 0142)
ieee80211 phy0: priv->iobase0 = e0e00000
ieee80211 phy0: priv->iobase1 = e1080000
ieee80211 phy0: priv->pcmd_buf = de24800 priv->pphys_cmd_buf = 1e248000
ieee80211 phy0: fw download start
ieee80211 phy0: FwSize = 207660 downloaded Size = 207660 curr_iteration 65522
ieee80211 phy0: fw download complete
ieee80211 phy0: pcmd = de24800
ieee80211 phy0: firmware version: 0x7080004
ieee80211 phy0: firmware region code: 10
ieee80211 phy0: 2G disabled, 5G enabled
ieee80211 phy0: 4 TX antennas, 4 RX antennas
pci 0000:00:02.0: enabling device (0140 -> 0142)
ieee80211 phy1: priv->iobase0 = e1200000
ieee80211 phy1: priv->iobase1 = e1480000
ieee80211 phy1: priv->pcmd_buf = dda20000 priv->pphys_cmd_buf = 1da20000
ieee80211 phy1: fw download start
ieee80211 phy1: FwSize = 207660 downloaded Size = 207660 curr_iteration 65527
ieee80211 phy1: fw download complete
ieee80211 phy1: pcmd = dda20000
ieee80211 phy1: firmware version: 0x7080004
ieee80211 phy1: firmware region code: 10
ieee80211 phy1: 2G enabled, 5G disabled
ieee80211 phy1: 4 TX antennas, 4 RX antennas

Steps to reproduce

Device was idle before the stack trace, but logs show some DHCP-related activity within 1 minute prior to the fault.

Stack Trace
Wed Nov 2 17:10:41 2016 kern.err kernel: [74541.045379] ieee80211 phy1: create ba result error 1
Wed Nov 2 17:10:41 2016 kern.err kernel: [74541.059396] ieee80211 phy1: ampdu operation error code: -22
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.085673] ------------[ cut here ]------------
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.090364] WARNING: CPU: 0 PID: 5305 at compat-wireless-2016-10-08/net/mac80211/agg-tx.c:398 ___ieee80211_stop_tx_ba_session+0x1e8/0x1f8 mac80211
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.103914] Modules linked in: pppoe ppp_async iptable_nat pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv6 nf_conntrack_ipv4 ipt_REJECT ipt_MASQUERADE xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_id xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_CT slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_raw iptable_mangle iptable_fWed Nov 2 17:10:42 2016 kern.warn kernel: [74542.162765] CPU: 0 PID: 5305 Comm: kworker/u4:2 Not tainted 4.4.28 #0
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.169230] Hardware name: Marvell Armada 380/385 (Device Tree)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.175192] Workqueue: phy1 ieee80211_iface_work [mac80211]
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.180804] [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.188584] [] (show_stack) from [] (dump_stack+0x8c/0xa0)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.195838] [] (dump_stack) from [] (warn_slowpath_common+0x94/0xb0)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.203962] [] (warn_slowpath_common) from [] (warn_slowpath_null+0x1c/0x24)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.212799] [] (warn_slowpath_null) from [] (___ieee80211_stop_tx_ba_session+0x1e8/0x1f8 [mac80211])
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.223749] [] (___ieee80211_stop_tx_ba_session [mac80211]) from [] (__ieee80211_stop_tx_ba_session+0x2c/0x40 [mac80211])
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.236527] [] (__ieee80211_stop_tx_ba_session [mac80211]) from [] (ieee80211_iface_work+0x1c0/0x610 [mac80211])
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.248510] [] (ieee80211_iface_work [mac80211]) from [] (process_one_work+0x228/0x3bc)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.258292] [] (process_one_work) from [] (worker_thread+0x310/0x504)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.266505] [] (worker_thread) from [] (kthread+0xf0/0xf8)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.273758] [] (kthread) from [] (ret_from_fork+0x14/0x3c)
Wed Nov 2 17:10:42 2016 kern.warn kernel: [74542.281021] ---[ end trace 09d229b863156d80 ]---

@openwrt-bot
Copy link
Author

mrm:

After investigating the stack trace in the description, I think it's probably just a symptom of an issue in the driver and is not the root cause.

Some similar (possibly related) discussion [[https://github.com/kaloz/mwlwifi/issues/41|found here]].

@openwrt-bot
Copy link
Author

mrm:

For info... the "firmware update apply" process is broken on this release of LEDE for the WRT3200ACM. In order to update firmware for experimenting with OpenWRT code, I needed to do the "Backup Firmware Recovery" (ie. three short power cycles) to get back to factory firmware.

@openwrt-bot
Copy link
Author

nbd:

Please try the latest version

@openwrt-bot
Copy link
Author

kevinjos:

I am running Reboot (17.01.0-rc1, r3042-ec095b5) and have observed similar behavior on this hardware. I have yet to isolate a stack trace. For me, I am losing the 5gHz antenna after around 24hr of uptime. The WAN/LAN/2.4gHz wifi remain operational. The issue is fixed by a reboot.

@openwrt-bot
Copy link
Author

kevinjos:

Here's the system logging from the beginning of the error onward:

Fri Feb 3 00:40:57 2017 kern.err kernel: [38159.651041] ieee80211 phy0: cmd 0x9125=BAStream timed out
Fri Feb 3 00:40:57 2017 kern.err kernel: [38159.656471] ieee80211 phy0: return code: 0x1125
Fri Feb 3 00:40:57 2017 kern.err kernel: [38159.661020] ieee80211 phy0: timeout: 0x1125
Fri Feb 3 00:40:57 2017 kern.err kernel: [38159.665223] ieee80211 phy0: destroy ba failed execution
Fri Feb 3 00:41:28 2017 kern.err kernel: [38189.998895] ieee80211 phy0: cmd 0x801d=MEMAddrAccess timed out
Fri Feb 3 00:41:28 2017 kern.err kernel: [38190.004769] ieee80211 phy0: return code: 0x001d
Fri Feb 3 00:41:28 2017 kern.err kernel: [38190.009319] ieee80211 phy0: timeout: 0x001d
Fri Feb 3 00:41:28 2017 kern.err kernel: [38190.013527] ieee80211 phy0: failed execution
Fri Feb 3 00:41:32 2017 kern.err kernel: [38194.016866] ieee80211 phy0: cmd 0x801d=MEMAddrAccess timed out
Fri Feb 3 00:41:32 2017 kern.err kernel: [38194.022746] ieee80211 phy0: return code: 0x001d
Fri Feb 3 00:41:32 2017 kern.err kernel: [38194.027312] ieee80211 phy0: timeout: 0x001d
Fri Feb 3 00:41:32 2017 kern.err kernel: [38194.031513] ieee80211 phy0: failed execution
Fri Feb 3 00:41:36 2017 kern.err kernel: [38198.034840] ieee80211 phy0: cmd 0x801d=MEMAddrAccess timed out
Fri Feb 3 00:41:36 2017 kern.err kernel: [38198.040701] ieee80211 phy0: return code: 0x001d
Fri Feb 3 00:41:36 2017 kern.err kernel: [38198.045259] ieee80211 phy0: timeout: 0x001d
Fri Feb 3 00:41:36 2017 kern.err kernel: [38198.049473] ieee80211 phy0: failed execution

@openwrt-bot
Copy link
Author

kevinjos:

My report above appears to be a known issue unrelated to the WAN failure on the WTR3200..

kaloz/mwlwifi#126

@openwrt-bot
Copy link
Author

mrm:

I can confirm that the issue I originally reported with this defect is awaiting the fix for mwlwifi issue #126. At this time, I do not know if there are any associated fixes needed with LEDE. I'm anxiously awaiting the mwlwifi fix, and will verify/close this defect once available.

@openwrt-bot
Copy link
Author

mkresin:

What is the status of your issue. The mentioned mwlwifi issue on github was closed quite some time ago?

@openwrt-bot
Copy link
Author

mrm:

The mwlwifi issue mentioned in this defect has been resolved. I will request this defect be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant