OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity Critical
  • Priority Very Low
  • Reported Version openwrt-21.02
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Vincent Riou - 05.05.2021

FS#3783 - MT7621 WiFi driver crash

Summary

The WiFi driver crashes on an MT7621 router running OpenWRT21.02rc1 (5 commits further to be exact 6f053e5b).
The router used is an InvizBox Go (WiFi only) i.e. MT7621 + MT7603E (not supported by OpenWRT just yet - I’m working on getting there). I don’t believe the issue is specific to that hardware though.

The crash seems to happen in AP/STA mode when an AP gets added to the configuration and the STA was connected (I’ll add more information if I come across more scenarios). I’m not able to reproduce consistently at the moment.
Once crashed the WiFi stack is not usable anymore until a reboot. Following a reboot, things are back to normal.

The crash stack is as follow:

[  726.587920] ------------[ cut here ]------------
[  726.597570] WARNING: CPU: 0 PID: 1767 at backports-5.10.16-1/net/mac80211/ieee80211_i.h:1468 sta_info_alloc+0x5c4/0x5fc [mac80211]
[  726.621113] Modules linked in: xt_connlimit pppoe ppp_async nf_conncount iptable_nat xt_state xt_nat xt_helper xt_conntrack xt_connmark xt_connbytes xt_REDIRECT xt_MASQUERADEr
[  726.621458]  algif_skcipher algif_rng algif_hash algif_aead af_alg sha256_generic libsha256 sha1_generic jitterentropy_rng drbg md5 hmac echainiv des_generic libdes cbc authec
[  726.859235] CPU: 0 PID: 1767 Comm: hostapd Not tainted 5.4.111 #0
[  726.871379] Stack : 8e375fb4 8007ce5c 80660000 80660d78 806c0000 80660d40 8065fe94 8e257a34
[  726.888032]         80800000 8dc34188 806aa6e3 805f993c 00000000 00000001 8e2579d8 72c69b9f
[  726.904659]         00000000 00000000 80830000 00000000 30232031 0000014b 2e352064 31312e34
[  726.921286]         00000000 00000024 00000000 000d1c63 80000000 806c0000 00000000 8e3075b8
[  726.937913]         00000009 8e146480 00000005 00000002 00000000 8034dfbc 00000000 80800000
[  726.954538]         ...
[  726.959397] Call Trace:
[  726.964284] [<8000b64c>] show_stack+0x30/0x100
[  726.973153] [<80542370>] dump_stack+0xa4/0xdc
[  726.981832] [<8002bee0>] __warn+0xc0/0x10c
[  726.989981] [<8002bf88>] warn_slowpath_fmt+0x5c/0xac
[  727.000018] [<8e3075b8>] sta_info_alloc+0x5c4/0x5fc [mac80211]
[  727.011712] [<8e3259b0>] ieee80211_nan_func_match+0x3d88/0x4410 [mac80211]
[  727.025846] ---[ end trace bc705bf94f0c5c24 ]---

Steps to reproduce

The following steps do not necessarily lead to the crash. I expect them to be what leads to it but am still unsure. The stack may show the way to a better reproduction scenario...

On an MT7621 router setup as AP/STA, add a second AP and possibly a third one.
Call /etc/init.d/network reload after each change

Current behaviour

Crash stack is visible in console/dmesg and the STA fails to reconnect (which also leads all APs to become not accessible - expected).

Expected behaviour

No crash

Notes This crash was also observed on builds before the rc1 tag. I don’t know the conditions leading to these crashes.

I had saved one older crash stack (truncated by console unfortunately) as a reference (early 21.02 branch when I started testing in preparation for release):

[ 1939.972549] ------------[ cut here ]------------
[ 1939.982051] WARNING: CPU: 2 PID: 2079 at backports-5.10.16-1/net/mac80211/ieee80211_i.h:1468 sta_info_alloc+0x5c4/0x]
[ 1940.005526] Modules linked in: xt_connlimit pppoe ppp_async nf_conncount iptable_nat xt_state xt_nat xt_helper xt_cor
[ 1940.005842]  algif_skcipher algif_rng algif_hash algif_aead af_alg sha256_generic libsha256 sha1_generic jitterentroc
[ 1940.243641] CPU: 2 PID: 2079 Comm: hostapd Not tainted 5.4.111 #0
[ 1940.255784] Stack : 8df75fb4 8007ce5c 80660000 80660d78 806c0000 80660d40 8065fe94 8dd01a34
[ 1940.272451]         80800000 8fe51fc8 806aa6e3 805f993c 00000002 00000001 8dd019d8 91468ee6
[ 1940.289110]         00000000 00000000 80830000 00000000 30232031 0000012f 2e352064 31312e34
[ 1940.305752]         00000000 00000060 00000000 0003b7b9 80000000 806c0000 00000000 8df075b8
[ 1940.322392]         00000009 8fed4480 00000005 00000002 00000000 8034dfbc 00000008 80800008
[ 1940.339021]         ...
[ 1940.343883] Call Trace:
[ 1940.348778] [<8000b64c>] show_stack+0x30/0x100
[ 1940.357652] [<80542370>] dump_stack+0xa4/0xdc
[ 1940.366336] [<8002bee0>] __warn+0xc0/0x10c
[ 1940.374487] [<8002bf88>] warn_slowpath_fmt+0x5c/0xac
[ 1940.384532] [<8df075b8>] sta_info_alloc+0x5c4/0x5fc [mac80211]
[ 1940.396280] [<8df259b0>] ieee80211_nan_func_match+0x3d88/0x4410 [mac80211]
[ 1940.410460] ---[ end trace dda71e821ee728c4 ]---
Vincent Riou commented on 05.05.2021 18:09

Built the latest today (commit d1a056f62052a251789d0a26c025f319bf497f2f) and got a crash with a simple AP/STA setup - the WiFi stayed up though (which seems different).

Here is the console log (with stack):

[  855.095094] device wlan0-1 left promiscuous mode
[  855.104719] br-lan: port 1(wlan0-1) entered disabled state
[  855.642152] wlan0: deauthenticating from xx:xx:xx:xx:xx:xx by local choice (Reason: 3=DEAUTH_LEAVING)
[  857.952818] br-lan: port 1(wlan0-1) entered blocking state
[  857.964772] br-lan: port 1(wlan0-1) entered disabled state
[  857.977242] device wlan0-1 entered promiscuous mode
[  858.619312] ------------[ cut here ]------------
[  858.628821] WARNING: CPU: 3 PID: 2069 at backports-5.10.34-1/net/mac80211/ieee80211_i.h:1468 sta_info_alloc+0x5c4/0x]
[  858.652422] Modules linked in: xt_connlimit pppoe ppp_async nf_conncount iptable_nat xt_state xt_nat xt_helper xt_cor
[  858.652727]  algif_skcipher algif_rng algif_hash algif_aead af_alg sha256_generic libsha256 sha1_generic jitterentroc
[  858.890371] CPU: 3 PID: 2069 Comm: hostapd Not tainted 5.4.114 #0
[  858.902511] Stack : 8e276094 8007ce5c 80660000 80661d78 806c0000 80661d40 80660e94 8eaefa34
[  858.919176]         80800000 8ff62b08 806aa6e3 805fa924 00000003 00000001 8eaef9d8 a480919e
[  858.935826]         00000000 00000000 80830000 00000000 30232034 00000122 2e352064 31312e34
[  858.952462]         00000000 0000002a 00000000 000d9603 80000000 806c0000 00000000 8e207608
[  858.969091]         00000009 8feae480 00000005 00000002 00000000 8034e07c 0000000c 8080000c
[  858.985722]         ...
[  858.990581] Call Trace:
[  858.995467] [<8000b64c>] show_stack+0x30/0x100
[  859.004335] [<805425a4>] dump_stack+0xa4/0xdc
[  859.013017] [<8002bee0>] __warn+0xc0/0x10c
[  859.021165] [<8002bf88>] warn_slowpath_fmt+0x5c/0xac
[  859.031200] [<8e207608>] sta_info_alloc+0x5c4/0x5fc [mac80211]
[  859.042899] [<8e225a78>] ieee80211_nan_func_match+0x3d94/0x441c [mac80211]
[  859.056962] ---[ end trace b8a32ba13a9858c3 ]---
[  865.020490] wlan0: authenticate with xx:xx:xx:xx:xx:xx
[  865.239685] wlan0: send auth to xx:xx:xx:xx:xx:xx (try 1/3)
[  865.252449] wlan0: authenticated
[  865.259933] wlan0: associating with AP with corrupt probe response
[  865.272425] wlan0: associate with xx:xx:xx:xx:xx:xx (try 1/3)
[  865.297509] wlan0: RX AssocResp from xx:xx:xx:xx:xx:xx (capab=0x411 status=0 aid=1)
[  865.313161] wlan0: associated
[  865.568143] br-lan: port 1(wlan0-1) entered blocking state
[  865.580011] br-lan: port 1(wlan0-1) entered forwarding state

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing