OpenWrt/LEDE Project

  • Status Assigned
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Kernel
  • Assigned To
    Felix Fietkau
  • Operating System All
  • Severity Medium
  • Priority Medium
  • Reported Version lede-17.01
  • Due in Version Undecided
  • Due Date Undecided
  • Votes 3
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Matthias Schiffer - 22.06.2017

FS#863 - ath9k failure with AP+11s VIFs ("unsupported hw bitrate detected")

A Gluon user reported the following issue, observed on a TL-WR841N v9 (QCA953x). LEDE version is 17.01.2.

Gluon uses AP+11s VIFs, with multicast_rate set to 12000.

The WLAN became unusable (no connection possible, WLAN LED permantently on) after a few hours of operation. Both VIFs were affected. The system recovered on its own after an hour in this state. Unfortunately, I have no information if beacons were still running in the broken state.

After the recovery, the large number of messages like the following were found in the kernel log:

[40085.953696] ath: phy0: unsupported hw bitrate detected 0x73 using 1 Mbit
[40200.844082] ath: phy0: unsupported hw bitrate detected 0x48 using 1 Mbit
[40281.298039] ------------[ cut here ]------------
[40281.303081] WARNING: CPU: 0 PID: 0 at compat-wireless-2017-01-31/net/mac80211/rx.c:4214 ieee80211_rx_napi+0x130/0x8b0 [mac80211]()
[40281.315230] Rate marked as an HT rate but passed status->rate_idx is not an MCS index [0-76]: 100 (0x64)
[40281.325023] Modules linked in: iptable_nat ath9k nf_nat_ipv4 nf_conntrack_ipv6 nf_conntrack_ipv4 ipt_REJECT ipt_MASQUERADE ebtable_nat ebtable_filter ebtable_broute ath9k_common xt_time xt_tcpudp xt_state xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_LOG xt_CT nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack mac80211 iptable_mangle iptable_filter ip_tables ebtables ebt_vlan ebt_stp ebt_snat ebt_redirect ebt_pkttype ebt_mark_m ebt_mark ebt_limit ebt_ip6 ebt_ip ebt_dnat ebt_arpreply ebt_arp ebt_among ebt_802_3 br_netfilter ath9k_hw ath em_nbyte em_cmp sch_pie em_meta sch_gred sch_teql act_ipt sch_prio em_text sch_codel sch_red sch_fq sch_sfq sch_dsmark act_skbedit act_mirred em_u32 cls_u32 cls_tcindex cls_flow cls_route cls_fw sch_htb sch_hfsc batman_adv libcrc32c cfg80211 compat ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables dummy veth tun act_police cls_basic sch_tbf sch_ingress gpio_button_hotplug crc16 crc32c_generic crypto_hash
[40281.433128] CPU: 0 PID: 0 Comm: swapper Tainted: G        W       4.4.71 #0
[40281.440330] Stack : 803e32a4 00000000 00000001 80440000 8042d1dc 8042ce63 803c48ac 00000000
[40281.440330]    804a378c 00000004 00000000 8180e918 00000000 800a7664 803c9fa8 80430000
[40281.440330]    00000003 00000004 803c83b4 81809d5c 00000000 800a55e0 00000002 00000000
[40281.440330]    8042bf70 801f5300 00000000 00000000 00000000 00000000 00000000 00000000
[40281.440330]    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[40281.440330]    ...
[40281.477351] Call Trace:
[40281.479951] [<800a7664>] vprintk_default+0x24/0x30
[40281.484917] [<800a55e0>] printk+0x2c/0x38
[40281.489077] [<801f5300>] serial8250_get_mctrl+0x30/0x54
[40281.494512] [<80081c60>] warn_slowpath_common+0xa0/0xd0
[40281.499949] [<801a772c>] dump_stack+0x14/0x28
[40281.504473] [<80071ea8>] show_stack+0x50/0x84
[40281.509003] [<80081c60>] warn_slowpath_common+0xa0/0xd0
[40281.514543] [<80e25cf0>] ieee80211_rx_napi+0x130/0x8b0 [mac80211]
[40281.520918] [<80081cbc>] warn_slowpath_fmt+0x2c/0x38
[40281.526158] [<80e25cf0>] ieee80211_rx_napi+0x130/0x8b0 [mac80211]
[40281.532495] [<802623ec>] __build_skb+0x48/0xc8
[40281.537148] [<802625d0>] __netdev_alloc_skb+0xd0/0x148
[40281.542477] [<80060bf4>] handle_int+0x134/0x140
[40281.547181] [<80de8388>] ath9k_cmn_rx_skb_postprocess+0x50/0x168 [ath9k_common]
[40281.554852] [<80e88374>] ath_rx_tasklet+0xee8/0xf4c [ath9k]
[40281.560635] [<8006503c>] __bzero+0xa0/0x164
[40281.565012] [<80e88340>] ath_rx_tasklet+0xeb4/0xf4c [ath9k]
[40281.570828] [<80e851b8>] ath9k_tasklet+0x200/0x28c [ath9k]
[40281.576512] [<80084820>] tasklet_action+0x80/0xc8
[40281.581390] [<800841dc>] __do_softirq+0x250/0x298
[40281.586284] [<800a82bc>] generic_handle_irq+0x24/0x3c
[40281.591540] [<8006a9e0>] plat_irq_dispatch+0xd4/0x10c
[40281.596760] [<80060bf4>] handle_int+0x134/0x140
[40281.601447]
[40281.602992] ---[ end trace f41ea26ed5101c02 ]---

Both the numbers in the “unsupported hw bitrate detected” messages and the status→rate_idx values in the WARNINGs are seemingly random.

Marcus Scharf commented on 26.06.2017 20:02

https://pastebin.com/n8zgqB7E

"iw dev client0 scan" bring the node back despite this error

Software: Gluon v2016.2.5
Hardware: WR841v10

aversario commented on 06.10.2017 17:02

I have those also (and some others):
[ 1027.252127] ath: phy0: unsupported hw bitrate detected 0x3b using 1 Mbit
[ 1027.265778] ath: phy0: unsupported hw bitrate detected 0x3b using 1 Mbit
[82325.137549] ath: phy0: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42100020 DMADBG_7=0x000084c0
[852191.562711] ath: phy0: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42100020 DMADBG_7=0x000084c0
[935075.485058] IPv6: ADDRCONF(NETDEV_UP): wi11: link is not ready
[935076.293142] IPv6: ADDRCONF(NETDEV_CHANGE): wi11: link becomes ready
[1508304.347528] ath: phy0: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42100020 DMADBG_7=0x000084c0
[1637309.420586] ath: phy0: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22

MikroTik RouterBOARD 411/A/AH
LEDE Reboot SNAPSHOT r3968-e20965811d / LuCI Master (git-17.103.24187-a82ca22)
Kernel Version 4.4.59

FreifunkUFO commented on 16.10.2017 21:19

its also happening with ath9k on AP+11s VIFs (f.i. on freifunk meshes)
i think its THE well-known problem of the last months, mostly called "wifi will break until a wifi scan is started".

2 weeks ago i saw that problem on a 5ghz (!) device (ubnt-loco5m)
[135091.754940] ath: phy0: unsupported hw bitrate detected 0x1b using 1 Mbit

there were many thoughts about it. sometimes it seemed to be broken/mad wifi-client-devices (at one location the mistake was only on weekends) or when people were using many windows mobile phones.
other ideas blaming ANI (please try to disable and enable ANI) or HCS. Or pathing ath9k (0xdeadbeef).

meanwhile many mesh-communities (like freifunk, libremesh..) have very many bad or very bad workarounds about that. (so even neoraiders wifi will recover after an hour)

see comments/links on https://plan.leipzig.freifunk.net/issues/252

FreifunkUFO commented on 16.10.2017 23:01

here some freifunk people with 11s and same problem:
https://github.com/freifunk-gluon/gluon/issues/1148

komar commented on 18.10.2017 11:07

Same on Ubiquti Bullet M2, Reboot (17.01-SNAPSHOT, r3535+34-ee32de4426).

WPA2/PSK connection

[68667.891551] WARNING: CPU: 0 PID: 0 at compat-wireless-2017-01-31/net/mac80211/rx.c:4214 ieee80211_rx_napi+0x130/0x8b0 [mac80211]()
[68667.903350] Rate marked as an HT rate but passed status->rate_idx is not an MCS index [0-76]: 123 (0x7b)
[68667.912861] Modules linked in: iptable_nat ath9k nf_nat_ipv4 nf_conntrack_ipv6 nf_conntrack_ipv4 ipt_REJECT ipt_MASQUERADE ebtable_nat ebtable_filter ebtable_broute ath9k_common xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_HL xt_DSCP xt_CT xt_CLASSIFY nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack mac80211 iptable_mangle iptable_filter ipt_ECN ip_tables ebtables ebt_vlan ebt_stp ebt_snat ebt_redirect ebt_pkttype ebt_mark_m ebt_mark ebt_limit ebt_ip6 ebt_ip ebt_dnat ebt_arpreply ebt_arp ebt_among ebt_802_3 br_netfilter ath9k_hw ath batman_adv libcrc32c cfg80211 compat xt_set ip_set_list_set ip_set_hash_netiface ip_set_hash_netport ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables dummy veth ohci_platform ohci_hcd ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common crc16 crc32c_generic crypto_hash
[68668.031742] CPU: 0 PID: 0 Comm: swapper Tainted: G        W       4.4.89 #0
[68668.038732] Stack : 803e3444 00000000 00000001 80440000 8042d1dc 8042ce63 803c4a38 00000000
[68668.038732] 	  804a378c 00000000 00000000 00000030 00000000 800a7844 803ca150 80430000
[68668.038732] 	  00000003 00000000 803c855c 81809d5c 00000000 800a57c0 00000002 00000000
[68668.038732] 	  8042bf70 801f5a00 00000000 00000000 00000000 00000000 00000000 00000000
[68668.038732] 	  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[68668.038732] 	  ...
[68668.074666] Call Trace:
[68668.077148] [<80071ee4>] show_stack+0x54/0x88
[68668.081539] [<80081ce4>] warn_slowpath_common+0xa0/0xd0
[68668.086807] [<80081d40>] warn_slowpath_fmt+0x2c/0x38
[68668.091946] [<80ea5cf0>] ieee80211_rx_napi+0x130/0x8b0 [mac80211]
[68668.098206] [<80de8374>] ath_rx_tasklet+0xee8/0xf4c [ath9k]
[68668.103828] [<80de51b8>] ath9k_tasklet+0x200/0x28c [ath9k]
[68668.109376] [<800848a4>] tasklet_action+0x80/0xc8
[68668.114103] [<80084260>] __do_softirq+0x250/0x298
[68668.118863] [<8006aa20>] plat_irq_dispatch+0xd4/0x10c
[68668.123927] 
[68668.125428] ---[ end trace ad42afcfef337e45 ]---

I have AP and 3 STA clients connected to AP. All 4 devices are the same Ubiquti Bullet M2 hardware and same firmware.

This in on AP device:

$ iw dev wlan-ap station dump | grep 'bitra\|Station'
Station 68:72:51:24:bc:9b (on wlan-ap)
	tx bitrate:	1.0 MBit/s          <=== !!!
	rx bitrate:	52.0 MBit/s MCS 5
Station 9a:b8:52:e5:07:23 (on wlan-ap)
	tx bitrate:	52.0 MBit/s MCS 5
	rx bitrate:	39.0 MBit/s MCS 4
Station 68:72:51:24:bc:21 (on wlan-ap)
	tx bitrate:	52.0 MBit/s MCS 5
	rx bitrate:	65.0 MBit/s MCS 7

Marked 'tx bitrate: 1.0 MBit/s' show which wifi sta connection frozen.

On client side I have symmetry situation:

$ iw dev wlan-sta station dump | grep 'bitra\|Station'
Station 68:72:51:68:49:9e (on wlan-sta)
        tx bitrate:     52.0 MBit/s MCS 5
        rx bitrate:     1.0 MBit/s

If I delete this connection on AP, STA recover connection in ~10 min.

$ iw dev wlan-ap station del 68:72:51:24:bc:9b

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing