OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity High
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by tobiaswaldvogel - 10.10.2020

FS#3376 - Memory leak in cfg80211/mac80211 in trunk on TP Archer c7 v2

- Device TP Archer c7 v2
- Software OpenWrt SNAPSHOT r14651+16-23c7fb7600
- How to reproduce: Just wait a few hours (Usually just 6 hours)
- I tried both ath10k drivers (ath10k and ath10k-ct) with the same result

After compiling the current trunk for TP Archer c7 v2 there seems to be a memory leak in the kernel.
The memory usage is creeping up over a few hours until it triggers an OOM and a reboot.
There is no user space process consuming the memory and an analysis with kmemleak showed always the following entries:

unreferenced object 0x84639900 (size 176):
  comm "hostapd", pid 1622, jiffies 5733 (age 112.956s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 85 5f f0 00 00 00 00 00  ........._......
    00 00 00 00 00 00 00 00 40 22 00 1b 00 07 00 00  ........@"......
  backtrace:
    [<c100ebc5>] kmem_cache_alloc+0xf4/0x23c
    [<bba668db>] __build_skb+0x34/0xd0
    [<968af6d0>] __netdev_alloc_skb+0xf8/0x1ac
    [<3bb1be69>] ieee80211_mgmt_tx+0x120/0x5a0 [mac80211]
    [<bb1a7ed1>] nl80211_parse_chandef+0x1bb8/0x2be8 [cfg80211]

gdb resolved this to

ieee80211_mgmt_tx (backports-5.8-1/net/mac80211/offchannel.c:890).
885			ret = -EBUSY;
886			goto out_unlock;
887		}
888	
889		skb = dev_alloc_skb(local->hw.extra_tx_headroom + params->len);
890		if (!skb) {
891			ret = -ENOMEM;
892			goto out_unlock;
893		}
894		skb_reserve(skb, local->hw.extra_tx_headroom);
nl80211_tx_mgmt (backports-5.8-1/net/wireless/nl80211.c:10990).
10985			}
10986		}
10987	
10988		params.chan = chandef.chan;
10989		err = cfg80211_mlme_mgmt_tx(rdev, wdev, &params, &cookie);
10990		if (err)
10991			goto free_msg;
10992	
10993		if (msg) {
10994			if (nla_put_u64_64bit(msg, NL80211_ATTR_COOKIE, cookie,

As far as I understood this belongs to the remain_on_channel functionality
But unfortunately I was unable to spot why the skb is not released

17.10.2020 : A task closure has been requested. Reason for request: Fixed by https://git.openwrt.org/?p=openwrt/staging/nbd.git;a=commitdiff;h=30b73fa5d8ab69514184f72e25a866e62e6cbbcb
Rod Egan commented on 15.10.2020 01:18

Anecdotal confirmation.
C7V5 and wr842v1 both suffer from OOM on latest snapshot.
Time period anything from 3 hours to 24 hours.
I use imagebuilder so don't have kmemleak available but there aren't any obvious user mode processes using memory.
I've reverted to 19.07.4 build to maintain connectivity.

tobiaswaldvogel commented on 17.10.2020 10:00

In the meanwhile I identified the exact place where the skb is lost and made a PR for a simple fix suggestion. Felix addressed that issue in his rework of the wireless framework so this PR will not required anymore. I'm going to test this today but most likely this issue can then be closed afterwards.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing