OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity Medium
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Dermot Mc Donnell - 19.01.2021

FS#3579 - Netifd OOMs in trunk

Supply the following if possible:
- Device problem occurs on BT HOME HUB 5A - LANTIQ XRX200 SoC
- Software versions of OpenWrt/LEDE release, packages, etc.
Linux OpenWrt 5.4.89 #0 SMP Fri Jan 15 19:49:01 2021 mips GNU/Linux
- Steps to reproduce JUST WAIT

 

Yesterday I built trunk for my bthh5a.

netifd OOMs 10.5 hours after the last LAN client disassociated. 30 mins after boot, with 3 or 4 LAN clients connected, there is 20M free ram. Looks like there may be a leak somewhere.

Tue Jan 19 00:56:47 2021 daemon.info hostapd: wlan1: STA d8:9c:67:ac:xx:xx IEEE 802.11: disassociated
Tue Jan 19 00:56:48 2021 daemon.info hostapd: wlan1: STA d8:9c:67:ac:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Tue Jan 19 00:58:20 2021 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 8c:f5:a3:a5:xx:xx
Tue Jan 19 00:58:20 2021 daemon.info hostapd: wlan0: STA 8c:f5:a3:a5:xx:xx IEEE 802.11: disassociated
Tue Jan 19 00:58:21 2021 daemon.info hostapd: wlan0: STA 8c:f5:a3:a5:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Tue Jan 19 01:09:21 2021 daemon.info hostapd: wlan0: STA 8c:f5:a3:a5:xx:xx IEEE 802.11: authenticated
Tue Jan 19 01:09:21 2021 daemon.notice hostapd: wlan0: STA-OPMODE-N_SS-CHANGED 8c:f5:a3:a5:xx:xx 2
Tue Jan 19 01:09:21 2021 daemon.info hostapd: wlan0: STA 8c:f5:a3:a5:xx:xx IEEE 802.11: associated (aid 1)
Tue Jan 19 01:09:21 2021 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 8c:f5:a3:a5:xx:xx
Tue Jan 19 01:09:21 2021 daemon.info hostapd: wlan0: STA 8c:f5:a3:a5:xx:xx WPA: pairwise key handshake completed (RSN)
Tue Jan 19 01:09:21 2021 daemon.info dnsmasq-dhcp[5331]: DHCPDISCOVER(br-lan) 8c:f5:a3:a5:xx:xx
Tue Jan 19 01:09:21 2021 daemon.info dnsmasq-dhcp[5331]: DHCPOFFER(br-lan) 192.168.9.117 8c:f5:a3:a5:xx:xx
Tue Jan 19 01:09:21 2021 daemon.info dnsmasq-dhcp[5331]: DHCPREQUEST(br-lan) 192.168.9.117 8c:f5:a3:a5:xx:xx
Tue Jan 19 01:09:21 2021 daemon.info dnsmasq-dhcp[5331]: DHCPACK(br-lan) 192.168.9.117 8c:f5:a3:a5:xx:xx Galaxy-S7
Tue Jan 19 01:17:44 2021 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 8c:f5:a3:a5:xx:xx
Tue Jan 19 01:17:44 2021 daemon.info hostapd: wlan0: STA 8c:f5:a3:a5:xx:xx IEEE 802.11: disassociated due to inactivity
Tue Jan 19 01:17:45 2021 daemon.info hostapd: wlan0: STA 8c:f5:a3:a5:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Tue Jan 19 01:18:40 2021 daemon.notice hostapd: wlan1: AP-STA-DISCONNECTED a4:71:74:e9:90:33
Tue Jan 19 01:18:40 2021 daemon.info hostapd: wlan1: STA a4:71:74:e9:90:33 IEEE 802.11: disassociated due to inactivity
Tue Jan 19 01:18:41 2021 daemon.info hostapd: wlan1: STA a4:71:74:e9:90:33 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.668452] netifd invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.675518] CPU: 1 PID: 2103 Comm: netifd Not tainted 5.4.89 #0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.681285] Stack : 00000001 00000000 800a4598 00000000 809119f8 854cdc1c 00000000 00000000
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.689637]         80b00000 87fa422c 8096876f 80892f9c 00000001 00000001 854cdbc0 920c484d
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.697994]         00000000 00000000 81140000 00000000 00000030 00000000 20352e34 2e383920
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.706348]         00000197 2c800000 00000000 000a4ebe 80000000 00000000 00000001 809a0000
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.714705]         809a8848 00000000 000002e6 00000000 00000002 00000010 00000004 80b00004
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.723062]         ...
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.725501] Call Trace:
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.727957] [<800107a0>] show_stack+0x30/0x100
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.732404] [<806ac0c8>] dump_stack+0xe4/0x158
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.736856] [<80170a98>] dump_header+0x60/0x334
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.741367] [<801719c0>] oom_kill_process+0x298/0x2a0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.746417] [<801724f8>] out_of_memory+0x284/0x43c
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.751211] [<801c3ae8>] __alloc_pages_nodemask+0xabc/0x144c
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.756858] [<801c4494>] __get_free_pages+0x1c/0x64
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.761747] [<80278d38>] proc_pid_readlink+0x80/0x194
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.766792] [<801fd098>] vfs_readlink+0x144/0x154
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.771497] [<801f57f0>] do_readlinkat+0xd0/0x178
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.776198] [<8001a150>] syscall_common+0x34/0x58
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.781277] Mem-Info:
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.783227] active_anon:939 inactive_anon:311 isolated_anon:0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.783227]  active_file:353 inactive_file:386 isolated_file:0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.783227]  unevictable:0 dirty:0 writeback:0 unstable:0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.783227]  slab_reclaimable:1224 slab_unreclaimable:6850
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.783227]  mapped:412 shmem:426 pagetables:84 bounce:0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.783227]  free:4908 free_pcp:0 free_cma:0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.815311] Node 0 active_anon:3756kB inactive_anon:1244kB active_file:1452kB inactive_file:1524kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:1648kB dirty:0kB writeback:0kB shmem:1704kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.838038] Normal free:19600kB min:20480kB low:24576kB high:28672kB active_anon:3756kB inactive_anon:1244kB active_file:1452kB inactive_file:1524kB unevictable:0kB writepending:0kB present:131072kB managed:113320kB mlocked:0kB kernel_stack:584kB pagetables:336kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.865727] lowmem_reserve[]: 0 0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.869063] Normal: 397*4kB (UMEH) 483*8kB (UMEH) 313*16kB (UMEH) 110*32kB (UMEH) 42*64kB (UMH) 17*128kB (UMH) 3*256kB (UH) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 19612kB
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.884169] 1178 total pagecache pages
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.887894] 0 pages in swap cache
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.891167] Swap cache stats: add 0, delete 0, find 0/0
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.896384] Free swap  = 0kB
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.899267] Total swap = 0kB
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.902130] 32768 pages RAM
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.904848] 0 pages HighMem/MovableOnly
Tue Jan 19 11:49:52 2021 kern.warn kernel: [74374.908745] 4438 pages reserved
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.911898] Tasks state (memory values in pages):
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.916575] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.925238] [    669]    81   669      317       22    20480        0             0 ubusd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.933414] [    682]     0   682      235        9    16384        0             0 askfirst
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.941860] [    731]     0   731      261       12    12288        0             0 urngd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.950087] [   1693]   514  1693      326       99    20480        0             0 logd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.958196] [   1745]     0  1745      537       43    16384        0             0 rpcd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.966227] [   1942]     0  1942      287       11    16384        0             0 dropbear
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.974682] [   2040]     0  2040      898      306    20480        0             0 hostapd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.983024] [   2041]     0  2041      880      163    24576        0             0 wpa_supplicant
Tue Jan 19 11:49:52 2021 kern.info kernel: [74374.991985] [   2103]     0  2103      445      137    16384        0             0 netifd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.000256] [   2486]     0  2486      372      149    20480        0             0 odhcpd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.008530] [   2729]     0  2729      314      128    16384        0             0 crond
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.016709] [   2914]     0  2914      883      104    24576        0             0 uhttpd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.025040] [   2997]     0  2997      387       20    20480        0             0 dbus-daemon
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.033694] [   3043]     0  3043      372       68    20480        0             0 blockd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.041952] [   3727]     0  3727      314       12    12288        0             0 ntpd
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.050094] [   3835]     0  3835      267       94    16384        0             0 odhcp6c
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.058405] [   3859]     0  3859      313       10    12288        0             0 udhcpc
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.066688] [   5331]   453  5331      665      340    16384        0             0 dnsmasq
Tue Jan 19 11:49:52 2021 kern.info kernel: [74375.075022] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=dnsmasq,pid=5331,uid=453
Tue Jan 19 11:49:52 2021 kern.err kernel: [74375.087794] Out of memory: Killed process 5331 (dnsmasq) total-vm:2660kB, anon-rss:1352kB, file-rss:4kB, shmem-rss:4kB, UID:453 pgtables:16kB oom_score_adj:0
Tue Jan 19 11:49:58 2021 daemon.info dnsmasq[13540]: Connected to system UBus
Tue Jan 19 11:49:58 2021 daemon.info dnsmasq[13540]: started, version 2.82 cachesize 150
Tue Jan 19 11:49:58 2021 daemon.info dnsmasq[13540]: DNS service limited to local subnets
Tue Jan 19 11:49:58 2021 daemon.info dnsmasq[13540]: compile time options: IPv6 GNU-getopt no-DBus UBus no-i18n no-IDN DHCP no-DHCPv6 no-Lua TFTP no-conntrack no-ipset no-auth no-DNSSEC no-ID loop-detect inotify dumpfile
Tue Jan 19 11:49:58 2021 daemon.info dnsmasq[13540]: UBus support enabled: connected to system bus
Dermot Mc Donnell commented on 20.01.2021 00:17

netifd - 2021-01-09-c00c8335-1
dnsmasq - 2.82-10

I will rebuild tomorrow with dnsmasq - 2.83 and report back.

Dermot Mc Donnell commented on 21.01.2021 14:20

I did a build yesterday and flashed it, keeping the existing config.

It has been up now for 24 hours.

I will post again in a few day.

Dermot Mc Donnell commented on 24.01.2021 18:07

Now up 100 hours without a problem. I reduced free RAM by adding sources to adblock and the bthh5a remains stable.

It would appear dnsmasq 2.83 fixes the issue. Of course, the new dnsmasq spams syslog. A fix has just been pushed for that. I will rebuild and flash to make sure the OOM does not reappear.

Will post again in a few days.

Project Manager
Baptiste Jonglez commented on 24.01.2021 21:01

What makes you think netifd is causing the issue? In your log, no process seems to have high memory usage (RSS column).

I would say it looks more like a kernel memory leak.

Project Manager
Baptiste Jonglez commented on 24.01.2021 21:06

If this happens again, please provide the content of /proc/meminfo after the OOM.

Dermot Mc Donnell commented on 28.01.2021 12:43

The OOM has not reoccurred since I flashed the build dated 24 Jan despite my reducing the amount of free RAM by 50%.

I leave it to wise to decide if this issue is now resolved.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing