OpenWrt/LEDE Project

  • Status Unconfirmed   Reopened
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity Medium
  • Priority Very Low
  • Reported Version openwrt-19.07
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Juliusz Chroboczek - 14.01.2020
Last edited by Jo-Philipp Wich - 15.01.2020

FS#2734 - Opkg update fails although router has enough memory

19.07.0 running on WNDR3700v2, 32MB of memory. The machine is certainly not short on memory:

                total        used        free      shared  buff/cache   available
  Mem:          59560       26912       24108         956        8540       16564
  Swap:             0           0           0

Running “opkg update” a first time works fine. However, if I run “opkg update” a second time, it reports:

  Collected errors:
   * pkg_hash_add_from_file: Failed to open /var/opkg-lists/openwrt_routing: Out of memory.

After doing “rm /var/opkg-lists/openwrt_*”, everything works fine again.

Juliusz Chroboczek commented on 14.01.2020 14:38

Strace indicates that it's `fork` that fails:

  pipe([4, 5])                            = 4
  fork()                                  = -1 ENOMEM (Out of memory)
Juliusz Chroboczek commented on 14.01.2020 14:41

Enabling memory overcommit (which I'm not recommending should be the default) successfully works around the issue:

  sysctl -w vm.overcommit_memory=1
Admin
Jo-Philipp Wich commented on 14.01.2020 16:34

I don't see a solution to this. Opkg was already heavily patched and modified to use less RAM, even introducing a rather slow multi-pass list parsing algorithm to avoid keeping the entire dependency graph in memory. The lists are just big and keep growing, so devices with 64MB RAM and lower are nearing the end of their usefulness.

Donk commented on 14.01.2020 16:45

Same issue here with a x86-64 image in Virtualbox with 128MB Ram

Juliusz Chroboczek commented on 14.01.2020 17:48

Jo-Philipp, may I suggest that you reopen this report? Just because we don't see a solution yet doesn't mean the bug should be closed.

There are two points I'd like to add.

First of all, "opkg install" and "opkg list-installable" work while "opkg update" fail. Is opkg perhaps keeping two copies (the old and the new one) in memory at the same time?

Second, the failure is not due to lack of memory – it is due to fork failing due to the large amount of writeable pages. To me, this implies that either opkg should avoid forking (by linking with libz instead of running gzip – see gzip_fdopen in libbb/gzip.c), or fork before it allocates the data structures (and keep the pipe around for later usage), or use a custom memory allocator that marks the pages as unwriteable using mprotect.

Admin
Jo-Philipp Wich commented on 15.01.2020 06:58

Sure, will reopen it if you prefer.

Juliusz Chroboczek commented on 16.01.2020 13:36

There might be a memory leak.

A first run of opkg update indicates all memory has been freed; running it a second time, though, indicates a leak of almost 4MB:

  ==6835==
  ==6835== HEAP SUMMARY:
  ==6835==     in use at exit: 3,674,125 bytes in 71,915 blocks
  ==6835==   total heap usage: 164,305 allocs, 92,390 frees, 27,717,132 bytes allo  cated
  ==6835==
  ==6835== 920 (144 direct, 776 indirect) bytes in 9 blocks are definitely lost in   loss record 5 of 19
  ==6835==    at 0x483577F: malloc (vg_replace_malloc.c:309)
  ==6835==    by 0x122C91: xmalloc (xfuncs.c:30)
  ==6835==    by 0x11C0BA: parse_alternatives (pkg_parse.c:185)
  ==6835==    by 0x11C0BA: pkg_parse_line (pkg_parse.c:212)
  ==6835==    by 0x11E4BE: parse_from_stream_nomalloc (parse_util.c:162)
  ==6835==    by 0x11B86C: pkg_hash_add_from_file (pkg_hash.c:126)
  ==6835==    by 0x11BAC9: pkg_hash_load_feeds (pkg_hash.c:196)
  ==6835==    by 0x10FDBF: main (opkg-cl.c:433)
  ==6835==
  ==6835== 1,144 bytes in 47 blocks are definitely lost in loss record 6 of 19
  ==6835==    at 0x4837D7B: realloc (vg_replace_malloc.c:836)
  ==6835==    by 0x119D7F: parse_providelist (pkg_depends.c:629)
  ==6835==    by 0x11C4CD: pkg_parse_line (pkg_parse.c:296)
  ==6835==    by 0x11E4BE: parse_from_stream_nomalloc (parse_util.c:162)
  ==6835==    by 0x11B86C: pkg_hash_add_from_file (pkg_hash.c:126)
  ==6835==    by 0x11BAC9: pkg_hash_load_feeds (pkg_hash.c:196)
  ==6835==    by 0x10FDBF: main (opkg-cl.c:433)
  ==6835==
  ==6835== 31,056 (15,264 direct, 15,792 indirect) bytes in 477 blocks are definitely lost in loss record 11 of 19
  ==6835==    at 0x48356AF: malloc (vg_replace_malloc.c:308)
  ==6835==    by 0x4837DE7: realloc (vg_replace_malloc.c:836)
  ==6835==    by 0x11A069: parse_deplist (pkg_depends.c:816)
  ==6835==    by 0x11C705: pkg_parse_line (pkg_parse.c:321)
  ==6835==    by 0x11E4BE: parse_from_stream_nomalloc (parse_util.c:162)
  ==6835==    by 0x11B86C: pkg_hash_add_from_file (pkg_hash.c:126)
  ==6835==    by 0x11BAC9: pkg_hash_load_feeds (pkg_hash.c:196)
  ==6835==    by 0x10FDBF: main (opkg-cl.c:433)
  ==6835==
  ==6835== 792,670 (306,096 direct, 486,574 indirect) bytes in 4,057 blocks are definitely lost in loss record 17 of 19
  ==6835==    at 0x4837D7B: realloc (vg_replace_malloc.c:836)
  ==6835==    by 0x11A069: parse_deplist (pkg_depends.c:816)
  ==6835==    by 0x11C705: pkg_parse_line (pkg_parse.c:321)
  ==6835==    by 0x11E4BE: parse_from_stream_nomalloc (parse_util.c:162)
  ==6835==    by 0x11B86C: pkg_hash_add_from_file (pkg_hash.c:126)
  ==6835==    by 0x11BAC9: pkg_hash_load_feeds (pkg_hash.c:196)
  ==6835==    by 0x10FDBF: main (opkg-cl.c:433)
  ==6835==
  ==6835== 2,848,303 (599,456 direct, 2,248,847 indirect) bytes in 6,812 blocks are definitely lost in loss record 19 of 19
  ==6835==    at 0x4837B65: calloc (vg_replace_malloc.c:762)
  ==6835==    by 0x122D20: xcalloc (xfuncs.c:46)
  ==6835==    by 0x116B77: pkg_new (pkg.c:95)
  ==6835==    by 0x11B81E: pkg_hash_add_from_file (pkg_hash.c:121)
  ==6835==    by 0x11BAC9: pkg_hash_load_feeds (pkg_hash.c:196)
  ==6835==    by 0x10FDBF: main (opkg-cl.c:433)
  ==6835==
  ==6835== LEAK SUMMARY:
  ==6835==    definitely lost: 922,104 bytes in 11,402 blocks
  ==6835==    indirectly lost: 2,751,989 bytes in 60,511 blocks
  ==6835==      possibly lost: 0 bytes in 0 blocks
  ==6835==    still reachable: 32 bytes in 2 blocks
  ==6835==         suppressed: 0 bytes in 0 blocks
Timoty commented on 02.02.2020 10:52

Have the same problem. Only with Asus RT-N11P. Only with 19.07.X. 18.06 not affected.

fnarfbargle commented on 08.03.2020 13:58

Can confirm this on an WNDR3700v1 with 19.07.01

root@Baravan:~# free
              total        used        free      shared  buff/cache   available
Mem:          59372       25364       26264        1196        7744       18100
Swap:             0           0           0
root@Baravan:~# opkg update
Collected errors:
 * pkg_hash_add_from_file: Failed to open /var/opkg-lists/openwrt_routing: Out of memory.

So 26M free. Not like it's out of memory. The contents of /var/opkg-lists decompressed totals some 3.6M :

root@Baravan:/tmp/opkg-lists# rm *,sig ; for i in * ; do cat $i | gzip -d > $i.raw ; done
root@Baravan:/tmp/opkg-lists# du -hcs *.raw
200.0K	openwrt_base.raw
352.0K	openwrt_core.raw
804.0K	openwrt_luci.raw
1.9M	openwrt_packages.raw
48.0K	openwrt_routing.raw
328.0K	openwrt_telephony.raw
3.6M	total

darkpenguin commented on 24.04.2020 23:01

If you are running "opkg update" for the first time, or if you delete /tmp/opkg-* , then it runs fine with only 2 Mb (or even less - I did not try beyond that). If there are already lists downloaded, then apparently no amount of memory will be enough (10M is certainly not enough). (We could just make it ignore previously downloaded lists, and that will fix it...)

"opkg list" should only list the names and not do anything more complex. It also runs fine when there are no downloaded lists; that could be attributed to having less text to display, but 10M being not enough to just display some text without any checking? That's unlikely.

Confirmed on:
Mikrotik RB941-2nD (Atheros QCA9531), both 19.07.2 and 18.06.8 (except on 18.06.8, "opkg update" worked multiple times, but not "opkg list")
TP-Link TL-WR940N v6 (Atheros TP9343), 18.06.8

Nelson commented on 19.05.2020 22:48

19.07.3 not fix this problem on Asus RT-N11P. Bug still there. Still "Out of memory" when opkg list-upgradable. Please pay attention to the one who leads MT7620N.

Benjamin Réveillé commented on 07.06.2020 17:11

19.07.3 has brought progress on my TP-Link re450v1 : No more Out of memory with opkg update

However as stated in the previous post, opkg list-upgradable fails with Out of memory

root@re450:~# opkg list-upgradable
Collected errors:
 * pkg_hash_add_from_file: Failed to open /var/opkg-lists/openwrt_routing: Out of memory.

strace shows :

--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14609, si_uid=0, si_status=0, si_utime=19, si_stime=49} ---
wait4(14609, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 14609
rt_sigaction(SIGPIPE, {sa_handler=SIG_DFL, sa_mask=[RT_68 RT_70 RT_71 RT_73 RT_74 RT_75 RT_77 RT_79 RT_81 RT_82 RT_83 RT_84 RT_86 RT_87 RT_88 RT_89 RT_90 RT_91 RT_93 RT_94 RT_95], sa_flags=SA_RESTORER, sa_restorer=NULL}, NULL, 16) = 0
stat64("/var/opkg-lists/openwrt_routing", {st_mode=S_IFREG|0644, st_size=11492, ...}) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[RT_68 RT_70 RT_71 RT_73 RT_74 RT_75 RT_77 RT_79 RT_81 RT_82 RT_83 RT_84 RT_86 RT_87 RT_88 RT_89 RT_90 RT_91 RT_93 RT_94 RT_95], sa_flags=SA_RESTORER, sa_restorer=NULL}, {sa_handler=SIG_DFL, sa_mask=[RT_72 RT_73 RT_74 RT_77 RT_79 RT_80 RT_83 RT_85 RT_86 RT_87 RT_88 RT_89 RT_90 RT_91 RT_93 RT_94 RT_95], sa_flags=SA_RESTORER, sa_restorer=NULL}, 16) = 0
pipe([4, 5])                            = 4
fork()                                  = -1 ENOMEM (Out of memory)
open("/tmp/opkg-BKcjeL", O_RDONLY|O_LARGEFILE|O_CLOEXEC|O_DIRECTORY) = 6
fcntl64(6, F_SETFD, FD_CLOEXEC)         = 0
fchdir(6)                               = 0
getdents64(6, /* 2 entries */, 2048)    = 48
getdents64(6, /* 0 entries */, 2048)    = 0
chdir("..")                             = 0
rmdir("/tmp/opkg-BKcjeL")               = 0
close(6)                                = 0
fcntl64(3, F_SETLK64, {l_type=F_UNLCK, l_whence=SEEK_CUR, l_start=0, l_len=0}) = 0
close(3)                                = 0
unlink("/var/lock/opkg.lock")           = 0
writev(2, [{iov_base="", iov_len=0}, {iov_base="Collected errors:\n", iov_len=18}], 2Collected errors:
) = 18
writev(2, [{iov_base=" * ", iov_len=3}, {iov_base="pkg_hash_add_from_file: Failed t"..., iov_len=87}], 2 * pkg_hash_add_from_file: Failed to open /var/opkg-lists/openwrt_routing: Out of memory.
) = 90
writev(2, [{iov_base="", iov_len=0}, {iov_base=NULL, iov_len=0}], 2) = 0
exit_group(-1)                          = ?
+++ exited with 255 +++

whereas

root@re450:~# ls -lrt /var/opkg-lists/openwrt_routing
-rw-r--r--    1 root     root         11492 Jun  7 19:03 /var/opkg-lists/openwrt_routing

and

root@re450:~# free
              total        used        free      shared  buff/cache   available
Mem:          59640       32064       23696        1104        3880       13232
Swap:             0           0           0
Andrey commented on 10.09.2020 00:18

19.07.4 and ASUS RT-N11P.
"Out of memory" after opkg list-upgradable in terminal.

Project Manager
Baptiste Jonglez commented on 13.09.2020 12:10

To get an idea of memory usage, run opkg with "time -v". Then look at "Maximum resident set size". That being said, I don't know if it accounts for forked processes.

Here is memory usage with 19.07.4 on a TL-WDR4300 v1 (ath79):

First opkg update takes 3.76 MB

Second opkg update takes 3.76 MB

opkg list takes 14.6 MB

opkg list-upgradable takes 56.3 MB

opkg list-installed takes 3.76 MB

So, jow's optimizations are quite good at reducing memory usage for update and list. Since list-upgradable has not been changed recently, it's not surprising it still takes a large amount of memory.

Karlito commented on 15.09.2020 10:57

It's already been nine months since 19.07.0 was released. You have not done ANYTHING to fix the problem. Moreover, you have already started to drop support for 18.06, where this problem does not exist. When you completely stop supporting 18.06 what do we do then? Throw out fully working routers or become part of some kind of botnet network, without security updates? You understand that most people use budget models of routers with OpenWRT? Then why are you better than those technology companies that produce these routers and drop their support in 1.5-2 years? In some countries, people have wages of $ 100-200 per month (sometimes less) and they do not have money for good routers.

PS: Have you ever wondered why "free" operating systems have a market share within one percent? That's because of these things. Due to the large number of small and huge bugs in these OS. You have a huge list of supported devices. But how many routers on this list actually work without any bugs?

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing