OpenWrt/LEDE Project

  • Status Closed
  • Percent Complete
    100%
  • Task Type Bug Report
  • Category Base system
  • Assigned To
    Stijn Tintel
  • Operating System All
  • Severity High
  • Priority Medium
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Stijn Tintel - 20.07.2021
Last edited by Stijn Tintel - 04.11.2021

FS#3943 - ujail breaks after some uptime

When a device with procd-ujail installed has been running for a while (hit it today with 28d uptime), restarting dnsmasq results in dnsmasq no longer being started, there is only the ujail process. There are no errors displayed on stdout/stderr while restarting, nor in syslog.

root@ar0:~# /etc/init.d/dnsmasq restart
udhcpc: started, v1.33.1
udhcpc: sending discover
udhcpc: no lease, failing
udhcpc: started, v1.33.1
udhcpc: sending discover
udhcpc: no lease, failing
udhcpc: started, v1.33.1
udhcpc: sending discover
udhcpc: no lease, failing
udhcpc: started, v1.33.1
udhcpc: sending discover
udhcpc: no lease, failing
Tue Jul 20 15:17:15 2021 user.notice dnsmasq: DNS rebinding protection is active, will discard upstream RFC1918 responses!
Tue Jul 20 15:17:15 2021 user.notice dnsmasq: Allowing 127.0.0.0/8 responses
Tue Jul 20 15:17:15 2021 user.notice dnsmasq: Allowing RFC1918 responses for domain plex.direct
root@ar0:~# ps aux | grep dnsmasq
root     21289  0.0  0.0   2088   872 ?        S    15:17   0:00 /sbin/ujail -n dnsmasq -u -l -r /dev/null -r /dev/urandom -r /etc/TZ -r /etc/dnsmasq.conf -r /etc/ethers -r /etc/group -r /etc/hosts -r /etc/passwd -r /sbin/hotplug-call -r /tftpboot -r /tmp/dnsmasq.d -r /tmp/etc/dnsmasq.conf.main -r /tmp/hosts/dhcp.main -r /usr/lib/dnsmasq/dhcp-script.sh -r /usr/share/dnsmasq/dhcpbogushostname.conf -r /usr/share/dnsmasq/rfc6761.conf -r /usr/share/dnsmasq/trust-anchors.conf -w /var/lib/dhcp.leases -w /var/run/dnsmasq/ -- /usr/sbin/dnsmasq -C /tmp/etc/dnsmasq.conf.main -k -x /var/run/dnsmasq/dnsmasq.main.pid
root     21455  0.0  0.0   1132   468 pts/1    S+   15:19   0:00 grep dnsmasq
root@ar0:~# ss -anput | grep dnsmasq
root@ar0:~#

Commenting out the lines in the init script starting with procd_add_jail and then restarting the service solves the problem. The problem also does not occur when dnsmasq is started during boot.

I’ve seen this problem before, mentioned it a few times on IRC, the first time was in October 2020, so before 21.02 was branched, so it’s very likely this problem exists there as well.

I didn’t reboot the system where I’m currently experiencing this, I’ve commented out the procd_add_jail lines instead. Uncommenting those lines brings back the problem, so further investigation is possible.

This seems to be a general problem with ujail, as even a simple echo refuses to start:

# ujail -d1 -n blah -r /tmp -- /bin/echo test
jail: adding mount /tmp /tmp bind(1) ro(1) err(0)
jail: Using namespaces(0x28020000), capabilities(0), seccomp(0)
jail: adding mount /bin/echo /bin/echo bind(1) ro(1) err(1)
jail: adding mount /lib/ld-musl-x86_64.so.1 /lib/ld-musl-x86_64.so.1 bind(1) ro(1) err(1)
jail: adding library /lib/libgcc_s.so.1 (libgcc_s.so.1)
jail: adding library /lib/libc.so (libc.so)

The process hangs here until killed with kill -9.
Running in strace, the process hangs on epoll_pwait.
Backtrace with gdbserver:

#0  epoll_pwait (fd=3, ev=ev@entry=0x7ffff7f802c0 <events>, cnt=cnt@entry=10, to=1834444156, sigs=sigs@entry=0x0) at ./arch/x86_64/syscall_arch.h:61
#1  0x00007ffff7fa7ada in epoll_wait (fd=<optimized out>, ev=ev@entry=0x7ffff7f802c0 <events>, cnt=cnt@entry=10, to=<optimized out>)
    at src/linux/epoll.c:36
#2  0x00007ffff7f7805f in uloop_fetch_events (timeout=<optimized out>)
    at /home/stijn/Development/OpenWrt/openwrt/build_dir/target-x86_64_musl/libubox-2021-08-19-d716ac4b/uloop-epoll.c:73
#3  uloop_run_events (timeout=<optimized out>)
    at /home/stijn/Development/OpenWrt/openwrt/build_dir/target-x86_64_musl/libubox-2021-08-19-d716ac4b/uloop.c:170
#4  uloop_run_timeout (timeout=-1) at /home/stijn/Development/OpenWrt/openwrt/build_dir/target-x86_64_musl/libubox-2021-08-19-d716ac4b/uloop.c:555
#5  0x000055555555b915 in ?? ()
#6  0x00007ffff7f69bb8 in ?? ()
#7  0xffffffff01203ff2 in ?? ()
#8  0x00007ffff7ffd880 in ?? () from /home/stijn/Development/OpenWrt/openwrt/scripts/../staging_dir/target-x86_64_musl/root-x86/lib/ld-musl-x86_64.so.1
#9  0x00007ffff7fe30a0 in do_init_fini (queue=<optimized out>) at ldso/dynlink.c:1545
#10 0x00007ffff7ff80e0 in ?? () from /home/stijn/Development/OpenWrt/openwrt/scripts/../staging_dir/target-x86_64_musl/root-x86/lib/ld-musl-x86_64.so.1
#11 0x00007fffffffecb8 in ?? ()
#12 0x00007ffff7fa5d3c in libc_start_main_stage2 (main=0x6b77814f93720b1f, argc=1431674893, argv=0x55555555a00d) at src/env/__libc_start_main.c:94
#13 0x000055555555ba5b in ?? ()
#14 0x0000000000000008 in ?? ()
#15 0x00007fffffffeeca in ?? ()
#16 0x00007fffffffeed6 in ?? ()
#17 0x00007fffffffeed9 in ?? ()
#18 0x00007fffffffeede in ?? ()
#19 0x00007fffffffeee1 in ?? ()
#20 0x00007fffffffeee6 in ?? ()
#21 0x00007fffffffeee9 in ?? ()
#22 0x00007fffffffeef3 in ?? ()
#23 0x0000000000000000 in ?? ()
Closed by  Stijn Tintel
04.11.2021 01:27
Reason for closing:  Fixed
Additional comments about closing:  

https:/ /git.openwrt.org/8802b21dffbc4b9834b2305 7e8f38c845a803ca7

wulfy23 commented on 08.10.2021 21:26

similar experience reported: https://forum.openwrt.org/t/least-privilege-runtime/99564/13?u=wulfy23

was also running dnsmasq-full when it failed to restart...

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing