New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FS#2666 - procd: reboot inside lxc container causes shutdown #7568
Comments
pgwipeout: Add a link button has a maximum character limit which truncated the reference link. Reference Link: |
ynezz: Hi, I've just tried following:
docker run --rm -it openwrtorg/rootfs:x86-64-19.07.0 reboot
and it works as expected, it doesnt hang, so probably lxd/lxc does it differently. I don't have any prior experience with lxc/lxd, can you provide exact commands one has to run in order to reproduce it? |
ynezz: In other words, following commands:
docker run --env DBGLVL=5 --rm -it openwrtorg/rootfs:x86-64-19.07.0
root@4c37be61eef4:/# logread -f &
provides following debugging log output:
Mon Jan 20 07:28:50 2020 daemon.notice procd: Triggering reboot
Mon Jan 20 07:28:50 2020 daemon.notice procd: Shutting down system with event 1234567
Mon Jan 20 07:28:50 2020 daemon.info procd: - shutdown -
Mon Jan 20 07:28:50 2020 daemon.notice procd: running /etc/rc.d/K* shutdown
Mon Jan 20 07:28:50 2020 daemon.notice procd: start /etc/rc.d/K10gpio_switch shutdown
Mon Jan 20 07:28:50 2020 daemon.notice procd: stop /etc/rc.d/K10gpio_switch shutdown - took 0.030491852s
Mon Jan 20 07:28:50 2020 daemon.notice procd: start /etc/rc.d/K50dropbear shutdown
Mon Jan 20 07:28:50 2020 daemon.notice procd: stop /etc/rc.d/K50dropbear shutdown - took 0.021862346s
Mon Jan 20 07:28:50 2020 authpriv.info dropbear[389]: Early exit: Terminated by signal
Mon Jan 20 07:28:50 2020 daemon.notice procd: start /etc/rc.d/K85odhcpd shutdown
Mon Jan 20 07:28:50 2020 daemon.notice procd: Instance dropbear::instance1 exit with error code 256 after 10 seconds
Mon Jan 20 07:28:50 2020 daemon.notice procd: Stop instance odhcpd::instance1
Mon Jan 20 07:28:50 2020 daemon.notice procd: Instance odhcpd::instance1 exit with error code 0 after 10 seconds
Mon Jan 20 07:28:50 2020 daemon.notice procd: stop /etc/rc.d/K85odhcpd shutdown - took 0.030896977s
Mon Jan 20 07:28:50 2020 daemon.notice procd: start /etc/rc.d/K89log shutdown
Mon Jan 20 07:28:50 2020 daemon.notice procd: Stop instance log::instance1
I would like to get the same logs from LXC container running under LXD. |
bjonglez: Are you running an OpenWrt LXC container on an OpenWrt host? Or an OpenWrt container on a host running another distribution? (Debian, Fedora...) I can test OpenWrt LXC container on a x86_64 host running Debian, but I want to make sure it's really the same setup as you :) Also, how do you prepare your image? Usually I just dump openwrt-x86-64-rootfs-ext4.img to a LVM volume. |
bjonglez: To create a container:
cd /tmp/
wget http://downloads.openwrt.org/releases/19.07.0/targets/x86/64/openwrt-19.07.0-x86-64-generic-rootfs.tar.gz
Edit
# LXC 2: use "lxc.rootfs" instead
lxc.rootfs.path = /var/lib/lxc/openwrt/rootfs
Start container:
lxc-start -n openwrt
To debug if it doesn't work, add -F. To connect to the container:
lxc-attach -n openwrt --clear-env /bin/ash
To look at the current container state:
lxc-ls -f
When testing here, on a x86_64 host running Arch Linux and LXC 3.2.1, indeed a reboot causes the container to shut down. |
bjonglez: Hmm, couldn't get any useful log:
# lxc-attach -n openwrt --clear-env --set-var DBGLVL=5 /bin/ash
|
bjonglez: Ok, the DBGLVL=5 should go in /var/lib/lxc/openwrt/config:
...
lxc.environment = DBGLVL=5
Then I get some debug:
# lxc-attach -n openwrt --clear-env /bin/ash
...
~ # logread -f &
~ # reboot
~ # Mon Jan 20 09:56:00 2020 daemon.notice procd: Triggering reboot
Mon Jan 20 09:56:00 2020 daemon.notice procd: Shutting down system with event 1234567
Mon Jan 20 09:56:00 2020 daemon.info procd: - shutdown -
Mon Jan 20 09:56:00 2020 daemon.notice procd: running /etc/rc.d/K* shutdown
Mon Jan 20 09:56:00 2020 daemon.notice procd: start /etc/rc.d/K10gpio_switch shutdown
Mon Jan 20 09:56:00 2020 daemon.notice procd: stop /etc/rc.d/K10gpio_switch shutdown - took 0.024061621s
Mon Jan 20 09:56:00 2020 daemon.notice procd: start /etc/rc.d/K50dropbear shutdown
Mon Jan 20 09:56:00 2020 authpriv.info dropbear[287]: Early exit: Terminated by signal
Mon Jan 20 09:56:00 2020 daemon.notice procd: stop /etc/rc.d/K50dropbear shutdown - took 0.015703745s
Mon Jan 20 09:56:00 2020 daemon.notice procd: Instance dropbear::instance1 exit with error code 256 after 10 seconds
Mon Jan 20 09:56:00 2020 daemon.notice procd: start /etc/rc.d/K85odhcpd shutdown
Mon Jan 20 09:56:00 2020 daemon.notice procd: Stop instance odhcpd::instance1
Mon Jan 20 09:56:00 2020 daemon.notice procd: Instance odhcpd::instance1 exit with error code 0 after 10 seconds
Mon Jan 20 09:56:00 2020 daemon.notice procd: stop /etc/rc.d/K85odhcpd shutdown - took 0.022317393s
Mon Jan 20 09:56:00 2020 daemon.notice procd: start /etc/rc.d/K89log shutdown
Mon Jan 20 09:56:00 2020 daemon.notice procd: Stop instance log::instance1
Failed to find log object: Not found
Failed to find log object: Not found
Failed to find log object: Not found
Failed to find log object: Invalid argument
|
bjonglez: Petr, your patch https://gitlab.com/ynezz/openwrt-procd/commit/aa8689ccdff0124f3d477f86433496aeb2c49d24 fixes the issue, you can add my Tested-By. I've tested with both LXC 2.0.7 on Debian and LXC 3.2.1 on Arch Linux. |
pgwipeout: Confirmed your patch Tested-by: Peter Geis pgwipeout@gmail.com |
pgwipeout:
armvirt64
openwrt-19.07-rc2
Create a lxc package from the openwrt-19.07-rc2 rootfs package.
Start a lxc container using the openwrt lxc package.
Initiate a
reboot
command from inside the container.Expected result: Container reboots.
Actual result: Container halts.
Current Version openwrt-19.07-rc2.
While running in a lxc container, initiating a reboot from openwrt results in a shutdown of the container instead of rebooting the container.
This appears to have been caused by commit <832369078d818d19ab64051fdc8da9e06c90ad88> state: fix shutdown when running in a container (FS#2425).
Instead of triggering reboot(reboot_event);, it detects that it is inside a container and exits pid 1, resulting in the container halting.
There is a patch at [0] to resolve this, but would likely break the original intention of the fix above.
The text was updated successfully, but these errors were encountered: