Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#2723 - After libubus/ubox: procd - failed to open/remove pidfile and/or crashing #7567

Closed
openwrt-bot opened this issue Jan 10, 2020 · 4 comments
Labels

Comments

@openwrt-bot
Copy link

Pepe:

Device problem occurs on Turris Omnia (mvebu), Turris MOX (aarch64).
OpenWrt 19.07 and the latest OpenWrt master branch.

How to reproduce it?

Master:
# opkg list-installed | grep ubus
libubus - 2020-01-05-d35df8ad-1.0
libubus-lua - 2020-01-05-d35df8ad-1.0
python3-ubus - 0.1-3.8-1.7
ubus - 2020-01-05-d35df8ad-1.0
ubusd - 2020-01-05-d35df8ad-1.0

opkg list-installed | grep libubox

libubox - 2019-12-28-cd75136b-1.0

Find any package which has in init script hard coded path for pid file
procd_set_param pidfile

E.g. Install i2pd package and run: /etc/init.d/i2pd start
Then stop it: /etc/init.d/i2pd stop

And here you go, the router crashes:
[ 587.264257] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[ 587.264257]
[ 587.273419] CPU1: stopping
[ 587.276135] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.19.93 #0
[ 587.282153] Hardware name: Marvell Armada 380/385 (Device Tree)
[ 587.288098] [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[ 587.295862] [] (show_stack) from [] (dump_stack+0x94/0xa8)
[ 587.303103] [] (dump_stack) from [] (handle_IPI+0xf0/0x19c)
[ 587.310435] [] (handle_IPI) from [] (gic_handle_irq+0x8c/0x90)
[ 587.318024] [] (gic_handle_irq) from [] (__irq_svc+0x6c/0x90)
[ 587.325523] Exception stack(0xef09bf60 to 0xef09bfa8)
[ 587.330587] bf60: 00000000 001bda5c ef6e12c4 c0114fc0 ffffe000 c0d03c6c c0d03cac 00000002
[ 587.338783] bf80: 00000000 c0d03c48 c0c4d628 00000000 2ea95000 ef09bfb0 c0108644 c0108648
[ 587.346977] bfa0: 60000013 ffffffff
[ 587.350478] [] (__irq_svc) from [] (arch_cpu_idle+0x34/0x38)
[ 587.357898] [] (arch_cpu_idle) from [] (do_idle+0xf4/0x1f0)
[ 587.365226] [] (do_idle) from [] (cpu_startup_entry+0x18/0x1c)
[ 587.372814] [] (cpu_startup_entry) from [<001023ec>] (0x1023ec)
[ 587.380295] Rebooting in 3 seconds..

With this issue, we have been able to reproduce another issue with the package fosquitto, which is not available in OpenWrt feeds. In the init script, we commented out procd_set_param pidfile and then restart the process, we see these rows in the system log.

Jan 10 17:32:51 turris procd: Failed to removed pidfile: ������t.d/fosquitto: No such file or directory
Jan 10 17:32:51 turris procd: failed to open pidfile for writing: ������t.d/fosquitto: No such file or directory

I'm sure that this can be reproduced with other packages from packages feed as well.

This happens after the latest changes in libubox and ubus. Anyway, it was not a good idea to include the latest changes in libubox and ubus 2 days before tagging the OpenWrt 19.07. Reverting to libubox to 2019-10-29 and ubus to 2018-10-06 helps for both issues.

@openwrt-bot
Copy link
Author

Pepe:

Forget to mention that the correct pidfile for fosquitto should be /var/run/fosquitto.pid to reproduce No such file or directory helps to comment out procd_set_param stdout and procd_set_param stderr

I can not edit my own task. Don't have permissions.

@openwrt-bot
Copy link
Author

ynezz:

Anyway, it was not a good idea to include the latest changes in libubox and ubus 2 days before tagging the OpenWrt 19.07.

Ok, noted, I'll try to find and fix the bugs faster next time :-)

Reverting to libubox to 2019-10-29 and ubus to 2018-10-06 helps for both issues.

Really?

@openwrt-bot
Copy link
Author

ynezz:

Fix sent for review https://patchwork.ozlabs.org/patch/1223292/

@openwrt-bot
Copy link
Author

maddie:

Having the same issue here. I have stubby installed from the repo and v2ray from third party, which has hardcoded pidfile path. Both crashes 9 of 10 times on restart due to the same kernel panic, and the same unable to remove pidfile log line with garbled pidfile path.

Is the fix going to be released as a opkg-upgradable package?

EDIT: This bug sometimes generates garbled files under root directory / and once corrupted the filesystem which makes bricks the router, I had to re-flash the factory image back to the router to make it work again. So it might cause some serious issue for some users.

My device is an R7800 running latest 19.07.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant