You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
====System:====
* Device: Linksys WRT3200ACM //(Note: problem is most likely not device-specific)//
* OpenWrt/LEDE: lede-17.01 and Git commit 359273d (both tested)
* Kernel: Custom built from https://github.com/openwrt/openwrt.git both 4.9 and 4.14 versions
====Problem:====
When deleting a kernel network namespace, a kworker thread hangs indefinitely waiting for the loopback device inside the namespace to be released. This prevents the creation of any additional network namespaces until the system is rebooted. It affects any software utilizing network namespaces such as LXC (which i was experimenting with when i first encountered this problem). LXC containers could only be started once and would hang when trying to restart them or start another container after one container had been stopped.
====Steps to reproduce:====
- Build kernel with namespace support including network namespaces:CONFIG_KERNEL_NAMESPACES=y
CONFIG_KERNEL_UTS_NS=y
CONFIG_KERNEL_IPC_NS=y
CONFIG_KERNEL_USER_NS=y
CONFIG_KERNEL_PID_NS=y
CONFIG_KERNEL_NET_NS=y
- Build BusyBox with the unshare utility (Linux System Utilities)CONFIG_BUSYBOX_CONFIG_UNSHARE=y
- Boot into system, create and immediately delete a network namespace with:root@box:~# unshare -n true
====Observed symptoms:====
root@box:~# ps | grep kworker shows a kworker-thread lingering in D-state: 450 root 0 DW [kworker/u4:3]
after a while, root@box:~# dmesg shows [ 114.596437] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 124.728977] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 134.881391] unregister_netdevice: waiting for lo to become free. Usage count = 1 . The message is repeated indefinitely every 10 seconds.
No more additional network namespaces can be created, another unshare -n will hang indefinitely.
====Possible cause:====
After poking in the dark for quite some time, [[https://forum.turris.cz/t/turris-os-3-9-1-is-out-in-rc-with-a-number-of-fixes/5918/25|i found this post]] by //HomerSp// in the Turris OS. After the kernel developers [[https://bugzilla.kernel.org/show_bug.cgi?id=198189|implied that the problem was due to a patch in OpenWrt]] during his first assessment, he finally tracked it down to a missing in6_dev_put() in [[https://github.com/openwrt/openwrt/blob/master/target/linux/generic/pending-4.14/670-ipv6-allow-rejecting-with-source-address-failed-policy.patch|670-ipv6-allow-rejecting-with-source-address-failed-policy.patch]]
====Solution:====
This is my patch derived from ([[https://gist.github.com/HomerSp/8ed5d5b7dcd4175a2fa3351577416a1b|HomerSp's complete modified patch here]]) which i applied to my kernel code after all other patches: --- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3860,6 +3860,7 @@ static int ip6_route_dev_notify(struct notifier_block *this,
in6_dev_put_clear(&net->ipv6.ip6_null_entry->rt6i_idev);
#ifdef CONFIG_IPV6_MULTIPLE_TABLES
in6_dev_put_clear(&net->ipv6.ip6_prohibit_entry->rt6i_idev);
#endif
}
//Note: line numbers probably not matching since my repo also contains some other recent upstream patches i tested while trying to solve the problem. You will probably want to fix the patch file itself instead of "patching after the patch" like i did.//
Since the solution doesn't seem to have made in back into the OpenWrt code, i took the liberty of reporting it here and bring the problem to your attention. Although i can't say anything about its correctness, it solves the problem at least for me.
The text was updated successfully, but these errors were encountered:
Gwani:
====System:====
* Device: Linksys WRT3200ACM //(Note: problem is most likely not device-specific)//
* OpenWrt/LEDE: lede-17.01 and Git commit 359273d (both tested)
* Kernel: Custom built from https://github.com/openwrt/openwrt.git both 4.9 and 4.14 versions
====Problem:====
When deleting a kernel network namespace, a kworker thread hangs indefinitely waiting for the loopback device inside the namespace to be released. This prevents the creation of any additional network namespaces until the system is rebooted. It affects any software utilizing network namespaces such as LXC (which i was experimenting with when i first encountered this problem). LXC containers could only be started once and would hang when trying to restart them or start another container after one container had been stopped.
====Steps to reproduce:====
- Build kernel with namespace support including network namespaces:
CONFIG_KERNEL_NAMESPACES=y
CONFIG_KERNEL_UTS_NS=y
CONFIG_KERNEL_IPC_NS=y
CONFIG_KERNEL_USER_NS=y
CONFIG_KERNEL_PID_NS=y
CONFIG_KERNEL_NET_NS=y
- Build BusyBox with the unshare utility (Linux System Utilities)
CONFIG_BUSYBOX_CONFIG_UNSHARE=y
- Boot into system, create and immediately delete a network namespace with:
root@box:~# unshare -n true
====Observed symptoms:====
root@box:~# ps | grep kworker
shows a kworker-thread lingering in D-state:450 root 0 DW [kworker/u4:3]
root@box:~# dmesg
shows[ 114.596437] unregister_netdevice: waiting for lo to become free. Usage count = 1
. The message is repeated indefinitely every 10 seconds.[ 124.728977] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 134.881391] unregister_netdevice: waiting for lo to become free. Usage count = 1
====Possible cause:====
After poking in the dark for quite some time, [[https://forum.turris.cz/t/turris-os-3-9-1-is-out-in-rc-with-a-number-of-fixes/5918/25|i found this post]] by //HomerSp// in the Turris OS. After the kernel developers [[https://bugzilla.kernel.org/show_bug.cgi?id=198189|implied that the problem was due to a patch in OpenWrt]] during his first assessment, he finally tracked it down to a missing in6_dev_put() in [[https://github.com/openwrt/openwrt/blob/master/target/linux/generic/pending-4.14/670-ipv6-allow-rejecting-with-source-address-failed-policy.patch|670-ipv6-allow-rejecting-with-source-address-failed-policy.patch]]
====Solution:====
This is my patch derived from ([[https://gist.github.com/HomerSp/8ed5d5b7dcd4175a2fa3351577416a1b|HomerSp's complete modified patch here]]) which i applied to my kernel code after all other patches:
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3860,6 +3860,7 @@ static int ip6_route_dev_notify(struct notifier_block *this,
in6_dev_put_clear(&net->ipv6.ip6_null_entry->rt6i_idev);
#ifdef CONFIG_IPV6_MULTIPLE_TABLES
in6_dev_put_clear(&net->ipv6.ip6_prohibit_entry->rt6i_idev);
#endif
}
//Note: line numbers probably not matching since my repo also contains some other recent upstream patches i tested while trying to solve the problem. You will probably want to fix the patch file itself instead of "patching after the patch" like i did.//
Since the solution doesn't seem to have made in back into the OpenWrt code, i took the liberty of reporting it here and bring the problem to your attention. Although i can't say anything about its correctness, it solves the problem at least for me.
The text was updated successfully, but these errors were encountered: