Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#1426 - Probable inet6_dev refcount leak introduced by OpenWrt-specific patch #6536

Closed
openwrt-bot opened this issue Mar 10, 2018 · 1 comment
Labels

Comments

@openwrt-bot
Copy link

Gwani:

====System:====
* Device: Linksys WRT3200ACM //(Note: problem is most likely not device-specific)//
* OpenWrt/LEDE: lede-17.01 and Git commit 359273d (both tested)
* Kernel: Custom built from https://github.com/openwrt/openwrt.git both 4.9 and 4.14 versions
====Problem:====
When deleting a kernel network namespace, a kworker thread hangs indefinitely waiting for the loopback device inside the namespace to be released. This prevents the creation of any additional network namespaces until the system is rebooted. It affects any software utilizing network namespaces such as LXC (which i was experimenting with when i first encountered this problem). LXC containers could only be started once and would hang when trying to restart them or start another container after one container had been stopped.
====Steps to reproduce:====
- Build kernel with namespace support including network namespaces:CONFIG_KERNEL_NAMESPACES=y
CONFIG_KERNEL_UTS_NS=y
CONFIG_KERNEL_IPC_NS=y
CONFIG_KERNEL_USER_NS=y
CONFIG_KERNEL_PID_NS=y
CONFIG_KERNEL_NET_NS=y

- Build BusyBox with the unshare utility (Linux System Utilities)CONFIG_BUSYBOX_CONFIG_UNSHARE=y
- Boot into system, create and immediately delete a network namespace with:root@box:~# unshare -n true
====Observed symptoms:====

  • root@box:~# ps | grep kworker shows a kworker-thread lingering in D-state: 450 root 0 DW [kworker/u4:3]
  • after a while, root@box:~# dmesg shows [ 114.596437] unregister_netdevice: waiting for lo to become free. Usage count = 1
    [ 124.728977] unregister_netdevice: waiting for lo to become free. Usage count = 1
    [ 134.881391] unregister_netdevice: waiting for lo to become free. Usage count = 1
    . The message is repeated indefinitely every 10 seconds.
  • No more additional network namespaces can be created, another unshare -n will hang indefinitely.
    ====Possible cause:====
    After poking in the dark for quite some time, [[https://forum.turris.cz/t/turris-os-3-9-1-is-out-in-rc-with-a-number-of-fixes/5918/25|i found this post]] by //HomerSp// in the Turris OS. After the kernel developers [[https://bugzilla.kernel.org/show_bug.cgi?id=198189|implied that the problem was due to a patch in OpenWrt]] during his first assessment, he finally tracked it down to a missing in6_dev_put() in [[https://github.com/openwrt/openwrt/blob/master/target/linux/generic/pending-4.14/670-ipv6-allow-rejecting-with-source-address-failed-policy.patch|670-ipv6-allow-rejecting-with-source-address-failed-policy.patch]]

====Solution:====
This is my patch derived from ([[https://gist.github.com/HomerSp/8ed5d5b7dcd4175a2fa3351577416a1b|HomerSp's complete modified patch here]]) which i applied to my kernel code after all other patches:
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3860,6 +3860,7 @@ static int ip6_route_dev_notify(struct notifier_block *this,
in6_dev_put_clear(&net->ipv6.ip6_null_entry->rt6i_idev);
#ifdef CONFIG_IPV6_MULTIPLE_TABLES
in6_dev_put_clear(&net->ipv6.ip6_prohibit_entry->rt6i_idev);

  •   in6_dev_put(net->ipv6.ip6_policy_failed_entry->rt6i_idev);
      in6_dev_put_clear(&net->ipv6.ip6_blk_hole_entry->rt6i_idev);
    

#endif
}

//Note: line numbers probably not matching since my repo also contains some other recent upstream patches i tested while trying to solve the problem. You will probably want to fix the patch file itself instead of "patching after the patch" like i did.//

Since the solution doesn't seem to have made in back into the OpenWrt code, i took the liberty of reporting it here and bring the problem to your attention. Although i can't say anything about its correctness, it solves the problem at least for me.

@openwrt-bot
Copy link
Author

jow-:

This was fixed in master with https://git.openwrt.org/58f7b5b96c ("kernel: add missing in6_dev_put_clear call to an ipv6 network patch")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant