Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#2814 - netns cleanup_net race condition #7612

Open
openwrt-bot opened this issue Feb 5, 2020 · 0 comments
Open

FS#2814 - netns cleanup_net race condition #7612

openwrt-bot opened this issue Feb 5, 2020 · 0 comments
Labels
flyspray kernel pull request/issue with Linux kernel related changes

Comments

@openwrt-bot
Copy link

jmarcet:

I'm seeing this issue on master built for x86-64, it's been happening since I started using this x86-64 machine around December.

[ 73.300163] ============================================================================= [ 73.309803] BUG kmalloc-32 (Not tainted): Object already free [ 73.316694] -----------------------------------------------------------------------------

[ 73.328675] Disabling lock debugging due to kernel taint
[ 73.335167] INFO: Allocated in ops_init+0x6d/0x100 age=147 cpu=0 pid=11053
[ 73.343242] INFO: Freed in 0xffffffffa09c94ca age=36 cpu=2 pid=50
[ 73.350557] INFO: Slab 0x00000000ed33138b objects=36 used=11 fp=0x0000000017541cc8 flags=0xf700000000101
[ 73.361281] INFO: Object 0x0000000041bc5c4c @offset=1576 fp=0x000000007d23c440

[ 73.372490] Redzone 000000008c59cd8b: bb bb bb bb bb bb bb bb ........
[ 73.382449] Object 0000000041bc5c4c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 73.393029] Object 0000000073cabfc7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
[ 73.403599] Redzone 000000009196639c: bb bb bb bb bb bb bb bb ........
[ 73.413574] Padding 000000009337e5e1: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
[ 73.423549] CPU: 2 PID: 50 Comm: kworker/u16:1 Tainted: G B 4.19.101 #0
[ 73.432698] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Pro4-M, BIOS P2.10 03/13/2018
[ 73.443617] Workqueue: netns cleanup_net
[ 73.448798] Call Trace:
[ 73.452495] dump_stack+0x57/0x7a
[ 73.457083] print_trailer+0x203/0x210
[ 73.462068] object_err+0x2f/0x36
[ 73.466613] free_debug_processing.cold.107+0x2c/0x11f
[ 73.472985] ? ops_free_list.part.14+0x4d/0x60
[ 73.478663] __slab_free+0x1bd/0x330
[ 73.483482] ? kmem_cache_free+0x1b5/0x1e0
[ 73.488778] kfree+0x11c/0x140
[ 73.493009] ops_free_list.part.14+0x4d/0x60
[ 73.498460] cleanup_net+0x1c4/0x280
[ 73.503215] process_one_work+0x1a8/0x340
[ 73.508384] worker_thread+0x2f/0x3a0
[ 73.513215] kthread+0x10b/0x130
[ 73.517619] ? process_one_work+0x340/0x340
[ 73.522967] ? kthread_create_worker_on_cpu+0x60/0x60
[ 73.529193] ret_from_fork+0x35/0x40
[ 73.534067] FIX kmalloc-32: Object at 0x0000000041bc5c4c not freed

There are old closed reports about it [[https://bugs.openwrt.org/index.php?do=details&task_id=2353&string=netns+cleanup_net&advancedsearch=on&type%5B0%5D=&sev%5B0%5D=&pri%5B0%5D=&due%5B0%5D=&reported%5B0%5D=&cat%5B0%5D=&status%5B0%5D=&percent%5B0%5D=&opened=&dev=&closed=&duedatefrom=&duedateto=&changedfrom=&changedto=&openedfrom=&openedto=&closedfrom=&closedto=|2353]] [[https://bugs.openwrt.org/index.php?do=details&task_id=2354&string=netns+cleanup_net&advancedsearch=on&type%5B0%5D=&sev%5B0%5D=&pri%5B0%5D=&due%5B0%5D=&reported%5B0%5D=&cat%5B0%5D=&status%5B0%5D=&percent%5B0%5D=&opened=&dev=&closed=&duedatefrom=&duedateto=&changedfrom=&changedto=&openedfrom=&openedto=&closedfrom=&closedto=|2354]]

I have tried with kernel 4.14 and with kernel 4.19 from the 19 branch and also removing the nf_conntrack_rtcache module, but the result is the same.

There are times that I see no BUG at all, others where I see it appear between 1 and several times. I use several docker containers and depending on the races, sometimes everything launches well and afterwards (if I don't play with new docker instances) it is 100% stable whereas other times I have to reboot up to 3 or even 5 times for everything to work reliably.

[[https://pastebin.com/CASGAfff|dmesg with some race but working stable]]

[[https://pastebin.com/Nyfp8jk5|partial dmesg which ends crashing, captured from another machine]]

@aparcar aparcar added the kernel pull request/issue with Linux kernel related changes label Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flyspray kernel pull request/issue with Linux kernel related changes
Projects
None yet
Development

No branches or pull requests

2 participants