OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Kernel
  • Assigned To No-one
  • Operating System All
  • Severity High
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Javier Marcet - 05.02.2020

FS#2814 - netns cleanup_net race condition

I’m seeing this issue on master built for x86-64, it’s been happening since I started using this x86-64 machine around December.

[   73.300163] =============================================================================                                                                                                                                                                                   
[   73.309803] BUG kmalloc-32 (Not tainted): Object already free                                                                                                                                                                                                               
[   73.316694] -----------------------------------------------------------------------------                                                                                                                                                                                   
                                                                                                                                                                                                                                                                               
[   73.328675] Disabling lock debugging due to kernel taint                                                                                                                                                                                                                    
[   73.335167] INFO: Allocated in ops_init+0x6d/0x100 age=147 cpu=0 pid=11053                                                                                                                                                                                                  
[   73.343242] INFO: Freed in 0xffffffffa09c94ca age=36 cpu=2 pid=50                                                                                                                                                                                                           
[   73.350557] INFO: Slab 0x00000000ed33138b objects=36 used=11 fp=0x0000000017541cc8 flags=0xf700000000101                                                                                                                                                                    
[   73.361281] INFO: Object 0x0000000041bc5c4c @offset=1576 fp=0x000000007d23c440                                                                                                                                                                                              
                                                                                                                                                                                                                                                                               
[   73.372490] Redzone 000000008c59cd8b: bb bb bb bb bb bb bb bb                          ........                                                                                                                                                                             
[   73.382449] Object 0000000041bc5c4c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk                                                                                                                                                                      
[   73.393029] Object 0000000073cabfc7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.                                                                                                                                                                      
[   73.403599] Redzone 000000009196639c: bb bb bb bb bb bb bb bb                          ........                                                                                                                                                                             
[   73.413574] Padding 000000009337e5e1: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ                                                                                                                                                                             
[   73.423549] CPU: 2 PID: 50 Comm: kworker/u16:1 Tainted: G    B             4.19.101 #0                                                                                                                                                                                      
[   73.432698] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Pro4-M, BIOS P2.10 03/13/2018                                                                                                                                                                  
[   73.443617] Workqueue: netns cleanup_net                                                                                                                                                                                                                                    
[   73.448798] Call Trace:                                                                                                                                                                                                                                                     
[   73.452495]  dump_stack+0x57/0x7a                                                                                                                                                                                                                                           
[   73.457083]  print_trailer+0x203/0x210                                                                                                                                                                                                                                      
[   73.462068]  object_err+0x2f/0x36                                                                                                                                                                                                                                           
[   73.466613]  free_debug_processing.cold.107+0x2c/0x11f                                                                                                                                                                                                                      
[   73.472985]  ? ops_free_list.part.14+0x4d/0x60                                                                                                                                                                                                                              
[   73.478663]  __slab_free+0x1bd/0x330                                                                                                                                                                                                                                        
[   73.483482]  ? kmem_cache_free+0x1b5/0x1e0                                                                                                                                                                                                                                  
[   73.488778]  kfree+0x11c/0x140                                                                                                                                                                                                                                              
[   73.493009]  ops_free_list.part.14+0x4d/0x60                                                                                                                                                                                                                                
[   73.498460]  cleanup_net+0x1c4/0x280                                                                                                                                                                                                                                        
[   73.503215]  process_one_work+0x1a8/0x340                                                                                                                                                                                                                                   
[   73.508384]  worker_thread+0x2f/0x3a0                                                                                                                                                                                                                                       
[   73.513215]  kthread+0x10b/0x130                                                                                                                                                                                                                                            
[   73.517619]  ? process_one_work+0x340/0x340                                                                                                                                                                                                                                 
[   73.522967]  ? kthread_create_worker_on_cpu+0x60/0x60                                                                                                                                                                                                                       
[   73.529193]  ret_from_fork+0x35/0x40                                                                                                                                                                                                                                        
[   73.534067] FIX kmalloc-32: Object at 0x0000000041bc5c4c not freed                                                                                                                                                                                                          

There are old closed reports about it 2353 2354

I have tried with kernel 4.14 and with kernel 4.19 from the 19 branch and also removing the nf_conntrack_rtcache module, but the result is the same.

There are times that I see no BUG at all, others where I see it appear between 1 and several times. I use several docker containers and depending on the races, sometimes everything launches well and afterwards (if I don’t play with new docker instances) it is 100% stable whereas other times I have to reboot up to 3 or even 5 times for everything to work reliably.

dmesg with some race but working stable

partial dmesg which ends crashing, captured from another machine

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing