OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
  • Task Type Bug Report
  • Category Kernel
  • Assigned To No-one
  • Operating System All
  • Severity Critical
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Alexander Lochmann - 15.05.2020

FS#3099 - TP-Link C2600 Kernel 5.4 crashes while accessing invalid memory address

Linux kernel experiences an oops from time to time
- Device: TP-Link Archer C2600
- Kernel Version: 5.4.40
- OpenWRT Version: OpenWrt SNAPSHOT, r13224+16-2308644b0c
- Steps to reproduce: Simply wait. After an unknown amount of time the kernels crashes leading to a reboot.


John T commented on 12.07.2020 19:16

Thanks for linking my bug. It does seem similar. How did you get the full kernel call stack?

Alexander Lochmann commented on 12.07.2020 19:36
John T commented on 12.07.2020 20:22

Thanks! Have you tried reverting to kernel 4.19 and try to catch this again?

I checked the clk-krait.c, file where "krait_mux_set_parent" is defined, introduced by the patch in 4.19 and the one already in 5.4 and they're pretty similar.

FWIW I'm on 5.4.48 and still seeing this issue.

Alexander Lochmann commented on 13.07.2020 08:50

No, I haven't tried this yet.
How have you reverted back to 4.19?

Which patch are you talking about?
Can you pls point me to the location?

John T commented on 13.07.2020 16:22

I haven't but I'm trying different workarounds in that CPU scaling area. I have 2 identical routers so I can experiment with different settings. I'll post if I find anything working, the issue is that it's taking days, weeks to crash.

The patch I was referring to is in openwrt\target\linux\ipq806x\patches-4.19:

And looks already merged into 5.4.

John T commented on 17.07.2020 18:18

I might be too soon, but one of the routers survived for over 7 days now, with no reboots.
I basically disabled the CPU scaling on both cores. Might be something to try and post any updates. I hope I get to 2 weeks with no power outages.

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

John T commented on 23.07.2020 21:59

Update: 2 weeks uptime now and no reboots with CPU scaling disabled.

Alexander Lochmann commented on 26.07.2020 18:19

Nice. I applied your settings as well. So far no reboot for 2 and a half days.
Keep you posted.

Alexander Lochmann commented on 03.08.2020 09:30

Update: Uptime went up to ~9 days. I switched back to ondemand governor, and experienced a crash within 12 hours.....

Filip Matijević commented on 03.08.2020 19:47

Had no crashes to report for a while - they are typical after a week or so for me, and since I had one power failure and a couple of user initiated reboots I couldn't catch any logs but I have finally caught one to report (files attached).

I'll try performance setting for both cores and report back if I experience any crashes.

John T commented on 03.08.2020 19:58

Filip, your call stacks look a little different, so it might be a different issue.

I'm almost at 30 days uptime with the governor set to "performance".

Filip Matijević commented on 04.08.2020 07:53

I've just had another reboot and it seems that crashlog doesn't get overwritten on subsequent crashes - I'll do a hard reboot and wait.
I'm not able to say if it's the same problem here as my crashlogs are not consistent (for example: In any changing governor for me did the opposite for stability - I'm wondering if making CPU to run at 100% frequency causes overheating issue making it more unstable in my case. I'll try with the same governor once more to see if that's the case.

Alexander Lochmann commented on 03.09.2020 20:43

@John T: Disabling cpufreq did the trick. :-/ Uptime is currently 32 days.

John T commented on 03.09.2020 21:41

Thanks for confirming.

I kind of had some power failures, then had to flash a new image to get the stuff that I needed, so I haven't been able to test it for so long!

I'm on kernel 5.4.60 now with cpufreq disabled from kernel_menuconfig.

tmo26 commented on 15.10.2020 21:24

@Alexander Lochmann @John T
Any news regarding uptime? Did disabling cpufreq solve your reboot problems?

John T commented on 15.10.2020 21:40

Last uptime for me was over 30 days, before we had a power blip due to a wind storm. But I think it would've gone even longer. At this point I'm sure the cpufreq was causing the random reboots.

Alexander Lochmann commented on 16.10.2020 06:19

@tmo26: Yes, it seems that switching to the performance governor solved the issue. My reboots so far were all manually triggered. I observed uptimes of severals days.
I once switched back to the ondemand governor, and my router crashed within hours.

John T commented on 16.10.2020 16:21

So, now we have a workaround, what about the real fix? Is anyone looking into this?


Available keyboard shortcuts


Task Details

Task Editing