OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity High
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Russell Senior - 25.03.2020

FS#2928 - TP-Link TL-WDR3600 v1 on kernel 5.4 boot-loops since change to GCC 8.4.0

- Device problem occurs on

TP-Link TL-WDR3600 v1

- Software versions of OpenWrt/LEDE release, packages, etc.

Since reboot-12646-gdb70077668 “toolchain: Update GCC 8 to version 8.4.0” and kernel 5.4, WDR3600 boot-loops with the following message:

Starting kernel ...

[    0.000000] Linux version 5.4.24 (openwrt@hawg) (gcc version 8.4.0 (OpenWrt GCC 8.4.0 r12683-8c33debb52)) #0 Sat Mar 21 21:35:45 2020
[    0.000000] printk: bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001974c (MIPS 74Kc)
[    0.000000] MIPS: machine is TP-Link TL-WDR3600 v1
[    0.000000] SoC: Atheros AR9344 rev 2
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, VIPT, cache aliases, linesize 32 bytes
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
[    0.000000] Kernel command line: console=ttyS0,115200 rootfstype=squashfs,jffs2
[    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
[    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
[    0.000000] Writing ErrCtl register=00000000
[    0.000000] Readback ErrCtl register=00000000
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 122384K/131072K available (4681K kernel code, 187K rwdata, 1080K rodata, 1212K init, 196K bss, 8688K reserved, 0K cma-reserved)
[    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] NR_IRQS: 51
[    0.000000] random: get_random_bytes called from start_kernel+0x32c/0x51c with crng_init=0
[    0.000000] CPU clock: 560.000 MHz
[    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 6825930166 ns
[    0.000009] sched_clock: 32 bits at 280MHz, resolution 3ns, wraps every 7669584382ns
[    0.008305] Calibrating delay loop... 278.93 BogoMIPS (lpj=1394688)
[    0.084927] pid_max: default: 32768 minimum: 301
[    0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.107070] Kernel panic - not syncing: Unexpected DSP exception
[    0.113470] Rebooting in 1 seconds..
realmicu commented on 27.03.2020 23:11

I'm experiencing the same issue on Netgear WNDR4300 (SoC: ar9344). Images compiled with previous version 8.3.0 were OK while 8.4.0 produces invalid code. What worked for me was switching GCC from version 8.4.0 to 9.3.0 :

CONFIG_TARGET_ath79=y
CONFIG_TARGET_ath79_nand=y
CONFIG_TARGET_ath79_nand_DEVICE_netgear_wndr4300=y
CONFIG_DEVEL=y
CONFIG_TOOLCHAINOPTS=y
CONFIG_CCACHE=y
CONFIG_COLLECT_KERNEL_DEBUG=y
# CONFIG_GCC_USE_VERSION_8 is not set
CONFIG_GCC_USE_VERSION_9=y
CONFIG_GCC_VERSION="9.3.0"
CONFIG_GCC_VERSION_9=y
CONFIG_IMAGEOPT=y
CONFIG_LINUX_5_4=y
CONFIG_TESTING_KERNEL=y
Steve Brown commented on 29.03.2020 15:32

Reverting 7000f11c23e23cf11f96 toolchain: Update GCC 8 to version 8.4.0

Fixes the problem on my TP-Link archer a7-v5

Russell Senior commented on 29.03.2020 21:32

Fwiw, this is the .config stub I used while bisecting:

CONFIG_TARGET_ath79=y
CONFIG_TARGET_ath79_generic=y
CONFIG_TARGET_ath79_generic_DEVICE_tplink_tl-wdr3600-v1=y
CONFIG_DEVEL=y
CONFIG_BUILD_LOG=y
# CONFIG_BUSYBOX_CONFIG_BRCTL is not set
# CONFIG_BUSYBOX_CONFIG_FREE is not set
# CONFIG_BUSYBOX_CONFIG_PGREP is not set
# CONFIG_BUSYBOX_CONFIG_TOP is not set
# CONFIG_BUSYBOX_CONFIG_UPTIME is not set
# CONFIG_PACKAGE_6relayd is not set
# CONFIG_PACKAGE_firewall is not set
# CONFIG_PACKAGE_firewall3 is not set
CONFIG_PACKAGE_iptables-mod-ipopt=y
CONFIG_PACKAGE_iptables-mod-nat-extra=y
# CONFIG_PACKAGE_odhcp6c is not set
# CONFIG_PACKAGE_ppp is not set
# CONFIG_PACKAGE_ppp-mod-pppoe is not set
CONFIG_TESTING_KERNEL=y
Project Manager
Hauke Mehrtens commented on 29.03.2020 22:34

I can reproduce it on a TP-Link TL-WDR4300 v1 with a AR9344.

It is happening in the save_dsp() function:
https://elixir.bootlin.com/linux/v5.4.28/source/arch/mips/include/asm/dsp.h#L50 which is called by arch_dup_task_struct()
https://elixir.bootlin.com/linux/v5.4.28/source/arch/mips/kernel/process.c#L110

The AR9344 says it supports the DSP extension:

root@OpenWrt:/# cat /proc/cpuinfo 
system type             : Atheros AR9344 rev 2
machine                 : TP-Link TL-WDR4300 v1
processor               : 0
cpu model               : MIPS 74Kc V4.12
BogoMIPS                : 278.78
wait instruction        : yes
microsecond timers      : yes
tlb_entries             : 32
extra interrupt vector  : yes
hardware watchpoint     : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb]
isa                     : mips1 mips2 mips32r1 mips32r2
ASEs implemented        : mips16 dsp dsp2
Options implemented     : tlb 4kex 4k_cache prefetch mcheck ejtag llsc dc_aliases perf_cntr_intr_bit nan_legacy nan_2008 perf
shadow register sets    : 1
kscratch registers      : 0
package                 : 0
core                    : 0
VCED exceptions         : not available
VCEI exceptions         : not available

root@OpenWrt:/# 

I added this function in between:

void my_save_dsp(void)
{
	save_dsp(current);
}

The working assembler for kernel 4.19 looks like this:

80067b40 <my_save_dsp.part.8>:
80067b40:       8f830000        lw      v1,0(gp)
80067b44:       00202810        mfhi    a1,$ac1
80067b48:       00202012        mflo    a0,$ac1
80067b4c:       ac65057c        sw      a1,1404(v1)
80067b50:       8f830000        lw      v1,0(gp)
80067b54:       00403810        mfhi    a3,$ac2
80067b58:       00403012        mflo    a2,$ac2
80067b5c:       ac640580        sw      a0,1408(v1)
80067b60:       8f830000        lw      v1,0(gp)
80067b64:       00602810        mfhi    a1,$ac3
80067b68:       00602012        mflo    a0,$ac3
80067b6c:       ac670584        sw      a3,1412(v1)
80067b70:       8f830000        lw      v1,0(gp)
80067b74:       ac660588        sw      a2,1416(v1)
80067b78:       8f830000        lw      v1,0(gp)
80067b7c:       ac65058c        sw      a1,1420(v1)
80067b80:       8f830000        lw      v1,0(gp)
80067b84:       ac640590        sw      a0,1424(v1)
80067b88:       7c3f1cb8        rddsp   v1,0x3f
80067b8c:       8f820000        lw      v0,0(gp)
80067b90:       03e00008        jr      ra
80067b94:       ac430594        sw      v1,1428(v0)

The crashing assembler for kernel 5.4 looks like this:

80066db0 <my_save_dsp.part.7>:
80066db0:       8f830000        lw      v1,0(gp)
80066db4:       00202810        mfhi    a1,$ac1
80066db8:       00202012        mflo    a0,$ac1
80066dbc:       ac65048c        sw      a1,1164(v1)
80066dc0:       8f830000        lw      v1,0(gp)
80066dc4:       00403810        mfhi    a3,$ac2
80066dc8:       00403012        mflo    a2,$ac2
80066dcc:       ac640490        sw      a0,1168(v1)
80066dd0:       8f830000        lw      v1,0(gp)
80066dd4:       00602810        mfhi    a1,$ac3
80066dd8:       00602012        mflo    a0,$ac3
80066ddc:       ac670494        sw      a3,1172(v1)
80066de0:       8f830000        lw      v1,0(gp)
80066de4:       ac660498        sw      a2,1176(v1)
80066de8:       8f830000        lw      v1,0(gp)
80066dec:       ac65049c        sw      a1,1180(v1)
80066df0:       8f830000        lw      v1,0(gp)
80066df4:       ac6404a0        sw      a0,1184(v1)
80066df8:       7c3f1cb8        rddsp   v1,0x3f
80066dfc:       8f820000        lw      v0,0(gp)
80066e00:       03e00008        jr      ra
80066e04:       ac4304a4        sw      v1,1188(v0)

This looks very similar, Is there some initialization for the DSP extension needed?

This commit from Linux 4.20 looks interesting:
https://git.kernel.org/linus/edbb4233e7efc37dbebb10f7774b38c64080dd66

Project Manager
Hauke Mehrtens commented on 30.03.2020 22:44

I did a git bisect and it breaks since this kernel commit:
http://git.kernel.org/linus/9012d011660ea5cf2a623e1de207a2bc0ca6936d

As this is changing some compiler optimizations I assume this is related to some compiler bug.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing