Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#3540 - ip6806x serial console is mute once kernel takes control (r15355-19d7e73ecc) #8587

Closed
openwrt-bot opened this issue Dec 27, 2020 · 7 comments
Labels

Comments

@openwrt-bot
Copy link

hnyman:

Background: I noticed with ipq806x R7800 build r15355-19d7e73ecc that I am unable to sysupgrade back to r15254-1302bee12a. (I had just upgraded from r15254-1302bee12a to r15355-19d7e73ecc, so sysupgrade used to work a few days ago.)

Alarming is that while trying to debug that sysupgrade, I noticed to my surprise that with r15355-19d7e73ecc the serial console access to R7800 works only during the u-boot phase, but once control is passed to Linux kernel, the serial console gets turned off. Output stops at:

   Entry Point:  42208000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
OK
mtdparts variable not set, see 'help mtdparts'
no partitions defined

defaults:
mtdids  : nand0=msm_nand
mtdparts: none
info: "mtdparts" not set
Using machid 0x136c from environment

Starting kernel ...

The firmware runs ok, SSH console works ok. But serial cable does not produce any output.

Rebooting the device causes the normal u-boot message to be shown ok, but the output stops when kernel takes over.
That makes me to suspect that something has changed in the kernel parameters.

commit 98b8629 is pretty much the only one to touch the ipq806x config in the last few days.
(other possibility is 77575d4 but that does not look likely.

This commit 98b8629 made this change:

CONFIG_CMDLINE_OVERRIDE=y

I suspect that it affects the normal booting of the other normal ipq806x devices by changing kernel cmdline and somehow disables serial access.

I will debug further by turning that off, but thought to highlight my suspicions already now.

Ps. Debugging sysupgrade is now impossible as serial console is mute. :-(

Pps.
The option is also mentioned in a forum discussion starting here
https://forum.openwrt.org/t/ipq806x-nss-build-netgear-r7800-tp-link-c2600-linksys-ea8500/82525/44

@openwrt-bot
Copy link
Author

hnyman:

The faulty build r15355-19d7e73ecc sets kernel console to a non-existent tty:

Kernel command line: console=ttyHSL1,115200n8

In the previous working r15254-1302bee12a the tty is ignored:

Bootloader command line (ignored): console=ttyHSL1,115200n8

More context:
From non-working r15355-19d7e73ecc:

[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 5.4.85 (perus@ub2010) (gcc version 8.4.0 (OpenWrt GCC 8.4.0 r15237-fca0eb2d92)) #0 SMP Sun Dec 27 19:31:27 2020
[ 0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5787d
[ 0.000000] CPU: div instructions available: patching division code
[ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[ 0.000000] OF: fdt: Machine model: Netgear Nighthawk X4S R7800
[ 0.000000] OF: fdt: Ignoring memory range 0x41500000 - 0x42000000
[ 0.000000] Memory policy: Data cache writealloc
[ 0.000000] On node 0 totalpages: 122880
[ 0.000000] Normal zone: 1080 pages used for memmap
[ 0.000000] Normal zone: 0 pages reserved
[ 0.000000] Normal zone: 122880 pages, LIFO batch:31
[ 0.000000] percpu: Embedded 15 pages/cpu s30156 r8192 d23092 u61440
[ 0.000000] pcpu-alloc: s30156 r8192 d23092 u61440 alloc=15*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 121800
[ 0.000000] Kernel command line: console=ttyHSL1,115200n8

From working r15254-1302bee12a:

[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 5.4.83 (perus@ub2010) (gcc version 8.4.0 (OpenWrt GCC 8.4.0 r15237-fca0eb2d92)) #0 SMP Tue Dec 22 16:08:44 2020
[ 0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5787d
[ 0.000000] CPU: div instructions available: patching division code
[ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[ 0.000000] OF: fdt: Machine model: Netgear Nighthawk X4S R7800
[ 0.000000] Memory policy: Data cache writealloc
[ 0.000000] percpu: Embedded 15 pages/cpu s30156 r8192 d23092 u61440
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 121800
[ 0.000000] Kernel command line:
[ 0.000000] Bootloader command line (ignored): console=ttyHSL1,115200n8

@openwrt-bot
Copy link
Author

hnyman:

Compiling r15355-19d7e73ecc with that CONFIG_CMDLINE_OVERRIDE=y reverted fixes R7800.

  • sysupgrade to work ok. Downgrade to r15254 worked.
  • serial console works again.

Fixed bootlog:

[ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Linux version 5.4.83 (perus@ub2010) (gcc version 8.4.0 (OpenWrt GCC 8.4.0 r15237-fca0eb2d92)) #0 SMP Tue Dec 22 16:08:44 2020 [ 0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5787d [ 0.000000] CPU: div instructions available: patching division code [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache [ 0.000000] OF: fdt: Machine model: Netgear Nighthawk X4S R7800 [ 0.000000] Memory policy: Data cache writealloc [ 0.000000] percpu: Embedded 15 pages/cpu s30156 r8192 d23092 u61440 [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 121800 [ 0.000000] Kernel command line: [ 0.000000] Bootloader command line (ignored): console=ttyHSL1,115200n8 [ 0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes, linear) [ 0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes, linear) [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off [ 0.000000] Memory: 474836K/491520K available (6042K kernel code, 193K rwdata, 1528K rodata, 1024K init, 231K bss, 16684K reserved, 0K cma-reserved, 0K highmem) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1

Note that also these lines that appeared on the faulty build, have disappeared:

[ 0.000000] OF: fdt: Ignoring memory range 0x41500000 - 0x42000000
[ 0.000000] Memory policy: Data cache writealloc
[ 0.000000] On node 0 totalpages: 122880
[ 0.000000] Normal zone: 1080 pages used for memmap
[ 0.000000] Normal zone: 0 pages reserved
[ 0.000000] Normal zone: 122880 pages, LIFO batch:31

Fix applied:

--- a/target/linux/ipq806x/config-5.4 +++ b/target/linux/ipq806x/config-5.4 @@ -78,7 +78,7 @@ CONFIG_CC_HAS_KASAN_GENERIC=y CONFIG_CLKDEV_LOOKUP=y CONFIG_CLKSRC_QCOM=y CONFIG_CLONE_BACKWARDS=y -CONFIG_CMDLINE_OVERRIDE=y +# CONFIG_CMDLINE_OVERRIDE is not set CONFIG_COMMON_CLK=y CONFIG_COMMON_CLK_QCOM=y CONFIG_COMPAT_32BIT_TIME=y

@openwrt-bot
Copy link
Author

ricsc:

Additional Info. Linksys EA8500 is also affected by this bug.

@openwrt-bot
Copy link
Author

rbpp:

I can confirm that Archer C2600 is also affected and using

CONFIG_CMDLINE_OVERRIDE is not set

fixes things.

@openwrt-bot
Copy link
Author

hnyman:

I did some further debugging, and noticed something:

I compared the kernel build log with the working code (without CONFIG_CMDLINE_OVERRIDE=y) to log with the faulty code, and there is only a small difference:

In the faulty version, there is a one-line warning:
warning: override: CMDLINE_OVERRIDE changes choice state

context:

HOSTCC scripts/kconfig/symbol.o
HOSTLD scripts/kconfig/conf
scripts/kconfig/conf --syncconfig Kconfig
net/sched/Kconfig:45: warning: menuconfig statement without prompt
.config:987:warning: override: CMDLINE_OVERRIDE changes choice state
HOSTCC scripts/dtc/dtc.o

So, the new option changes something else.

I looked at build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/linux-5.4.85/arch/arm/Kconfig and noticed that the new definition of CONFIG_CMDLINE_OVERRIDE has been placed into end of a multi-option choice block. Enabling this option may now change the selected value of the choice of ARM_ATAG_DTB_COMPAT_CMDLINE... options.

choice prompt "Kernel command line type" if ARM_ATAG_DTB_COMPAT default ARM_ATAG_DTB_COMPAT_CMDLINE_FROM_BOOTLOADER

config ARM_ATAG_DTB_COMPAT_CMDLINE_FROM_BOOTLOADER
bool "Use bootloader kernel arguments if available"
help
Uses the command-line options passed by the boot loader instead of
the device tree bootargs property. If the boot loader doesn't provide
any, the device tree bootargs property will be used.

config ARM_ATAG_DTB_COMPAT_CMDLINE_EXTEND
bool "Extend with bootloader kernel arguments"
help
The command-line arguments provided by the boot loader will be
appended to the the device tree bootargs property.

config ARM_ATAG_DTB_COMPAT_CMDLINE_MANGLE
bool "Append rootblock parsing bootloader's kernel arguments"
help
The command-line arguments provided by the boot loader will be
appended to a new device tree property: bootloader-args.
If there is a property "append-rootblock" in DT under /chosen
and a root= option in bootloaders command line it will be parsed
and added to DT bootargs with the form: XX.
Only command line ATAG will be processed, the rest of the ATAGs
sent by bootloader will be ignored.

config CMDLINE_OVERRIDE
bool "Use alternative cmdline from device tree"
help
Some bootloaders may have uneditable bootargs. While CMDLINE_FORCE can
be used, this is not a good option for kernels that are shared across
devices. This setting enables using "chosen/cmdline-override" as the
cmdline if it exists in the device tree.

endchoice

config CMDLINE
string "Default kernel command string"
default ""
help
On some architectures (EBSA110 and CATS), there is currently no way

Apparently the new selected option in the block changes the selection of the existing options, causing bugs to other devices depending on those selections. Asrock router apparently is not affected by that change,

Likely the new code should be after the "endchoice" line, so that ARM_ATAG_DTB_COMPAT_CMDLINE... things do not change.

Or possibly it should be inside the next choice block about CMDLINE:

config CMDLINE string "Default kernel command string" default "" help On some architectures (EBSA110 and CATS), there is currently no way for the boot loader to pass arguments to the kernel. For these architectures, you should supply some command-line options at build time by entering them here. As a minimum, you should specify the memory size and the root device (e.g., mem=64M root=/dev/nfs).

choice
prompt "Kernel command line type" if CMDLINE != ""
default CMDLINE_FROM_BOOTLOADER
depends on ATAGS

config CMDLINE_FROM_BOOTLOADER
bool "Use bootloader kernel arguments if available"
help
Uses the command-line options passed by the boot loader. If
the boot loader doesn't provide any, the default kernel command
string provided in CMDLINE will be used.

config CMDLINE_EXTEND
bool "Extend bootloader kernel arguments"
help
The command-line arguments provided by the boot loader will be
appended to the default kernel command string.

config CMDLINE_FORCE
bool "Always use the default kernel command string"
help
Always use the default kernel command string, even if the boot
loader passes other arguments to the kernel.
This is useful if you cannot or don't want to change the
command-line options your boot loader passes to the kernel.
endchoice

Kernel & bootloader option specialists may take a closed look, but I think that this might well be the reason.

@openwrt-bot
Copy link
Author

hnyman:

One more observation: The faulty Asrock commit's commit messages says:

  • 900-arm-add-cmdline-override.patch was copied from 102-powerpc-add-cmdline-override.patch from powerpc target.

But looking at

https://github.com/openwrt/openwrt/blob/master/target/linux/mpc85xx/patches-5.4/102-powerpc-add-cmdline-override.patch

shows that there the code there is apparently placed outside any choice block, just before "config EXTRA_TARGETS"

So, looks like the author has accidentally placed the new definition into inside a choice block and it has worked for him.

@openwrt-bot
Copy link
Author

CHKDSK88:

Guys with the problem, please test my PR with fix:

#3740

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant