Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#462 - Lede 17.01.0-rc1 - ramips/rt3883/rt-n56u bootloop #5672

Closed
openwrt-bot opened this issue Feb 4, 2017 · 17 comments
Closed

FS#462 - Lede 17.01.0-rc1 - ramips/rt3883/rt-n56u bootloop #5672

openwrt-bot opened this issue Feb 4, 2017 · 17 comments
Labels

Comments

@openwrt-bot
Copy link

giannello:

Installing Lede 17.01.0-rc1 on an Asus N56U using the factory image results in the device constantly rebooting.

Upgrading with the sysupdate image from a known working installation - even without preserving the configuration - works fine.

@openwrt-bot
Copy link
Author

mkresin:

Does the [[https://downloads.openwrt.org/chaos_calmer/15.05.1/ramips/rt3883/openwrt-15.05.1-ramips-rt3883-rt-n56u-squashfs-factory.bin|OpenWrt 15.01.1 factory image]] works for you? The easiest would be to find a working OpenWrt image, to see what has changed in between.

How do you install the factory image? Do you have a serial console attached to the rt-n56u?

@openwrt-bot
Copy link
Author

giannello:

No, it doesn't.

So far, I've been only able to have a working device the following ways:

  • factory OpenWrt 14.07 -> sysupgrade to OpenWrt 15.05.1 keeping the configuration (doing a config reset after the upgrade bricks the device)
  • factory OpenWrt 14.07 -> sysupgrade to Lede 17.01.0-rc1 (config reset after upgrade works fine)

I still haven't tried the sysupgrade to Lede 17 without preserving the config, might give it a try tomorrow.

All the factory installs are done via TFTP, sadly I don't have any ttl->rs232->usb converters around to get debug info. I'll try to get one soon (and maybe another device so I can have connectivity in the meantime).

@openwrt-bot
Copy link
Author

aczlan:

I just picked up a couple of these and I ran into the same issue with lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-factory.bin as was seen with RC1.

I was able to get it working by going from factory FW_RT_N56U_30043804180 → OpenWrt 14.07 ( https://downloads.openwrt.org/barrier_breaker/14.07/ramips/rt3883/openwrt-ramips-rt3883-rt-n56u-squashfs-factory.bin ) → sysupgrade (via LUCI) to Lede 17.01.0 ( lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-sysupgrade.bin ).

I did NOT save settings when upgrading from 14.07 to 17.01.0 and it worked fine.

To recover from the bootloop, I followed the "I think this should work" directions from https://wiki.openwrt.org/toh/asus/rt-n56u#recovery (Using the Asus "Firmware Restoration" tool from http://dlcdnet.asus.com/pub/ASUS/wireless/RT-N56U_B1/Rescue_2000.zip to flash the factory FW_RT_N56U_30043804180 image on).

I have also used that tool to flash OpenWrt 14.07 on when it was stuck in the bootloop with no issues.

@openwrt-bot
Copy link
Author

aczlan:

Let me know if there is something else that is needed to troubleshoot this. I should have ttl access by late this week, or early next week.
When it is in the bootloop, at the point where it reboots, it is shutting down the interface, so Windows says says that the network cable is disconnected, then it comes back up for 3.5 pings (one really long timeout ping, then 3 1ms pings, then the interface goes down).
The external LEDs (power and link) mirror this behavior.

Aaron Z

@openwrt-bot
Copy link
Author

aczlan:

I was able to connect via serial after flashing the lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-factory.bin image. Attached is the output of it booting several times (FS#462 - Lede 17.01.0 RT-n56u bootloop.txt ).
It seems to have a problem with the VFS as the following appears right before it reboots:
[ 4.557843] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6
[ 4.572823] Please append a correct "root=" boot option; here are the available partitions:
[ 4.589488] 1f00 192 mtdblock0 (driver?)
[ 4.599566] 1f01 64 mtdblock1 (driver?)
[ 4.609637] 1f02 64 mtdblock2 (driver?)
[ 4.619714] 1f03 7872 mtdblock3 (driver?)
[ 4.629784] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 4.647344] Rebooting in 1 seconds..

Let me know if there is more that is needed to diagnose this.

Aaron Z

@openwrt-bot
Copy link
Author

mkresin:

So far I couldn't spot anything that would explain the issue. Would you please upload a bootlog of a working LEDE image please.

Would you please build and run an initramfs image and paste the first few bytes of the factory partition:

hexdump -C /dev/mtd3 | head

@openwrt-bot
Copy link
Author

aczlan:

I am not sure what you mean by an initramfs image (and I am not currently setup to be able to build my own images), so I don't know if I can help with that.
Attached are three bootlogs:

  • FS#462 - OpenWrt14.07 RT-n56u bootloop.txt - Bootlog from the second bootup after installing the https://downloads.openwrt.org/barrier_breaker/14.07/ramips/rt3883/openwrt-ramips-rt3883-rt-n56u-squashfs-factory.bin image
  • FS#462 - LEDE17.01SysupgradeFirstBoot RT-n56u bootloop.txt - Bootlog from the first bootup (the auto-reboot) after installing the lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-sysupgrade.bin image from Luci (no saving settings)
  • FS#462 - LEDE17.01SysupgradeSecondBoot RT-n56u bootloop.txt - Bootlog from the second bootup (done from power off) after installing the lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-sysupgrade.bin image from Luci (no saving settings).

Luci was accessible and the system worked as expected at the end of each of these bootups.

Let me know if you need anything else

Aaron Z

@openwrt-bot
Copy link
Author

aczlan:

A few more datapoints when installing from the recovery tool (which is a glorified TFTP utility with a GUI):

  • The LEDE 17.01 Release Sysupgrade image WORKS AS EXPECTED (install/firstboot bootlog attached as: RecoveryInstallOfLEDE17.01Sysupgrade.txt).
  • The LEDE 17.01 Release Factory image //fails and ends in a bootloop// (install/bootloop bootlog attached as: RecoveryInstallOfLEDE17.01Factory.txt).
  • The LEDE 23 Mar Snapshot Factory image //fails and ends in a bootloop// (install/bootloop bootlog attached as: RecoveryInstallOfLEDETrunk23Mar17Factory.txt).
  • The LEDE 23 Mar Snapshot Sysupgrade image //fails and ends in a bootloop// (install/bootloop bootlog attached as: RecoveryInstallOfLEDETrunk23Mar17Sysupgrade.txt).

Aaron Z

@openwrt-bot
Copy link
Author

mkresin:

What I can see from the logs:

  • installing lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-sysupgrade.bin via tftp recovery: works
  • installing lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-factory.bin via tftp recovery: fails

The OpenWrt wiki article seams to be outdated and the sysupgrade.bin is the image that should be used for tftp recovery. The factory.bin is most likely intended to be used for a flash from the asus firmware webinterface.

Installing the trunk sysupgrade.bin via tftp recovery does work as well. But it fails later with a complete different error (rt2800pci driver has issues with getting the clock) which is unrelated to this bug report. I'm quite sure, you will see the same crash when updating from 17.01.0 to the latest trunk version as well.

Conclusion: At least in terms of this bugreport, there is no issue. It is as simple as the wrong image was used.

Regarding the rt2800pci related crash in trunk, you should create a new bugreport with a bootlog attached which shows the kernel crash.

Are you fine with closing this ticket?

@openwrt-bot
Copy link
Author

aczlan:

Let me try flashing back to the stock image, then get the bootlog for installing the factory image via the web interface before you close this one.
I will open a new ticket for the rt2800pci related crash in trunk.

Aaron Z

@openwrt-bot
Copy link
Author

aczlan:

When upgrading from stock firmware (FW_RT_N56U_30043804180) to LEDE 17.01 Factory, it still fails. Bootlog for the install and bootloop is attached (FactoryUpgradeFromWebsiteToLEDE17.01Bootlog.txt).

Aaron Z

@openwrt-bot
Copy link
Author

aczlan:

When upgrading from stock firmware (FW_RT_N56U_30043804180) to LEDE Trunk 23 Mar 17 Factory, it still fails. Bootlog for the install and bootloop is attached (FactoryUpgradeFromWebsiteToLEDETrunk23Mar17Bootlog.txt).

Aaron Z

@openwrt-bot
Copy link
Author

mkresin:

Have you tried to to use the -sysupgrade.bin for upgrading from stock firmware as well? Does it work?

@openwrt-bot
Copy link
Author

aczlan:

I tried that (with both the release and the 23 Mar snapshot sysupgrade files) and the firmware checker on the stock page throws an error saying that it is not a valid upgrade file.
I think I will have more time tonight, I can try upgrading via the serial port in the stock firmware and see if that works with the sysupgrade file if that datapoint would be useful.

Aaron Z

@openwrt-bot
Copy link
Author

mkresin:

I finally found the problem. But It needs a bit of explanation how the partition thingy is working in LEDE/OpenWrt.

The kernel, rootfs and rootfs_data partitions are created on the fly during boot. The concatenated kernel+rootfs is stored in the firmware partition.

The rt-n56u kernel has a so called uImage header at the beginning. The uImage header has size field which normally matches the size of the kernel.

During boot the firmware splitter checks for the uImage header at the beginning of the firmware partition. If the header is found, the number of bytes from the size header field are skipped. If the next bytes are a known filesystem header, the firmware partition is splitted into a kernel partition and a rootfs part.

Now the problem. The size field of the uImage header of the factory image covers not only the kernel. It is for kernel+rootfs. The "original" uImage header which covers only the kernel is appended to the end of the image. The firmware splitter kicks in during boot, finds the uImage header with the size set to kernel+rootfs, skips the kernel+rootfs and can not find a valid filesystem => error.

It most likely works with the Barrier Breaker image due to a bug in the BB firmware splitter. It seams to me the BB firmware splitter tries to find the next uImage header on the firmware partition if the first one didn't result in a usable rootfs.

The BB firmware splitter behaviour is really error prone. Since we have more than just the uImage based firmware splitter. Assuming the uImage based splitter always runs at first and searches the whole flash for uImage headers it might find a byte sequence which looks like a valid uImage but isn't. This will cause issues for images which are not using the uImage header.

Long story short: It only worked due to a bug in BB and was the wrong approach from the beginning.

I would propose to simply remove the -factory.bin image since the -sysupgrade.bin image works fine with the tftp recovery and allows to do the initial LEDE installation.

Is everybody fine with the "solution"? Would anyone of you take care of updating the OpenWrt Wiki article since OpenWrt 15.01 is affected by the same issue?

@openwrt-bot
Copy link
Author

aczlan:

That "solution" works for me. I can update the LEDE wiki this week with this information if needed.
I do not yet have an account for the OpenWRT wiki, but I can sign up for one to update this.

Aaron Z

@openwrt-bot
Copy link
Author

aczlan:

Apparently, I did have an account for the OpenWrt wiki. Both wikis (wikii, wikia?) are updated now.

Aaron Z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant