OpenWrt/LEDE Project

  • Status Closed
  • Percent Complete
    100%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity Low
  • Priority Very Low
  • Reported Version lede-17.01
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Giuseppe Iannello - 04.02.2017
Last edited by Mathias Kresin - 08.04.2017

FS#462 - Lede 17.01.0-rc1 - ramips/rt3883/rt-n56u bootloop

Installing Lede 17.01.0-rc1 on an Asus N56U using the factory image results in the device constantly rebooting.

Upgrading with the sysupdate image from a known working installation - even without preserving the configuration - works fine.

Closed by  Mathias Kresin
08.04.2017 11:41
Reason for closing:  Won't fix
Additional comments about closing:  

Won't fix due to suitable workaround and possible regressions of restoring the old behaviour.

Project Manager
Mathias Kresin commented on 04.02.2017 18:05

Does the OpenWrt 15.01.1 factory image works for you? The easiest would be to find a working OpenWrt image, to see what has changed in between.

How do you install the factory image? Do you have a serial console attached to the rt-n56u?

Giuseppe Iannello commented on 04.02.2017 22:40

No, it doesn't.

So far, I've been only able to have a working device the following ways:
* factory OpenWrt 14.07 → sysupgrade to OpenWrt 15.05.1 keeping the configuration (doing a config reset after the upgrade bricks the device)
* factory OpenWrt 14.07 → sysupgrade to Lede 17.01.0-rc1 (config reset after upgrade works fine)

I still haven't tried the sysupgrade to Lede 17 _without_ preserving the config, might give it a try tomorrow.

All the factory installs are done via TFTP, sadly I don't have any ttl→rs232→usb converters around to get debug info. I'll try to get one soon (and maybe another device so I can have connectivity in the meantime).

Aaron Z commented on 13.03.2017 02:27

I just picked up a couple of these and I ran into the same issue with lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-factory.bin as was seen with RC1.

I was able to get it working by going from factory FW_RT_N56U_30043804180 → OpenWrt 14.07 ( https://downloads.openwrt.org/barrier_breaker/14.07/ramips/rt3883/openwrt-ramips-rt3883-rt-n56u-squashfs-factory.bin ) → sysupgrade (via LUCI) to Lede 17.01.0 ( lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-sysupgrade.bin ).

I did NOT save settings when upgrading from 14.07 to 17.01.0 and it worked fine.

To recover from the bootloop, I followed the "I think this should work" directions from https://wiki.openwrt.org/toh/asus/rt-n56u#recovery (Using the Asus "Firmware Restoration" tool from http://dlcdnet.asus.com/pub/ASUS/wireless/RT-N56U_B1/Rescue_2000.zip to flash the factory FW_RT_N56U_30043804180 image on).

I have also used that tool to flash OpenWrt 14.07 on when it was stuck in the bootloop with no issues.

Aaron Z commented on 14.03.2017 23:25

Let me know if there is something else that is needed to troubleshoot this. I should have ttl access by late this week, or early next week.
When it is in the bootloop, at the point where it reboots, it is shutting down the interface, so Windows says says that the network cable is disconnected, then it comes back up for 3.5 pings (one really long timeout ping, then 3 1ms pings, then the interface goes down).
The external LEDs (power and link) mirror this behavior.

Aaron Z

Aaron Z commented on 17.03.2017 21:37

I was able to connect via serial after flashing the lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-factory.bin image. Attached is the output of it booting several times ( FS#462  - Lede 17.01.0 RT-n56u bootloop.txt ).
It seems to have a problem with the VFS as the following appears right before it reboots:
[ 4.557843] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6
[ 4.572823] Please append a correct "root=" boot option; here are the available partitions:
[ 4.589488] 1f00 192 mtdblock0 (driver?)
[ 4.599566] 1f01 64 mtdblock1 (driver?)
[ 4.609637] 1f02 64 mtdblock2 (driver?)
[ 4.619714] 1f03 7872 mtdblock3 (driver?)
[ 4.629784] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 4.647344] Rebooting in 1 seconds..

Let me know if there is more that is needed to diagnose this.

Aaron Z

Project Manager
Mathias Kresin commented on 23.03.2017 18:11

So far I couldn't spot anything that would explain the issue. Would you please upload a bootlog of a working LEDE image please.

Would you please build and run an initramfs image and paste the first few bytes of the factory partition:

hexdump -C /dev/mtd3 | head
Aaron Z commented on 23.03.2017 22:00

I am not sure what you mean by an initramfs image (and I am not currently setup to be able to build my own images), so I don't know if I can help with that.
Attached are three bootlogs:

  1.  FS#462  - OpenWrt14.07 RT-n56u bootloop.txt - Bootlog from the second bootup after installing the https://downloads.openwrt.org/barrier_breaker/14.07/ramips/rt3883/openwrt-ramips-rt3883-rt-n56u-squashfs-factory.bin image
  2.  FS#462  - LEDE17.01SysupgradeFirstBoot RT-n56u bootloop.txt - Bootlog from the first bootup (the auto-reboot) after installing the lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-sysupgrade.bin image from Luci (no saving settings)
  3.  FS#462  - LEDE17.01SysupgradeSecondBoot RT-n56u bootloop.txt - Bootlog from the second bootup (done from power off) after installing the lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-sysupgrade.bin image from Luci (no saving settings).

Luci was accessible and the system worked as expected at the end of each of these bootups.

Let me know if you need anything else

Aaron Z

Aaron Z commented on 23.03.2017 23:20

A few more datapoints when installing from the recovery tool (which is a glorified TFTP utility with a GUI):

  • The LEDE 17.01 Release Sysupgrade image WORKS AS EXPECTED (install/firstboot bootlog attached as: RecoveryInstallOfLEDE17.01Sysupgrade.txt).
  • The LEDE 17.01 Release Factory image fails and ends in a bootloop (install/bootloop bootlog attached as: RecoveryInstallOfLEDE17.01Factory.txt).
  • The LEDE 23 Mar Snapshot Factory image fails and ends in a bootloop (install/bootloop bootlog attached as: RecoveryInstallOfLEDETrunk23Mar17Factory.txt).
  • The LEDE 23 Mar Snapshot Sysupgrade image fails and ends in a bootloop (install/bootloop bootlog attached as: RecoveryInstallOfLEDETrunk23Mar17Sysupgrade.txt).

Aaron Z

Project Manager
Mathias Kresin commented on 23.03.2017 23:52

What I can see from the logs:

  • installing lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-sysupgrade.bin via tftp recovery: works
  • installing lede-17.01.0-r3205-59508e3-ramips-rt3883-rt-n56u-squashfs-factory.bin via tftp recovery: fails

The OpenWrt wiki article seams to be outdated and the sysupgrade.bin is the image that should be used for tftp recovery. The factory.bin is most likely intended to be used for a flash from the asus firmware webinterface.

Installing the trunk sysupgrade.bin via tftp recovery does work as well. But it fails later with a complete different error (rt2800pci driver has issues with getting the clock) which is unrelated to this bug report. I'm quite sure, you will see the same crash when updating from 17.01.0 to the latest trunk version as well.

Conclusion: At least in terms of this bugreport, there is no issue. It is as simple as the wrong image was used.

Regarding the rt2800pci related crash in trunk, you should create a new bugreport with a bootlog attached which shows the kernel crash.

Are you fine with closing this ticket?

Aaron Z commented on 23.03.2017 23:56

Let me try flashing back to the stock image, then get the bootlog for installing the factory image via the web interface before you close this one.
I will open a new ticket for the rt2800pci related crash in trunk.

Aaron Z

Aaron Z commented on 24.03.2017 00:23

When upgrading from stock firmware (FW_RT_N56U_30043804180) to LEDE 17.01 Factory, it still fails. Bootlog for the install and bootloop is attached (FactoryUpgradeFromWebsiteToLEDE17.01Bootlog.txt).

Aaron Z

Aaron Z commented on 24.03.2017 01:05

When upgrading from stock firmware (FW_RT_N56U_30043804180) to LEDE Trunk 23 Mar 17 Factory, it still fails. Bootlog for the install and bootloop is attached (FactoryUpgradeFromWebsiteToLEDETrunk23Mar17Bootlog.txt).

Aaron Z

Project Manager
Mathias Kresin commented on 24.03.2017 07:16

Have you tried to to use the -sysupgrade.bin for upgrading from stock firmware as well? Does it work?

Aaron Z commented on 24.03.2017 10:01

I tried that (with both the release and the 23 Mar snapshot sysupgrade files) and the firmware checker on the stock page throws an error saying that it is not a valid upgrade file.
I think I will have more time tonight, I can try upgrading via the serial port in the stock firmware and see if that works with the sysupgrade file if that datapoint would be useful.

Aaron Z

Project Manager
Mathias Kresin commented on 27.03.2017 16:29

I finally found the problem. But It needs a bit of explanation how the partition thingy is working in LEDE/OpenWrt.

The kernel, rootfs and rootfs_data partitions are created on the fly during boot. The concatenated kernel+rootfs is stored in the firmware partition.

The rt-n56u kernel has a so called uImage header at the beginning. The uImage header has size field which normally matches the size of the kernel.

During boot the firmware splitter checks for the uImage header at the beginning of the firmware partition. If the header is found, the number of bytes from the size header field are skipped. If the next bytes are a known filesystem header, the firmware partition is splitted into a kernel partition and a rootfs part.

Now the problem. The size field of the uImage header of the factory image covers not only the kernel. It is for kernel+rootfs. The "original" uImage header which covers only the kernel is appended to the end of the image. The firmware splitter kicks in during boot, finds the uImage header with the size set to kernel+rootfs, skips the kernel+rootfs and can not find a valid filesystem ⇒ error.

It most likely works with the Barrier Breaker image due to a bug in the BB firmware splitter. It seams to me the BB firmware splitter tries to find the next uImage header on the firmware partition if the first one didn't result in a usable rootfs.

The BB firmware splitter behaviour is really error prone. Since we have more than just the uImage based firmware splitter. Assuming the uImage based splitter always runs at first and searches the whole flash for uImage headers it might find a byte sequence which looks like a valid uImage but isn't. This will cause issues for images which are not using the uImage header.

Long story short: It only worked due to a bug in BB and was the wrong approach from the beginning.

I would propose to simply remove the -factory.bin image since the -sysupgrade.bin image works fine with the tftp recovery and allows to do the initial LEDE installation.

Is everybody fine with the "solution"? Would anyone of you take care of updating the OpenWrt Wiki article since OpenWrt 15.01 is affected by the same issue?

Aaron Z commented on 27.03.2017 18:18

That "solution" works for me. I can update the LEDE wiki this week with this information if needed.
I do not yet have an account for the OpenWRT wiki, but I can sign up for one to update this.

Aaron Z

Aaron Z commented on 28.03.2017 00:06

Apparently, I did have an account for the OpenWrt wiki. Both wikis (wikii, wikia?) are updated now.

Aaron Z

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing