Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#3887 - MTD partition offset not correctly mapped when bad eraseblocks present #8885

Closed
openwrt-bot opened this issue Jun 19, 2021 · 3 comments
Labels
flyspray kernel pull request/issue with Linux kernel related changes

Comments

@openwrt-bot
Copy link

csharper2005:

  • Device problem occurs on:
  • Beeline Smartbox GIGA with bad erase block on NAND flash
  • Possible other NAND devices with bad erase blocks on flash
  • Software versions of OpenWrt/LEDE release, packages, etc:
    OpenWrt SNAPSHOT r16952-677813c776
  • Steps to reproduce:
  1. Write Openwrt kernel (mtd4) and UBI (mtd6) partitions from the stock firmware on a device with badblock(s).
  2. Get bootloop:
    [ 4.040886] mt7530 mdio-bus:1f: Link is Up - 1Gbps/Full - flow control off
    [ 4.042039] UBI error: no valid UBI magic found inside mtd6
    [ 4.065746] hctosys: unable to open rtc device (rtc0)
    [ 4.076584] /dev/root: Can't open blockdev
    [ 4.084755] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6

Root cause:

  1. Both U-Boot and stock firmware detects a bad eraseblock, all following offsets are sifted by one block (0x20000). Our mtd4 and mtd6 in fact are written at 0x420000 and 0x1020000.
    ********************************************
    Flash Map Information

Partition Logic_Offs Logic_size Real_Offs Real_Size
u-boot 0 100000 0 100000
part_map 100000 100000 100000 100000
factory-data 200000 100000 200000 120000
dual-flag 300000 100000 320000 100000
uImage1 400000 600000 420000 600000
uImage2 a00000 600000 a20000 600000
rootfs1 1000000 1800000 1020000 1800000
rootfs2 2800000 1800000 2820000 1800000
config/log 4000000 800000 4020000 800000
app-tmp 4800000 c00000 4820000 c00000
free-space 5400000 2800000 5420000 2800000
badblock-reserve 7c00000 400000 7c20000 360000

2. U-Boot still waiting for a kernel at 0x420000 and it's ok.
3. Openwrt detects a bad eraseblock, but does NOT shift all following offsets by one block:
[ 1.198092] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[ 1.210738] nand: Macronix MX30LF1G18AC
[ 1.218369] nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
[ 1.233435] mt7621-nand 1e003000.nand: ECC strength adjusted to 4 bits
[ 1.246486] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[ 1.261038] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[ 1.275588] Scanning device for bad blocks
[ 1.312129] Bad eraseblock 22 at 0x0000002c0000
[ 2.583817] 8 fixed-partitions partitions found on MTD device mt7621-nand
[ 2.597331] Creating 8 MTD partitions on "mt7621-nand":
[ 2.607736] 0x000000000000-0x000000100000 : "u-boot"
[ 2.618828] 0x000000100000-0x000000200000 : "dynamic partition map"
[ 2.632380] 0x000000200000-0x000000300000 : "Factory"
[ 2.643531] 0x000000300000-0x000000400000 : "Boot Flag"
[ 2.655139] 0x000000400000-0x000000a00000 : "kernel"
[ 2.666172] 0x000000a00000-0x000001000000 : "Kernel 2"
[ 2.677543] 0x000001000000-0x000007c00000 : "ubi"
[ 2.688598] 0x000007c20000-0x000007fa0000 : "bad block reserved"

3. As a result OpenWRT expects to find UBI at 0x1000000. In fact UBI is at 0x10200000. It causes boot loop:

[ 4.025957] pci 0000:00:01.0: bridge window [mem 0x60400000-0x604fffff pref]
[ 4.040886] mt7530 mdio-bus:1f: Link is Up - 1Gbps/Full - flow control off
[ 4.042039] UBI error: no valid UBI magic found inside mtd6
[ 4.065746] hctosys: unable to open rtc device (rtc0)
[ 4.076584] /dev/root: Can't open blockdev
[ 4.084755] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6
[ 4.099639] Please append a correct "root=" boot option; here are the available partitions:
[ 4.116269] 1f00 1024 mtdblock0
[ 4.116274] (driver?)
[ 4.129299] 1f01 1024 mtdblock1
[ 4.129302] (driver?)
[ 4.142303] 1f02 1024 mtdblock2
[ 4.142305] (driver?)
[ 4.155316] 1f03 1024 mtdblock3
[ 4.155319] (driver?)
[ 4.168323] 1f04 6144 mtdblock4
[ 4.168326] (driver?)
[ 4.181324] 1f05 6144 mtdblock5
[ 4.181327] (driver?)
[ 4.194323] 1f06 110592 mtdblock6
[ 4.194326] (driver?)
[ 4.207336] 1f07 3584 mtdblock7
[ 4.207339] (driver?)
[ 4.220336] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 4.236804] Rebooting in 1 seconds..

@openwrt-bot
Copy link
Author

csharper2005:

  1. Another Sercomm NAND devices are also affected by the issue. For example:
  • [[https://forum.openwrt.org/t/fixed-position-partition-splitting-and-bad-blocks-on-nand-flash/94305/|ZyXEL NR7101]] - Sercomm ?
  • [[https://bugs.openwrt.org/index.php?do=details&task_id=3582|Netgear devices]]
  • [[https://forum.openwrt.org/t/add-support-for-beeline-smartbox-giga-on-stock-u-boot/99390|Beeline SmartBox GIGA]]
  • [[https://forum.openwrt.org/t/add-support-for-beeline-smartbox-turbo/99635|Beeline SmartBox TURBO+]]
  1. Affected software versions. 21.02 branch is also affected. OpenWrt does not shift the offsets after the bad erase block at 0x2c0000:
    [ 14.116745] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
    [ 14.131565] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
    [ 14.144220] nand: Macronix MX30LF1G18AC
    [ 14.151841] nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
    [ 14.166910] mt7621-nand 1e003000.nand: ECC strength adjusted to 4 bits
    [ 14.179933] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
    [ 14.194485] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
    [ 14.209046] Scanning device for bad blocks
    [ 14.245594] Bad eraseblock 22 at 0x0000002c0000
    [ 15.517155] 8 fixed-partitions partitions found on MTD device mt7621-nand
    [ 15.530664] Creating 8 MTD partitions on "mt7621-nand":
    [ 15.541071] 0x000000000000-0x000000100000 : "u-boot"
    [ 15.552186] 0x000000100000-0x000000200000 : "dynamic partition map"
    [ 15.565827] 0x000000200000-0x000000300000 : "Factory"
    [ 15.576979] 0x000000300000-0x000000400000 : "Boot Flag"
    [ 15.588499] 0x000000400000-0x000000a00000 : "kernel"
    [ 15.599511] 0x000000a00000-0x000001000000 : "Kernel 2"
    [ 15.610909] 0x000001000000-0x000007c00000 : "ubi"
    [ 15.622086] 0x000007c00000-0x000007f80000 : "bad block reserved"
  2. [[http://patchwork.ozlabs.org/project/openwrt/patch/20200628232747.1367531-1-jan@3e8.eu/|Possible Workaround / Solution]]

@openwrt-bot
Copy link
Author

csharper2005:

Solution - 23874c6

@aparcar aparcar added the kernel pull request/issue with Linux kernel related changes label Feb 22, 2022
@Djfe
Copy link
Contributor

Djfe commented Feb 20, 2023

@hauke this was fixed by #10038 and can be closed

@hauke hauke closed this as completed Feb 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flyspray kernel pull request/issue with Linux kernel related changes
Projects
None yet
Development

No branches or pull requests

4 participants