OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Kernel
  • Assigned To No-one
  • Operating System All
  • Severity Low
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by LeonPoon - 02.02.2019

FS#2097 - mt7621 nand mtd slave fail to read same page twice

For my setup, only ‘firmware’ partition is specified in .dts (no ‘kernel’ or ‘rootfs’) so that mtdsplit can do its job of finding the correct offset for ubi parition.

mtdsplit found the ubi slave partition but attaching the ubi failed so kernel panicked due to lack of a root fs.

I narrowed down the issue to data all zero read when doing ubi_auto_attach().

Observe in below added printk in nand_do_read_ops() that 0×580000 was read once at 2.028127s and again at 3.682984s but the data are different:

[    1.166183] 7 fixed-partitions partitions found on MTD device MT7621-NAND
[    1.172957] Creating 7 MTD partitions on "MT7621-NAND": ***
[    1.178513] 0x000000000000-0x000000080000 : "uboot"
[    1.207245] 0x000000080000-0x0000000c0000 : "uboot_env"
[    1.237246] 0x0000000c0000-0x000000100000 : "factory"
[    1.266565] 0x000000100000-0x000000140000 : "s_env"
[    1.295219] 0x000000140000-0x000000180000 : "devinfo"
[    1.324541] 0x000000180000-0x000002980000 : "firmware"
[    1.902188] nand: read 4bytes 34f983(3471747): @8fc35b60=0 (retlen=4)
[    1.909212] nand: read 4bytes 360000(3538944): @8fc35b60=0 (retlen=4)
[    1.916224] nand: read 4bytes 380000(3670016): @8fc35b60=0 (retlen=4)
[    1.923224] nand: read 4bytes 3a0000(3801088): @8fc35b60=0 (retlen=4)
[    1.930201] nand: read 4bytes 3c0000(3932160): @8fc35b60=0 (retlen=4)
[    1.937204] nand: read 4bytes 3e0000(4063232): @8fc35b60=0 (retlen=4)
[    1.944203] nand: read 4bytes 400000(4194304): @8fc35b60=0 (retlen=4)
[    1.951180] nand: read 4bytes 420000(4325376): @8fc35b60=0 (retlen=4)
[    1.958183] nand: read 4bytes 440000(4456448): @8fc35b60=0 (retlen=4)
[    1.965183] nand: read 4bytes 460000(4587520): @8fc35b60=0 (retlen=4)
[    1.972160] nand: read 4bytes 480000(4718592): @8fc35b60=0 (retlen=4)
[    1.979163] nand: read 4bytes 4a0000(4849664): @8fc35b60=0 (retlen=4)
[    1.986162] nand: read 4bytes 4c0000(4980736): @8fc35b60=0 (retlen=4)
[    1.993172] nand: read 4bytes 4e0000(5111808): @8fc35b60=0 (retlen=4)
[    2.000150] nand: read 4bytes 500000(5242880): @8fc35b60=0 (retlen=4)
[    2.007147] nand: read 4bytes 520000(5373952): @8fc35b60=0 (retlen=4)
[    2.014152] nand: read 4bytes 540000(5505024): @8fc35b60=0 (retlen=4)
[    2.021130] nand: read 4bytes 560000(5636096): @8fc35b60=0 (retlen=4)
[    2.028127] nand: read 4bytes 580000(5767168): @8fc35b60=23494255 (retlen=4)
[    2.035166] mtdsplit: mtd_check_rootfs_magic(firmware, offset=4194304) got UBI_EC_MAGIC (magic=23494255)
[    2.044625] 2 uimage-fw partitions found on MTD device firmware
[    2.050514] run_parsers_by_type(firmware, MTD_PARSER_TYPE_FIRMWARE) found 2 parts
[    2.057997] 0x000000180000-0x000000580000 : "kernel"
[    2.087096] 0x000000580000-0x000002980000 : "ubi"
[    2.115345] 0x000002980000-0x000005180000 : "alt_firmware"
[    2.146730] [mtk_nand] probe successfully!
[    2.151499] Signature matched and data read!
[    2.155764] load_fact_bbt success 1023
[    2.160202] libphy: Fixed MDIO Bus: probed
[    2.234009] mtk_soc_eth 1e100000.ethernet: generated random MAC address 3e:90:49:dc:0e:f4
[    2.242383] libphy: mdio: probed
[    3.645190] mtk_soc_eth 1e100000.ethernet: loaded mt7530 driver
[    3.651851] mtk_soc_eth 1e100000.ethernet eth0: mediatek frame engine at 0xbe100000, irq 20
[    3.662579] NET: Registered protocol family 10
[    3.668281] Segment Routing with IPv6
[    3.672014] NET: Registered protocol family 17
[    3.676546] 8021q: 802.1Q VLAN Support v1.8
[    3.682984] nand: read 4bytes 580000(5767168): @8fc35e1c=0 (retlen=4)
[    3.689426] UBI mtd_read(ubi, offset=0, 4 bytes) got magic@8fc35e1c bytes: 0
[    3.696477] UBI error: no valid UBI magic found inside mtd7
[    3.702054] hctosys: unable to open rtc device (rtc0)
[    3.707847] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6

I added ops.oobbuf = 1 in nand_read() and the issue went away seemingly due to disabling of read from chip buffer (near line 1903 in nand_base.c). Observe below printk indicates now reading same address returns same data both times (3.188661s and 4.854364s):

[    3.153685] nand: read 4bytes 4e0000(5111808): @8fc35b60=0 (retlen=4)
[    3.160662] nand: read 4bytes 500000(5242880): @8fc35b60=0 (retlen=4)
[    3.167676] nand: read 4bytes 520000(5373952): @8fc35b60=0 (retlen=4)
[    3.174678] nand: read 4bytes 540000(5505024): @8fc35b60=0 (retlen=4)
[    3.181655] nand: read 4bytes 560000(5636096): @8fc35b60=0 (retlen=4)
[    3.188661] nand: read 4bytes 580000(5767168): @8fc35b60=23494255 (retlen=4)
[    3.195698] mtdsplit: mtd_check_rootfs_magic(firmware, offset=4194304) got UBI_EC_MAGIC (magic=23494255)
[    3.205161] 2 uimage-fw partitions found on MTD device firmware
[    3.211051] run_parsers_by_type(firmware, MTD_PARSER_TYPE_FIRMWARE) found 2 parts
[    3.218530] 0x000000180000-0x000000580000 : "kernel"
[    3.247625] 0x000000580000-0x000002980000 : "ubi"
[    3.275885] 0x000002980000-0x000005180000 : "alt_firmware"
[    3.307291] [mtk_nand] probe successfully!
[    3.312061] Signature matched and data read!
[    3.316326] load_fact_bbt success 1023
[    3.320776] libphy: Fixed MDIO Bus: probed
[    3.393926] mtk_soc_eth 1e100000.ethernet: generated random MAC address d6:1b:2e:2e:17:74
[    3.402252] libphy: mdio: probed
[    4.815958] mtk_soc_eth 1e100000.ethernet: loaded mt7530 driver
[    4.822595] mtk_soc_eth 1e100000.ethernet eth0: mediatek frame engine at 0xbe100000, irq 20
[    4.833311] NET: Registered protocol family 10
[    4.839073] Segment Routing with IPv6
[    4.842892] NET: Registered protocol family 17
[    4.847360] 8021q: 802.1Q VLAN Support v1.8
[    4.854364] nand: read 4bytes 580000(5767168): @8fc35e1c=23494255 (retlen=4)
[    4.861399] UBI mtd_read(ubi, offset=0, 4 bytes) got magic@8fc35e1c bytes: 23494255
[    4.869062] UBI: auto-attach mtd7
[    4.872414] ubi0: attaching mtd7
[    4.892381] UBI: EOF marker found, PEBs from 14 will be erased
[    4.898547] ubi0: scanning is finished
[    4.940681] ubi0: volume 1 ("rootfs_data") re-sized from 9 to 252 LEBs
[    4.948006] ubi0: attached mtd7 (name "ubi", size 36 MiB)
[    4.953428] ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[    4.960270] ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
[    4.967044] ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
[    4.973995] ubi0: good PEBs: 287, bad PEBs: 1, corrupted PEBs: 0
[    4.979971] ubi0: user volume: 2, internal volumes: 1, max. volumes count: 128
[    4.987175] ubi0: max/mean erase counter: 1/0, WL threshold: 4096, image sequence number: 1003525433
[    4.996288] ubi0: available PEBs: 0, total reserved PEBs: 287, PEBs reserved for bad PEB handling: 19
[    5.005507] ubi0: background thread "ubi_bgt0d" started, PID 354
[    5.006121] nand: read 4bytes 5c1000(6033408): @8fc35d8c=73717368 (retlen=4)
[    5.019595] block ubiblock0_0: created from ubi0:0(rootfs)
[    5.025122] ubiblock: device ubiblock0_0 (rootfs) set to be root filesystem
[    5.032069] hctosys: unable to open rtc device (rtc0)

I don’t know what oobbuf does and what’s the wider impact so what is the real problem and fix?

Thanks.

Let met know what other information is needed.

LeonPoon commented on 02.02.2019 16:57

By the way the (re-)read works correctly if the ubi partition is directly defined in the .dts, so it seems that the issue hits only when ubi is a slave of firmware.

And if I boot into initramfs-kernel.bin (with oobbuf==0) and do `head -c 32 /dev/mtd7|hexdump -C`, it correctly shows the UBI# signature even if I run this command multiple times - so this probably excludes the issue from chip→buffers→databuf mechanism in nand_base.c?

LeonPoon commented on 03.02.2019 05:18

Forget about what I said above about /dev/mtd7 giving correct data when running hexdump. Please look at attached log instead.

Running the hexdump command gives different results.

Something must be filling chip→buffers→databuf with junk between the time when the ubi was found to the time when ubi was trying to auto attach.

   p.txt (246.3 KiB)
LeonPoon commented on 03.02.2019 07:30

It appears that at the end of mtk_nand_probe() it reads the factory bad blocks table into chip→buffers→databuf without resetting the pagebuf number.

This patch fixes the problem for me (made to not change number of line):

diff --git a/target/linux/ramips/patches-4.14/0039-mtd-add-mt7621-nand-support.patch b/target/linux/ramips/patches-4.14/0039-mtd-add-mt7621-nand-support.patch
index d50e689110..5af384c342 100644
--- a/target/linux/ramips/patches-4.14/0039-mtd-add-mt7621-nand-support.patch
+++ b/target/linux/ramips/patches-4.14/0039-mtd-add-mt7621-nand-support.patch
@@ -3297,13 +3297,13 @@ Signed-off-by: John Crispin <blogic@openwrt.org>
 +                      printk("compare signature failed %x\n", page);
 +                      return -1;
 +              }
-+              if (mtk_nand_exec_read_page(mtd, page, mtd->writesize, chip->buffers->databuf, chip->oob_poi))
++              if (mtk_nand_exec_read_page(mtd, chip->pagebuf = page, mtd->writesize, chip->buffers->databuf, chip->oob_poi))
 +              {
 +                      printk("Signature matched and data read!\n");
 +                      memcpy(fact_bbt, chip->buffers->databuf, (bbt_size <= mtd->writesize)? bbt_size:mtd->writesize);
 +                      return 0;
-+              }
-+
++              } else
++                      chip->pagebuf = -1;
 +      }
 +      printk("failed at page %x\n", page);
 +      return -1;

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing