Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#3241 - temporary flash failure on ipq40xx device (wpj428) #6403

Open
openwrt-bot opened this issue Jul 21, 2020 · 3 comments
Open

FS#3241 - temporary flash failure on ipq40xx device (wpj428) #6403

openwrt-bot opened this issue Jul 21, 2020 · 3 comments
Labels

Comments

@openwrt-bot
Copy link

yogo1212:

Hello :-)

My employer has noticed a small fraction of devices failing with a trunk-based software image (OpenWrt SNAPSHOT, r13134+521-f57230c4e6) on the WPJ428 platform (ipq40xx).

Messages like these appear in syslog:

Tue Jul 21 13:16:39 2020 daemon.err node-comm[27021]: Error loading shared library libevent_openssl-2.1.so.7: I/O error (needed by /usr/bin/node-comm-mqtt)
Tue Jul 21 13:16:39 2020 kern.err kernel: [523126.625066] SQUASHFS error: Unable to read fragment cache entry [3c56aa]
Tue Jul 21 13:16:39 2020 kern.err kernel: [523126.625114] SQUASHFS error: Unable to read page, block 3c56aa, size 1522c

After reboot, the problem goes away (probably because it's very unlikely to appear twice in a row).

The problem occurs with various flash chip revisions, so we believe it is a driver issue.

On an (un-)lucky day, the error occured on my device and i created two dumps of /dev/mtd8ro (the whole 32M of flash), one while error was occuring and another after the reboot.
1290 consecutive bytes are read as FF in the error state (reliably when running dd multiple times).

The diff from before and after the reboot looks like this (cmp -l output converted to hex, xx for redacted bytes):

01BB36BD FF xx
01BB36BE FF xx
01BB36BF FF xx
01BB36C0 FF xx
...
01BB3BC7 FF xx

The syslog from above belongs to the same occurance as diff.
It's worth noting that the file that couldn't be read is in the ROM portion of the flash while the offset of the diff is near the end.

I've reached the limits of my knowledge. If there's anything else that would be interesting to know from the error state, let me know, i'll see what i can do.

@openwrt-bot
Copy link
Author

yogo1212:

the information about partition and mtd size:

$ df -h
Filesystem Size Used Available Use% Mounted on
/dev/root 5.8M 5.8M 0 100% /rom
tmpfs 121.9M 124.0K 121.8M 0% /tmp
/dev/mtdblock11 21.3M 692.0K 20.6M 3% /overlay
overlayfs:/overlay 21.3M 692.0K 20.6M 3% /
tmpfs 512.0K 0 512.0K 0% /dev
$ cat /proc/mtd
dev: size erasesize name
mtd0: 00040000 00010000 "0:SBL1"
mtd1: 00020000 00010000 "0:MIBIB"
mtd2: 00060000 00010000 "0:QSEE"
mtd3: 00010000 00010000 "0:CDT"
mtd4: 00010000 00010000 "0:DDRPARAMS"
mtd5: 00010000 00010000 "0:APPSBLENV"
mtd6: 00080000 00010000 "0:APPSBL"
mtd7: 00010000 00010000 "0:ART"
mtd8: 01e80000 00010000 "firmware"
mtd9: 00390000 00010000 "kernel"
mtd10: 01af6bdc 00010000 "rootfs"
mtd11: 01540000 00010000 "rootfs_data"

@openwrt-bot
Copy link
Author

yogo1212:

on two other routers, the same block is affected:

Tue Jul 21 15:43:29 2020 kern.err kernel: [1257728.664909] SQUASHFS error: Unable to read fragment cache entry [3c56aa]
Tue Jul 21 15:43:29 2020 kern.err kernel: [1257728.664949] SQUASHFS error: Unable to read page, block 3c56aa, size 1522c

Tue Jul 21 15:40:49 2020 kern.err kernel: [1770438.768700] SQUASHFS error: Unable to read fragment cache entry [3c56aa]
Tue Jul 21 15:40:49 2020 kern.err kernel: [1770438.768741] SQUASHFS error: Unable to read page, block 3c56aa, size 1522c

@openwrt-bot
Copy link
Author

yogo1212:

Compiled another firmware. Another device of the same type. Same error, different block:

Fri Aug 21 08:33:23 2020 daemon.err node-comm[4034]: Error loading shared library libevmqtt.so: I/O error (needed by /usr/bin/node-comm-mqtt)
Fri Aug 21 08:33:23 2020 kern.err kernel: [451294.748866] SQUASHFS error: Unable to read fragment cache entry [3edbca]
Fri Aug 21 08:33:23 2020 kern.err kernel: [451294.755269] SQUASHFS error: Unable to read page, block 3edbca, size 1522c

3066 bytes:
00E29AAD FF xx
00E29AAE FF xx
00E29AAF FF xx
...
00E2A6B4 FF xx
00E2A6B5 FF xx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant