Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#2643 - EspressoBin: Internal error: synchronous parity or ECC error: 86000018 [#1] SMP #7465

Open
openwrt-bot opened this issue Nov 30, 2019 · 6 comments
Labels

Comments

@openwrt-bot
Copy link

russell:

Supply the following if possible:

  • Device problem occurs on

Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)

  • Software versions of OpenWrt/LEDE release, packages, etc.

11595-g5d2a900163

  • Steps to reproduce

Boot and wait about an hour. Here are a series of kernel panics:

[ 4228.773060] Internal error: synchronous parity or ECC error: 86000018 [#1] SMP
[ 4228.777653] Modules linked in: ath9k ath9k_common xt_connlimit nf_conncount iptable_nat ipt_MASQUERADE ath9k_hw ath xt_state xt_nat xt_helper xt_conntrack xt_connmark xt_connbytes xt_REDIRECT xt_NETMAP xt_FLOWOFFLOAD xt_CT
nf_nat_ipv4 nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mac80211 ipt_REJECT cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_recent xt_quota xt_pkttype xt_owner xt_multiport xt_mark xt
_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_condition xt_comment xt_bpf xt_addrtype xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptabl
e_mangle iptable_filter ipt_ECN ip_tables compat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables
[ 4228.850958] nf_reject_ipv6 tun gpio_button_hotplug
[ 4228.855992] Process kworker/1:2 (pid: 594, stack limit = 0x00000000a141aece)
[ 4228.863262] CPU: 1 PID: 594 Comm: kworker/1:2 Not tainted 4.19.85 #0
[ 4228.869776] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[ 4228.876450] Workqueue: (null) (events)
[ 4228.881258] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[ 4228.886204] pc : kthread_data+0x8/0x40
[ 4228.890035] lr : wq_worker_sleeping+0xc/0xb0
[ 4228.894410] sp : ffffff8009463d60
[ 4228.897819] x29: ffffff8009463d60 x28: 0000000000000000
[ 4228.903281] x27: 0000000000000000 x26: ffffff80080b5318
[ 4228.908746] x25: 0000000000000000 x24: ffffffc03e479a18
[ 4228.914213] x23: ffffff8008868738 x22: ffffff800884e018
[ 4228.919678] x21: ffffff8008853500 x20: ffffffc03e479500
[ 4228.925146] x19: ffffffc03ffdd500 x18: 0000000000000000
[ 4228.930609] x17: 0000000000000000 x16: 0000000000000000
[ 4228.936076] x15: 0000000000000000 x14: 0000000000000000
[ 4228.941541] x13: 0000000000000002 x12: 0000000000000001
[ 4228.947006] x11: ffffffc03ffd94d0 x10: 00000000000007c0
[ 4228.952472] x9 : ffffffc03e13d668 x8 : ffffff80088a9b08
[ 4228.957937] x7 : 0000000000000000 x6 : 00000000050ebc0a
[ 4228.963404] x5 : 0000000000000004 x4 : 0000000000000002
[ 4228.968869] x3 : 00000000fffffff9 x2 : 0000000800000000
[ 4228.974335] x1 : 0000000000000001 x0 : ffffffc03e479500
[ 4228.979797] Call trace:
[ 4228.982331] kthread_data+0x8/0x40
[ 4228.985813] wq_worker_sleeping+0xc/0xb0
[ 4228.989859] __schedule+0x140/0x570
[ 4228.993436] schedule+0x58/0x80
[ 4228.996656] worker_thread+0x370/0x468
[ 4229.000513] kthread+0x110/0x120
[ 4229.003829] ret_from_fork+0x10/0x1c
[ 4229.007521] Code: d65f03c0 d503201f a9be7bfd 910003fd (f9000bf3)
[ 4229.013769] ---[ end trace 4f1be3f746ce5030 ]---
[ 4229.024169] Kernel panic - not syncing: Fatal exception
[ 4229.026728] SMP: stopping secondary CPUs
[ 4229.030757] Kernel Offset: disabled
[ 4229.034342] CPU features: 0x0,00002008
[ 4229.038186] Memory Limit: none
[ 4229.044024] Rebooting in 3 seconds..

[ 3676.038431] Internal error: synchronous parity or ECC error: 86000018 [#1] SMP
[ 3676.043028] Modules linked in: ath9k ath9k_common xt_connlimit nf_conncount iptable_nat ipt_MASQUERADE ath9k_hw ath xt_state xt_nat xt_helper xt_conntrack xt_connmark xt_connbytes xt_REDIRECT xt_NETMAP xt_FLOWOFFLOAD xt_CT
nf_nat_ipv4 nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mac80211 ipt_REJECT cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_recent xt_quota xt_pkttype xt_owner xt_multiport xt_mark xt
_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_condition xt_comment xt_bpf xt_addrtype xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptabl
e_mangle iptable_filter ipt_ECN ip_tables compat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables
[ 3676.116317] nf_reject_ipv6 tun gpio_button_hotplug
[ 3676.121338] Process kworker/0:1 (pid: 41, stack limit = 0x000000008b9ea8ea)
[ 3676.128509] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 4.19.85 #0
[ 3676.134953] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[ 3676.141596] Workqueue: (null) (events)
[ 3676.146426] pstate: 60400085 (nZCv daIf +PAN -UAO)
[ 3676.151361] pc : update_cfs_group+0x0/0xb0
[ 3676.155565] lr : dequeue_task_fair+0x440/0x960
[ 3676.160130] sp : ffffff8008a23cf0
[ 3676.163535] x29: ffffff8008a23cf0 x28: ffffffc03ffcd500
[ 3676.169001] x27: ffffffc0043a5e80 x26: ffffff80088523f0
[ 3676.174466] x25: afb504000afb5041 x24: ffffff80088c85c0
[ 3676.179932] x23: 00000357e4f6aec0 x22: ffffff800884e018
[ 3676.185397] x21: 0000000000000009 x20: ffffffc0043a5f40
[ 3676.190863] x19: ffffffc03ffcd580 x18: 0000000000000000
[ 3676.196329] x17: 0000000000000000 x16: 0000000000000000
[ 3676.201793] x15: 0000000000000000 x14: 0000000000000000
[ 3676.207259] x13: 0000000000000000 x12: 0000000000000000
[ 3676.212726] x11: 0000000000000000 x10: 00000000000007c0
[ 3676.218191] x9 : ffffffc0042eea68 x8 : ffffff80088a9b08
[ 3676.223657] x7 : 0000000000000001 x6 : 00000000050e6eea
[ 3676.229122] x5 : 00000000000000e8 x4 : ffffffc0043a3ff0
[ 3676.234587] x3 : ffffffc03ffcdef8 x2 : ffffffc0043a5f70
[ 3676.240053] x1 : 0000000000000001 x0 : ffffffc0043a5f40
[ 3676.245519] Call trace:
[ 3676.248032] update_cfs_group+0x0/0xb0
[ 3676.251887] deactivate_task+0x6c/0x80
[ 3676.255740] __schedule+0x10c/0x570
[ 3676.259321] schedule+0x58/0x80
[ 3676.262546] worker_thread+0x370/0x468
[ 3676.266400] kthread+0x110/0x120
[ 3676.269715] ret_from_fork+0x10/0x1c
[ 3676.273393] Code: a9425bf5 a8c37bfd d65f03c0 d503201f (f9404407)
[ 3676.279660] ---[ end trace fa914749a4dc6031 ]---
[ 3676.285581] Kernel panic - not syncing: Fatal exception
[ 3676.289785] SMP: stopping secondary CPUs
[ 3676.293816] Kernel Offset: disabled
[ 3676.297399] CPU features: 0x0,00002008
[ 3676.301249] Memory Limit: none
[ 3676.304922] Rebooting in 3 seconds..

[12929.680240] Internal error: synchronous parity or ECC error: 86000018 [#1] SMP
[12929.684836] Modules linked in: ath9k ath9k_common xt_connlimit nf_conncount iptable_nat ipt_MASQUERADE ath9k_hw ath xt_state xt_nat xt_helper xt_conntrack xt_connmark xt_connbytes xt_REDIRECT xt_NETMAP xt_FLOWOFFLOAD xt_CT nf_nat_ipv4 nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mac80211 ipt_REJECT cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_recent xt_quota xt_pkttype xt_owner xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_condition xt_comment xt_bpf xt_addrtype xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables compat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables
[12929.758123] nf_reject_ipv6 tun gpio_button_hotplug
[12929.763145] Process kworker/0:0 (pid: 5, stack limit = 0x00000000f766a017)
[12929.770227] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 4.19.85 #0
[12929.776582] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[12929.783224] Workqueue: (null) (events)
[12929.788054] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[12929.792990] pc : __schedule+0x348/0x570
[12929.796923] lr : __schedule+0x2d0/0x570
[12929.800862] sp : ffffff800804bd90
[12929.804268] x29: ffffff800804bd90 x28: 0000000000000000
[12929.809733] x27: ffffffc03ffccef0 x26: ffffffc03ffccea0
[12929.815199] x25: 0000000000000000 x24: ffffffc00427af10
[12929.820664] x23: ffffff8008666460 x22: ffffff800884e018
[12929.826129] x21: ffffffc00427de80 x20: ffffffc00427aa00
[12929.831595] x19: ffffffc03ffcd500 x18: 0000000000000000
[12929.837061] x17: 0000000000000000 x16: 0000000000000000
[12929.842526] x15: 0000000000000000 x14: 0000000000000000
[12929.847992] x13: 0000000000000002 x12: 0000000000000000
[12929.853458] x11: ffffffc03ffc94d0 x10: 00000000000007c0
[12929.858924] x9 : ffffffc004230168 x8 : ffffff80088a9b08
[12929.864388] x7 : 0000000000000000 x6 : 000000199ea477c6
[12929.869855] x5 : 000000000000030f x4 : 00000bc26cff1000
[12929.875320] x3 : ffffffc03ffcdef8 x2 : ffffffc03ffcdef8
[12929.880785] x1 : ffffffc03ffcdef8 x0 : 0000000000000008
[12929.886251] Call trace:
[12929.888764] __schedule+0x348/0x570
[12929.892348] schedule+0x58/0x80
[12929.895575] worker_thread+0x370/0x468
[12929.899428] kthread+0x110/0x120
[12929.902742] ret_from_fork+0x10/0x1c
[12929.906418] Code: b5fffeb7 d4210000 f9800291 c85f7e80 (927ef800)
[12929.912686] ---[ end trace 5f4e2dd5c1da7acd ]---
[12929.918593] Kernel panic - not syncing: Fatal exception
[12929.922812] SMP: stopping secondary CPUs
[12929.926844] Kernel Offset: disabled
[12929.930426] CPU features: 0x0,00002008
[12929.934277] Memory Limit: none
[12929.937946] Rebooting in 3 seconds..

@openwrt-bot
Copy link
Author

russell:

9482-gb2bf3745ff (circa 2019-02-28) is not affected.

@openwrt-bot
Copy link
Author

russell:

10099-gcf463159df (circa 2019-06-02) seems okay.

11167-g273a6cb562 (circa 2019-10-07) seems broken (similar panics as above).

@openwrt-bot
Copy link
Author

russell:

commit f342ffd (the switch to 4.19.x) seems to be affected. So far (currently testing) the immediately prior commit (9b8d0f1) does not. 24 hours later, still no freeze or panic. I am surprised no one else has noticed this.

@openwrt-bot
Copy link
Author

russell:

I tried the snapshot config in my build tree:

DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r12121-2dc0a8c180'
DISTRIB_TARGET='mvebu/cortexa53'
DISTRIB_ARCH='aarch64_cortex-a53'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r12121-2dc0a8c180'
DISTRIB_TAINTS=''

uname -a:

Linux OpenWrt 4.19.98 #0 SMP Fri Jan 24 17:52:41 2020 aarch64 GNU/Linux

After 2+ days uptime:

[276126.053707] Internal error: synchronous parity or ECC error: 86000018 [#1] SMP
[276126.058388] Modules linked in: pppoe ppp_async iptable_nat ipt_MASQUERADE xt_state xt_nat xt_conntrack xt_REDIRECT xt_FLOWOFFLOAD xt_CT pppox ppp_generic nf_nat_ipv4 nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcac
he nf_conntrack ipt_REJECT xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG slhc nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables crc_ccitt nf_l
og_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 gpio_button_hotplug
[276126.109847] Process kworker/1:1 (pid: 28, stack limit = 0x000000002318eef4)
[276126.117111] CPU: 1 PID: 28 Comm: kworker/1:1 Not tainted 4.19.98 #0
[276126.123631] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[276126.130390] Workqueue: (null) (events)
[276126.135290] pstate: 60400085 (nZCv daIf +PAN -UAO)
[276126.140326] pc : __switch_to+0x0/0xf0
[276126.144168] lr : __schedule+0x508/0x570
[276126.148178] sp : ffffff8008a03d90
[276126.151673] x29: ffffff8008a03d90 x28: 0000000000000000
[276126.157228] x27: 0000000000000000 x26: ffffff80080b4990
[276126.162784] x25: 0000000000000000 x24: ffffffc03dc51500
[276126.168337] x23: 0000000000000000 x22: ffffff800887f018
[276126.173893] x21: ffffffc03e4dd400 x20: ffffffc004365400
[276126.179448] x19: ffffffc03ffdd540 x18: 0000000000000000
[276126.185004] x17: 0000000000000000 x16: 0000000000000000
[276126.190559] x15: 0000000000000000 x14: 0000000000000000
[276126.196114] x13: 0000000000000002 x12: 0000000000000001
[276126.201670] x11: ffffffc03ffd94d0 x10: 00000000000007c0
[276126.207225] x9 : ffffffc00431f268 x8 : ffffff80088dabd8
[276126.212778] x7 : 0000000000000000 x6 : 000003199d53cb09
[276126.218335] x5 : 0000000000000103 x4 : 0000fb229b57d000
[276126.223890] x3 : ffffffc03ffddf38 x2 : 0000000000000000
[276126.229447] x1 : ffffffc03e4dd400 x0 : ffffffc004365400
[276126.234999] Call trace:
[276126.237616] __switch_to+0x0/0xf0
[276126.241108] schedule+0x58/0x80
[276126.244423] worker_thread+0x370/0x468
[276126.248369] kthread+0x110/0x120
[276126.251769] ret_from_fork+0x10/0x1c
[276126.255549] Code: f9041801 d65f03c0 d65f03c0 d503201f (a9be7bfd)
[276126.261888] ---[ end trace 8a0f9451e47635d7 ]---
[276126.272178] Kernel panic - not syncing: Fatal exception
[276126.274828] SMP: stopping secondary CPUs
[276126.278946] Kernel Offset: disabled
[276126.282618] CPU features: 0x0,00002008
[276126.286553] Memory Limit: none
[276126.292489] Rebooting in 3 seconds..

@openwrt-bot
Copy link
Author

russell:

Reproduced another panic with an OpenWrt snapshot from downloads.openwrt.org:

OpenWrt SNAPSHOT, r12190-71de48bd37
Linux OpenWrt 4.19.101 #0 SMP Wed Feb 5 20:56:02 2020 aarch64 GNU/Linux

Tue Feb 11 19:30:06 UTC 2020 19:30:06 up 4 days, 8:36, load average: 0.00, 0.00, 0.00 total used free shared buff/cache available Mem: 1019624 40432 968244 60 10948 950988 Swap: 0 0 0
[376792.183870] SError Interrupt on CPU1, code 0xbf000001 -- SError
[376792.183875] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.19.101 #0
[376792.183877] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[376792.183880] pstate: 60400085 (nZCv daIf +PAN -UAO)
[376792.183881] pc : el1_da+0xc/0xb0
[376792.183883] lr : do_idle+0x98/0x1b0
[376792.183885] sp : ffffff8008953e10
[376792.183887] x29: ffffff8008953f40 x28: ffffffc004290000
[376792.183892] x27: 0000000000000000 x26: 0000000000000000
[376792.183896] x25: 0000000000000000 x24: 0000000000000025
[376792.183901] x23: 0000000020400085 x22: ffffff80080c6154
[376792.183905] x21: 000000000093f000 x20: 0000007fffffffff
[376792.183909] x19: ffffff8008898578 x18: 0000000000000000
[376792.183913] x17: 0000000000000000 x16: 0000000000000000
[376792.183917] x15: 0000000000000000 x14: 0000000000000000
[376792.183921] x13: 0000000000000002 x12: 0000000000000001
[376792.183925] x11: ffffffc03ffd94d0 x10: 00000000000007c0
[376792.183930] x9 : ffffff8008953ec0 x8 : ffffffc004290820
[376792.183934] x7 : 0000000000000000 x6 : 000004400338df17
[376792.183938] x5 : 0000000000000046 x4 : 0000000000000001
[376792.183942] x3 : ffffffc03ffdc5b0 x2 : 0000000000000080
[376792.183946] x1 : 0000000096000018 x0 : ffffff80088835b0
[376792.183951] Kernel panic - not syncing: Asynchronous SError Interrupt
[376792.183965] SMP: stopping secondary CPUs
[376792.183967] Kernel Offset: disabled
[376792.183969] CPU features: 0x0,00002008
[376792.183970] Memory Limit: none

These do not happen on 4.14.x, including about 9+ days of no-panic uptime (before I gave up to try a different image) on:

OpenWrt 19.07.0, r10860-a3ffeb413b
Linux OpenWrt 4.14.162 #0 SMP Mon Jan 6 16:47:09 2020 aarch64 GNU/Linux

@openwrt-bot
Copy link
Author

Andre:

Sounds like a DDR instability. Try a firmware build with CPU_800_DDR_800

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant