Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#2321 - Kernel Panic after enabling hardware offloading - EdgeRouter X #7195

Closed
openwrt-bot opened this issue Jun 15, 2019 · 6 comments
Closed
Labels

Comments

@openwrt-bot
Copy link

SirToffski:

  • Device: Ubiquiti EdgeRouter X (ramips mt7621)

  • Software: LuCI Master (f138fc93) / OpenWrt SNAPSHOT r10210-09c6885

  • Steps to reproduce: enable hardware flow offloading

Brief description:
Enabling hardware flow offloading causes kernel oops/panic, resulting in device rebooting. The device remains in a bootloop.
Hard powercycle does not resolve the kernel panic.

Steps taken to resolve the issue:
Powering the device off, disconnecting all rj45 ports then reconnecting a single rj45 cable allows to disable the hardware offload. As the hardware offload is no longer active, the issue stops from happening.

Crash logs:
https://bugs.openwrt.org/index.php?do=details&task_id=1837

Additional information:
The issue was not present prior to this snapshot. Packet capture on a device connected to eth1 port did not reveal anything strange. There was a broadcast message sent from the router to UPD port 4919 stating Please press button now to enter failsafe. Presumably this is a normal part of the boot-up process.

@openwrt-bot
Copy link
Author

SirToffski:

Apologies, the above post had a wrong link for crash logs.

The correct link is: https://gist.github.com/SirToffski/41aa69bf4baca463bb04b9c1f746fad3

Attaching the logs here as well.

Jun 14 05:20:40 _gateway kernel: [ 246.730704] CPU 1 Unable to handle kernel paging request at virtual address 000000a4, epc == 803901c8, ra == 8efda024
Jun 14 05:20:40 _gateway kernel: [ 246.751874] Oops[#1]:
Jun 14 05:20:40 _gateway kernel: [ 246.756403] CPU: 1 PID: 137 Comm: kworker/1:1 Not tainted 4.14.125 #0
Jun 14 05:20:40 _gateway kernel: [ 246.769298] Workqueue: events_power_efficient nf_flow_dnat_port [nf_flow_table]
Jun 14 05:20:40 _gateway kernel: [ 246.783871] task: 8fe44e00 task.stack: 8feac000
Jun 14 05:20:40 _gateway kernel: [ 246.792887] $ 0 : 00000000 00000001 00000200 00000003
Jun 14 05:20:40 _gateway kernel: [ 246.803306] $ 4 : 00000000 00000003 8feadd98 00000001
Jun 14 05:20:40 _gateway kernel: [ 246.813720] $ 8 : 00000000 00000005 0c269a71 00000002
Jun 14 05:20:40 _gateway kernel: [ 246.824146] $12 : 00000000 00000253 7fafc8f4 7fafc8f4
Jun 14 05:20:40 _gateway kernel: [ 246.834572] $16 : 00000000 8feadd98 8da62804 8e9bba98
Jun 14 05:20:40 _gateway kernel: [ 246.844989] $20 : 8e9b9df8 8da6283c 80600000 00000000
Jun 14 05:20:40 _gateway kernel: [ 246.855398] $24 : 77ec1f88 8000cf94
Jun 14 05:20:40 _gateway kernel: [ 246.865807] $28 : 8feac000 8feadd68 8ef30000 8efda024
Jun 14 05:20:40 _gateway kernel: [ 246.876216] Hi : 00000099
Jun 14 05:20:40 _gateway kernel: [ 246.881934] Lo : 999999c0
Jun 14 05:20:40 _gateway kernel: [ 246.887682] epc : 803901c8 dev_get_by_index_rcu+0x4/0x5c
Jun 14 05:20:40 _gateway kernel: [ 246.898664] ra : 8efda024 0x8efda024
Jun 14 05:20:40 _gateway kernel: [ 246.906298] Status: 11007c03 KERNEL EXL IE
Jun 14 05:20:40 _gateway kernel: [ 246.914629] Cause : 40800008 (ExcCode 02)
Jun 14 05:20:40 _gateway kernel: [ 246.922595] BadVA : 000000a4
Jun 14 05:20:40 _gateway kernel: [ 246.928314] PrId : 0001992f (MIPS 1004Kc)
Jun 14 05:20:40 _gateway kernel: [ 246.936450] Modules linked in: pppoe ppp_async pppox ppp_generic nft_fib_inet nf_nat_pptp nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet nf_conntrack_pptp iptable_nat ipt_REJECT ipt_MASQUERADE xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_FLOWOFFLOAD xt_CT wireguard ts_fsm ts_bm slhc sch_mqprio nft_set_rbtree nft_set_hash nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject_bridge nft_reject nft_redir_ipv6 nft_redir_ipv4 nft_redir nft_quota nft_numgen nft_nat nft_meta_bridge nft_meta nft_masq_ipv6 nft_masq_ipv4 nft_masq nft_log nft_limit nft_fwd_netdev nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_exthdr nft_dup_netdev nft_ct nft_counter nft_chain_route_ipv6 nft_chain_route_ipv4 nft_chain_nat_ipv6
Jun 14 05:20:40 _gateway kernel: [ 247.077963] nft_chain_nat_ipv4 nf_tables_netdev nf_tables_ipv6 nf_tables_ipv4 nf_tables_inet nf_tables_bridge nf_tables_arp nf_tables nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv6 nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_dup_netdev nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_rtcache nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast ts_kmp nf_conntrack_amanda nf_conntrack libcrc32c iptable_raw iptable_mangle iptable_filter ip_tables crc_ccitt sch_tbf sch_ingress sch_htb sch_hfsc
Jun 14 05:20:40 _gateway kernel: [ 247.219602] em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ip6_udp_tunnel udp_tunnel leds_gpio gpio_button_hotplug crc32c_generic
Jun 14 05:20:40 _gateway kernel: [ 247.323344] Process kworker/1:1 (pid: 137, threadinfo=8feac000, task=8fe44e00, tls=00000000)
Jun 14 05:20:40 _gateway kernel: [ 247.340122] Stack : 81263d30 8fc24880 615180f4 c0a8019c 00000000 8da62800 8e99f400 8efda1a4
Jun 14 05:20:40 _gateway kernel: [ 247.356769] 8fc04000 8004a714 6a8ac264 00000039 00000000 00000000 00000000 00000000
Jun 14 05:20:40 _gateway kernel: [ 247.373413] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Jun 14 05:20:40 _gateway kernel: [ 247.390056] 00000000 00000000 80600000 0000001e 8da62800 8efda3a4 8da2793c 81263980
Jun 14 05:20:40 _gateway kernel: [ 247.406701] 0000001e 8da62800 8e99f400 8ef31308 80600000 80600000 805fe0c4 00000009
Jun 14 05:20:40 _gateway kernel: [ 247.423346] ...
Jun 14 05:20:40 _gateway kernel: [ 247.428208] Call Trace:
Jun 14 05:20:40 _gateway kernel: [ 247.433085] [<803901c8>] dev_get_by_index_rcu+0x4/0x5c
Jun 14 05:20:40 _gateway kernel: [ 247.443340] [<8efda024>] 0x8efda024
Jun 14 05:20:40 _gateway kernel: [ 247.450278] Code: 03e00008 00000000 30a300ff <8c8200a4> 00031880 00431021 8c420000 1040000e 00000000
Jun 14 05:20:40 _gateway kernel: [ 247.469702]
Jun 14 05:20:40 _gateway kernel: [ 247.473148] ---[ end trace 6301c6566f43e61d ]---
Jun 14 05:20:40 _gateway kernel: [ 247.485047] Kernel panic - not syncing: Fatal exception in interrupt
Jun 14 05:22:22 _gateway logread[783]: Logread connected to 192.168.1.156:514
Jun 14 05:22:22 _gateway uhttpd[1144]: luci: accepted login on / for root from 192.168.1.151
Jun 14 05:22:23 _gateway kernel: [ 41.395361] CPU 3 Unable to handle kernel paging request at virtual address 000000a4, epc == 803901c8, ra == 8f0fc024
Jun 14 05:22:23 _gateway kernel: [ 41.416531] Oops[#1]:
Jun 14 05:22:23 _gateway kernel: [ 41.421061] CPU: 3 PID: 307 Comm: kworker/3:1 Not tainted 4.14.125 #0
Jun 14 05:22:23 _gateway kernel: [ 41.433962] Workqueue: events_power_efficient nf_flow_dnat_port [nf_flow_table]
Jun 14 05:22:23 _gateway kernel: [ 41.448550] task: 8fe96800 task.stack: 8ff80000
Jun 14 05:22:23 _gateway kernel: [ 41.457569] $ 0 : 00000000 00000001 00000200 00000003
Jun 14 05:22:23 _gateway kernel: [ 41.467989] $ 4 : 00000000 00000003 8ff81d98 00001102
Jun 14 05:22:23 _gateway kernel: [ 41.478413] $ 8 : 00000000 00000005 7e22e3bd 00000002
Jun 14 05:22:23 _gateway kernel: [ 41.488832] $12 : 00000000 0041a278 00000000 00000000
Jun 14 05:22:23 _gateway kernel: [ 41.499247] $16 : 00000000 8ff81d98 8f7b0b04 8fd6ca98
Jun 14 05:22:23 _gateway kernel: [ 41.509681] $20 : 8f7b01bc 8f7b0b3c 80600000 00000000
Jun 14 05:22:23 _gateway kernel: [ 41.520101] $24 : 00000000 8000cf94
Jun 14 05:22:23 _gateway kernel: [ 41.530525] $28 : 8ff80000 8ff81d68 8f170000 8f0fc024
Jun 14 05:22:23 _gateway kernel: [ 41.540964] Hi : 00000133
Jun 14 05:22:23 _gateway kernel: [ 41.546715] Lo : 33333380
Jun 14 05:22:23 _gateway kernel: [ 41.552472] epc : 803901c8 dev_get_by_index_rcu+0x4/0x5c
Jun 14 05:22:23 _gateway kernel: [ 41.563451] ra : 8f0fc024 0x8f0fc024
Jun 14 05:22:23 _gateway kernel: [ 41.571098] Status: 11007c03 KERNEL EXL IE
Jun 14 05:22:23 _gateway kernel: [ 41.579442] Cause : 40800008 (ExcCode 02)
Jun 14 05:22:23 _gateway kernel: [ 41.587428] BadVA : 000000a4
Jun 14 05:22:23 _gateway kernel: [ 41.593159] PrId : 0001992f (MIPS 1004Kc)
Jun 14 05:22:23 _gateway kernel: [ 41.601307] Modules linked in: pppoe ppp_async pppox ppp_generic nft_fib_inet nf_nat_pptp nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet nf_conntrack_pptp iptable_nat ipt_REJECT ipt_MASQUERADE xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_FLOWOFFLOAD xt_CT wireguard ts_fsm ts_bm slhc sch_mqprio nft_set_rbtree nft_set_hash nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject_bridge nft_reject nft_redir_ipv6 nft_redir_ipv4 nft_redir nft_quota nft_numgen nft_nat nft_meta_bridge nft_meta nft_masq_ipv6 nft_masq_ipv4 nft_masq nft_log nft_limit nft_fwd_netdev nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_exthdr nft_dup_netdev nft_ct nft_counter nft_chain_route_ipv6 nft_chain_route_ipv4 nft_chain_nat_ipv6
Jun 14 05:22:23 _gateway kernel: [ 41.742857] nft_chain_nat_ipv4 nf_tables_netdev nf_tables_ipv6 nf_tables_ipv4 nf_tables_inet nf_tables_bridge nf_tables_arp nf_tables nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv6 nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_dup_netdev nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_rtcache nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast ts_kmp nf_conntrack_amanda nf_conntrack libcrc32c iptable_raw iptable_mangle iptable_filter ip_tables crc_ccitt sch_tbf sch_ingress sch_htb sch_hfsc
Jun 14 05:22:23 _gateway kernel: [ 41.884434] em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ip6_udp_tunnel udp_tunnel leds_gpio gpio_button_hotplug crc32c_generic
Jun 14 05:22:23 _gateway kernel: [ 41.988111] Process kworker/3:1 (pid: 307, threadinfo=8ff80000, task=8fe96800, tls=00000000)
Jun 14 05:22:23 _gateway kernel: [ 42.004887] Stack : 8127fd30 8fc24c80 00000000 00000000 00000000 8f7b0b00 8e132400 8f0fc1a4

@openwrt-bot
Copy link
Author

ynezz:

The issue was not present prior to this snapshot.

Do you know last working version, git hash?

@openwrt-bot
Copy link
Author

SirToffski:

Petr,

I have just tested using //SNAPSHOT r10168-b8a72dfd28/LuCI Master (f138fc93) Kernel 4.14.123// generated on June 6th 20:30 UTC. The issue is not present.

Enabling hw offloading and rebooting works as expected.

root@OpenWrt:/etc# cat openwrt_release DISTRIB_ID='OpenWrt' DISTRIB_RELEASE='SNAPSHOT' DISTRIB_REVISION='r10168-b8a72dfd28' DISTRIB_TARGET='ramips/mt7621' DISTRIB_ARCH='mipsel_24kc' DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r10168-b8a72dfd28' DISTRIB_TAINTS=''

root@OpenWrt:~# cat /etc/config/firewall

config defaults
option syn_flood '1'
option input 'ACCEPT'
option output 'ACCEPT'
option forward 'REJECT'
option flow_offloading '1'
option flow_offloading_hw '1'

@openwrt-bot
Copy link
Author

SirToffski:

I was able to replicate the issue with June 16th snapshot as well.

  • The sysupbrade image contained exactly the same packages as the last snapshot built known to not have an issue (r10168-b8a72dfd28)

  • The sysupgrade was performed without preserving the configuration. Configuration was re-created manually with enabling HW Offload done as the last step.

  • As soon as the HW Offload was enabled, the router went into a boot loop.

Appears to be a bug introduced in kernel 4.14.125.

Syslogs are saved on my server, I will be able to provide them soon.

Let me know if there is anything else I can do to assist in resolving the issue.

Thanks!

@openwrt-bot
Copy link
Author

ynezz:

If you're able to compile your own image, then #2153 could be worth a try.

@openwrt-bot
Copy link
Author

simontretter:

Kernel panics started after the inclusion of KERNEL_NET_NS=y.

[[https://github.com//pull/2153|This patch fixes it.]]

Tested on mt7621 (ER-X):

DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r10307-629e6538a1'
DISTRIB_TARGET='ramips/mt7621'
DISTRIB_ARCH='mipsel_24kc'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r10307-629e6538a1'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant