Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#3319 - erratic behaviour wireless network control on system with three wireless phy #8180

Open
openwrt-bot opened this issue Sep 5, 2020 · 1 comment
Labels

Comments

@openwrt-bot
Copy link

wwortel:

Device: Ubiquiti Routerstation (ar71xx)
Software: OpenWrt SNAPSHOT, r14382+7-ad0f0df909 (local compilation after Sep 3 2020 git pull)
Symptoms: commands like 'wifi up wlan1' on a system with wlan0, wlan1, and wlan2 (phy0,1,2) , make other wlanN disapperar than wlan1.
plus in dmesg messages like
[ n.n] do_page_fault(): sending SIGSEGV to hostapd for invalid write access to 00000000
[ n.n] epc = 77e79e6c in libc.so[77e4c000+9c000]
[ n.s] ra = 77e1e309 in libubus.so[77e1c000+13000]

Bug found through experimentation:
script: /lib/netifd/wireless/mac80211.sh
function: drv_mac80211_teardown()
line: json_select data
problem: 'data' does not produce phy, 'config' does.
corrected line: json_select config

Going though experimentations also found that there may be problems in
script: /lib/netifd/hostapd.sh
function: wpa_supplicant_run()
line: ubus call wpa_supplicant config_add
and the lines that follow with parameters and line continuations.
Replaced:
ubus call wpa_supplicant config_add "{
"driver": "${_w_driver:-wext}", "ctrl": "$_rpath",
"iface": "$ifname", "config": "$_config"
${network_bridge:+, "bridge": "$network_bridge"}
${hostapd_ctrl:+, "hostapd_ctrl": "$hostapd_ctrl"}
}"
By:
local br=${network_bridge:+',"bridge":"'$network_bridge'"'}
local hc=${hostapd_ctrl:+',"hostapd_ctrl":"'$hostapd_ctrl'"'}
local jsonstr='{"driver":"'${_w_driver:-wext}'","iface":"'${ifname}'"'${br}${hc}',"ctrl":"'${_rpath}'","config":"'${_config}'"}'
ubus call wpa_supplicant config_add ${jsonstr}

because in a separate test script just to see the workings of the original I could not get it to work due to the nested parameter substitution.
Have removed all the " as the " do not need to be escaped when within single quotes.

@openwrt-bot
Copy link
Author

wwortel:

The hostapd crash is also easily reproducible by just 'wifi down' or 'wifi up'.
Initially thought that drv_mac80211_teardown() in /lib/netifd/wireless/mac80211.sh chose the wrong 'data' config.
But by json_dump at that place observed that this is only the case for one out of three physical wireless interfaces that get acted upon simultaniously. The other two get the data about phy served correctly.
The crash as such happens when drv_mac80211_setup() is called, either as setup or teardown action. But, again, only for one out of three wireless interfaces.
It occurs near the line 'local hostapd_pid=$(ubus call service list '{"name":"wpad"}' | jsonfilter -l 1 -e "@['wpad'].instances['hostapd'].pid")'
Even though the code before that line has been waiting for hostapd to be present for ubus (ubus wait_for hostapd), by the time communication occurs hostapd has already crashed and in the data 'running' is reported 'false'
But the code has no test for that and just reports the pid (empty) is not the correct one.
But, again, only for one out of three interfaces. After a hostapd crash it gets started again automatically unless the kernel has detected to many crashes of hostapd in a short time and then hostapd remains absent.
The missing interface is brought up later.
My preliminary conclusion is that the concurrency in time of three identical mac80211 commands to be executed, cause this instability. Apparently the ubus communication channel to hostapd does not properly interleave communications.

Another imperfection observed is that a command like 'wifi down wlan1' does not bring down just wlan1, when also wlan0 and wlan2 are present.
All three get brought down and then the two not given down are restarted afterwards. It is very inconvenient that manipulating one radio does not leave the others intact without interruption.
Earlier OpenWrt did not have this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant