Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#506 - BT Home Hub 5: 5g WiFi jumps to channel 36 and stops working if stopped and restarted #5524

Open
openwrt-bot opened this issue Feb 13, 2017 · 22 comments
Labels

Comments

@openwrt-bot
Copy link

ezplanet:

Supply the following if possible:

  • Device problem occurs on BT Home Hub 5
  • Software versions of LEDE release, packages, r3425-f28eef4 and previous versions
  • Steps to reproduce

It can be after a few hours, if WiFi 5g radio0 Qualcomm Atheros QCA9880 802.11nac (radio0) is configured to a Band B channel, it jumps to Channel 36 even though there is no apparent interference.
This appears to happen more frequently overnight during long periods of inactivity. Whilst the device is being used it does not happen.

After it does make the jump, if I stop and restart the device (using luci disable/enable) the device stops, but it fails to restart. Attempts to change the channel and re-enable the device also do not help.

It requires a reboot to restore service.

@openwrt-bot
Copy link
Author

ezplanet:

This happens also on a WNDR3700v4:

Configure Atheros AR9580 802.11an (radio1) to a Band B Channel (100-140)

Use a WiFi scanner like Android WiFi Analyzer to scan WiFi channels.

The router will be initially on the configured channel. After a few minutes it will jump to Channel 36 (Band A). /etc/config/wireless remains untouched to the original channel only the radio will be on channel 36. On reboot it will take the configured channel on Band B (eg Ch 104) correctly, but it will jump to Ch 36 again after a few minutes.

The router will take the configuration for any Band B Channel, but it will not appear at all. Only if configured on a Band A (ch 36-64) it will appear on the radio waves.

Also in this case if I disable and re-enable the device it will not work. A reboot is required to restore service.

@openwrt-bot
Copy link
Author

arjendekorte:

This could be due to radar activity and DFS backing out. Check out if

/sys/kernel/debug/ieee80211/phy1/ath9k/dfs_stats

shows any hints of what could be going on.

Note that Ch 104 works fine on a WNDR4300 I have here (r3426-4c09f99605).

@openwrt-bot
Copy link
Author

ezplanet:

DFS can be OK if a bit too quick to change channel, the main issue here is that when this happens, if I disable/re-enable the interface, it will no longer work and it will require a reboot to get it back on line.

@openwrt-bot
Copy link
Author

arjendekorte:

That is not the idea behind DFS. If interference is detected, it is mandatory to back-off from a channel and not retry again for a certain amount of time (half an hour, if memory serves).

iw phy phy1 info

should show how long channels will not be available for further scanning. Restarting the interface should not make a difference, as the timer must not be reset in that case.

You should really ask yourself, if frequent dropping out of the selected channel doesn't mean there actually is interference preventing your routers from using them. This whole DFS thing is there for a reason.

@openwrt-bot
Copy link
Author

ezplanet:

Arjen,

I understand DFS, however when the channel jumps to 36:

  • the following leads me to believe that there is a problem with DFS implementation:
  1. it does not happen when the router is configured on Ch 40 to 56 that in my area have the most usage (other SSIDs present on most this channel range)
  2. it does happen only when the routers are configured on channels above 100 where there is no other local usage or interference
  3. whilst the router is in use (it has devices associated) and there is traffic, the channel does not jump
  4. jumps to ch 36 happen mostly overnight when the router is not in use and there are no devices associated
  • the following is a blocking bug:
  1. after the channel jumps to 36, if I disable and then re-enable the WiFi interface (using luci), WiFi does not get re-enabled (tried several times) and the SSID no longer appears, not even on Ch 36. It requires a reboot to get the WiFi interface back working again.

@openwrt-bot
Copy link
Author

arjendekorte:

  1. Irrelevant. DFS is not about congestion, it deals with radar pulses. If it doesn't find something that resembles a radar pattern, it won't switch.

  2. Ditto. If no radar is detected, DFS doesn't cause channels to switch.

  3. In my setup, the device only has a maximum of about two hours of 5 GHz traffic per day (and sometimes not even that for weeks on end). The last time I saw it switching from the configured channel, was when I had it on Ch 120 just after the DFS patches went mainstream in OpenWRT (something like two years ago). After switching to Ch 104, it never switched channels again. I very much doubt that a lack in traffic will cause this.

  4. This may be purely coincidental. Assuming you're seeing radar pulses, it may be active only during a few hours during the night. What do the dfs_stats tell you? Is a radar detected?

cat /sys/kernel/debug/ieee80211/phy1/ath9k/dfs_stats

DFS support for macVersion = 0x1c0, macRev = 0x4: enabled
Pulse detector statistics:
pulse events reported : 1040
invalid pulse events : 3
DFS pulses detected : 627
Datalen discards : 0
RSSI discards : 410
BW info discards : 0
Primary channel pulses : 17
Secondary channel pulses : 579
Dual channel pulses : 441
Radar detector statistics (current DFS region: 2)
Pulse events processed : 1059
Radars detected : 0
Global Pool statistics:
Pool references : 14
Pulses allocated : 67
Pulses alloc error : 0
Pulses in use : 20
Seqs. allocated : 55
Seqs. alloc error : 0
Seqs. in use : 2

  1. Does it also block, if you change to a different channel in /etc/config/wireless before starting WiFi again?

@openwrt-bot
Copy link
Author

ezplanet:

Here is what I get after a jump to ch36 from ch120 on BT Home Hub 5:

DFS support for macVersion = 0x180, macRev = 0x2: disabled Pulse detector statistics: pulse events reported : 0 invalid pulse events : 0 DFS pulses detected : 0 Datalen discards : 0 RSSI discards : 0 BW info discards : 0 Primary channel pulses : 0 Secondary channel pulses : 0 Dual channel pulses : 0 Radar detector statistics (current DFS region: 2) Pulse events processed : 0 Radars detected : 0 Global Pool statistics: Pool references : 7 Pulses allocated : 46 Pulses alloc error : 0 Pulses in use : 10 Seqs. allocated : 41 Seqs. alloc error : 0 Seqs. in use : 0

After this, if I disable, then re-enable the interface on BT Home Hub 5, the interface fails to work. A reboot is required to get it back working.

@openwrt-bot
Copy link
Author

arjendekorte:

The line

DFS support for macVersion = 0x180, macRev = 0x2: disabled

tells DFS is not available on your hardware with the current driver. I'm actually a bit surprised this is working at all. This should work on the WNDR3700v4 however, as that one is using a supported chip (AR9580).

@openwrt-bot
Copy link
Author

ezplanet:

I have just seen it jump from 56 to 36 and then lock.
It is a bug.

@openwrt-bot
Copy link
Author

arjendekorte:

DFS support is required on Ch56. If this is on your BT Home Hub 5 which doesn't support DFS with the existing ath10k driver (see above), you're trying to do something that is not supported and it is not a bug.

@openwrt-bot
Copy link
Author

mavcin:

I think this is a luci issue, already fixed
openwrt/luci@07e01d0

@openwrt-bot
Copy link
Author

dlang:

jumping from a DFS channel to channel 36 happens when the radar detection in the chipset detects what it thinks is radar on the channel you are on. It is then required to switch off of that channel, and OpenWRT/LEDE switch to channel 36

@openwrt-bot
Copy link
Author

ezplanet:

@arjen,

I am not trying anything at all this is the default, I do not even know if there is a parameter for DFS. Is there a way to turn it off? If I turn the country code to 00 - World does it turn it off?

@openwrt-bot
Copy link
Author

mavcin:

From my BT Home Hub 5
phy0 = 5 GHz
phy1 = 2.4 GHz

root@BTHomeHub5A:/# cat /sys/kernel/debug/ieee80211/phy0/ath10k/dfs_stats
Pulse detector statistics:
reported phy errors : 968
pulse events reported : 968
DFS pulses detected : 40
DFS pulses discarded : 928
Radars detected : 6
Global Pool statistics:
Pool references : 7
Pulses allocated : 43
Pulses alloc error : 0
Pulses in use : 42
Seqs. allocated : 8
Seqs. alloc error : 0
Seqs. in use : 5

@openwrt-bot
Copy link
Author

ezplanet:

I found the following documentation on line that summarizes DFS specifications:

DFS-enabled radios monitor the operating frequency for radar signals. If radar signals are detected on the channel, the wireless device takes these steps:

  1. Blocks new transmissions on the channel.
  2. Broadcasts an 802.11h channel-switch announcement.
  3. Disassociates remaining client devices.
  4. Access Point selects a different channel permitted within the regulatory domain.
  5. After the DFS non-occupancy period has been reached for the original DFS channel, if no clients are associated it will move back to the original DFS channel and scan for 60 seconds. If there are no radar signals on the new channel, the wireless device enables beacons and accepts client associations. The non-occupancy period is defined by the regulatory domain but in most cases is 30 minutes.

From my observations the implementation of LEDE DFS:

a) assume (4) as ch 36 only whilst it could be any other permitted frequency
b) do not implement (5) because once switched, the router is stuck on ch 36 (in my sample I get very little radar detections)

And finally it crashes the WiFi device if I try to change the channel manually or disable re-enable the device, requiring a reboot to resume operations.

Also on BT Home Hub 5 DFS is reported as disabled but it appears to be taking place anyway.

@openwrt-bot
Copy link
Author

arjendekorte:

You probably need to look at phy0

cat /sys/kernel/debug/ieee80211/phy0/ath10k/dfs_stats

to look for the DFS statistics. If not zero, the number of radars detected is really of no interest and means that the channel you configured is not usable in your location.

Note that if radar is detected, although allowed to retry after some time (5), it is pointless to do so. Most of the (weather) radars are ground based systems, so if you're in range of such an installation, you'll invariably detect them again.

Most (if not all) of the DFS stuff is built into the kernel and not LEDE specific. You're barking up the wrong tree here.

@openwrt-bot
Copy link
Author

sumpfralle:

If I understand the submitter correctly, we are discussing different issues here.
I would like to clarify this, since an (in my opinion) unrelated issue ([[https://bugs.lede-project.org/index.php?do=details&task_id=558|#558]]) was just marked as a duplicate of this one.

The submitter described:

  • there was an automatic switch of channel
  • when trying to toggle the state of the wifi interface (via luci), it stopped operating at all

I do not think, that the channel switch (due to DFS) is the problem of the submitter. The real problem seems to be that he needs to reboot his router in order to get the wifi working again.

As a sidenote:
Here in our freifunk community we are working on improving DFS handling. We have to deal with a lot of channel switching and toggling of the wifi state, but we did not encounter a wifi device getting stuck. Thus it sounds like the above issue is not related to DFS, but to the BT Home Hub 5 (we are using mainly Ubiquiti Nanostations and TP-Link CPE).

Or do I misunderstand something?

@openwrt-bot
Copy link
Author

pietia:

I do confirm on archer c7 v2 17.01.4 it went back to channel 36 from channel 104 even tho there was a device that was on channel 36 and there wasn't on channel 104.
It is a bug.

@openwrt-bot
Copy link
Author

mkresin:

It is a bug

No it isn't as already discussed here:

jumping from a DFS channel to channel 36 happens when the radar detection in the chipset detects what it thinks is radar on the channel you are on. It is then required to switch off of that channel, and OpenWRT/LEDE switch to channel 36.

In most countries Channel 104 is a DFS channel and as soon as a radar is detected (or what the chipset thinks is radar) it must be switch to a different channel.

Finally, this ticket is about a wireless hang after DFS related channel switch.

@mauro, do you still see this error? Till now I wasn't able to reproduce the issue on my HH5a.

@openwrt-bot
Copy link
Author

pietia:

Sure it is DFS but it shouldn't go to channel 36 which can be occupied by other routers and in my case it was.
I haven't noticed crash tho.

". The access points automatically select frequency channels with low interference levels"

https://en.wikipedia.org/wiki/Channel_allocation_schemes#DFS

Mauro Mozzarelli explained preceisly wrong behaviour.
I have seen that this might be kernel related perhaps a bug report to kernel devs might be good idea?

@openwrt-bot
Copy link
Author

samoz83:

I'm getting the same thing, my non LEDE home hub doesn't change off of DFS channels, in my case channel 120 but the LEDE router does and goes back to 36?

I seem to also get the hang if you try and switch it back to any channel that's not 36 once it's changed itself back.

@openwrt-bot
Copy link
Author

pietia:

@sam same hang here as well...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant