OpenWrt/LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity Critical
  • Priority Very Low
  • Reported Version openwrt-18.06
  • Due in Version Undecided
  • Due Date Undecided
  • Private
Attached to Project: OpenWrt/LEDE Project
Opened by Tobias Leupold - 25.02.2019

FS#2145 - No IPv4 adresses offered on LAN of GL-AR150 if no LAN cable is plugged during boot

This is 100 % reproducible with a fresh and unchanged installation of OpenWRT 18.06.2 on a GL-Inet GL-AR150, and it gave me quite some headache until I found out what’s happening here:

If a LAN cable is plugged during the device’s boot process and one does a DHCP request when it’s ready, the router offers an IPv4 address (as expected) and one can access it on 192.168.1.1. This is also the case for a DHCP request via WLAN, as soon as one does enable it.

If however the device is powered up without a LAN cable being plugged, no IPv4 address will be offered, neither for a LAN nor for a WLAN connection. One can connect to the router via the IPv6 address to be found in /etc/resov.conf (on the client) though, but not via IPv4. This is also the case if a LAN cable is plugged after booting.

Looking at /var/etc/dnsmasq.conf.cfg01411c, it does contain the line

  dhcp-range=set:lan,192.168.1.100,192.168.1.249,255.255.255.0,12h

if a LAN cable has been plugged during bootup, and no such entry if none has been plugged.

This makes the device completely unusable for what I intend to use it ... as this happens with a completely unaltered, fresh installation, I suppose this is no misconfiguration but a bug?

Tobias Leupold commented on 26.02.2019 10:44

Apparently, someone filed the same problem at the same time as me at https://bugs.openwrt.org/index.php?do=details&task_id=2144 ...

For now, I would be really happy if there was some workaround until this is fixed as I simply can't use the router at all ...

Tobias Leupold commented on 27.02.2019 11:44

Here are the "vanilla" logs after booting. The only difference is that, after

  daemon.info procd: - init complete -

if the LAN cable is plugged, dnsmasq is restarted, and a line

  dnsmasq-dhcp[1210]: DHCP, IP range 192.168.1.100 -- 192.168.1.249, lease time 12h

appears, so the dynamic IPv4 range is set. If no LAN cable is plugged, dnsmasq is not restarted and no IPv4 range is set.

Tobias Leupold commented on 27.02.2019 12:37

Restarting dnsmasq automatically also does not help. I tried to trigger the restart via /etc/rc.local, but still, no IPv4 range is set – as long as no LAN cable is plugged.

When I boot up without a cable, plug it, request an IP address (getting no IPv4 address) and login via IPv6 and THEN restart dnsmasq, the IPv4 range is set.

Tobias Leupold commented on 02.03.2019 17:06

The problem happens/shows up inside the dhcp_check() function of the dnsmasq init script.

In both cases (plugged/unplugged), this function is called only once, with $ifname set to "br-lan" and $stamp set to "/var/run/dnsmasq.cfg01411c.br-lan.dhcp".

If the LAN cable is plugged, the function completes and returns 0 at the end, which then results in the dhcp-range line being added to the config by dhcp_add().

If no LAN cable is plugged, the function returns at

  # If there's no carrier yet, skip this interface.
  # The init script will be called again once the link is up
  case "$(devstatus "$ifname" | jsonfilter -e @.carrier)" in
      false) return 1;;
  esac

This causes dhcp_add() to return before the dhcp-range line is added.

When the device boots up without a LAN cable plugged and one sets an IP address manually (or uses the IPv6 DHCP address which can be obtained in this case), logs in and runs

  devstatus br-lan | jsonfilter -e @.carrier

it returns "true".

So I'm pretty sure that this is some hardware-specific timing problem or similar. I'm not deep enough into OpenWrt (yet), but it looks like the LAN bridge does not have a "carrier" without a LAN cable being plugged when it should have one.

Tobias Leupold commented on 03.03.2019 06:50

This actually seems to be a timing problem. I added the following code before the above check:

  counter=1
  while [ $(devstatus "$ifname" | jsonfilter -e @.carrier) = 'false' ] && [ $counter -le 20 ]; do
      logger -t "test" "No carrier, waiting $counter secs"
      counter=$(($counter + 1))
      sleep 1
  done

Which results in the script to wait for 7 seconds and then starting up as expected, with working DHCPv4:

  ...
  Sun Feb 24 02:13:08 2019 daemon.info procd: - init complete -
  Sun Feb 24 02:13:09 2019 user.notice test: No carrier, waiting 1 secs
  Sun Feb 24 02:13:10 2019 user.notice test: No carrier, waiting 2 secs
  Sun Feb 24 02:13:11 2019 user.notice test: No carrier, waiting 3 secs
  Sun Feb 24 02:13:12 2019 user.notice test: No carrier, waiting 4 secs
  Sun Feb 24 02:13:14 2019 user.notice test: No carrier, waiting 5 secs
  Sun Feb 24 02:13:15 2019 user.notice test: No carrier, waiting 6 secs
  Sun Feb 24 02:13:16 2019 user.notice test: No carrier, waiting 7 secs
  Sun Feb 24 02:13:16 2019 kern.info kernel: [   36.119928] eth1: link up (1000Mbps/Full duplex)
  Sun Feb 24 02:13:16 2019 kern.info kernel: [   36.123166] br-lan: port 1(eth1) entered blocking stat
  e
  Sun Feb 24 02:13:16 2019 kern.info kernel: [   36.128318] br-lan: port 1(eth1) entered forwarding st
  ate
  Sun Feb 24 02:13:16 2019 daemon.notice netifd: Network device 'eth1' link is up
  Sun Feb 24 02:13:16 2019 kern.info kernel: [   36.136384] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: lin
  k becomes ready
  Sun Feb 24 02:13:16 2019 daemon.notice netifd: bridge 'br-lan' link is up
  Sun Feb 24 02:13:16 2019 daemon.notice netifd: Interface 'lan' has link connectivity
  Sun Feb 24 02:13:21 2019 daemon.info dnsmasq[727]: exiting on receipt of SIGTERM
  Sun Feb 24 02:13:21 2019 daemon.info dnsmasq[1273]: started, version 2.80 cachesize 150
  Sun Feb 24 02:13:21 2019 daemon.info dnsmasq[1273]: DNS service limited to local subnets
  Sun Feb 24 02:13:21 2019 daemon.info dnsmasq[1273]: compile time options: IPv6 GNU-getopt no-DBus no
  -i18n no-IDN DHCP no-DHCPv6 no-Lua TFTP no-conntrack no-ipset no-auth no-DNSSEC no-ID loop-detect in
  otify dumpfile
  Sun Feb 24 02:13:21 2019 daemon.info dnsmasq-dhcp[1273]: DHCP, IP range 192.168.1.100 -- 192.168.1.2
  49, lease time 12h
  ...
Tobias Leupold commented on 04.03.2019 12:36

Another workaround is to add the "force" option to the "lan" section of /etc/config/dhcp.

Question is if the default image simply misses this option and it should be pre-set to get a working "vanilla" installation – because by setting it, the dhcp_check() function is skipped and thus the carrier problem does not show up (and is possibly not solved but only hidden).

Tobias Leupold commented on 09.03.2019 13:20

I also filed a bug at Gl-iNet: https://bugs.gl-inet.com/view.php?id=146 They say the problem is known and they work around it by using the "force" option. According to the forum post, the dnsmasq init script should not check for the device state at all if binding 0.0.0.0, so perhaps, the init script should be fixed?

asdf commented on 24.07.2019 10:01

Set the workaround and it's working well now. Still needs to be fixed.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing