FS#251 - sending SIGSEGV to dnsmasq for invalid read access from 00000000 #5482

openwrt-bot · 2016-10-26T21:47:35Z

koa:

it's a message that shows up frequently on a fresh install of lede's 4.4.27 on wzr-hp-g300nh -> Linux robokoa 4.4.27 #0 Wed Oct 26 10:37:47 2016 mips GNU/Linux -- steps to reproduce are unknown, i'm unsure of the initial reason for this error

[28082.882471] do_page_fault(): sending SIGSEGV to dnsmasq for invalid read access from 00000000
[28082.891127] epc = 00439ff1 in busybox[400000+4a000]
[28082.896083] ra = 00439fe5 in busybox[400000+4a000]
[28082.901018]

openwrt-bot · 2016-11-06T19:08:14Z

IronicSven:

I've got the same message on my TP-Link TL-WR1043N/ND v1 with r2109.

openwrt-bot · 2016-11-13T08:52:45Z

pmalecka:

Same here on mikrotik 493g - r2155

The sigsegv also happens for:

Sun Nov 13 02:27:07 2016 kern.info kernel: [49320.120778]
Sun Nov 13 02:27:07 2016 kern.info kernel: [49320.120778] do_page_fault(): sending SIGSEGV to sysntpd for invalid read access from 00000000
Sun Nov 13 02:27:07 2016 kern.info kernel: [49320.223021] epc = 00439ff1 in busybox[400000+4a000]
Sun Nov 13 02:27:07 2016 kern.info kernel: [49320.281722] ra = 00439fe5 in busybox[400000+4a000]
Sun Nov 13 02:27:07 2016 kern.info kernel: [49320.340351]

Sun Nov 13 08:26:00 2016 kern.info kernel: [70852.962664]
Sun Nov 13 08:26:00 2016 kern.info kernel: [70852.962664] do_page_fault(): sending SIGSEGV to hotplug-call for invalid read access from 00000000
Sun Nov 13 08:26:00 2016 kern.info kernel: [70853.070137] epc = 00439ff1 in busybox[400000+4a000]
Sun Nov 13 08:26:00 2016 kern.info kernel: [70853.128724] ra = 00439fe5 in busybox[400000+4a000]
Sun Nov 13 08:26:00 2016 kern.info kernel: [70853.187376]

openwrt-bot · 2016-11-13T09:34:43Z

mkresin:

It seams to me that the SIGSEGV is related to busybox since the return address (ra) points always to busybox and always to the same position in busybox.

Would any of you please compile an Image with the following extra option in menuconfig:


Base system --->
  <*> busybox --->
    [*] Customize busybox options --->
      Busybox Settings  --->
        Debugging Options --->
          [*] Build BusyBox with extra Debugging symbols

This **might ** print the function which is called in busybox instead of the - not really helpful - position of the function in the binary.

openwrt-bot · 2016-11-16T13:28:10Z

mamarley:

I did a build with that option and it has been running for a day or so now (more than long enough to reproduce it in the past) and so far there are no segfaults at all. Stupid Heisenbug…

openwrt-bot · 2016-11-16T19:36:02Z

NeoRaider:

I'm seeing the same issue, unfortunately also without debug symbols. I haven't had a closer look yet, but here's some GDB output:


#0  0x00439ff1 in nonblock_immune_read ()
(gdb) bt
#0  0x00439ff1 in nonblock_immune_read ()
#1  0x0041bc23 in argstr ()
#2  0x0041bd89 in expandarg ()
#3  0x0041e7c3 in evalfor ()
#4  0x0041dc37 in evaltreenr ()
#5  0x0041dc37 in evaltreenr ()
#6  0x0041e117 in cmdloop ()
#7  0x0041f8e3 in ash_main ()
#8  0x00407879 in run_applet_no_and_exit ()
#9  0x004078f1 in main ()
(gdb) info registers
          zero       at       v0       v1       a0       a1       a2       a3
 R0   00000000 80480000 00000000 fffffffc 00000000 7fda119c 00000080 00000000
            t0       t1       t2       t3       t4       t5       t6       t7
 R8   00000000 80f7fa80 00000001 00000000 8104217c 00000024 804a0000 ffffff80
            s0       s1       s2       s3       s4       s5       s6       s7
 R16  00000000 00000003 00000003 0040789d 77292000 77292000 77294500 77295e94
            t8       t9       k0       k1       gp       sp       s8       ra
 R24  00000000 772144f8 00000000 00000000 7729b2b0 7fda1108 00000000 00439fe5
            sr       lo       hi      bad    cause       pc
      0000dc13 02400000 000f4537 00000000 00800008 00439ff1
           fsr      fir
      00000000 00000000
(gdb) disas
Dump of assembler code for function nonblock_immune_read:
   0x00439fd5 <+0>:     save    a0-a2,48,ra,s0-s1
   0x00439fd9 <+4>:     move    s1,a0
   0x00439fdb <+6>:     lw      a2,56(sp)
   0x00439fdd <+8>:     lw      a1,52(sp)
   0x00439fdf <+10>:    jal     0x4086c1 
   0x00439fe3 <+14>:    move    a0,s1
   0x00439fe5 <+16>:    slti    v0,0
   0x00439fe7 <+18>:    move    s0,v0
   0x00439fe9 <+20>:    bteqz   0x43a00d 
   0x00439feb <+22>:    jal     0x448ff1 <__errno_location@mips16plt>
   0x00439fef <+26>:    nop
=> 0x00439ff1 <+28>:    lw      v0,0(v0)
   0x00439ff3 <+30>:    cmpi    v0,11
   0x00439ff5 <+32>:    btnez   0x43a00d 
   0x00439ff7 <+34>:    li      v0,1
   0x00439ff9 <+36>:    li      a2,1
   0x00439ffb <+38>:    move    v1,sp
   0x00439ffd <+40>:    neg     a2
   0x00439fff <+42>:    li      a1,1
   0x0043a001 <+44>:    addiu   a0,sp,24
   0x0043a003 <+46>:    sw      s1,24(sp)
   0x0043a005 <+48>:    jal     0x43a671 
   0x0043a009 <+52>:    sh      v0,28(v1)
   0x0043a00b <+54>:    b       0x439fdb 
   0x0043a00d <+56>:    move    v0,s0
   0x0043a00f <+58>:    restore 48,ra,s0-s1
   0x0043a011 <+60>:    jrc     ra
End of assembler dump.

openwrt-bot · 2016-11-17T15:35:10Z

NeoRaider:

It is indeed a Heisenbug, any change to the code to add debug output makes it go away. More weirdness (if the information from the core dump I got is accurate):

The whole function is aligned to odd addresses; I've never seen this before. Is this even allowed? More weirdly, gdb dumps the addresses like this, while objdump shows the whole function shifted by one byte, so the addresses are even (it's MIPS16 code, so it is not aligned to 4 bytes)
If the program counter is accurate (which I'm not sure about), the only way a NULL dereference can happen here is if __errno_location() has returned NULL (or something even weirder like register corruption). This should not be possible.

I can reproduce the issue fairly easily on a TL-WR1043 v1 by calling "/etc/init.d/network restart" when dnsmasq is restarted by this, but I haven't seen it on a TL-WR841 v7. Either this is hardware-dependent, or something changed because I cleaned my tree when changing the models; I'll have to check again when I have both devices at the same place.

openwrt-bot · 2016-11-18T03:36:02Z

NeoRaider:

I'm not much closer to the root of this issue, but at least I'm a bit less confused.

I've found out that the 1 bit of the program counter enables MIPS16 mode, thus explaning the "odd addresses"
I've verified that the issue is indeed that __errno_location() returns NULL
/etc/init.d/dnsmasq reload will segfault on my TL-WR1043 v1 in about 1 out of 3 runs

I've been unable to test this command in gdb (it just hangs). When run in strace, the command doesn't ever segfault.

I'll check with the musl people if they have any idea what is happening.

openwrt-bot · 2016-11-21T14:23:21Z

NeoRaider:

Further increasing severity, as this doesn't only affect init scripts, but all shell scripts using shell expansion ($() or backticks). While testing, I've experienced several crashs of sysupgrade.

Further results of my investigation:

errno_location is not returning NULL after all; in fact, errno_location() is not called at all. This seems correct; the branch calling errno_location() is only called when safe_read() fails, and it doesn't look like safe_read() fails before the crash. The value of the ra register still holds the return address from safe_read().
The whole thing is very fragile; adding a single "nop" instruction before the "jal __errno_location" makes the crash go away.
The Program Counter somehow ends up at 0x00439ff1; it is unclear how it gets there. The preceeding instructions have not been executed. While a random jump after memory corruption could be a possible cause, the backtrace up to nonblock_immune_read() looks sane

I'm currently looking into possible kernel-side causes for this issue.

openwrt-bot · 2016-11-22T11:35:31Z

None:

Adding this as a 'me too'.

[ 200.009789] do_page_fault(): sending SIGSEGV to odhcpd for invalid read access from 00000000
[ 200.018396] epc = 00407ec1 in odhcpd[400000+d000]
[ 200.023198] ra = 004063bf in odhcpd[400000+d000]

Archer c7 v2 - linux 4.4.34

Can't re-create at will, but occurs 2-3 times every reboot. Let me know how I can be of assistance with running tests etc.

openwrt-bot · 2016-11-22T14:15:53Z

None:

Don't know if this is of any help, but I got a 'strace':


epoll_pwait(3, [], 10, 2000, NULL, 16)  = 0
clock_gettime(CLOCK_MONOTONIC, {118, 496503244}) = 0
clock_gettime(CLOCK_MONOTONIC, {118, 496956153}) = 0
clock_gettime(CLOCK_MONOTONIC, {118, 497110062}) = 0
clock_gettime(CLOCK_MONOTONIC, {118, 497561396}) = 0
epoll_pwait(3, [{EPOLLIN, {u32=2002960268, u64=8602648846247395328}}], 10, 2000, NULL, 16) = 1
recvmsg(18, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0\5\0\23\0\0\0\0\0\0\0P", iov_len=12}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 12
poll([{fd=18, events=POLLIN}], 1, -1)   = 1 ([{fd=18, revents=POLLIN}])
recvmsg(18, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\3\0\0\10B\353\226\255\4\0\0\24ubus.object.add\0\7\0\0000"..., iov_len=76}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 76
sendmsg(18, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0\1\0\23\0\0\0\0", iov_len=8}, {iov_base="\0\0\0\24\1\0\0\10\0\0\0\0\3\0\0\10B\353\226\255", iov_len=20}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, 0) = 28
recvmsg(18, {msg_namelen=0}, 0)         = -1 EAGAIN (Resource temporarily unavailable)
clock_gettime(CLOCK_MONOTONIC, {118, 667711369}) = 0
clock_gettime(CLOCK_MONOTONIC, {118, 667914580}) = 0
epoll_pwait(3, [{EPOLLIN, {u32=4313216, u64=18525121660583936}}], 10, 1830, NULL, 16) = 1
recvmsg(13, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000400}, msg_namelen=28->12, msg_iov=[{iov_base=[{{len=116, type=0x18 /* NLMSG_??? */, flags=0, seq=0, pid=0}, "\n\10\0\0\377\3\0\1\0\0\0\0\0\10\0\17\0\0\0\377\0\24\0\1\377\0\0\0\0\0\0\0"...}, {{len=0, type=0x62e3 /* NLMSG_??? */, flags=NLM_F_REQUEST|NLM_F_MULTI|NLM_F_ACK|NLM_F_ECHO|NLM_F_DUMP_INTR|NLM_F_DUMP_FILTERED|0x27c0, seq=4272922192, pid=0}}], iov_len=8192}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT) = 116
recvmsg(13, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000400}, msg_namelen=28->12, msg_iov=[{iov_base=[{{len=116, type=0x18 /* NLMSG_??? */, flags=0, seq=0, pid=0}, "\n@\0\0\376\2\0\1\0\0\0\0\0\10\0\17\0\0\0\376\0\24\0\1\376\200\0\0\0\0\0\0"...}, {{len=0, type=0x62e3 /* NLMSG_??? */, flags=NLM_F_REQUEST|NLM_F_MULTI|NLM_F_ACK|NLM_F_ECHO|NLM_F_DUMP_INTR|NLM_F_DUMP_FILTERED|0x27c0, seq=4272922192, pid=0}}], iov_len=8192}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT) = 116
recvmsg(13, {msg_namelen=28}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
clock_gettime(CLOCK_MONOTONIC, {118, 707666461}) = 0
clock_gettime(CLOCK_MONOTONIC, {118, 707849995}) = 0
epoll_pwait(3, [{EPOLLIN, {u32=4313216, u64=18525121660583936}}], 10, 1790, NULL, 16) = 1
recvmsg(13, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000100}, msg_namelen=28->12, msg_iov=[{iov_base=[{{len=72, type=0x14 /* NLMSG_??? */, flags=0, seq=0, pid=0}, "\n\200\0\0\0\0\0\n\0\24\0\1*\2\f\177\22 \277+\0\0\0\0\0\0\0\376\0\24\0\6"...}, {{len=2359308, type=0 /* NLMSG_??? */, flags=0, seq=0, pid=0}, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\24\0\0\0\0\0\0\0\0"...}], iov_len=8192}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT) = 72
clock_gettime(CLOCK_MONOTONIC, {119, 188109722}) = 0
sendto(7, {{len=24, type=0x16 /* NLMSG_??? */, flags=NLM_F_REQUEST|0x300, seq=1, pid=0}, "\n\0\0\0\0\0\0\n"}, 24, 0, NULL, 0) = 24
recvfrom(7, [{{len=72, type=0x14 /* NLMSG_??? */, flags=NLM_F_MULTI, seq=1, pid=2601}, "\n\200\200\376\0\0\0\1\0\24\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\24\0\6"...}, {{len=72, type=0x14 /* NLMSG_??? */, flags=NLM_F_MULTI, seq=1, pid=2601}, "\n\200\0\0\0\0\0\n\0\24\0\1*\2\f\177\22 \277+\0\0\0\0\0\0\0\376\0\24\0\6"...}, {{len=72, type=0x14 /* NLMSG_??? */, flags=NLM_F_MULTI, seq=1, pid=2601}, "\n@\200\375\0\0\0\n\0\24\0\1\376\200\0\0\0\0\0\0\26\314 \377\376\276\2112\0\24\0\6"...}, {{len=72, type=0x14 /* NLMSG_??? */, flags=NLM_F_MULTI, seq=1, pid=2601}, "\n@\200\375\0\0\0\22\0\24\0\1\376\200\0\0\0\0\0\0\26\314 \377\376\276\2111\0\24\0\6"...}, {{len=72, type=0x14 /* NLMSG_??? */, flags=NLM_F_MULTI, seq=1, pid=2601}, "\n@\300\375\0\0\0\23\0\24\0\1\376\200\0\0\0\0\0\0\26\314 \377\376\276\2110\0\24\0\6"...}], 8192, 0, NULL, NULL) = 360
recvfrom(7, {{len=20, type=NLMSG_DONE, flags=NLM_F_MULTI, seq=1, pid=2601}, "\0\0\0\0"}, 8192, 0, NULL, NULL) = 20
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
+++ killed by SIGSEGV +++

openwrt-bot · 2016-11-22T14:53:07Z

NeoRaider:

What exact options did you use for this strace? Is contains lots of syscalls that are not from busybox.

openwrt-bot · 2016-11-22T16:45:36Z

None:

it was an 'strace -p' of odhcpd which is the thing that gets killed on a 'regular' basis. Oh hell, I've just noticed odhcpd was bumped recently.... this might be a red herring.

openwrt-bot · 2016-11-22T17:01:54Z

NeoRaider:

Most likely that is a different bug. All reports in this ticket are about busybox (ash) crashing while running shell scripts. Some strings like "dnsmasq" appear in the logs as that are the names of the scripts (e.g. /etc/init.d/dnsmasq).

openwrt-bot · 2016-11-26T13:49:47Z

IronicSven:

I'm testing latest trunk r0+2321 on my TP-Link TL-WR1043N/ND v1 since a few hours and I can't reproduce the SIGSEGV messages anymore :)

openwrt-bot · 2016-11-30T15:35:19Z

IronicSven:

I just flashed r0+2369 and the SIGSEGV messages are back.

openwrt-bot · 2016-12-07T01:35:11Z

fuzzle:

i build several lede in the last days - and at least on tplink 841 i never see this. (with some days uptime)
just something on gluon like this
Sat Dec 3 14:46:38 2016 daemon.crit dnsmasq[2095]: unknown user or group: dnsmasq
Sat Dec 3 14:46:38 2016 daemon.crit dnsmasq[2095]: FAILED to start up

..
some dnsmasq is still running
1016 root 1116 S /usr/sbin/dnsmasq -x /var/run/gluon-wan-dnsmasq.pid -u root -i lo -p 54 -h -r /var/gluon/wan-dnsmasq/resolv.conf

..
logread
Sat Dec 3 14:46:34 2016 daemon.crit dnsmasq[1985]: unknown user or group: dnsmasq
Sat Dec 3 14:46:34 2016 daemon.crit dnsmasq[1985]: FAILED to start up
Sat Dec 3 14:46:35 2016 user.notice firewall: Reloading firewall due to ifup of wan6 (br-wan)
Sat Dec 3 14:46:35 2016 daemon.warn fastd[1472]: sendmsg: Operation not permitted
Sat Dec 3 14:46:35 2016 daemon.warn fastd[1472]: sendmsg: Operation not permitted
Sat Dec 3 14:46:37 2016 daemon.info dnsmasq[1016]: reading /var/gluon/wan-dnsmasq/resolv.conf
Sat Dec 3 14:46:37 2016 daemon.info dnsmasq[1016]: using nameserver fd00::a96:d7ff:fe5d:1026#53
Sat Dec 3 14:46:37 2016 daemon.info dnsmasq[1016]: using nameserver 192.168.0.1#53
Sat Dec 3 14:46:37 2016 daemon.crit dnsmasq[2040]: unknown user or group: dnsmasq
Sat Dec 3 14:46:37 2016 daemon.crit dnsmasq[2040]: FAILED to start up
Sat Dec 3 14:46:38 2016 daemon.crit dnsmasq[2095]: unknown user or group: dnsmasq
Sat Dec 3 14:46:38 2016 daemon.crit dnsmasq[2095]: FAILED to start up
Sat Dec 3 14:46:38 2016 daemon.info procd: Instance dnsmasq::cfg02411c s in a crash loop 6 crashes, 0 seconds since last crash

this particular node:
Linux version 4.4.32 (fffr@v32412.1blu.de) (gcc version 5.4.0 (LEDE GCC 5.4.0 r2187+6) ) #0 Tue Sep 27 01:55:55 2016
based on https://kau.toke.dk/git/lede/commit/?id=18726b0ed2be546d1d2503c903d7d069ae5522d5

openwrt-bot · 2016-12-07T01:57:06Z

NeoRaider:

fuzzle, that's not even close to the issues reported in this ticket. As mentioned in earlier comments, this ticket doesn't have to do anything with dnsmasq, but is about a segfault in busybox.

Also, please don't report Gluon bugs in the LEDE tracker.

openwrt-bot · 2016-12-07T02:01:23Z

NeoRaider:

Small update:

While I mostly see this issue on a TL-WR1043 v1, I've also observed it on a TL-WR841 v9 at least once; so it seems the bug is not hardware-specific after all (at least not limited to specific SoCs).

Unfortunately, I've been busy with other things last week, so I haven't been able to continue debugging the issue.

openwrt-bot · 2016-12-27T20:05:32Z

nbd:

Please test the latest version

openwrt-bot · 2016-12-27T22:03:51Z

IronicSven:

I've tested a few versions since last weekend and couldn't reproduce this issue on my 1043nd v1 anymore. I think it's fixed.

openwrt-bot · 2016-12-27T22:26:48Z

NeoRaider:

Still reproducible with current master (r2695-c9c68c71776).

openwrt-bot · 2017-01-04T20:27:39Z

mjw99:

Just a "Me too". I am seeing this with a NETGEAR WNR2000v1 on r2449-7c47f43:[30131.723691] do_page_fault(): sending SIGSEGV to dnsmasq for invalid read access from 00000000 [30131.732296] epc = 00439ff1 in busybox[400000+4a000] [30131.737282] ra = 00439fe5 in busybox[400000+4a000]

openwrt-bot · 2017-01-25T19:12:07Z

IronicSven:

I can't reproduce this issue since weeks. I've been testing a TL-WR1043ND v1, TL-WR1043ND v2 and Archer C7 during this period.

Is it possible your images are selfbuilt and a make dirclean or make distclean might help?

openwrt-bot · 2017-01-29T13:11:25Z

nbd:

If you're still affected by this bug, please try the latest version

openwrt-bot · 2017-01-29T13:16:39Z

mamarley:

I'm not seeing this on my UAP-LR anymore.

openwrt-bot · 2017-01-31T23:52:04Z

mjw99:

I am no longer seeing this on a NETGEAR WNR2000v1 with 17.01-SNAPSHOT, r3045-e038c60.

openwrt-bot · 2017-05-06T07:26:48Z

guidosarducci:

I've just noticed seeing the following several times within the last day or so:
[1461327.495159] do_page_fault(): sending SIGSEGV to dnsmasq for invalid read access from 00000000 [1461327.504081] epc = 0040f28d in dnsmasq[400000+2c000] [1461327.509252] ra = 0040f273 in dnsmasq[400000+2c000]

I'm running the latest LEDE stable, with all updates applied as of 2017-05-05:

LEDE Reboot 17.01.1 r3316-7eb58cf109
D-Link DIR-835 rev. A1
dnsmasq-full - 2.76-6

The most recent upgrade in the same time frame was to odhcpd-2017-04-28-9268ca65-1. And DNSSEC is enabled.

After a few restart attempts, dnsmasq has continued to run since then.

openwrt-bot · 2017-05-07T06:53:26Z

guidosarducci:

The SIGSEGV crashes continue to happen periodically, and I may have been missing them due to dnsmasq being restarted by procd.

To get a little more info, I rebuilt the stable LEDE and dnsmasq-full with a "-g" CFLAG option. After installing this package, I captured the following crash details:

[1562749.817613] do_page_fault(): sending SIGSEGV to dnsmasq for invalid read access from 00000000 [1562749.826522] epc = 0040f295 in dnsmasq[400000+2c000] [1562749.831681] ra = 0040f27b in dnsmasq[400000+2c000]

Checking further with gdb yields:
(gdb) info line *0x0040f27b Line 278 of "forward.c" starts at address 0x40f275 <forward_query+204> and ends at 0x40f281 <forward_query+216>.

(gdb) info line *0x0040f295 Line 281 of "forward.c" starts at address 0x40f295 <forward_query+236> and ends at 0x40f29b <forward_query+242>.
And the relevant source (forward.c) looks like:
275 blockdata_retrieve(forward->stash, forward->stash_len, (void *)header); 276 plen = forward->stash_len; 277 278 if (find_pseudoheader(header, plen, NULL, &pheader, &is_sign, NULL) && !is_sign) 279 PUTSHORT(SAFE_PKTSZ, pheader); 280 281 if (forward->sentto->addr.sa.sa_family == AF_INET) 282 log_query(F_NOEXTRA | F_DNSSEC | F_IPV4, "retry", (struct all_addr *)&forward->sentto->addr.in.sin_addr, "dnssec"); 283 #ifdef HAVE_IPV6 284 else 285 log_query(F_NOEXTRA | F_DNSSEC | F_IPV6, "retry", (struct all_addr

Any similar reports from others? I'll keep monitoring in the meantime...

openwrt-bot · 2017-05-07T07:00:59Z

NeoRaider:

This ticket is specifically about a crash in busybox, often seen while running the dnsmasq init script (but also in other shell scripts). Your issue is a crash in dnsmasq itself, please open a new ticket.

openwrt-bot · 2017-05-07T08:37:12Z

guidosarducci:

Sure, new ticket created. I'd also like to suggest changing the unfortunately misleading title of this ticket if possible, since it matches my own issue.

openwrt-bot · 2017-08-21T08:28:46Z

ckujau:

For the record, this is still an issue with 17.01.2 and Dnsmasq version 2.77:


kernel: [ 2860.890789] 
kernel: [ 2860.890789] do_page_fault(): sending SIGSEGV to dnsmasq for invalid write access to 00552000
kernel: [ 2860.899402] epc = 77cd488c in libc.so[77c62000+92000]
kernel: [ 2860.904552] ra  = 00406c41 in dnsmasq[400000+21000]
kernel: [ 2860.909537]

I came across this one while playing around with //dnseval// from the [[https://github.com/farrokhi/dnsdiag|dnsdiag]] package. Simply calling //dnseval foo// was enough to make //dnsmasq// crash :-|

But, as this crashes the lastest git checkout from //dnsmasq// too, I shall report this upstream, of course.

openwrt-bot · 2017-08-21T08:33:51Z

NeoRaider:

ckujau: please open a new ticket for dnsmasq, I believe your issue hasn't been reported yet.

As mentioned in an earlier comment, this ticket doesn't have to do anything with dnsmasq at all; it is about a segfault in busybox that just happened to occur while running a shell script called "dnsmasq", leading to a somewhat confusing error message.

openwrt-bot · 2017-08-21T10:18:53Z

ckujau:

I think this has been reported in ~~[[https://bugs.lede-project.org/index.php?do=details&task_id=766|#766]]~~, sorry for the mixup.

openwrt-bot · 2017-08-21T10:22:45Z

NeoRaider:

#766 also looks like an independent issue. While both your crash and #766 are a segfault of dnsmasq, your crash happens in libc.so, while #766 is in dnsmasq itself.

openwrt-bot · 2017-09-01T14:52:12Z

marcin1j:

Christian

I reported the issue you mentioned as FS#994. What's your target and device the problem occurs on?

openwrt-bot · 2017-09-02T07:04:05Z

ckujau:

I was able to reproduce this on x86 too and [[http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2017q3/011704.html|bisected]] it to upstream commit 0xfa78573778, so it was not LEDE or architecture specific and I should've have reported this upstream from the start. But yes, the fix mentioned in FS#994 is the same [[http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2017q3/011714.html|posted]] to the //dnsmasq-discuss// list. I haven't had a chance to verify it yet (the previous band-aid patch worked), will report back.

For completeness' sake: my target is ar71xx (a TP-Link AC1750 Wifi router).

Thanks.

openwrt-bot closed this as completed Jan 29, 2017

FS#251 - sending SIGSEGV to dnsmasq for invalid read access from 00000000 #5482

FS#251 - sending SIGSEGV to dnsmasq for invalid read access from 00000000 #5482

Comments

openwrt-bot commented Oct 26, 2016

openwrt-bot commented Nov 6, 2016

openwrt-bot commented Nov 13, 2016

openwrt-bot commented Nov 13, 2016

openwrt-bot commented Nov 16, 2016

openwrt-bot commented Nov 16, 2016

openwrt-bot commented Nov 17, 2016

openwrt-bot commented Nov 18, 2016

openwrt-bot commented Nov 21, 2016

openwrt-bot commented Nov 22, 2016

openwrt-bot commented Nov 22, 2016

openwrt-bot commented Nov 22, 2016

openwrt-bot commented Nov 22, 2016

openwrt-bot commented Nov 22, 2016

openwrt-bot commented Nov 26, 2016

openwrt-bot commented Nov 30, 2016

openwrt-bot commented Dec 7, 2016

openwrt-bot commented Dec 7, 2016

openwrt-bot commented Dec 7, 2016

openwrt-bot commented Dec 27, 2016

openwrt-bot commented Dec 27, 2016

openwrt-bot commented Dec 27, 2016

openwrt-bot commented Jan 4, 2017

openwrt-bot commented Jan 25, 2017

openwrt-bot commented Jan 29, 2017

openwrt-bot commented Jan 29, 2017

openwrt-bot commented Jan 31, 2017

openwrt-bot commented May 6, 2017

openwrt-bot commented May 7, 2017

openwrt-bot commented May 7, 2017

openwrt-bot commented May 7, 2017

openwrt-bot commented Aug 21, 2017

openwrt-bot commented Aug 21, 2017

openwrt-bot commented Aug 21, 2017

openwrt-bot commented Aug 21, 2017

openwrt-bot commented Sep 1, 2017

openwrt-bot commented Sep 2, 2017