| Summary: | Fails to start if iface not up | segfault on IgnoreMissing | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Pete Zaitcev <zaitcev> | ||||
| Component: | radvd | Assignee: | Petr Pisar <ppisar> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 15 | CC: | jskala | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| URL: | http://lists.litech.org/pipermail/radvd-devel-l/2010-September/000491.html | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-03-29 17:03:07 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Pete Zaitcev
2011-07-15 17:46:33 UTC
Created attachment 513422 [details]
radvd.conf
Also, this may be obvious, but just to be clear: if I log in over ssh and say "systemctl start radvd.service", everything starts right up, because by that time hostapd brought up wlanhome. Before anyone asks: if I uncomment IgnoreMissing, the following happens: Jul 15 11:58:44 elanor radvd[1367]: interface wlanhome seems to have come back up, trying to reinitialize Jul 15 11:58:44 elanor radvd[1367]: attempting to reread config file Jul 15 11:58:44 elanor radvd[1367]: resuming normal operation Jul 15 11:58:44 elanor kernel: [ 40.608051] radvd[1367]: segfault at 50 ip 0080b14a sp bff5b3a0 error 6 in radvd[807000+63000] Jul 15 11:58:44 elanor systemd[1]: radvd.service: main process exited, code=killed, status=11 (In reply to comment #3) > Jul 15 11:58:44 elanor radvd[1367]: interface wlanhome seems to have come back > up, trying to reinitialize hmm this looks like radvd checks the interface is available. I'm not able to reproduce it. Are you able to provide me backtrace? Try to watch following bug: https://bugzilla.redhat.com/show_bug.cgi?id=729183 Do you have some comments to comment #4? Core seems to be impossible to catch, even with a shell wrapper, dunno
what's up with that. But gdb -p says:
Program received signal SIGSEGV, Segmentation fault.
0x002c114a in alarm_handler (sig=14) at timer.c:152
152 tm->prev->next = tm->next;
(gdb) where
#0 0x002c114a in alarm_handler (sig=14) at timer.c:152
#1 <signal handler called>
#2 0x00447416 in __kernel_vsyscall ()
#3 0x008b7ccd in ___newselect_nocancel () from /lib/libc.so.6
#4 0x002bfc5e in recv_rs_ra (sock=4, msg=0xbfbecdc4 "\206", addr=0xbfbed3a0,
pkt_info=0xbfbecdc0, hoplimit=0xbfbecdbc) at recv.c:46
#5 0x002bf087 in main (argc=3, argv=0xbfbed484) at radvd.c:341
(gdb)
radvd-1.8.2-2.fc15 may be ok. I jerked the interface around, it survived. Waiting for a reboot to confirm. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. I experienced this problem too (on Gentoo) and upgrade to 1.8.1 fixed it. I will dig sources to find a reference it's fixed indeed. The 1.7 code is:
while (tm->next && tm->prev && tm->expires.tv_sec != LONG_MAX && check_time_diff(tm, tv))
{
tm->prev->next = tm->next;
Last line segfaults. It could segfault only if tm or rm->prev were NULL, but tm->prev is checked in the while condition and NULL tm would segfault in the condition. The code is called in SIGALRM handler. The tm is taken from a linked list. I guess there was a race between the condition and the check.
Studying radvd-1.8 shows the code has changed significantly. The handler as well as recv_rs_ra() have changed. Also radvd-1.8 uses NETLINK now in more cases.
I did some tests with version you reported and with latest F15 version and none of them crashed (I played with two pairs of veth devices).
I believe the crash is fixed in 1.8.2.
Regarding the premature exit if IgnoreIfMissing is off: I think you need enable this option or modify the systemd unit file on your own. Distribution cannot wait for all devices because they can appear and disappear dynamically. Even the IgnoreIfMissing has been default since radvd-1.8.
|