Bug 580709
Summary: | [abrt] crash in internal_getent() | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Michael J. Chudobiak <mjc> | ||||
Component: | glibc | Assignee: | Andreas Schwab <schwab> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 13 | CC: | anders.blomdell, ar145boxer, dcbw, dhollis, drjohnson1, fweimer, jakub, jskala, jyundt, k1mk, liko, schwab | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | abrt_hash:cba8d0fd0a351004aa99366073ab7e0a8e607583 | ||||||
Fixed In Version: | glibc-2.12.1-2 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-09-24 20:37:36 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Michael J. Chudobiak
2010-04-08 20:20:37 UTC
Created attachment 405398 [details]
File: backtrace
I've watched #580514. There is a described an influence of update on worse behaviour with crash of pppd. I'd like to ask for more detailed info about update: 1. was the kernel updated too? 2. if so try to boot with older kernel with intention to identify if kernel brought described change 3. are you able to test the modem with another system with F13? (complete behaviour, detection of hw when is started) Thanks in advance minimal for answering items #1 or/and #2 Jiri 1. No, there was no kernel or ppp update. (There were lots of other updates, though, including most NM rpms). 2. The only kernel on the system is the one provided by the preupgrade to F13 (kernel-2.6.33.1-19.fc13.i686). 3. No, this is my only F13 system. I tried the update to F13 to see if it would make my U760 modem work better on my Asus EEE 900. Sorry this wasn't more helpful... One possible clue: I find it odd in the log of bug 580514 comment 4 that the modem shows up on ttyUSB2. The earlier working log (bug 580514 comment 2) shows the modem on ttyUSB0. Might be significant, might not. Wired and Wifi networking works great. - Mike (In reply to comment #3) > 1. No, there was no kernel or ppp update. (There were lots of other updates, > though, including most NM rpms). > > 2. The only kernel on the system is the one provided by the preupgrade to F13 > (kernel-2.6.33.1-19.fc13.i686). > > 3. No, this is my only F13 system. I tried the update to F13 to see if it would > make my U760 modem work better on my Asus EEE 900. > > Sorry this wasn't more helpful... > > One possible clue: I find it odd in the log of bug 580514 > comment 4 that the modem shows up on ttyUSB2. The earlier working log (bug > 580514 > comment 2) shows the modem on ttyUSB0. Might be significant, might not. THis sometimes happens in response to kernel problems where the driver fails and isn't cleaned up after properly; then if you pull the device and re-insert it, since USB0 and USB1 still exist but aren't completely torn down, the kernel will chose the next available device names which will be USB2. So this behavior usually indicates a kernel bug. Moving to glibc since almost the entire stack is in glibc underneath a call to getlogin(). Seems this issue may be sasl related? Are you still able to reproduce the issue? Yes, the crash still happens on up-to-date F13. Apr 30 13:19:43 localhost NetworkManager: <info> Activation (ttyUSB1) Stage 4 of 5 (IP6 Configure Get) complete. Apr 30 13:19:43 localhost pppd[3452]: Plugin /usr/lib/pppd/2.4.5/nm-pppd-plugin.so loaded. Apr 30 13:19:43 localhost kernel: PPP generic driver version 2.4.2 Apr 30 13:19:43 localhost kernel: pppd[3452]: segfault at bff19baf ip 00164698 sp bff0e2e4 error 6 in libnss_files-2.11.90.so[15e000+c000] Apr 30 13:19:44 localhost abrt[3455]: saved core dump of pid 3452 (/usr/sbin/pppd) to /var/cache/abrt/ccpp-1272647984-3452.new/coredump (937984 bytes) I resolved this on my system by: 1. editing /etc/nsswitch.conf to remove everything except "files" and "dns" 2. yum remove nss_ldap One of these two (probably #2) "fixed" it for me. I suspect there is still a bug here, but since I have a workaround, I won't be doing further tests to find the root cause. - Mike -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Unfortunately the bug still exists in Fedora 13 (updated by yum at Aug 04 05:55:33 2010). Stackdump in comment 1 looks very similar to the one I get. The segfault seems to be related to the fact that the buffer passed to internal_getent is not as big as buflen indicates. In my case buffer=0xbf890800 (ends up in data->linebuffer), buflen=8192 (end up in linebuflen), hence ((unsigned char *) data->linebuffer)[linebuflen - 1] = '\xff'; tries to access address 0xbf890800 + 8192 - 1 = 0xbf8927ff, but only memory up to 0xbf891fff is accessible (according to gdb on my coredump). The following data in the passed buffer: (gdb) x/b 0xbf890800 + 1023 0xbf890bff: 0xff (gdb) x/b 0xbf890800 + 2047 0xbf890fff: 0xff (gdb) x/b 0xbf890800 + 4095 0xbf8917ff: 0xff Makes it reasonable to believe that the following loop in __getlogin_r_loginuid: size_t buflen = 1024; char *buf = alloca (buflen); bool use_malloc = false; struct passwd pwd; struct passwd *tpwd; int res; while ((res = __getpwuid_r (uid, &pwd, buf, buflen, &tpwd)) != 0) if (__libc_use_alloca (2 * buflen)) extend_alloca (buf, buflen, 2 * buflen); else { buflen *= 2; char *newp = realloc (use_malloc ? buf : NULL, buflen); if (newp == NULL) { fail: if (use_malloc) free (buf); return 1; } buf = newp; use_malloc = true; } is at it's fourth iteration and that the buffer still lives on the stack, but the buffer extends past the mapped stack. I'm at a loss here... glibc-2.12.1-1 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/glibc-2.12.1-1 OK, thanks. Works like a charm here (and fixes the bug I was able to find, namely adding [ buf = ] to the extend_alloca call; and then some... ) Thanks glibc-2.12.1-1 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update glibc'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/glibc-2.12.1-1 (In reply to comment #7) > I resolved this on my system by: > > 1. editing /etc/nsswitch.conf to remove everything except "files" and "dns" > > 2. yum remove nss_ldap > > One of these two (probably #2) "fixed" it for me. > > I suspect there is still a bug here, but since I have a workaround, I won't be > doing further tests to find the root cause. > > - Mike i had this problem too with pppd and rp-pppoe scripts. I resolved with #1 workaround. Thank you very much Michael for this workaround, it was a big problem for me If can be useful for bug hunting this was my situation: I have found this bug after a update Fedora 12 -> Fedora 13, pppd crashing at boot but not if invoked in console by pppoe-start. The log error at boot was: Aug 30 19:40:44 wwwipcop kernel: pppd[2581]: segfault at bf9f201f ip 0049b518 sp bf9ec764 error 6 in libnss_files-2.12.so[495000+c000] glibc-2.12.1-2 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update glibc'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/glibc-2.12.1-2 glibc-2.12.1-2 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report. *** Bug 644434 has been marked as a duplicate of this bug. *** |