Bug 580709

Summary:

[abrt] crash in internal_getent()

Product:

[Fedora] Fedora

Reporter:

Michael J. Chudobiak <mjc>

Component:

glibc

Assignee:

Andreas Schwab <schwab>

Status:

CLOSED ERRATA

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

medium

Docs Contact:

Priority:

low

Version:

CC:

anders.blomdell, ar145boxer, dcbw, dhollis, drjohnson1, fweimer, jakub, jskala, jyundt, k1mk, liko, schwab

Target Milestone:

---

Keywords:

Reopened

Target Release:

---

Hardware:

i686

OS:

Linux

Whiteboard:

abrt_hash:cba8d0fd0a351004aa99366073ab7e0a8e607583

Fixed In Version:

glibc-2.12.1-2

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2010-09-24 20:37:36 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
File: backtrace	none

Description Michael J. Chudobiak 2010-04-08 20:20:37 UTC

abrt 1.0.9 detected a crash.

architecture: i686
Attached file: backtrace
cmdline: /usr/sbin/pppd nodetach lock nodefaultroute user user ttyUSB1 noipdefault noauth usepeerdns lcp-echo-failure 0 lcp-echo-interval 0 ipparam /org/freedesktop/NetworkManager/PPP/0 plugin /usr/lib/pppd/2.4.5/nm-pppd-plugin.so
component: ppp
executable: /usr/sbin/pppd
global_uuid: cba8d0fd0a351004aa99366073ab7e0a8e607583
kernel: 2.6.33.1-19.fc13.i686
package: ppp-2.4.5-6.fc13
rating: 4
reason: Process /usr/sbin/pppd was killed by signal 11 (SIGSEGV)
release: Fedora release 13 (Goddard)
How to reproduce: 1. Inserted Novatel U760 cdma usb dongle. See bug 580514 comment 4.

Comment 1 Michael J. Chudobiak 2010-04-08 20:20:39 UTC

Created attachment 405398 [details]
File: backtrace

Comment 2 Jiri Skala 2010-04-09 06:03:27 UTC

I've watched #580514. There is a described an influence of update on worse behaviour with crash of pppd.
I'd like to ask for more detailed info about update:

1. was the kernel updated too?
2. if so try to boot with older kernel with intention to identify if kernel brought   described change
3. are you able to test the modem with another system with F13? (complete behaviour, detection of hw when is started)

Thanks in advance minimal for answering items #1 or/and #2

Jiri

Comment 3 Michael J. Chudobiak 2010-04-09 12:01:11 UTC

1. No, there was no kernel or ppp update. (There were lots of other updates, though, including most NM rpms).

2. The only kernel on the system is the one provided by the preupgrade to F13 (kernel-2.6.33.1-19.fc13.i686).

3. No, this is my only F13 system. I tried the update to F13 to see if it would make my U760 modem work better on my Asus EEE 900.

Sorry this wasn't more helpful...

One possible clue: I find it odd in the log of bug 580514
comment 4 that the modem shows up on ttyUSB2. The earlier working log (bug 580514
comment 2) shows the modem on ttyUSB0. Might be significant, might not.

Wired and Wifi networking works great.

- Mike

Comment 4 Dan Williams 2010-04-29 23:48:17 UTC

(In reply to comment #3)
> 1. No, there was no kernel or ppp update. (There were lots of other updates,
> though, including most NM rpms).
> 
> 2. The only kernel on the system is the one provided by the preupgrade to F13
> (kernel-2.6.33.1-19.fc13.i686).
> 
> 3. No, this is my only F13 system. I tried the update to F13 to see if it would
> make my U760 modem work better on my Asus EEE 900.
> 
> Sorry this wasn't more helpful...
> 
> One possible clue: I find it odd in the log of bug 580514
> comment 4 that the modem shows up on ttyUSB2. The earlier working log (bug
> 580514
> comment 2) shows the modem on ttyUSB0. Might be significant, might not.

THis sometimes happens in response to kernel problems where the driver fails and isn't cleaned up after properly; then if you pull the device and re-insert it, since USB0 and USB1 still exist but aren't completely torn down, the kernel will chose the next available device names which will be USB2.  So this behavior usually indicates a kernel bug.

Comment 5 Dan Williams 2010-04-29 23:55:29 UTC

Moving to glibc since almost the entire stack is in glibc underneath a call to getlogin().  Seems this issue may be sasl related?

Are you still able to reproduce the issue?

Comment 6 Michael J. Chudobiak 2010-04-30 17:25:37 UTC

Yes, the crash still happens on up-to-date F13.

Apr 30 13:19:43 localhost NetworkManager: <info> Activation (ttyUSB1) Stage 4 of 5 (IP6 Configure Get) complete.
Apr 30 13:19:43 localhost pppd[3452]: Plugin /usr/lib/pppd/2.4.5/nm-pppd-plugin.so loaded.
Apr 30 13:19:43 localhost kernel: PPP generic driver version 2.4.2
Apr 30 13:19:43 localhost kernel: pppd[3452]: segfault at bff19baf ip 00164698 sp bff0e2e4 error 6 in libnss_files-2.11.90.so[15e000+c000]
Apr 30 13:19:44 localhost abrt[3455]: saved core dump of pid 3452 (/usr/sbin/pppd) to /var/cache/abrt/ccpp-1272647984-3452.new/coredump (937984 bytes)

Comment 7 Michael J. Chudobiak 2010-06-01 16:45:52 UTC

I resolved this on my system by:

1. editing /etc/nsswitch.conf to remove everything except "files" and "dns"

2. yum remove nss_ldap

One of these two (probably #2) "fixed" it for me.

I suspect there is still a bug here, but since I have a workaround, I won't be doing further tests to find the root cause.

- Mike

Comment 8 d. johnson 2010-07-11 05:55:49 UTC


-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 9 Anders Blomdell 2010-08-16 13:09:26 UTC

Unfortunately the bug still exists in Fedora 13 (updated by yum at Aug 04 05:55:33 2010). Stackdump in comment 1 looks very similar to the one I get. The segfault seems to be related to the fact that the buffer passed to internal_getent is not as big as buflen indicates.

In my case buffer=0xbf890800 (ends up in data->linebuffer), buflen=8192 (end up in linebuflen), hence 

  ((unsigned char *) data->linebuffer)[linebuflen - 1] = '\xff';

tries to access address 0xbf890800 + 8192 - 1 = 0xbf8927ff, but only memory up to 0xbf891fff is accessible (according to gdb on my coredump).

The following data in the passed buffer:

(gdb)  x/b 0xbf890800 + 1023
0xbf890bff:     0xff
(gdb)  x/b 0xbf890800 + 2047
0xbf890fff:     0xff
(gdb)  x/b 0xbf890800 + 4095
0xbf8917ff:     0xff

Makes it reasonable to believe that the following loop in __getlogin_r_loginuid:

  size_t buflen = 1024;
  char *buf = alloca (buflen);
  bool use_malloc = false;
  struct passwd pwd;
  struct passwd *tpwd;
  int res;
  
  while ((res = __getpwuid_r (uid, &pwd, buf, buflen, &tpwd)) != 0)
    if (__libc_use_alloca (2 * buflen))
      extend_alloca (buf, buflen, 2 * buflen);
    else
      {
        buflen *= 2;
        char *newp = realloc (use_malloc ? buf : NULL, buflen);
        if (newp == NULL)
          {
          fail:

            if (use_malloc)
              free (buf);
            return 1;
          }
          buf = newp;
          use_malloc = true;
       }

is at it's fourth iteration and that the buffer still lives on the stack, but the buffer extends past the mapped stack. I'm at a loss here...

Comment 10 Fedora Update System 2010-08-17 13:23:25 UTC

glibc-2.12.1-1 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/glibc-2.12.1-1

Comment 11 Anders Blomdell 2010-08-17 14:25:16 UTC

OK, thanks. Works like a charm here (and fixes the bug I was able to find, namely adding [ buf = ] to the extend_alloca call; and then some... )

Thanks

Comment 12 Fedora Update System 2010-08-20 01:48:34 UTC

glibc-2.12.1-1 has been pushed to the Fedora 13 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update glibc'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/glibc-2.12.1-1

Comment 13 Fabio C. 2010-08-30 22:01:21 UTC

(In reply to comment #7)
> I resolved this on my system by:
> 
> 1. editing /etc/nsswitch.conf to remove everything except "files" and "dns"
> 
> 2. yum remove nss_ldap
> 
> One of these two (probably #2) "fixed" it for me.
> 
> I suspect there is still a bug here, but since I have a workaround, I won't be
> doing further tests to find the root cause.
> 
> - Mike

i had this problem too with pppd and rp-pppoe scripts.

I resolved with #1 workaround.

Thank you very much Michael for this workaround, it was a big problem for me
If can be useful for bug hunting this was my situation:
I have found this bug after a update Fedora 12 -> Fedora 13, pppd crashing at boot but not if invoked in console by pppoe-start.
The log error at boot was:

Aug 30 19:40:44 wwwipcop kernel: pppd[2581]: segfault at bf9f201f ip 0049b518 sp bf9ec764 error 6 in libnss_files-2.12.so[495000+c000]

Comment 14 Fedora Update System 2010-09-15 05:40:56 UTC

glibc-2.12.1-2 has been pushed to the Fedora 13 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update glibc'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/glibc-2.12.1-2

Comment 15 Fedora Update System 2010-09-24 20:37:25 UTC

glibc-2.12.1-2 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 16 Andreas Schwab 2010-10-20 08:49:19 UTC

*** Bug 644434 has been marked as a duplicate of this bug. ***