Bug 1122593

Summary: segv in nmcli during system shutdown
Product: [Fedora] Fedora Reporter: John Sullivan <jsrhbz>
Component: NetworkManagerAssignee: Dan Williams <dcbw>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 20CC: dcbw, jklimes, psimerda, shawn.bohrer
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: NetworkManager-0.9.9.0-42.git20131003.fc20 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-28 03:27:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Sullivan 2014-07-23 14:47:05 UTC
NetworkManager-0.9.9.0-41.git20131003.fc20.x86_64

Noticed a flurry of these on each reboot:

Jul 22 13:32:38 kernel: nmcli[22209]: segfault at 54 ip 00007f7a17176b40 sp 00007fff13f06b70 error 4 in libnm-glib.so.4.7.0[7f7a17165000+43000]
Jul 22 13:32:38 kernel: nmcli[22218]: segfault at 54 ip 00007f700c1bcb40 sp 00007fffe2e9a250 error 4 in libnm-glib.so.4.7.0[7f700c1ab000+43000]

The ip corresponds to _nm_object_ensure_inited:

   0x00007ffff793db2d <_nm_object_ensure_inited+29>:	call   0x7ffff793a0d0 <nm_object_get_type@plt>
   0x00007ffff793db32 <_nm_object_ensure_inited+34>:	mov    rdi,rbx
   0x00007ffff793db35 <_nm_object_ensure_inited+37>:	mov    rsi,rax
   0x00007ffff793db38 <_nm_object_ensure_inited+40>:	call   0x7ffff793a650 <g_type_instance_get_private@plt>
   0x00007ffff793db3d <_nm_object_ensure_inited+45>:	mov    rbp,rax
   0x00007ffff793db40 <_nm_object_ensure_inited+48>:	mov    eax,DWORD PTR [rax+0x54] <<<<< FAULTING INSTRUCTION

Which is in libnm-glib/nm-object.c:

_nm_object_ensure_inited (NMObject *object)
{
        NMObjectPrivate *priv = NM_OBJECT_GET_PRIVATE (object);
        GError *error = NULL;

        if (!priv->inited) {
[...]

So priv came back NULL.

It looks like there have been bugs in other places with the same basic symptoms: NM_OBJECT_GET_PRIVATE() returning NULL. In those cases because a method was called to access an nm object after that object was already freed. Which suggests the fix is not to check priv for NULL here but to prevent that arising in the first place.

Unfortunately I currently have neither the command line nor a coredump/backtrace, but I've just tried to enable coredumps for system services so if I can get one in the future I shall update this bug.

Comment 1 John Sullivan 2014-07-23 15:50:26 UTC
Ok:

The command is "nmcli -t --fields running general status", called from /etc/sysconfig/network-scripts/network-functions, function is_nm_running from function source_config, almost certainly from the ifdown script.

The backtrace is:

#0  _nm_object_ensure_inited (object=0x0) at nm-object.c:1259
#1  0x00007fa02e4007db in nm_client_new () at nm-client.c:1606
#2  0x00007fa02eafe51e in nmc_get_client (nmc=0x7fff00cee090) at nmcli.c:336
#3  0x00007fa02eae8746 in show_nm_status (nmc=nmc@entry=0x7fff00cee090, pretty_header_name=pretty_header_name@entry=0x0, print_flds=print_flds@entry=0x0)
    at network-manager.c:216
#4  0x00007fa02eae9234 in do_general (nmc=0x7fff00cee090, argc=1, argv=0x7fff00cee250) at network-manager.c:461
#5  0x00007fa02eafe025 in do_cmd (argv=0x7fff00cee248, argc=2, argv0=0x7fff00ceef56 "general", nmc=0x7fff00cee090) at nmcli.c:129
#6  parse_command_line (argv=0x7fff00cee240, argc=3, nmc=0x7fff00cee090) at nmcli.c:262
#7  start (data=0x7fff00cee070) at nmcli.c:399
#8  0x00007fa02bcad2a6 in g_main_dispatch (context=0x7fa02f329900) at gmain.c:3066
#9  g_main_context_dispatch (context=context@entry=0x7fa02f329900) at gmain.c:3642
#10 0x00007fa02bcad628 in g_main_context_iterate (context=0x7fa02f329900, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3713
#11 0x00007fa02bcada3a in g_main_loop_run (loop=0x7fa02f329a20) at gmain.c:3907
#12 0x00007fa02ead1b15 in main (argc=<optimized out>, argv=<optimized out>) at nmcli.c:434

nm_client_new says:

NMClient *
nm_client_new (void)
{
        NMClient *client;

        client = g_object_new (NM_TYPE_CLIENT, NM_OBJECT_DBUS_PATH, NM_DBUS_PATH, NULL);
        _nm_object_ensure_inited (NM_OBJECT (client));
        return client;
}

So g_object_new appears to have returned NULL. AFAICT it looks like libnm-glib/nm-object.c fails object creation unless it can connect to dbus, but from /var/log/messages it looks like systemd shut down the "D-Bus System Message Bus" well before it got to shutting down network interfaces.

Comment 2 Jirka Klimes 2014-07-24 16:03:59 UTC
John, thanks for reporting and the analysis. It helped me find out what is going on.

Fortunately, the problem has been fixed, both in upstream master and in RHEL7 - bug 1010288.
I'm going to push the fix for Fedora 20.

Comment 3 Jirka Klimes 2014-07-24 16:10:56 UTC
Fedora 20 build:
http://koji.fedoraproject.org/koji/buildinfo?buildID=547783

Comment 4 Fedora Update System 2014-07-25 07:02:58 UTC
NetworkManager-0.9.9.0-42.git20131003.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/NetworkManager-0.9.9.0-42.git20131003.fc20

Comment 5 Fedora Update System 2014-07-26 00:01:28 UTC
Package NetworkManager-0.9.9.0-42.git20131003.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing NetworkManager-0.9.9.0-42.git20131003.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-8871/NetworkManager-0.9.9.0-42.git20131003.fc20
then log in and leave karma (feedback).

Comment 6 Jirka Klimes 2014-07-27 07:39:29 UTC
*** Bug 1123520 has been marked as a duplicate of this bug. ***

Comment 7 Fedora Update System 2014-07-28 03:27:54 UTC
NetworkManager-0.9.9.0-42.git20131003.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 8 John Sullivan 2014-07-28 17:38:11 UTC
I updated to NetworkManager*-0.9.9.0-42.git20131003.fc20 and rebooted several times. segfaults no longer occur.

Thanks!