Bug 1063282
| Summary: | [abrt] NetworkManager: nm_device_remove_pending_action(): NetworkManager killed by SIGSEGV | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Matěj Cepl <mcepl> | ||||||||||||||||||||||||||
| Component: | NetworkManager | Assignee: | Jirka Klimes <jklimes> | ||||||||||||||||||||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Desktop QE <desktop-qa-list> | ||||||||||||||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||||||||||||||
| Priority: | medium | ||||||||||||||||||||||||||||
| Version: | 7.0 | CC: | danken, dcbw, jklimes, mcepl, osvoboda, rkhan | ||||||||||||||||||||||||||
| Target Milestone: | rc | Keywords: | Reopened | ||||||||||||||||||||||||||
| Target Release: | --- | ||||||||||||||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||||||||||||||
| OS: | Unspecified | ||||||||||||||||||||||||||||
| Whiteboard: | abrt_hash:44c52c3fff9453a9375fda009a4f5fa96254f146 | ||||||||||||||||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||||||||||||
| Last Closed: | 2015-02-18 20:02:29 UTC | Type: | --- | ||||||||||||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||||
| Embargoed: | |||||||||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||||||
|
Description
Matěj Cepl
2014-02-10 12:00:52 UTC
Created attachment 861356 [details]
File: backtrace
Created attachment 861357 [details]
File: cgroup
Created attachment 861358 [details]
File: core_backtrace
Created attachment 861359 [details]
File: dso_list
Created attachment 861360 [details]
File: environ
Created attachment 861361 [details]
File: exploitable
Created attachment 861362 [details]
File: limits
Created attachment 861363 [details]
File: maps
Created attachment 861364 [details]
File: open_fds
Created attachment 861365 [details]
File: proc_pid_status
Created attachment 861366 [details]
File: var_log_messages
Created attachment 861367 [details]
File: sosreport.tar.xz
Matej, do you know how to reproduce it?
#0 nm_device_remove_pending_action (device=device@entry=0x7fcbb63cc120, action=0x7fcbb5972938 "queued state change to disconnected") at devices/nm-device.c:7403
priv = 0x7fcbb597294f
iter = <optimized out>
__PRETTY_FUNCTION__ = "nm_device_remove_pending_action"
7403 for (iter = priv->pending_actions; iter; iter = iter->next) {
Hmm, device is probably not valid any more?
Jirka, it may be the changes we'd made for deleting a virtual interface on explicit disconnect, since /var/log/messages includes: úno 10 11:42:20 wycliff.ceplovi.cz NetworkManager[11763]: <info> (virbr0-nic): device state change: ip-config -> deactivating (reason 'user-requested') [70 110 39] úno 10 11:42:20 wycliff.ceplovi.cz NetworkManager[11763]: <info> (virbr0-nic): device state change: deactivating -> disconnected (reason 'user-requested') [110 30 39] úno 10 11:42:20 wycliff.ceplovi.cz NetworkManager[11763]: <info> (virbr0-nic): deactivating device (reason 'user-requested') [39] úno 10 11:42:20 wycliff.ceplovi.cz NetworkManager[11763]: nm_device_queue_state: assertion `NM_IS_DEVICE (self)' failed mind checking out whether that might cause the issue? There is an assertion even before in the log: úno 10 11:30:38 wycliff.ceplovi.cz NetworkManager[11763]: <info> (virbr0-nic): device state change: config -> ip-config (reason 'none') [50 70 0] úno 10 11:30:38 wycliff.ceplovi.cz NetworkManager[11763]: nm_device_get_iface: assertion `self != NULL' failed But I am not able to spot any problem. So I am inclined to close this bug as fixed in the current release if the reporter sees no issues any more. Go ahead. This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux. Closing as fixed per comment 17. Hi :-)
I am hitting this bug with NetworkManager-0.9.9.1-26.git20140326.4dba720.el7_0.x86_64 when deleting a connection for a bond that I remove right before running nmcli conn delete.
How to reproduce:
cat > /etc/sysconfig/network-scripts/ifcfg-random-bond <<EOF
DEVICE=random-bond
NM_CONTROLLED=no
EOF
nmcli conn load /etc/sysconfig/network-scripts/ifcfg-random-bond
echo +random-bond > /sys/class/net/bonding_masters
(only to prove that the bond is indeed ignored by NM) echo 4 > /sys/class/net/random-bond/bonding/mode
echo -random-bond > /sys/class/net/bonding_masters
nmcli conn delete random-bond
Backtrace in GDB:
(NetworkManager:5593): GLib-GObject-WARNING **: invalid unclassed pointer in cast to `GObject'
(NetworkManager:5593): GLib-GObject-CRITICAL **: g_object_notify: assertion `G_IS_OBJECT (object)' failed
(NetworkManager:5593): GLib-GObject-WARNING **: invalid unclassed pointer in cast to `GObject'
(NetworkManager:5593): GLib-GObject-CRITICAL **: g_object_notify: assertion `G_IS_OBJECT (object)' failed
(NetworkManager:5593): GLib-GObject-WARNING **: invalid unclassed pointer in cast to `GObject'
(NetworkManager:5593): GLib-GObject-CRITICAL **: g_object_notify: assertion `G_IS_OBJECT (object)' failed
(NetworkManager:5593): GLib-GObject-WARNING **: invalid unclassed pointer in cast to `GObject'
(NetworkManager:5593): GLib-GObject-CRITICAL **: g_object_notify: assertion `G_IS_OBJECT (object)' failed
(NetworkManager:5593): GLib-GObject-WARNING **: invalid unclassed pointer in cast to `GObject'
(NetworkManager:5593): GLib-GObject-CRITICAL **: g_object_notify: assertion `G_IS_OBJECT (object)' failed
(NetworkManager:5593): GLib-GObject-CRITICAL **: g_type_instance_get_private: assertion `instance != NULL && instance->g_class != NULL' failed
Program received signal SIGSEGV, Segmentation fault.
nm_device_remove_pending_action (device=device@entry=0x7f913541ba60, action=0x7f91344dcb38 "queued state change to disconnected",
assert_is_pending=assert_is_pending@entry=1) at devices/nm-device.c:7753
7753 for (iter = priv->pending_actions; iter; iter = iter->next) {
Missing separate debuginfos, use: debuginfo-install GConf2-3.2.6-8.el7.x86_64 ModemManager-glib-1.1.0-6.git20130913.el7.x86_64 libgcc-4.8.2-16.2.el7_0.x86_64 libgudev1-208-11.el7_0.2.x86_64 libndp-1.2-4.el7.x86_64 libnl3-3.2.21-6.el7.x86_64 libsoup-2.42.2-3.el7.x86_64 libxml2-2.9.1-5.el7.x86_64 nspr-4.10.6-1.el7_0.x86_64 nss-3.16.2-2.el7_0.x86_64 nss-softokn-3.16.2-1.el7_0.x86_64 nss-softokn-freebl-3.16.2-1.el7_0.x86_64 nss-util-3.16.2-1.el7_0.x86_64 polkit-0.112-5.el7.x86_64 sqlite-3.7.17-4.el7.x86_64 systemd-libs-208-11.el7_0.2.x86_64 teamd-1.9-15.el7.x86_64
(gdb) bt
#0 nm_device_remove_pending_action (device=device@entry=0x7f913541ba60,
action=0x7f91344dcb38 "queued state change to disconnected", assert_is_pending=assert_is_pending@entry=1)
at devices/nm-device.c:7753
#1 0x00007f9134459d51 in queued_set_state (user_data=<optimized out>) at devices/nm-device.c:6881
#2 0x00007f91305f1ac6 in g_main_dispatch (context=0x7f9135376270) at gmain.c:3058
#3 g_main_context_dispatch (context=context@entry=0x7f9135376270) at gmain.c:3634
#4 0x00007f91305f1e48 in g_main_context_iterate (context=0x7f9135376270, block=block@entry=1, dispatch=dispatch@entry=1,
self=<optimized out>) at gmain.c:3705
#5 0x00007f91305f225a in g_main_loop_run (loop=0x7f91353747a0) at gmain.c:3899
#6 0x00007f91344468ea in main (argc=1, argv=0x7fffa105cd08) at main.c:644
This crash complicates the workaround of https://bugzilla.redhat.com/show_bug.cgi?id=1142701 (this bug is a clone of an original NM bug).
Can you reproduce the crash? I would be glad if you could find a solution. I offer any help with debugging, of course :-)
The problem is fixed in NetworkManager-0.9.9.1-35.git20140326.4dba720.el7 (for bug 1136843) that makes sure the connection is not even created for unmanaged devices. So you won't got a crash with it and it also helps you with un-managing the interface. However, I tried your steps with NetworkManager-0.9.9.1-33.git20140326.4dba720.el7 and I can reproduce the crash with it. Basically, the device is disposed and then some methods are called for it. But I can't understand why this happens. What is very strange for me is that dispose() is called from within g_signal_emit_by_name(). What does this mean??? 6786 g_signal_emit_by_name (device, "state-changed", state, old_state, reason); Breakpoint 3, dispose (object=0x7f2060910470) at devices/nm-device.c:5736 5736 { (gdb) bt #0 dispose (object=0x7f2060910470) at devices/nm-device.c:5736 #1 0x00007f205c613c68 in g_object_unref () from /lib64/libgobject-2.0.so.0 #2 0x00007f205c634453 in g_value_unset () from /lib64/libgobject-2.0.so.0 #3 0x00007f205c628e6e in g_signal_emit_valist () from /lib64/libgobject-2.0.so.0 #4 0x00007f205c629638 in g_signal_emit_by_name () from /lib64/libgobject-2.0.so.0 #5 0x00007f205ffa042d in nm_device_state_changed (device=<optimized out>, state=<optimized out>, reason=NM_DEVICE_STATE_REASON_CONNECTION_REMOVED) at devices/nm-device.c:6786 #6 0x00007f205ffa4e92 in queued_set_state (user_data=<optimized out>) at devices/nm-device.c:6914 #7 0x00007f205c1119ba in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #8 0x00007f205c111d08 in g_main_context_iterate.isra.24 () from /lib64/libglib-2.0.so.0 #9 0x00007f205c111fda in g_main_loop_run () from /lib64/libglib-2.0.so.0 #10 0x00007f205ff919ba in main (argc=1, argv=0x7fff0c2bc768) at main.c:642 Maybe here is a similar problem: https://bugzilla.redhat.com/show_bug.cgi?id=859879#c14 |