Bug 1055101

Summary: [abrt] NetworkManager: _g_log_abort(): NetworkManager killed by SIGABRT
Product: [Fedora] Fedora Reporter: Niko <nfink95>
Component: NetworkManagerAssignee: Dan Williams <dcbw>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 20CC: dcbw, dieter.kasper, jklimes, KoVadim
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/0f5c3477151430b2ee6418ed35bb478e4a528e94
Whiteboard: abrt_hash:d6dd577390db5ce1ec04d902a4766efeffceb3d9
Fixed In Version: NetworkManager-0.9.9.0-38.git20131003.fc20 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-22 03:59:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: backtrace
none
File: cgroup
none
File: core_backtrace
none
File: dso_list
none
File: environ
none
File: limits
none
File: maps
none
File: open_fds
none
File: proc_pid_status
none
File: var_log_messages none

Description Niko 2014-01-18 19:45:49 UTC
Version-Release number of selected component:
NetworkManager-0.9.9.0-23.git20131003.fc20

Additional info:
reporter:       libreport-2.1.11
backtrace_rating: 4
cmdline:        /usr/sbin/NetworkManager --no-daemon
crash_function: _g_log_abort
executable:     /usr/sbin/NetworkManager
kernel:         3.12.7-300.fc20.x86_64
runlevel:       N 5
type:           CCpp
uid:            0

Truncated backtrace:
Thread no. 1 (10 frames)
 #2 _g_log_abort at gmessages.c:255
 #5 activation_failure_handler at devices/nm-device-wifi.c:3254
 #6 device_state_changed at devices/nm-device-wifi.c:3344
 #7 ffi_call_unix64 at ../src/x86/unix64.S:76
 #8 ffi_call at ../src/x86/ffi64.c:522
 #9 g_cclosure_marshal_generic at gclosure.c:1454
 #13 g_signal_emit_by_name at gsignal.c:3426
 #14 nm_device_state_changed at devices/nm-device.c:6064
 #20 g_object_notify_by_spec_internal at gobject.c:1141
 #21 g_object_notify at gobject.c:1183

Comment 1 Niko 2014-01-18 19:45:53 UTC
Created attachment 852088 [details]
File: backtrace

Comment 2 Niko 2014-01-18 19:45:55 UTC
Created attachment 852089 [details]
File: cgroup

Comment 3 Niko 2014-01-18 19:45:58 UTC
Created attachment 852090 [details]
File: core_backtrace

Comment 4 Niko 2014-01-18 19:46:00 UTC
Created attachment 852091 [details]
File: dso_list

Comment 5 Niko 2014-01-18 19:46:01 UTC
Created attachment 852092 [details]
File: environ

Comment 6 Niko 2014-01-18 19:46:03 UTC
Created attachment 852093 [details]
File: limits

Comment 7 Niko 2014-01-18 19:46:06 UTC
Created attachment 852094 [details]
File: maps

Comment 8 Niko 2014-01-18 19:46:09 UTC
Created attachment 852095 [details]
File: open_fds

Comment 9 Niko 2014-01-18 19:46:11 UTC
Created attachment 852096 [details]
File: proc_pid_status

Comment 10 Niko 2014-01-18 19:46:13 UTC
Created attachment 852097 [details]
File: var_log_messages

Comment 11 Dan Williams 2014-04-09 19:19:21 UTC
Jan 16 10:08:43 SD959-LT NetworkManager[13490]: <info> (wlp9s0): disconnecting for new activation request.
Jan 16 10:08:43 SD959-LT NetworkManager[13490]: <info> (wlp9s0): device state change: secondaries -> disconnected (reason 'none') [90 30 0]
Jan 16 10:08:43 SD959-LT NetworkManager[13490]: <info> (wlp9s0): deactivating device (reason 'none') [0]
Jan 16 10:08:44 SD959-LT NetworkManager[13490]: <info> (wlp9s0): canceled DHCP transaction, DHCP client pid 1002
Jan 16 10:08:44 SD959-LT NetworkManager[13490]: <info> NetworkManager state is now DISCONNECTED
Jan 16 10:08:44 SD959-LT NetworkManager[13490]: (devices/nm-device.c:5983):nm_device_state_changed: runtime check failed: (priv->in_state_changed == FALSE)
Jan 16 10:08:44 SD959-LT NetworkManager[13490]: <info> (wlp9s0): device state change: disconnected -> failed (reason 'secondary-connection-failed') [30 120 54]
Jan 16 10:08:44 SD959-LT NetworkManager[13490]: ** Message: Terminated openvpn daemon with PID 1038.
Jan 16 10:08:44 SD959-LT NetworkManager[13490]: **
Jan 16 10:08:44 SD959-LT NetworkManager[13490]: ERROR:devices/nm-device-wifi.c:3254:activation_failure_handler: assertion failed: (connection)

I think the segfault is fixed by Thomas' rework of the WiFi bits for bug 1025371.

But the real problem is that the device is moved to DISCONNECTED state, which terminates a secondary VPN that is in the process of connecting.  That triggers nm-policy.c::vpn_connection_deactivated() which then calls nm-policy.c::process_secondaries(), which then changes the device state to FAILED even though it's already in DISCONNECTED state.

process_secondaries() should take the device's current state into account before changing the device state.  If the secondary connected, it should ensure that the device is in the SECONDARIES state before moving the device to ACTIVATED, and if the secondary failed, it should ensure the device is in SECONDARIES || ACTIVATED states before moving the device to FAILED.

Comment 12 Fedora Update System 2014-04-15 11:56:55 UTC
NetworkManager-0.9.9.0-37.git20131003.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/NetworkManager-0.9.9.0-37.git20131003.fc20

Comment 13 Jirka Klimes 2014-04-15 12:20:41 UTC
(In reply to Dan Williams from comment #11)
> Jan 16 10:08:43 SD959-LT NetworkManager[13490]: <info> (wlp9s0):
> disconnecting for new activation request.
> Jan 16 10:08:43 SD959-LT NetworkManager[13490]: <info> (wlp9s0): device
> state change: secondaries -> disconnected (reason 'none') [90 30 0]
> Jan 16 10:08:43 SD959-LT NetworkManager[13490]: <info> (wlp9s0):
> deactivating device (reason 'none') [0]
> Jan 16 10:08:44 SD959-LT NetworkManager[13490]: <info> (wlp9s0): canceled
> DHCP transaction, DHCP client pid 1002
> Jan 16 10:08:44 SD959-LT NetworkManager[13490]: <info> NetworkManager state
> is now DISCONNECTED
> Jan 16 10:08:44 SD959-LT NetworkManager[13490]:
> (devices/nm-device.c:5983):nm_device_state_changed: runtime check failed:
> (priv->in_state_changed == FALSE)
> Jan 16 10:08:44 SD959-LT NetworkManager[13490]: <info> (wlp9s0): device
> state change: disconnected -> failed (reason 'secondary-connection-failed')
> [30 120 54]
> Jan 16 10:08:44 SD959-LT NetworkManager[13490]: ** Message: Terminated
> openvpn daemon with PID 1038.
> Jan 16 10:08:44 SD959-LT NetworkManager[13490]: **
> Jan 16 10:08:44 SD959-LT NetworkManager[13490]:
> ERROR:devices/nm-device-wifi.c:3254:activation_failure_handler: assertion
> failed: (connection)
> 
> I think the segfault is fixed by Thomas' rework of the WiFi bits for bug
> 1025371.
> 
> But the real problem is that the device is moved to DISCONNECTED state,
> which terminates a secondary VPN that is in the process of connecting.  That
> triggers nm-policy.c::vpn_connection_deactivated() which then calls
> nm-policy.c::process_secondaries(), which then changes the device state to
> FAILED even though it's already in DISCONNECTED state.
> 
> process_secondaries() should take the device's current state into account
> before changing the device state.  If the secondary connected, it should
> ensure that the device is in the SECONDARIES state before moving the device
> to ACTIVATED, and if the secondary failed, it should ensure the device is in
> SECONDARIES || ACTIVATED states before moving the device to FAILED.

Right, that caused a crash in bug 1055099 (on Fedora).

On upstream master, the bug didn't result in a crash, but the problem is the same. I pushed the fix:
c54faa4 policy: check device state before changing it for secondaries (rh #1055099)
and backported that to Fedora 20.

Testcase:
1) add a VPN connection as a secondary to an ethernet connection
2) activate the ethernet connection
3) activate the ethernet connection again (without providing or cancelling VPN password dialog)

Logs on upstream master:
NetworkManager[2655]: <info> Policy set 'ethernet-12' (eth0) as default for IPv4 routing and DNS.
NetworkManager[2655]: <info> Starting VPN service 'openvpn'...
Detaching after fork from child process 11412.
NetworkManager[2655]: <info> VPN service 'openvpn' started (org.freedesktop.NetworkManager.openvpn), PID 11412
NetworkManager[2655]: <info> VPN service 'openvpn' appeared; activating connections
NetworkManager[2655]: <info> (eth0): disconnecting for new activation request.
NetworkManager[2655]: <info> (eth0): device state change: secondaries -> deactivating (reason 'none') [90 110 0]
NetworkManager[2655]: <info> (eth0): device state change: deactivating -> disconnected (reason 'none') [110 30 0]
NetworkManager[2655]: <info> (eth0): deactivating device (reason 'none') [0]
NetworkManager[2655]: <info> (eth0): canceled DHCP transaction, DHCP client pid 11409
NetworkManager[2655]: <info> NetworkManager state is now DISCONNECTED
NetworkManager[2655]: (devices/nm-device.c:6591):nm_device_state_changed: runtime check failed: (priv->in_state_changed == FALSE)
NetworkManager[2655]: <info> (eth0): device state change: disconnected -> failed (reason 'secondary-connection-failed') [30 120 54]
NetworkManager[2655]: <warn> Activation (eth0) failed for connection '<unknown>'
NetworkManager[2655]: <warn> (eth0): add_pending_action (4): 'queued state change to disconnected' already added
NetworkManager[2655]: file devices/nm-device.c: line 7682 (nm_device_add_pending_action): should not be reached
NetworkManager[2655]: <info> Activation (eth0) starting connection 'ethernet-12'
NetworkManager[2655]: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) scheduled...
NetworkManager[2655]: <info> (eth0): device state change: failed -> disconnected (reason 'none') [120 30 0]
NetworkManager[2655]: <info> (eth0): deactivating device (reason 'none') [0]
NetworkManager[2655]: <warn> (eth0): remove_pending_action (2): 'queued state change to disconnected' never added
NetworkManager[2655]: file devices/nm-device.c: line 7733 (nm_device_remove_pending_action): should not be reached
NetworkManager[2655]: <info> VPN service 'openvpn' disappeared


Logs on Fedora 20:
NetworkManager[24363]: <info> (em1): disconnecting for new activation request.
NetworkManager[24363]: <info> (em1): device state change: secondaries -> disconnected (reason 'none') [90 30 0]
NetworkManager[24363]: <info> (em1): deactivating device (reason 'none') [0]
Received signal 15, initiating shutdown.
DHCPRELEASE on em1 to 10.11.5.19 port 67 (xid=0x60a5080f)
NetworkManager[24363]: <warn> (em1): DHCP client pid 25694 didn't exit, will kill it.
NetworkManager[24363]: <info> (em1): canceled DHCP transaction, DHCP client pid 25694
NetworkManager[24363]: <info> NetworkManager state is now DISCONNECTED
NetworkManager[24363]: (devices/nm-device.c:6281):nm_device_state_changed: runtime check failed: (priv->in_state_changed == FALSE)
NetworkManager[24363]: <info> (em1): device state change: disconnected -> failed (reason 'secondary-connection-failed') [30 120 54]
** Message: Terminated openvpn daemon with PID 25708.
NetworkManager[24363]: <warn> Activation (em1) failed for connection '<unknown>'
NetworkManager[24363]: <info> Activation (em1) starting connection 'ethernet'
NetworkManager[24363]: (devices/nm-device.c:4749):nm_device_activate: runtime check failed: (priv->state == NM_DEVICE_STATE_DISCONNECTED)
NetworkManager[24363]: <info> (em1): device state change: failed -> prepare (reason 'none') [120 40 0]

(NetworkManager:24363): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'NMActiveConnection'

(NetworkManager:24363): GLib-GObject-CRITICAL **: g_type_instance_get_private: assertion 'instance != NULL && instance->g_class != NULL' failed

Program received signal SIGSEGV, Segmentation fault.
nm_active_connection_get_path (self=<optimized out>) at nm-active-connection.c:157
157		return NM_ACTIVE_CONNECTION_GET_PRIVATE (self)->path;

Comment 14 Fedora Update System 2014-04-16 09:26:04 UTC
Package NetworkManager-0.9.9.0-37.git20131003.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing NetworkManager-0.9.9.0-37.git20131003.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-5242/NetworkManager-0.9.9.0-37.git20131003.fc20
then log in and leave karma (feedback).

Comment 15 Fedora Update System 2014-04-17 16:41:31 UTC
NetworkManager-0.9.9.0-38.git20131003.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/NetworkManager-0.9.9.0-38.git20131003.fc20

Comment 16 Fedora Update System 2014-04-22 03:59:22 UTC
NetworkManager-0.9.9.0-38.git20131003.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.