Bug 1058843
| Summary: | [abrt] NetworkManager: nm_active_connection_export(): NetworkManager killed by SIGABRT | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Martin <mholec> | ||||||||||||||||||||||||
| Component: | NetworkManager | Assignee: | Dan Williams <dcbw> | ||||||||||||||||||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Desktop QE <desktop-qa-list> | ||||||||||||||||||||||||
| Severity: | urgent | Docs Contact: | |||||||||||||||||||||||||
| Priority: | urgent | ||||||||||||||||||||||||||
| Version: | 7.0 | CC: | danw, dcbw, jgrulich, jklimes, nstraz, tpelka | ||||||||||||||||||||||||
| Target Milestone: | rc | ||||||||||||||||||||||||||
| Target Release: | 7.0 | ||||||||||||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||||||||||||
| OS: | Unspecified | ||||||||||||||||||||||||||
| Whiteboard: | abrt_hash:36c2159ffcc8a9d6088e0f81917bf589be32bfa5 | ||||||||||||||||||||||||||
| Fixed In Version: | NetworkManager-0.9.9.1-0.git20140228.el7 | Doc Type: | Bug Fix | ||||||||||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||||||||||
| Last Closed: | 2014-06-13 13:24:04 UTC | Type: | --- | ||||||||||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||
| Embargoed: | |||||||||||||||||||||||||||
| Bug Depends On: | |||||||||||||||||||||||||||
| Bug Blocks: | 916275 | ||||||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||||
|
Description
Martin
2014-01-28 15:43:52 UTC
Created attachment 856676 [details]
File: backtrace
Created attachment 856677 [details]
File: cgroup
Created attachment 856678 [details]
File: core_backtrace
Created attachment 856679 [details]
File: dso_list
Created attachment 856680 [details]
File: environ
Created attachment 856681 [details]
File: limits
Created attachment 856682 [details]
File: maps
Created attachment 856683 [details]
File: open_fds
Created attachment 856684 [details]
File: proc_pid_status
Created attachment 856685 [details]
File: var_log_messages
I'm able to reproduce it even in F20, but only with the old KDE applet, which is in RHEL 7. It looks like the old applet is calling something wrongly, but I haven't found where could be a problem. It's also reproducible with wireless networks secured with WPA2. I remember it worked, so maybe some change in NM causes this problem, but weird is that the crash is in NetworkManager. I've just tested the new update of NM in Fedora and it looks it was related to https://bugzilla.gnome.org/show_bug.cgi?id=723163 and seems to be fixed. Could be fixed by ff350c04c0546383841126ea43bed93d302482fb upstream, but the backtrace doesn't implicate AddAndActivate. I'd like to dig further into this though. If you can easily reproduce on *RHEL7*, please install the debuginfo so we can grab a better backtrace if possible. Also, if possible, could you reproduce the issue under valgrind and attach the result? Created attachment 857415 [details]
Backtrace with installed debuginfo
Adding a backtrace with installed debuginfo
I tried to run NetworkManager under valgrind, but probably not properly, because I got a lot of errors about connecting to dbus, but I reproduced the issue and the only thing I see is: ERROR:nm-active-connection.c:288:nm_active_connection_export: assertion failed: (priv->device || priv->vpn) Looks like an auto-activated request is then superceded by a manual request for the same device. The new request already has the INT_DEVICE set on it, and thus is watching for device state events. When the device state changes to DISCONNECTED to clean up the old request, the new request is listening and calls _device_cleanup(), but then proceeds and hits the assertion because priv->device is NULL. NetworkManager[3324]: <info> NetworkManager state is now CONNECTING NetworkManager[3324]: <info> Activation (wlp3s0) Stage 1 of 5 (Device Prepare) scheduled... NetworkManager[3324]: <info> Activation (wlp3s0) Stage 1 of 5 (Device Prepare) started... NetworkManager[3324]: <info> Activation (wlp3s0) Stage 2 of 5 (Device Configure) scheduled... NetworkManager[3324]: <info> Activation (wlp3s0) Stage 1 of 5 (Device Prepare) complete. NetworkManager[3324]: <info> (wlp3s0): disconnecting for new activation request. NetworkManager[3324]: <info> (wlp3s0): device state change: prepare -> disconnected (reason 'none') [40 30 0] NetworkManager[3324]: <info> (wlp3s0): deactivating device (reason 'none') [0] NetworkManager[3324]: <info> NetworkManager state is now DISCONNECTED NetworkManager[3324]: <info> Activation (wlp3s0) starting connection 'Red Hat' NetworkManager[3324]: <info> (wlp3s0): device state change: disconnected -> prepare (reason 'none') [30 40 0] NetworkManager[3324]: <info> Activation (wlp3s0) Stage 1 of 5 (Device Prepare) scheduled... ** ERROR:nm-active-connection.c:288:nm_active_connection_export: assertion failed: (priv->device || priv->vpn) Similar or the same issue is Fedora bug 1062984. Fixes pushed to dcbw/reactivate upstream. > core: queue re-activations to allow DEACTIVATING state any reason for the clear_act_request() rewrite? It seems less clear this way to me. The commit message doesn't explain the "HACK..." that you removed in nm_device_activate(). > core: better ignore deactivations before a new activation starts (rh #1058843) The AC could also just check if nm_device_get_act_request() returns itself (well, assuming the AC is an NMActRequest and not an NMVpnConnection)? But this way works too. (In reply to Dan Winship from comment #21) > > core: queue re-activations to allow DEACTIVATING state > > any reason for the clear_act_request() rewrite? It seems less clear this way > to me. Reverted. > The commit message doesn't explain the "HACK..." that you removed in > nm_device_activate(). Archaeology time! This was fun, back to 2007. Updated the commit message. > > core: better ignore deactivations before a new activation starts (rh #1058843) > > The AC could also just check if nm_device_get_act_request() returns itself > (well, assuming the AC is an NMActRequest and not an NMVpnConnection)? This is why we have code review :) I like your way better, though it is a bit less explicit. Updated. (In reply to Dan Williams from comment #22) > Archaeology time! This was fun, back to 2007. Updated the commit message. OK. That's more explanation than I was expecting. :) everything looks good now Another user experienced a similar problem: NetworkManager crashed during boot preventing network from coming online. reporter: libreport-2.1.11 backtrace_rating: 4 cmdline: /usr/sbin/NetworkManager --no-daemon crash_function: nm_active_connection_export executable: /usr/sbin/NetworkManager kernel: 3.10.0-93.el7.revolver.x86_64 package: NetworkManager-0.9.9.0-39.git20140131.el7 reason: NetworkManager killed by SIGABRT runlevel: unknown type: CCpp uid: 0 Merged to git master. e19f48ec2601a37641cfbcd4cc4b0c63b407c7a2 brings a regression in that active connections are not removed any more, but stuck in deactivating state. Test: $ nmcli device DEVICE TYPE STATE CONNECTION ens3 ethernet connected c3 ens4 ethernet connected c4 ens5 ethernet connected c5 $ nmcli connection NAME TYPE UUID TYPE DEVICE c3 ethernet cfac552d-710a-4ff7-9ef1-0edd218e52e8 802-3-ethernet ens3 c4 ethernet 3cde06c2-735c-41e3-a316-ad4a9d2c27dd 802-3-ethernet ens4 c5 ethernet b3007cf2-f1b4-49a4-af76-32afd8bb49d8 802-3-ethernet ens5 $ nmcli dev disc ens4 $ nmcli dev disc ens5 $ nmcli device DEVICE TYPE STATE CONNECTION ens3 ethernet connected c3 ens4 ethernet disconnected -- ens5 ethernet disconnected -- $ nmcli connection NAME TYPE UUID TYPE DEVICE c3 ethernet cfac552d-710a-4ff7-9ef1-0edd218e52e8 802-3-ethernet ens3 c4 ethernet 3cde06c2-735c-41e3-a316-ad4a9d2c27dd 802-3-ethernet ens4 c5 ethernet b3007cf2-f1b4-49a4-af76-32afd8bb49d8 802-3-ethernet ens5 I pushed a fix to jk/acon-remove-fix upstream branch. seems right to me, but it would be good to get dcbw's input Yeah, I tracked this down yesterday and today too. Your fix is correct, but I'd prefer one that makes the decision clearer. I pushed dcbw/acon-remove-fix with my version of the patch isolated. Would this be OK instead? also looks good after acks from thaller and danw, merged dcbw/acon-remove-fix to master This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |