Red Hat Bugzilla – Bug 865009
GString mem alloc crashes after dbus op
Last modified: 2015-03-03 18:05:45 EST
Might not be NetworkManager issue after all, take it as a starting point. Potential candidates include dbus-glib and glib2.
Description of problem:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Fire up Gnome with The Shell
2. gnome-control-center network
3. watch NM daemon crashes
Created attachment 624964 [details]
Created attachment 624965 [details]
Forgot to add this is a clean F18 (Alpha) install, i686 32-bit PAE mode, SELinux in enforcing mode.
One physical Ethernet NIC.
> # ip link sh
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN mode DEFAULT
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> 2: p2p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode > DEFAULT qlen 1000
> link/ether 00:xx:xx:xx:xx:92 brd ff:ff:ff:ff:ff:ff
Both traces are /usr/sbin/NetworkManager processes. Since it's started on demand, using the following script to attach to a newly started instance:
> (while [[ -z `pidof NetworkManager` ]]; do sleep 1; done) && gdb /usr/sbin/NetworkManager `pidof NetworkManager`
Created attachment 624981 [details]
valgrind log, try #1
valgrind log, still using GSlice, will try to catch better one
Okay, running the daemon with G_SLICE=always-malloc set it seems it doesn't crash anymore. Allocator fault perhaps?
Created attachment 624985 [details]
valgrind log, try #2, with G_SLICE=always-malloc
Another valgrind log, this time the process didn't crash, terminated manually with Ctrl+C. Attaching just for curiosity and comparison.
Thanks; likely a mixup of g_array_unref() used on a GValueArray instead of g_value_array_unref() in src/nm-dispatcher.c.
(In reply to comment #8)
> Thanks; likely a mixup of g_array_unref() used on a GValueArray instead of
> g_value_array_unref() in src/nm-dispatcher.c.
By which I mean g_value_array_free().
Created attachment 625498 [details]
Free GValueArray with g_value_array_free ()
Use g_value_array_free () as Dan suggests. g_array_unref () seems to make harm to heap.
I'm not sure about the other "Invalid read" in foreach_route_cb (nm-netlink-utils.c:410). The code looks OK for the first look. Maybe nl_addr_get_binary_addr () returns a bad addr?
Anyway, this one shouldn't cause any harm.
(In reply to comment #10)
> Created attachment 625498 [details]
> Free GValueArray with g_value_array_free ()
> Use g_value_array_free () as Dan suggests. g_array_unref () seems to make
> harm to heap.
Looks like this did the trick, thanks! I can't seem to make it crash again with the patch applied.
Patch from comment #10 pushed to upstream master:
NetworkManager-0.9.7.0-6.git20121004.fc18 has been submitted as an update for Fedora 18.
*** Bug 865042 has been marked as a duplicate of this bug. ***
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing NetworkManager-0.9.7.0-6.git20121004.fc18'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
*** Bug 863544 has been marked as a duplicate of this bug. ***
*** Bug 866434 has been marked as a duplicate of this bug. ***
Jirka, please transfer Blocks field. Doing that now.
*** Bug 864300 has been marked as a duplicate of this bug. ***
For anyone reproducing this - please note that the issue might be tied just to i386 systems.
Discussed at 2012-10-17 blocker review meeting: http://meetbot.fedoraproject.org/fedora-qa/2012-10-17/f18beta-blocker-review-4.2012-10-17-16.00.log.txt . Note that this bug can potentially affect installation - see #866434 - which is why it's proposed as a blocker.
We agreed the bug constitutes a conditional violation of the criteria - when it affects installation, install crashes, which is obviously against the criteria. However, it seems to affect only 32-bit installs and does so only occasionally: 2 in 5 tries for kparal, but 1 in 20 tries for Jirka. On the basis that you can just restart and try again, and you should be able to get it to go through after a couple of tries, this is rejected as a blocker. it is accepted as NTH as obviously install crashers are worth fixing post-freeze.
If we get data indicating this might be more than just an occasional problem, it can be re-proposed as a blocker, but it's very likely to be fixed in future builds anyway since it's been accepted as NTH.
I'm sorry I couldn't have attended yesterday's blocker bug meeting. But I have to re-propose this as a Beta blocker, because this is one of the biggest blockers I have ever seen. I have learned that this manifests only on 32bit systems and that's the reason why my colleagues weren't able to reproduce it. But I have performed about 30 Anaconda boots yesterday on two different 32bit bare metal machines and I see a 80-90% failure rate. Moreover, the remaining 10-20% of cases where it successfully boots to the installer, it's ever worse, because it causes so many of weird things happening, just look at duplicates of bug 866434: no network devices, installation hangs, invalid hostname, etc.
On that basis I change vote to +1 blocker.
Well, the update has gone stable now, so we could in fact close this bug. But let's wait for TC5 to verify that the fix is good. Kamil, can you re-test with TC5 - which should include the fixed NM - and close this bug if it works OK? Thanks!
"For anyone reproducing this - please note that the issue might be tied just to i386 systems."
My system is x86_64. I have seen this problem and was added to the CC list by abrt accordingly.
I have done many boots of Beta TC6 and I'm pretty sure this bug no longer affects Anaconda installer. Closing.
*** Bug 863554 has been marked as a duplicate of this bug. ***