Bug 865009
Summary: | GString mem alloc crashes after dbus op | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Tomáš Bžatek <tbzatek> | ||||||||||||
Component: | NetworkManager | Assignee: | Dan Williams <dcbw> | ||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 18 | CC: | awilliam, danw, dcbw, dcharlespyle, jklimes, kparal, mikhail.v.gavrilov, psimerda, robatino, sergei.litvinenko, tsmetana | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | All | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | AcceptedNTH | ||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2012-10-22 12:41:41 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 752660, 752664 | ||||||||||||||
Attachments: |
|
Description
Tomáš Bžatek
2012-10-10 15:39:19 UTC
Created attachment 624964 [details]
backtrace
Created attachment 624965 [details]
backtrace
Forgot to add this is a clean F18 (Alpha) install, i686 32-bit PAE mode, SELinux in enforcing mode.
One physical Ethernet NIC.
> # ip link sh
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN mode DEFAULT
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> 2: p2p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode > DEFAULT qlen 1000
> link/ether 00:xx:xx:xx:xx:92 brd ff:ff:ff:ff:ff:ff
Both traces are /usr/sbin/NetworkManager processes. Since it's started on demand, using the following script to attach to a newly started instance:
> (while [[ -z `pidof NetworkManager` ]]; do sleep 1; done) && gdb /usr/sbin/NetworkManager `pidof NetworkManager`
Created attachment 624981 [details]
valgrind log, try #1
valgrind log, still using GSlice, will try to catch better one
Okay, running the daemon with G_SLICE=always-malloc set it seems it doesn't crash anymore. Allocator fault perhaps? Created attachment 624985 [details]
valgrind log, try #2, with G_SLICE=always-malloc
Another valgrind log, this time the process didn't crash, terminated manually with Ctrl+C. Attaching just for curiosity and comparison.
Thanks; likely a mixup of g_array_unref() used on a GValueArray instead of g_value_array_unref() in src/nm-dispatcher.c. (In reply to comment #8) > Thanks; likely a mixup of g_array_unref() used on a GValueArray instead of > g_value_array_unref() in src/nm-dispatcher.c. By which I mean g_value_array_free(). Created attachment 625498 [details]
Free GValueArray with g_value_array_free ()
Use g_value_array_free () as Dan suggests. g_array_unref () seems to make harm to heap.
I'm not sure about the other "Invalid read" in foreach_route_cb (nm-netlink-utils.c:410). The code looks OK for the first look. Maybe nl_addr_get_binary_addr () returns a bad addr? Anyway, this one shouldn't cause any harm. looks good (In reply to comment #10) > Created attachment 625498 [details] > Free GValueArray with g_value_array_free () > > Use g_value_array_free () as Dan suggests. g_array_unref () seems to make > harm to heap. Looks like this did the trick, thanks! I can't seem to make it crash again with the patch applied. Patch from comment #10 pushed to upstream master: b95b6c8aa1b2e2d6a662e93843e50b50d5a9c6c6 NetworkManager-0.9.7.0-6.git20121004.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/NetworkManager-0.9.7.0-6.git20121004.fc18 *** Bug 865042 has been marked as a duplicate of this bug. *** Package NetworkManager-0.9.7.0-6.git20121004.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing NetworkManager-0.9.7.0-6.git20121004.fc18' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-16127/NetworkManager-0.9.7.0-6.git20121004.fc18 then log in and leave karma (feedback). *** Bug 863544 has been marked as a duplicate of this bug. *** *** Bug 866434 has been marked as a duplicate of this bug. *** Jirka, please transfer Blocks field. Doing that now. *** Bug 864300 has been marked as a duplicate of this bug. *** For anyone reproducing this - please note that the issue might be tied just to i386 systems. Discussed at 2012-10-17 blocker review meeting: http://meetbot.fedoraproject.org/fedora-qa/2012-10-17/f18beta-blocker-review-4.2012-10-17-16.00.log.txt . Note that this bug can potentially affect installation - see #866434 - which is why it's proposed as a blocker. We agreed the bug constitutes a conditional violation of the criteria - when it affects installation, install crashes, which is obviously against the criteria. However, it seems to affect only 32-bit installs and does so only occasionally: 2 in 5 tries for kparal, but 1 in 20 tries for Jirka. On the basis that you can just restart and try again, and you should be able to get it to go through after a couple of tries, this is rejected as a blocker. it is accepted as NTH as obviously install crashers are worth fixing post-freeze. If we get data indicating this might be more than just an occasional problem, it can be re-proposed as a blocker, but it's very likely to be fixed in future builds anyway since it's been accepted as NTH. I'm sorry I couldn't have attended yesterday's blocker bug meeting. But I have to re-propose this as a Beta blocker, because this is one of the biggest blockers I have ever seen. I have learned that this manifests only on 32bit systems and that's the reason why my colleagues weren't able to reproduce it. But I have performed about 30 Anaconda boots yesterday on two different 32bit bare metal machines and I see a 80-90% failure rate. Moreover, the remaining 10-20% of cases where it successfully boots to the installer, it's ever worse, because it causes so many of weird things happening, just look at duplicates of bug 866434: no network devices, installation hangs, invalid hostname, etc. On that basis I change vote to +1 blocker. Well, the update has gone stable now, so we could in fact close this bug. But let's wait for TC5 to verify that the fix is good. Kamil, can you re-test with TC5 - which should include the fixed NM - and close this bug if it works OK? Thanks! "For anyone reproducing this - please note that the issue might be tied just to i386 systems." My system is x86_64. I have seen this problem and was added to the CC list by abrt accordingly. I have done many boots of Beta TC6 and I'm pretty sure this bug no longer affects Anaconda installer. Closing. *** Bug 863554 has been marked as a duplicate of this bug. *** |