Bug 976067
Summary: | vmware tools automatically starts on non vmware machines | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dan Mashal <dan.mashal> |
Component: | glib2 | Assignee: | Matthias Clasen <mclasen> |
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 19 | CC: | awreece, dan.mashal, johannbg, lnykryn, mark.harfouche, mclasen, msekleta, negativo17, plautrba, ravindrakumar, rjones, rob2098, rvokal, systemd-maint, vmware-gos-qa, vpavlin, zbyszek |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-02-17 15:38:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 834091 |
Description
Dan Mashal
2013-06-19 20:21:55 UTC
Confirmed this is a wider bug than just open-vm-tools. What does 'systemctl status vmtoolsd' say? systemctl status vmtoolsd vmtoolsd.service - Service for virtual machines hosted on VMware Loaded: loaded (/usr/lib/systemd/system/vmtoolsd.service; enabled) Active: inactive (dead) start condition failed at Wed 2013-06-19 13:07:55 PDT; 2h 24min ago Docs: http://open-vm-tools.sourceforge.net/about.php Jun 18 20:14:09 Fedora19 systemd[1]: Started Service for virtual machines h...e. Jun 18 22:49:50 Fedora19 systemd[1]: Started Service for virtual machines h...e. Jun 19 01:01:31 Fedora19 systemd[1]: Started Service for virtual machines h...e. Probably because I killed it. Nope, open-vm-tools have two services, vmtoolsd maintained by systemd and "vmtoolsd -n vmusr" which is attached to a user login. The latter is not controlled by systemd but it starts with a desktop session and ends with the session i.e. starts/ends with each user login/logout. Systemd does not start vmtoolsd on non-VMware environments (though service is enabled, as you can see "start condition failed at ..." in service status message). Also, vmtoolsd is designed to exit if it is not running on VMware VM. However, I've noticed that vmtoolsd gets stuck on a g_main_loop_unref() call. It appears to be a glib issue, but I'm not totally sure. From the bug perspective, "vmtoolsd -n vmusr" service is hung at cleanup, and it is safe to kill it. I'm investigating the root cause, so I will take it. I did not intend to change the hardware field. Following is the backtrace of a stuck "vmtoolsd -n vmusr:. (gdb) bt #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x0000003ac3a09bd7 in _L_lock_974 () from /lib64/libpthread.so.0 #2 0x0000003ac3a09b80 in __GI___pthread_mutex_lock (mutex=0xa981f0) at pthread_mutex_lock.c:104 #3 0x0000003ac5a872f1 in g_mutex_lock (mutex=mutex@entry=0xa997e0) at gthread-posix.c:210 #4 0x0000003ac5a455a8 in g_source_destroy_internal (source=0xa9bec0, context=context@entry=0xa997e0, have_lock=have_lock@entry=0) at gmain.c:1181 #5 0x0000003ac5a45b47 in g_main_context_unref (context=0xa997e0) at gmain.c:527 #6 0x0000003ac5a48558 in g_main_loop_unref (loop=0xa99a70) at gmain.c:3833 #7 0x000000000040443f in ToolsCoreCleanup (state=0x60a4e0 <gState>) at mainLoop.c:66 #8 ToolsCoreRunLoop (state=0x60a4e0 <gState>) at mainLoop.c:240 #9 0x00000000004039cb in main (argc=3, argv=0xa93330, envp=0x7fff4c6e88b8) at mainPosix.c:233 (gdb) info threads Id Target Id Frame * 1 Thread 0x7f159f9c1740 (LWP 3566) __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 (gdb) f 2 #2 0x0000003ac3a09b80 in __GI___pthread_mutex_lock (mutex=0xa981f0) at pthread_mutex_lock.c:104 104 LLL_MUTEX_LOCK (mutex); (gdb) p *mutex $1 = { __data = { __lock = 2, __count = 0, __owner = 3566, __nusers = 0, __kind = 3, __spins = 0, __list = { __prev = 0x0, __next = 0x0 } }, __size = "\002\000\000\000\000\000\000\000\356\r\000\000\000\000\000\000\003", '\000' <repeats 22 times>, __align = 2 } Thread 3566 holds the lock on mutex and it is trying to lock it again, unless it is a recursive mutex (PTHREAD_MUTEX_RECURSIVE) it is going to deadlock the caller (which is what is happening here). Additionally, vmtoolsd doesn't manage this mutex. As it is a glib managed mutex, I think it is some problem with glib or the way vmtoolsd is using glib GMainLoop. There were changes to mutex in glib but let's not go that route if possible, that is a dead end. I have a question regarding bug 834091. Given the overall situation above, would you still block bug 834091 on this bug? Yes. It's not a blocker, it's a FE. What's the output of systemd-detect-virt $ systemd-detect-virt oracle I can't see anything that was a bug in systemd here? The service in question is not a systemd service, right? And the systemd service got properly excluded from starting. Reassigning. Turns out this affects RHEL 7 as well, making it quite a serious bug. What's the status of this? It appears to be a bug in vmtoolsd right? It appears to be an issue with glib, please see my comment #c6. I will try to debug it again and see if I find anything to be fixed in vmtoolsd. I can reproduce this behavior with the vmtoolsd from open-vm-tools on Solaris:
$ mdb /var/crash/core.vmtoolsd.10213.1389811413
Loading modules: [ libc.so.1 ld.so.1 ]
> ::stack
libc.so.1`_lwp_kill+0x15(1, 6, 8047b58, fed74000, fed74000, 1)
libc.so.1`raise+0x2b(6, 0, 8047b40, fecc93db, 0, 0)
libc.so.1`abort+0x10e(fed75c20, feab8850, feab88bf, fed088f1, 1, feacd668)
libglib-2.0.so.0.3400.1`g_mutex_impl_new(2d, feab88bf, 8047bc8, 0, 8, 806d0a0)
libglib-2.0.so.0.3400.1`g_mutex_lock+0x43(806da00, 806da00, 1, 4, 8051bf7, feace824)
libglib-2.0.so.0.3400.1`g_source_destroy_internal+0x23(8075230, 806da00, 0, 0, 0, 8075230)
libglib-2.0.so.0.3400.1`g_main_context_unref+0x114(806da00, 4, 2000, 806a488, feffb0a4, 8047d24)
libglib-2.0.so.0.3400.1`g_main_loop_unref+0xa0(806a488, 80696a0, 8074f00, 806a488, 8047ca8, feffb0a4
)
ToolsCoreCleanup+0x5d(80696a0, 80545e4, 80696a0, 80696a0, 0, 0)
ToolsCoreRunLoop+0x12b(80696a0, fefc3330, fefa0500, 228, 8055114, 0)
ToolsCore_Run+0x72(80696a0, 1, 0, 0, 8047d24, 8047db4)
main+0x553(1, 806a980, 8047d70, feffb0a4, 8047d5c, 8053b02)
_start+0x83(1, 8047e24, 0, 8047e2d, 8047e41, 8047e51)
>
This is not an open-vm-tools bug. This is a lock related bug in glib 2.36. It has been fixed as https://bug697595.bugzilla-attachments.gnome.org/attachment.cgi?id=241179. I found that the fix is available in glib 2.37.0 and later versions. Given that this is a glib issue, either glib needs to be upgraded to a version 2.37.0 (or later) or this patch (https://bug697595.bugzilla-attachments.gnome.org/attachment.cgi?id=241179) should be applied in glib 2.36.3 source included in Fedora 19. awreece, what is the glib version you are using? I verified this on Fedora 20 that contains glib 2.38.2 and I could not repro this. So, this bug has been fixed in Fedora 20. Based on my previous two comments (#c16 and #c17), I'm moving this bug to glib2. I would suggest glib2 packager to pull in following fix in glib2: https://bug697595.bugzilla-attachments.gnome.org/attachment.cgi?id=241179. *** Bug 1008114 has been marked as a duplicate of this bug. *** Ravindra: indeed, I suspect its a glib bug. I am using glib 2.34.1 This message is a notice that Fedora 19 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 19. It is Fedora's policy to close all bug reports from releases that are no longer maintained. Approximately 4 (four) weeks from now this bug will be closed as EOL if it remains open with a Fedora 'version' of '19'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 19 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |