Bug 1627073
| Summary: | [ESXi][RHEL7.6][abrt] [faf] open-vm-tools: _g_log_abort(): /usr/bin/vmtoolsd killed by 5 | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Vladimir Benes <vbenes> | |
| Component: | open-vm-tools | Assignee: | Cathy Avery <cavery> | |
| Status: | CLOSED ERRATA | QA Contact: | Bo Yang <boyang> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 7.6 | CC: | cavery, jen, jjarvis, jomurphy, jsavanyo, kkong, ldu, leiwang, linl, mkyral, ravindrakumar, ribarry, vbenes, vmware-gos-qa, yacao, ybhasin | |
| Target Milestone: | rc | Keywords: | Regression, TestOnly | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| URL: | https://faf.lab.eng.brq.redhat.com/faf/reports/bthash/42c958988dfc6e43f999e955816e95c3c51e2798/ | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1628959 (view as bug list) | Environment: | ||
| Last Closed: | 2019-08-06 12:57:03 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1667549 | |||
| Bug Blocks: | 1628959 | |||
|
Description
Vladimir Benes
2018-09-10 11:44:37 UTC
@Vladimir, I am not familiar with this tool. Did you run it? What was the last rev of opem-vm-tools where it passed? Thanks, Cathy (In reply to Cathy Avery from comment #4) > @Vladimir, > > I am not familiar with this tool. Did you run it? What was the last rev of > opem-vm-tools where it passed? > > Thanks, > > Cathy Click the link in comment #0, you can see more info (date of occurrence, RHEL version affected, etc) More details (like backtrace) should be in ftp://bordell.englab.brq.redhat.com/pub/open-vm-tools-10.2.5-2.el8-ccpp-2018-08-23-20-56-13-27328.tar.gz (also linked there) No idea why we have just el8 info, Martin can we dig some more el7 info too? Versions in el7.6 and el8 are now the same so I think you have to fix that crash for el8 anyway. And no, I didn't run it, I just saw the crash (as it's reported automatically when it's enabled in beaker automated runs). You can check various problems coming to FAF server here: https://faf.lab.eng.brq.redhat.com/faf/problems/ Vladimir Yes please give us more info on RHEL7 so we can determine if this is indeed a RHEL7 regression. Thanks, Cathy If this crash is from "vmtoolsd -n vmusr" service, then it might be same as https://bugzilla.redhat.com/show_bug.cgi?id=1530902. You should be able to compare the backtrace to confirm this. @Ravindra Here's the back trace. It looks the same as https://bugzilla.redhat.com/show_bug.cgi?id=1530902 1 _g_log_abort /usr/lib64/libglib-2.0.so.0.5600.1 0x51b11 2 g_log_writer_default /usr/lib64/libglib-2.0.so.0.5600.1 0x543d2 3 g_log_structured_array /usr/lib64/libglib-2.0.so.0.5600.1 0x52685 4 g_log_structured_standard /usr/lib64/libglib-2.0.so.0.5600.1 0x5316e 5 _gdk_x11_display_error_event /usr/lib64/libgdk-3.so.0.2200.30 0x612f7 6 gdk_x_error /usr/lib64/libgdk-3.so.0.2200.30 0x6de09 7 _XError /usr/lib64/libX11.so.6.3.0 0x4507b 8 handle_error /usr/lib64/libX11.so.6.3.0 0x420d7 9 _XReply /usr/lib64/libX11.so.6.3.0 0x43193 10 XGetWindowProperty /usr/lib64/libX11.so.6.3.0 0x2923f 11 XFetchName /usr/lib64/libX11.so.6.3.0 0x23ae0 12 X11Lock_Init /usr/lib64/open-vm-tools/plugins/vmusr/libdesktopEvents.so 0x1c01 13 ToolsOnLoad /usr/lib64/open-vm-tools/plugins/vmusr/libdesktopEvents.so 0x1789 14 ToolsCore_LoadPlugins /usr/bin/vmtoolsd 0x5e7d 15 ToolsCoreRunLoop /usr/bin/vmtoolsd 0x4cc8 16 main /usr/bin/vmtoolsd 0x3f6a OK so BZ 1530902 is reported against fedora 27. Ravindra, can you answer a couple of questions for me? We're at the end of the 7.6 release and we may need to determine if this bug is a legitimate blocker for 7.6 (i.e., is it a severe regression from 7.5). 1. Do you know if this bug was introduced in open-vm-tools 10.2.*? (rhel 7.5 contains open-vm-tools 10.1.3)? 2. What's the customer impact if this problem occurs? Does it self-heal, require customer action or a support call? 3. How often is this problem seen? (In reply to Rick Barry from comment #12) > 1. Do you know if this bug was introduced in open-vm-tools 10.2.*? (rhel 7.5 contains open-vm-tools 10.1.3)? My understanding is, this bug gets exposed by the system environment. The faulting code is pretty much the same between 10.2 and 10.1 versions. Our understanding is this issue is mostly seen on systems with Wayland because the faulting code is not yet enhanced for Wayland. AFAIK, RHEL 7 does not use Wayland by default. Am I right? Are you seeing this on RHEL 7 as well? Is RHEL 7.6 using Wayland by default now? If you have a consistent repro, it might help us triage this issue better. I'm including Kevin Kong as he has been looking into this issue. > 2. What's the customer impact if this problem occurs? This will affect "vmtoolsd -n vmusr" process that is responsible for drag/drop, copy/paste, which are primarily UI operations. So, it does not affect headless deployment. > Does it self-heal, require customer action or a support call? There is no self-heal as "vmtoolsd -n vmusr" will crash. We haven't had any customer calls on this so far. But, there are several Fedora reports. Please note that Fedora switched to Wayland a while back and we started seeing these. > 3. How often is this problem seen? As you can see in Fedora reports there are several reports of "vmtoolsd -n vmusr" crashes. The frequency of reports seems to have gone down after we applied following patch - https://src.fedoraproject.org/rpms/open-vm-tools/c/fec5bc0e046ed3f7f3563fca3ea41ba8e31cf406?branch=master. This patch is included in open-vm-tools 10.2.5 what you are using. Rick, I'm sorry as I forgot to mention that we also upgraded open-vm-tools to 10.3.0 in Fedora and that has following fix - https://github.com/vmware/open-vm-tools/commit/c80bb3fc7960bc78a6d39c89b6952218a401b0cf#diff-5cca602d926ceb52deb9a46b9b017c16 - which potentially fixes this problem. So, I'd suggest either try 10.3.0 if possible or try the patch above to see if the problem gets solved. Also adding the needInfo I removed accidently. Given the drastic reduction in the failures after rebasing to 10.3.0, I'm going to mark the Fedora bug 1530902 as fixed with 10.3.0 rebase. (In reply to Vladimir Benes from comment #5) > (In reply to Cathy Avery from comment #4) > > @Vladimir, > > > > I am not familiar with this tool. Did you run it? What was the last rev of > > opem-vm-tools where it passed? > > > > Thanks, > > > > Cathy > > Click the link in comment #0, you can see more info (date of occurrence, > RHEL version affected, etc) > > More details (like backtrace) should be in > ftp://bordell.englab.brq.redhat.com/pub/open-vm-tools-10.2.5-2.el8-ccpp-2018- > 08-23-20-56-13-27328.tar.gz (also linked there) > > No idea why we have just el8 info, Martin can we dig some more el7 info too? > Versions in el7.6 and el8 are now the same so I think you have to fix that > crash for el8 anyway. Unfortunately, there is no crash data from RHEL-7 yet. The only information I can see is, that on RHEL-7 the segfault occurs way more frequently (which in turn my be caused by higher testing effort on RHEL-7.6, which is nearing the testing phase end). Thanks for the additional comments 10 - 17. Since this issue has existed in open-vm-tools 10.1.3 (rhel 7.5) and may be partially fixed in 10.2.5 (rhel 7.6), this should not be a blocker for 7.6. This may not be a regression as I see a large number of vmtoolsd traces in rhel 7.5 (not exactly the same trace between releases, though): https://faf.lab.eng.brq.redhat.com/faf/reports/4504/ https://bugzilla.redhat.com/show_bug.cgi?id=1452100 open-vm-tools 10.3.0 has the fix, but it's tool late to back-port this for rhel-7.6 at this stage of the release. I'll clone this bug for rhel 8.0 (where we'll be rebasing open-vm-tools to 10.3.0) and move this bug to 7.7 where we can back-port the fix. WW08.5 UPDATE: From Platform QE FAF, checked this issue in RHEL7.6 within one month, it occurred again in Feb 06. Based on Comment 18, wait for its porting. BR, BO WW20.2 UPDATE: From Platform QE FAF, filter RHEL7.7 with its OVT released, https://faf.lab.eng.brq.redhat.com/faf/summary/?opsysreleases=48&component_names=open-vm-tools&daterange=2019-04-15%3A2019-05-14&resolution=d Change "Status" to "VERIFIED', if you have any question / requirement, you can update the "Status" again. BR, BO Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2139 |