Bug 1963032
| Summary: | qemu-ga sometimes fails to install at firstboot: Failed to grab execution mutex. System error 258. | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | mxie <mxie> |
| Component: | virt-v2v | Assignee: | Richard W.M. Jones <rjones> |
| Status: | CLOSED MIGRATED | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 9.0 | CC: | chhu, juzhou, lersek, mzhan, rjones, tyan, tzheng, vwu, xiaodwan, yvugenfi |
| Target Milestone: | beta | Keywords: | MigratedToJIRA, Triaged |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | x86_64 | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-09-22 17:36:12 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Attachments: | |||
|
Description
mxie@redhat.com
2021-05-21 08:08:08 UTC
Created attachment 1785441 [details]
msiexec-error-windows-log.png
Created attachment 1785442 [details]
esx7.0-win2016-date-yyyy-m-d-china-rhel9.0.log
Apparently the problem means that another instance of msiexec is running at the same time, and rather than do anything sensible only one is allowed to run and the other fails. Of course the question then become what other instance of msiexec is running? Is this a regression over RHEL 8? I can't imagine what has changed that would stop this from working between RHEL 8 and RHEL 9. What kind of installer runs during v2v? Is it https://github.com/virtio-win/virtio-win-guest-tools-installer or qemu-ga installer separately? It's run via virt-v2v, and the command run is: qemu-ga-x86_64.msi /forcerestart /qn /l+*vx C:\qemu-ga.log qemu-ga-x86_64.msi comes from virtio-win-1.9.15-3.el9.noarch (filename: /guest-agent/qemu-ga-x86_64.msi) Looking at the log file, I suspect what could be happening here is that the instance of MsiExec that we started running 60 seconds earlier to uninstall Vmware-Tools is still running. I wonder if there's a way to "chain" MsiExec jobs so that one runs after another, and the whole lot only runs after the network has come up. > It's run via virt-v2v,
What I mean by this is virt-v2v sets it up, and it runs once the first
time the guest is booted on KVM.
(In reply to Richard W.M. Jones from comment #6) > > It's run via virt-v2v, > > What I mean by this is virt-v2v sets it up, and it runs once the first > time the guest is booted on KVM. Sure. (In reply to Richard W.M. Jones from comment #5) > It's run via virt-v2v, and the command run is: > > qemu-ga-x86_64.msi /forcerestart /qn /l+*vx C:\qemu-ga.log > > qemu-ga-x86_64.msi comes from virtio-win-1.9.15-3.el9.noarch > (filename: /guest-agent/qemu-ga-x86_64.msi) > > Looking at the log file, I suspect what could be happening here > is that the instance of MsiExec that we started running 60 seconds > earlier to uninstall Vmware-Tools is still running. > > I wonder if there's a way to "chain" MsiExec jobs so that one > runs after another, and the whole lot only runs after the network > has come up. You can try to create a wrapper batch file and use "start /wait" for each command that triggers the execution of MsiExec. BTW, I didn't meet the bug when I test (In reply to Richard W.M. Jones from comment #3) > Apparently the problem means that another instance of msiexec is > running at the same time, and rather than do anything sensible > only one is allowed to run and the other fails. Of course the question > then become what other instance of msiexec is running? > > Is this a regression over RHEL 8? I can't imagine what has changed > that would stop this from working between RHEL 8 and RHEL 9. I didn't meet the bug when I tested bug1895323 with virt-v2v-1.42.0-6.module+el8.4.0+8855, please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1895323#c15 I downgrade virt-v2v and libguestfs to 1.42.0-6.module+el8.4.0+8855 and convert the same guest of this bug, qemu-ga can installed normally after v2v conversion, compare the debug logs of virt-v2v-1.42.0-6 and virt-v2v-1.44.0-1.el9.1, found info of install-qemu-ga-x86_64-msi.bat is a little different between them # cat esx7.0-win2016-date-yyyy-m-d-china-v2v-libguestfs-1.42.0-6-virtio-win-1.9.16-2.log |grep install-qemu-ga-x86_64-msi.bat libguestfs: trace: v2v: write "/Program Files/Guestfs/Firstboot/scripts/0002-install-qemu-ga-x86_64-msi.bat" "echo Removing any previously scheduled qemu-ga installation\x0d\x0aschtasks.exe /Delete /TN Firstboot-qemu-ga /F\x0d\x0aecho Scheduling delayed installation of qemu-ga from qemu-ga-x86_64.msi\x0d\x0apowershell.exe -command "$d = (get-date).AddSeconds(120); schtasks.exe /Cre"<truncated, original size 446 bytes> libguestfs: trace: v2v: internal_write "/Program Files/Guestfs/Firstboot/scripts/0002-install-qemu-ga-x86_64-msi.bat" "echo Removing any previously scheduled qemu-ga installation\x0d\x0aschtasks.exe /Delete /TN Firstboot-qemu-ga /F\x0d\x0aecho Scheduling delayed installation of qemu-ga from qemu-ga-x86_64.msi\x0d\x0apowershell.exe -command "$d = (get-date).AddSeconds(120); schtasks.exe /Cre"<truncated, original size 446 bytes> # cat esx7.0-win2016-date-yyyy-m-d-china-rhel9.0.log |grep install-qemu-ga-x86_64-msi.bat libguestfs: trace: v2v: write "/Program Files/Guestfs/Firstboot/scripts/0001-install-qemu-ga-x86_64-msi.bat" "echo Removing any previously scheduled qemu-ga installation\x0d\x0aschtasks.exe /Delete /TN Firstboot-qemu-ga /F\x0d\x0aecho Scheduling delayed installation of qemu-ga from qemu-ga-x86_64.msi\x0d\x0apowershell.exe -command "$d = (get-date).AddSeconds(120); $FormatHack = ($("<truncated, original size 581 bytes> libguestfs: trace: v2v: internal_write "/Program Files/Guestfs/Firstboot/scripts/0001-install-qemu-ga-x86_64-msi.bat" "echo Removing any previously scheduled qemu-ga installation\x0d\x0aschtasks.exe /Delete /TN Firstboot-qemu-ga /F\x0d\x0aecho Scheduling delayed installation of qemu-ga from qemu-ga-x86_64.msi\x0d\x0apowershell.exe -command "$d = (get-date).AddSeconds(120); $FormatHack = ($("<truncated, original size 581 bytes> Created attachment 1786835 [details]
esx7.0-win2016-date-yyyy-m-d-china-v2v-libguestfs-1.42.0-6-virtio-win-1.9.16-2.log
mxie: I suspect there is simply a race between the two runs of MsiExec. Yan is correct here that we need to go about this differently and his suggestion of using start /wait is what I'll try. Still looking for a way to wait until the network is available however. FWIW my analysis of the two logs ... In your log (virt-v2v 1.42): 0001-configure-rhev-apt.bat - installs rhev-apt immediately 0002-install-qemu-ga-x86_64-msi.bat - after 120 seconds installs qemu-ga via msiexec 0003-uninstall-VMware-Tools.bat - removes VMware tools immediately via msiexec In the current bug (virt-v2v 1.44): 0001-install-qemu-ga-x86_64-msi.bat - after 120 seconds installs qemu-ga via msiexec 0002-uninstall-VMware-Tools.bat - removes VMware tools immediately via msiexec The difference is because in virt-v2v 1.44 we no longer try to install rhev-apt, since it's obsolete. (In reply to Richard W.M. Jones from comment #10) > mxie: I suspect there is simply a race between the two runs of MsiExec. > > Yan is correct here that we need to go about this differently and his > suggestion of using start /wait is what I'll try. Still looking for a > way to wait until the network is available however. > > FWIW my analysis of the two logs ... > > In your log (virt-v2v 1.42): > > 0001-configure-rhev-apt.bat - installs rhev-apt immediately > 0002-install-qemu-ga-x86_64-msi.bat - after 120 seconds installs qemu-ga via > msiexec > 0003-uninstall-VMware-Tools.bat - removes VMware tools immediately via > msiexec > > In the current bug (virt-v2v 1.44): > > 0001-install-qemu-ga-x86_64-msi.bat - after 120 seconds installs qemu-ga via > msiexec > 0002-uninstall-VMware-Tools.bat - removes VMware tools immediately via > msiexec > > The difference is because in virt-v2v 1.44 we no longer try to install > rhev-apt, since it's obsolete. is this something we can mitigate (perhaps with a delay /sleep), or perhaps something for us to document and move on? msiexec is a piece of junk that causes these kinds of random problems (see also uninstall VMware Tools not working). But also we need to rewrite firstboot so it can run scripts in sequence and with dependencies between scripts (bug 1788823) which would help a lot here. (In reply to Richard W.M. Jones from comment #13) > msiexec is a piece of junk that causes these kinds of random problems (see > also uninstall VMware Tools not working). But also we need to rewrite > firstboot so it can run scripts in sequence and with dependencies between > scripts (bug 1788823) which would help a lot here. Considering the efforts in getting v2v 2.0 on RHEL-9, I'd suggest close WONFIX this one and track the improvement of the use-case described here in Bug 1788823... Rich - the stale date is approaching (strange that the bot didn't fire). Do you believe we should extend the stale date or just decide this wont be fixed. I see the bug Klaus mentions above is in Verified, but I'm unclear how that affects what's described here though. Kill stale bug stuff. We still have to work on scheduling firstboot scripts on Windows which is a long term effort. Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |