Bug 1465849
Summary: | v2v hangs on removing vmware-tool | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jaroslav Spanko <jspanko> | ||||||||||
Component: | libguestfs | Assignee: | Richard W.M. Jones <rjones> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | high | ||||||||||||
Version: | 7.4 | CC: | cww, jcoscia, jspanko, juzhou, mtessun, mxie, mzhan, ptoscano, tzheng | ||||||||||
Target Milestone: | rc | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | V2V | ||||||||||||
Fixed In Version: | libguestfs-1.40.1-1.el7 | Doc Type: | If docs needed, set a value | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2019-08-06 12:44:11 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | 1477905, 1621895 | ||||||||||||
Bug Blocks: | 1420851 | ||||||||||||
Attachments: |
|
Description
Jaroslav Spanko
2017-06-28 11:05:09 UTC
Can you please run virt-v2v with the -v -x parameters added, and provide the full log? I.e. something like: $ virt-v2v -v -x [all the other parameters already used] 2>&1 | tee v2v.log uploaded thanks Created attachment 1292613 [details]
v2v hangs log
Ehm you forgot -x too... Also I don't think we have the Perl script. Could you download it and attach it to the bug: virt-copy-out -a guest /usr/bin/vmware-uninstall-tools.pl /tmp & upload /tmp/vmware-uninstall-tools.pl See also comment 5. I am not able to provide the /usr/bin/vmware-uninstall-tools.pl now, I will try to reproduce this as we faced this during remote session with cu every time. Thanks Attached the pl script, I am trying to reproduce it but I'm not sure as till now with no success. Thanks ! Created attachment 1299845 [details]
uninstall script
I tried to reproduce this using the following steps: (1) Download VMwaretools-10.0.6-3560309.zip (or any other version) from VMware's site. (2) Unpack the ZIP file, find linux.iso, open it [eg using guestfish] and extract VMwareTools-10.0.6-3560309.tar.gz (exact version number may be different). (3) Install a new RHEL 7.3 guest with VMware tools enabled: $ virt-builder rhel-7.3 \ --install bash,perl,net-tools \ --copy-in /var/tmp/VMwareTools-10.0.6-3560309.tar.gz:/tmp \ --run-command 'cd /tmp && tar zxf /tmp/VMwareTools-10.0.6-3560309.tar.gz' \ --run-command 'cd /tmp/vmware-tools-distrib && ./vmware-install.pl -d default --force-install' (4) Perform a conversion: $ virt-v2v -i disk rhel-7.3.img -o null However I could not reproduce the hang (which is possibly not surprising because VMware tools isn't "really" installed here). I had a closer look at the log provided and (1) it's a SUSE guest and (2) the guest is hanging when rebuilding the kdump initrd. I've heard that one before ... I wonder if we could set rootdev as we do in this patch: https://www.redhat.com/archives/libguestfs/2017-June/msg00000.html A speculative patch would look like this: diff --git a/v2v/convert_linux.ml b/v2v/convert_linux.ml index c34bf3e91..5d650561f 100644 --- a/v2v/convert_linux.ml +++ b/v2v/convert_linux.ml @@ -304,7 +304,11 @@ let rec convert (g : G.guestfs) inspect source output rcaps = let uninstaller = "/usr/bin/vmware-uninstall-tools.pl" in if g#is_file ~followsymlinks:true uninstaller then ( try - ignore (g#command [| uninstaller |]); + if family = `SUSE_family then + ignore (g#sh (sprintf "/usr/bin/env rootdev=%s %s" + inspect.i_root uninstaller)) + else + ignore (g#command [| uninstaller |]); (* Reload Augeas to detect changes made by vbox tools uninst. *) Linux.augeas_reload g Pino, do we have any SUSE Enterprise templates we can use? (In reply to Richard W.M. Jones from comment #12) > I had a closer look at the log provided and (1) it's a SUSE guest and > (2) the guest is hanging when rebuilding the kdump initrd. > > I've heard that one before ... > > I wonder if we could set rootdev as we do in this patch: > > https://www.redhat.com/archives/libguestfs/2017-June/msg00000.html > > A speculative patch would look like this: > > diff --git a/v2v/convert_linux.ml b/v2v/convert_linux.ml > index c34bf3e91..5d650561f 100644 > --- a/v2v/convert_linux.ml > +++ b/v2v/convert_linux.ml > @@ -304,7 +304,11 @@ let rec convert (g : G.guestfs) inspect source output > rcaps = > let uninstaller = "/usr/bin/vmware-uninstall-tools.pl" in > if g#is_file ~followsymlinks:true uninstaller then ( > try > - ignore (g#command [| uninstaller |]); > + if family = `SUSE_family then > + ignore (g#sh (sprintf "/usr/bin/env rootdev=%s %s" > + inspect.i_root uninstaller)) Better use g#command here too, as done when mkinitrd is run: ignore (g#command [| "/usr/bin/env"; "rootdev=" ^ inspect.i_root; uninstaller |]); this way quoting issues are avoided. Also, theoretically this could be done regardless of the distro, I don't think on other distros the rootdev environment variable should do much. (Although it can be avoided, to be safe.) > Pino, do we have any SUSE Enterprise templates we can use? Nope, neither on VMware. Also, I recall being told the hang happened also on other guests than SUSE, i.e. RHEL, but I cannot find references now. Jaroslav, do you remember more? > Also, I recall being told the hang happened also on other guests than SUSE,
> i.e. RHEL, but I cannot find references now. Jaroslav, do you remember more?
Hi Pino,Rich
Yes it happened for SLES 11 and RHEL 6.x. I attached screenshot from the remote sessions, unfortunately is all what i have for now.
There are 3 pictures where the conversion hanged for ~20-30 minutes, I am not sure if it was related to CU environment but was no related to one distro.
Thanks a lot !
Created attachment 1300309 [details]
hang for 20 minutes
Created attachment 1300310 [details]
Seg fault in one case
Created attachment 1300311 [details]
RHEL hang
(In reply to Jaroslav Spanko from comment #16) > Created attachment 1300310 [details] > Seg fault in one case VMware is running a tool which communicates with its own hypervisor. However there's no VMware hypervisor because everything is run under qemu, so instead of doing the right thing it crashes. I actually talked to VMware about this a while back and they fixed it. (In reply to Jaroslav Spanko from comment #17) > Created attachment 1300311 [details] > RHEL hang The RHEL hang is distinctly different from the SUSE hang, although with less information available. Unfortunately I wasn't able to reproduce the RHEL hang using a RHEL 6.8 guest and the same steps as in comment 11. I spend a couple of days on this and read a lot of documentation, but I still cannot work out how to enable kdump on OpenSUSE. I think a better approach is that I prepare a package containing the proposed patch and the reporter can see if it makes any difference. I just wonder whether the hang, and the crash are two different results of the same issue, i.e. the vmware-uninstall-tools.pl script trying to communicate with VMX, and failing. (In reply to Pino Toscano from comment #20) > I just wonder whether the hang, and the crash are two different results of > the same issue, i.e. the vmware-uninstall-tools.pl script trying to > communicate with VMX, and failing. Likely there are several different things going on, the only common theme being that they are all caused by running vmware-uninstall-tools.pl. #1 The log posted in comment 4 looks almost certainly as if the SUSE mkdumprd script is hanging (see my full analysis in comment 12). This is what the proposed patch may fix. #2 There is also a hang in a RHEL 6 guest, but that cannot be the same thing as above. #3 There is also a segfault in a VMware tool, which is caused by the VMware port not being available. It could be as you say that #2 and #3 are different aspects of the same thing, or not. I'm unable to reproduce any of them. As a bit of clarification, it seems like #1 is a problem, but would not cause a hang. Note that in the short term (i.e. for the whole 7.4.z) we will disable the execution of vmware-uninstall-tools.pl, since it turning out to be problematic -- see also bug #1480623. Regarding 7.5, so far it will be disabled too (bug #1477905), unless we find out what are the exact issues the uninstallation script is hitting, and we fix them somehow. Self-note: fixing this for RHEL 7.5 will make bug 1477905 and bug 1481930 obsolete. A couple of months ago I sent this upstream series: https://www.redhat.com/archives/libguestfs/2018-October/msg00044.html In particular, patch #2 deals with the uninstallation of VMware tools from tarball: https://www.redhat.com/archives/libguestfs/2018-October/msg00046.html the approach chosen works around the behaviour of the vmware-uninstall-tools.pl script, making sure that it does not do more work than needed, and thus it should not take long anymore. This was implemented as commit 04a157d0f8529e8bf6a5bc0ac4ee146aaedc3c15, which is included in libguestfs >= v1.39.12. This bug will be fixed by the rebase scheduled for RHEL 7.7, see bug 1621895. Try to verify this bug with new build: libvirt-4.5.0-15.el7.x86_64 libguestfs-1.40.2-4.el7.x86_64 virt-v2v-1.40.2-4.el7.x86_64 qemu-kvm-rhev-2.12.0-27.el7.x86_64 python-ovirt-engine-sdk4-4.3.1-1.el7ev.x86_64 rhv-guest-tools-iso-4.3-6.el7ev.noarch kernel: 3.10.0-1040.el7.x86_64 rhv:4.3.0.1-0.1.el7 Steps: Scenario-1 Testing with SLES guests with VMware Tools installation type is "Open-source VMware Tools" 1.1 Prepare a SLES guests with VMware Tools installation type is "Open-source VMware Tools" 1.2 Login in vm and find that there is vmtoolsd process running in vm. # ps -ef |grep vmtoolsd root 848 1 0 05:33 ? 00:00:00 /usr/sbin/vmtoolsd root 2084 1 0 05:35 ? 00:00:00 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr 1.3 Shutdown vm and convert vm from vmware to rhv by virt-v2v. # virt-v2v -ic vpx://root.73.141/data/10.73.75.219/?no_verify=1 -it vddk -io vddk-libdir=/home/vmware-vix-disklib-distrib -io vddk-thumbprint=1F:97:34:5F:B6:C2:BA:66:46:CB:1A:71:76:7D:6B:50:1E:03:00:EA esx6.7-sles15-x86_64-GUI --password-file /tmp/passwd -o rhv-upload -oc https://ibm-x3250m5-03.rhts.eng.pek2.redhat.com/ovirt-engine/api -os nfs_data -op /tmp/rhvpasswd -oo rhv-cafile=/home/ca.pem -oo rhv-direct=true -oo rhv-cluster=nfs -of raw -b ovirtmgmt Exception AttributeError: "'module' object has no attribute 'dump_plugin'" in <module 'threading' from '/usr/lib64/python2.7/threading.pyc'> ignored [ 0.2] Opening the source -i libvirt -ic vpx://root.73.141/data/10.73.75.219/?no_verify=1 esx6.7-sles15-x86_64-GUI -it vddk -io vddk-libdir=/home/vmware-vix-disklib-distrib -io vddk-thumbprint=1F:97:34:5F:B6:C2:BA:66:46:CB:1A:71:76:7D:6B:50:1E:03:00:EA [ 1.9] Creating an overlay to protect the source from being modified [ 5.4] Opening the overlay [ 11.7] Inspecting the overlay [ 29.5] Checking for sufficient free disk space in the guest [ 29.5] Estimating space required on target for each disk [ 29.5] Converting SUSE Linux Enterprise Server 15 to run on KVM virt-v2v: QEMU Guest Agent installed for this guest. virt-v2v: This guest has virtio drivers installed. [ 72.2] Mapping filesystem data to avoid copying unused and blank areas virt-v2v: warning: fstrim on guest filesystem /dev/sda1 failed. Usually you can ignore this message. To find out more read "Trimming" in virt-v2v(1). Original message: fstrim: fstrim: /sysroot/: the discard operation is not supported [ 73.9] Closing the overlay [ 74.2] Assigning disks to buses [ 74.2] Checking if the guest needs BIOS or UEFI to boot virt-v2v: This guest requires UEFI on the target to boot. [ 74.2] Initializing the target -o rhv-upload -oc https://ibm-x3250m5-03.rhts.eng.pek2.redhat.com/ovirt-engine/api -op /tmp/rhvpasswd -os nfs_data [ 75.5] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/var/tmp/rhvupload.C8JY1A/nbdkit0.sock", "file.export": "/" } (raw) (100.00/100%) [ 835.9] Creating output metadata [ 857.3] Finishing off 1.4 After finishing v2v conversion, power on guest at rhv, login guest and check vmware-tools service status # ll /usr/sbin/vmtoolsd ls: cannot access /usr/sbin/vmtoolsd: No such file or directory Result: virt-v2v doesn't hang while removing VMware tools during conversion, and VMware tools can be removed successfully. Scenario-2 Testing with RHEL6_10 guest installed VMware tools from tarball 2.1 Login in vm and find that there is vmtoolsd process running in vm. # ps -ef |grep vmtoolsd root 1666 1 0 05:33 ? 00:00:00 /usr/sbin/vmtoolsd root 3002 1 0 05:35 ? 00:00:00 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr 2.2 Shutdown vm and convert vm from vmware to rhv by virt-v2v. # virt-v2v -ic vpx://root.73.141/data/10.73.75.219/?no_verify=1 -it vddk -io vddk-libdir=/home/vmware-vix-disklib-distrib -io vddk-thumbprint=1F:97:34:5F:B6:C2:BA:66:46:CB:1A:71:76:7D:6B:50:1E:03:00:EA esx6.7-rhel6.10-x86_64-vmware-tools --password-file /tmp/passwd -o rhv-upload -oc https://ibm-x3250m5-03.rhts.eng.pek2.redhat.com/ovirt-engine/api -os nfs_data -op /tmp/rhvpasswd -oo rhv-cafile=/home/ca.pem -oo rhv-direct=true -oo rhv-cluster=nfs -of raw -b ovirtmgmt Exception AttributeError: "'module' object has no attribute 'dump_plugin'" in <module 'threading' from '/usr/lib64/python2.7/threading.pyc'> ignored [ 0.2] Opening the source -i libvirt -ic vpx://root.73.141/data/10.73.75.219/?no_verify=1 esx6.7-rhel6.10-x86_64-vmware-tools -it vddk -io vddk-libdir=/home/vmware-vix-disklib-distrib -io vddk-thumbprint=1F:97:34:5F:B6:C2:BA:66:46:CB:1A:71:76:7D:6B:50:1E:03:00:EA [ 2.0] Creating an overlay to protect the source from being modified [ 3.7] Opening the overlay [ 34.5] Inspecting the overlay [ 47.7] Checking for sufficient free disk space in the guest [ 47.7] Estimating space required on target for each disk [ 47.7] Converting Red Hat Enterprise Linux Server release 6.10 (Santiago) to run on KVM virt-v2v: QEMU Guest Agent installed for this guest. virt-v2v: This guest has virtio drivers installed. [ 142.4] Mapping filesystem data to avoid copying unused and blank areas [ 142.8] Closing the overlay [ 143.0] Assigning disks to buses [ 143.0] Checking if the guest needs BIOS or UEFI to boot [ 143.0] Initializing the target -o rhv-upload -oc https://ibm-x3250m5-03.rhts.eng.pek2.redhat.com/ovirt-engine/api -op /tmp/rhvpasswd -os nfs_data [ 145.1] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/var/tmp/rhvupload.mIB9Ne/nbdkit0.sock", "file.export": "/" } (raw) (100.00/100%) [ 790.0] Creating output metadata [ 812.4] Finishing off 2.3 After finishing v2v conversion, power on guest at rhv, login guest and check vmware-tools service status # ll /usr/sbin/vmtoolsd ls: cannot access /usr/sbin/vmtoolsd: No such file or directory Result: virt-v2v doesn't hang while removing VMware tools during conversion, and VMware tools can be removed successfully. I also test with many other rhel and sles vms, virt-v2v doesn't hang while removing VMware tools during conversion, so I move this bug from ON_QA to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2096 |