2182024 – improve UX when running as root and we can't chown v2v tmpdir or socks

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2182024 - improve UX when running as root and we can't chown v2v tmpdir or socks

Summary: improve UX when running as root and we can't chown v2v tmpdir or socks

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 9
Classification:	Red Hat
Component:	virt-v2v
Sub Component:
Version:	9.3
Hardware:	x86_64
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Laszlo Ersek
QA Contact:	mxie@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2023-03-27 10:19 UTC by mxie@redhat.com
Modified:	2023-11-07 09:28 UTC (History)
CC List:	12 users (show)
Fixed In Version:	virt-v2v-2.3.4-5.el9
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-11-07 08:28:57 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	2182505	medium	CLOSED	Create a selinux policy for nbdkit	2023-10-03 08:13:50 UTC
Red Hat Issue Tracker	RHELPLAN-153148	None	None	None	2023-03-27 10:38:07 UTC
Red Hat Product Errata	RHBA-2023:6376	None	None	None	2023-11-07 08:29:15 UTC

Description mxie@redhat.com 2023-03-27 10:19:53 UTC

Description of problem:
Virt-v2v conversion will encounter permission error after executing command 'yum remove libvirt*' and 'yum install virt-v2v' 
 
Version-Release number of selected component (if applicable):
virt-v2v-2.2.0-5.el9.x86_64
libvirt-libs-9.1.0-1.el9.x86_64
qemu-img-7.2.0-14.el9_2.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Remove all libvirt and its dependence package, virt-v2v will be removed during uninstallation
#yum remove libvirt* -y

2. Install virt-v2v again
#yum install virt-v2v -y

3. Convert a guest from VMware by v2v
# virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   1.9] Opening the source
virt-v2v: error: libguestfs error: could not connect to libvirt (URI = 
qemu:///system): Failed to connect socket to 
'/var/run/libvirt/virtqemud-sock': No such file or directory [code=38 
int1=2]

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]


4. Start service virtqemud.socket and try step3 again
# systemctl start virtqemud.socket

# virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   1.8] Opening the source
virt-v2v: error: libguestfs error: could not create appliance through 
libvirt.

Try running qemu directly without libvirt using this environment variable:
export LIBGUESTFS_BACKEND=direct

Original error from libvirt: internal error: process exited while 
connecting to monitor: 2023-03-27T09:58:18.838679Z qemu-kvm: -blockdev 
{"driver":"nbd","server":{"type":"unix","path":"/tmp/v2v.sKlulY/in0"},"node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}: 
Failed to connect to '/tmp/v2v.sKlulY/in0': Permission denied [code=1 
int1=-1]

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]


5. Reboot v2v server 
#init 6

6.Log into v2v server and try step3 again, the permission error disappears
# virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.3] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   2.4] Opening the source
[  14.5] Inspecting the source
[  24.0] Checking for sufficient free disk space in the guest
[  24.0] Converting Red Hat Enterprise Linux 9.2 Beta (Plow) to run on KVM
....


Actual results:
As above description

Expected results:
virt-v2v conversion has no problem

Additional info:
The bug can also be reproduced on rhel9.2

Package versions:
virt-v2v-2.2.0-5.el9.x86_64
libvirt-libs-9.0.0-10.el9_2.x86_64
qemu-img-7.2.0-14.el9_2.x86_64

Steps to reproduce on rhel9.2:

1. Remove all libvirt and its dependence package, virt-v2v will be removed during uninstallation
#yum remove libvirt* -y

2. Install virt-v2v again
#yum install virt-v2v -y

3.Convert a guest from VMware by v2v
# virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   1.8] Opening the source
virt-v2v: error: libguestfs error: could not create appliance through 
libvirt.

Try running qemu directly without libvirt using this environment variable:
export LIBGUESTFS_BACKEND=direct

Original error from libvirt: internal error: process exited while 
connecting to monitor: 2023-03-27T10:00:47.413893Z qemu-kvm: -blockdev 
{"driver":"nbd","server":{"type":"unix","path":"/tmp/v2v.mqzF1X/in0"},"node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}: 
Failed to connect to '/tmp/v2v.mqzF1X/in0': Permission denied [code=1 
int1=-1]

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]


4.Restart service libvirtd and try step3 again
# systemctl restart libvirtd

# virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   1.8] Opening the source
virt-v2v: error: libguestfs error: could not create appliance through 
libvirt.

Try running qemu directly without libvirt using this environment variable:
export LIBGUESTFS_BACKEND=direct

Original error from libvirt: internal error: process exited while 
connecting to monitor: 2023-03-27T10:00:47.413893Z qemu-kvm: -blockdev 
{"driver":"nbd","server":{"type":"unix","path":"/tmp/v2v.mqzF1X/in0"},"node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}: 
Failed to connect to '/tmp/v2v.mqzF1X/in0': Permission denied [code=1 
int1=-1]

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]

5. Reboot v2v server 
#init 6

6.Log into v2v server and try step3 again, permission error disappeared

# virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.2] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   2.2] Opening the source
[   9.9] Inspecting the source
[  17.5] Checking for sufficient free disk space in the guest
[  17.5] Converting Red Hat Enterprise Linux 9.2 Beta (Plow) to run on KVM
^Cvirt-v2v: Exiting on signal SIGINT

Comment 1 Laszlo Ersek 2023-03-28 16:14:19 UTC

I don't have the slightest idea. In fact for several months (years?) now I've not known how to restart the libvirt services pristinely. There are so many of them, and I have no idea about their inter-dependencies. I just tend to restart my laptop (hello Windows!)

Daniel, Michal, any help pls? Thanks.

@ Ming Xie: I assume the "Permission denied" error, when connecting to the unix domain socket exposed by nbdkit, originates from SELinux. Something could go wrong with the labeling that (I assume) libvirtd performs. Can you retry with "setenforce 0" (just as a check), and can you capture AVCs from "audit.log" ("sealert -a /var/log/audit/audit.log")? Thanks!

Comment 2 Laszlo Ersek 2023-03-29 06:42:29 UTC

*Somewhat* (not entirely!) related: bug 2182505.

Comment 3 mxie@redhat.com 2023-03-29 07:38:06 UTC

(In reply to Laszlo Ersek from comment #1)
> I don't have the slightest idea. In fact for several months (years?) now
> I've not known how to restart the libvirt services pristinely. There are so
> many of them, and I have no idea about their inter-dependencies. I just tend
> to restart my laptop (hello Windows!)
> 
> Daniel, Michal, any help pls? Thanks.
> 
> @ Ming Xie: I assume the "Permission denied" error, when connecting to the
> unix domain socket exposed by nbdkit, originates from SELinux. Something
> could go wrong with the labeling that (I assume) libvirtd performs. Can you
> retry with "setenforce 0" (just as a check), and can you capture AVCs from
> "audit.log" ("sealert -a /var/log/audit/audit.log")? Thanks!

Setting 'sentenforce 0' can fix the permission error, but it is strange that you must reset 'sentenforce 0' to fix the problem even if the selinux policy is Permissive,, please refer to the following steps for details and attached audit.log

#yum remove libvirt* -y

#yum install virt-v2v -y

#  virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   1.9] Opening the source
virt-v2v: error: libguestfs error: could not connect to libvirt (URI = 
qemu:///system): Failed to connect socket to 
'/var/run/libvirt/virtqemud-sock': No such file or directory [code=38 
int1=2]

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]

# systemctl start virtqemud.socket

# virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   1.8] Opening the source
virt-v2v: error: libguestfs error: could not create appliance through 
libvirt.

Try running qemu directly without libvirt using this environment variable:
export LIBGUESTFS_BACKEND=direct

Original error from libvirt: internal error: process exited while 
connecting to monitor: 2023-03-29T07:03:59.615625Z qemu-kvm: -blockdev 
{"driver":"nbd","server":{"type":"unix","path":"/tmp/v2v.YSV2PJ/in0"},"node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}: 
Failed to connect to '/tmp/v2v.YSV2PJ/in0': Permission denied [code=1 
int1=-1]

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]


# getenforce
Permissive

# setenforce 0

#  virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it vddk -io vddk-libdir=/home/vddk8.0.0 -io vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94 -ip /home/passwd  esx8.0-rhel9.2-x86_64 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk esx8.0-rhel9.2-x86_64
[   1.9] Opening the source
[   8.4] Inspecting the source
[  18.2] Checking for sufficient free disk space in the guest
[  18.2] Converting Red Hat Enterprise Linux 9.2 Beta (Plow) to run on KVM
^Cvirt-v2v: Exiting on signal SIGINT

Comment 5 Laszlo Ersek 2023-03-29 14:12:01 UTC

(In reply to mxie from comment #3)
> (In reply to Laszlo Ersek from comment #1)

> > @ Ming Xie: I assume the "Permission denied" error, when connecting to the
> > unix domain socket exposed by nbdkit, originates from SELinux. Something
> > could go wrong with the labeling that (I assume) libvirtd performs. Can you
> > retry with "setenforce 0" (just as a check), and can you capture AVCs from
> > "audit.log" ("sealert -a /var/log/audit/audit.log")? Thanks!
> 
> Setting 'sentenforce 0' can fix the permission error, but it is strange that
> you must reset 'sentenforce 0' to fix the problem even if the selinux policy
> is Permissive,,

If your enforcement level is *already* permissive when you first encounter the problem, then issuing "setenforce 0" for the second time will make zero difference. If you see a difference at the second time, then whatever changes between the two attempts is unrelated to SELinux.

(I didn't know your enforcement level was permissive to begin with. That's not a standard or recommended RHEL setting. I assumed all tests were run with strict (= enforcing) enforcement level.)

> please refer to the following steps for details and attached
> audit.log
> 
> #yum remove libvirt* -y
> 
> #yum install virt-v2v -y
> 
> #  virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it
> vddk -io vddk-libdir=/home/vddk8.0.0 -io
> vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94
> -ip /home/passwd  esx8.0-rhel9.2-x86_64 
> [   0.0] Setting up the source: -i libvirt -ic
> vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk
> esx8.0-rhel9.2-x86_64
> [   1.9] Opening the source
> virt-v2v: error: libguestfs error: could not connect to libvirt (URI = 
> qemu:///system): Failed to connect socket to 
> '/var/run/libvirt/virtqemud-sock': No such file or directory [code=38 
> int1=2]

OK so this is a sign that the libvirt daemon(s) are not running after reinstalling them. It's not that surprising; IIRC it's been general RHEL policy (or at least tradition) that services are not immediately enabled after installation.

> 
> If reporting bugs, run virt-v2v with debugging enabled and include the 
> complete output:
> 
>   virt-v2v -v -x [...]
> 
> # systemctl start virtqemud.socket

Makes sense.

> 
> # virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it
> vddk -io vddk-libdir=/home/vddk8.0.0 -io
> vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94
> -ip /home/passwd  esx8.0-rhel9.2-x86_64 
> [   0.0] Setting up the source: -i libvirt -ic
> vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk
> esx8.0-rhel9.2-x86_64
> [   1.8] Opening the source
> virt-v2v: error: libguestfs error: could not create appliance through 
> libvirt.
> 
> Try running qemu directly without libvirt using this environment variable:
> export LIBGUESTFS_BACKEND=direct
> 
> Original error from libvirt: internal error: process exited while 
> connecting to monitor: 2023-03-29T07:03:59.615625Z qemu-kvm: -blockdev 
> {"driver":"nbd","server":{"type":"unix","path":"/tmp/v2v.YSV2PJ/in0"},"node-
> name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-
> read-only":true,"discard":"unmap"}: 
> Failed to connect to '/tmp/v2v.YSV2PJ/in0': Permission denied [code=1 
> int1=-1]

Yes, so this is the curious bit, and this is where I expected that you'd have "setenforce 1" in place.

Apparently that's not the case:

> 
> If reporting bugs, run virt-v2v with debugging enabled and include the 
> complete output:
> 
>   virt-v2v -v -x [...]
> 
> 
> # getenforce
> Permissive

Yeah, I don't understand *why* your env is set up like this.

Either way, it does tell us that the "Permission denied" problem is not related to SELinux.

> 
> # setenforce 0

So this is a no-op.

And then:

> 
> #  virt-v2v -ic vpx://root.212.149/data/10.73.212.36/?no_verify=1  -it
> vddk -io vddk-libdir=/home/vddk8.0.0 -io
> vddk-thumbprint=D1:03:96:7E:11:3D:7C:4C:B6:50:28:1B:63:74:B5:40:5F:9D:9F:94
> -ip /home/passwd  esx8.0-rhel9.2-x86_64 
> [   0.0] Setting up the source: -i libvirt -ic
> vpx://root.212.149/data/10.73.212.36/?no_verify=1 -it vddk
> esx8.0-rhel9.2-x86_64
> [   1.9] Opening the source
> [   8.4] Inspecting the source
> [  18.2] Checking for sufficient free disk space in the guest
> [  18.2] Converting Red Hat Enterprise Linux 9.2 Beta (Plow) to run on KVM
> ^Cvirt-v2v: Exiting on signal SIGINT

Whatever "fixed" this final attempt cannot be related to SELinux. The "setenforce 0" command was a no-op.

Comment 7 Klaus Heinrich Kiwi 2023-03-30 11:27:50 UTC

This is an interesting problem, but probably also one customers will not reach in realistic scenarios..

Also, please be aware that if you are, at any point in time, disabling SELinux, the domain transitions can fail and therefore important context setting could get lost. I'm not sure what the supported scenario is for such cases, but at the very lease I'd recommend a complete filesystem relabel.

Comment 10 mxie@redhat.com 2023-06-19 10:55:01 UTC

This bug has a high reproduction rate in v2v-appliance which is an instance on OSP17.1 env, for example, below two cases fails with the bug

https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/INT%20Runtest/view/V2V-OSP/job/OSP-17.1-runtest-rhel9-v2v-RHEL-174566/3/console
https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/INT%20Runtest/view/V2V-OSP/job/OSP-17.1-runtest-rhel9-v2v-VIRT-43579/2/console

so I think the priority of the bug needs to be adjusted to high 


These steps to reproduce in v2v appliance:

1. Running v2v for the first time in v2v appliance
#  virt-v2v -ic vpx://root.74.72/data/10.73.196.89/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk8.0.1 -io vddk-thumbprint=24:40:7E:C8:C8:1F:9C:DF:E4:E0:48:D0:9E:25:64:94:64:AF:C6:8C  -ip /home/passwd  Auto-esx6.5-rhel7.6-selinux-mls -o null 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.74.72/data/10.73.196.89/?no_verify=1 -it vddk Auto-esx6.5-rhel7.6-selinux-mls
[   1.9] Opening the source
virt-v2v: error: libguestfs error: could not connect to libvirt (URI = 
qemu:///system): Failed to connect socket to 
'/var/run/libvirt/virtqemud-sock': No such file or directory [code=38 
int1=2]

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]

2.Start virtqemud.socket according to the error of step1 
# systemctl start virtqemud.socket

3. Run the v2v command a second time
#  virt-v2v -ic vpx://root.74.72/data/10.73.196.89/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk8.0.1 -io vddk-thumbprint=24:40:7E:C8:C8:1F:9C:DF:E4:E0:48:D0:9E:25:64:94:64:AF:C6:8C  -ip /home/passwd  Auto-esx6.5-rhel7.6-selinux-mls -o null 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.74.72/data/10.73.196.89/?no_verify=1 -it vddk Auto-esx6.5-rhel7.6-selinux-mls
[   2.0] Opening the source
virt-v2v: error: libguestfs error: could not create appliance through 
libvirt.

Try running qemu directly without libvirt using this environment variable:
export LIBGUESTFS_BACKEND=direct

Original error from libvirt: internal error: process exited while 
connecting to monitor: 2023-06-13T08:15:14.006136Z qemu-kvm: -blockdev 
{"driver":"nbd","server":{"type":"unix","path":"/tmp/v2v.zzrRXA/in0"},"node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}: 
Failed to connect to '/tmp/v2v.zzrRXA/in0': Permission denied [code=1 
int1=-1]

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]

4.Run the v2v command a third time, v2v runs normally
#  virt-v2v -ic vpx://root.74.72/data/10.73.196.89/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk8.0.1 -io vddk-thumbprint=24:40:7E:C8:C8:1F:9C:DF:E4:E0:48:D0:9E:25:64:94:64:AF:C6:8C  -ip /home/passwd  esx6.5-rhel8.8-x86_64 -o null 
[   0.1] Setting up the source: -i libvirt -ic vpx://root.74.72/data/10.73.196.89/?no_verify=1 -it vddk esx6.5-rhel8.8-x86_64
[   2.1] Opening the source
[  16.0] Inspecting the source
[  40.0] Checking for sufficient free disk space in the guest
[  40.0] Converting Red Hat Enterprise Linux 8.8 Beta (Ootpa) to run on KVM
......

Comment 14 Laszlo Ersek 2023-06-22 12:50:35 UTC

For reproducing this error, it is not necessary to remove the libvirt* components (and things that require them, such as virt-v2v), and to reinstall virt-v2v.

For reproducing the error, it is sufficient to just do

# systemctl stop virtqemud.socket
# systemctl start virtqemud.socket
# virt-v2v ...

Note that "systemct restart virtqemud.socket" is *not* sufficient for triggering the symptom. Separate stop and start actions are necessary.

Comment 15 Richard W.M. Jones 2023-06-22 13:38:09 UTC

Does it happen if you just use "virsh list --all" (as root) or a similar command
("virsh nodeinfo" might be worth a go).  If so it's a libvirt bug.

Comment 16 Laszlo Ersek 2023-06-22 14:00:16 UTC

(Daniel is already CC'd, good.)

This is a bug in:

(1) *either* the original bug report (invalid actions taken by the
user),

(2) *or* in the systemd *.socket service files that are provided by the
"libvirt-daemon-driver-qemu" subpackage.

Here's why.

Initially -- that is, after just installing the
"libvirt-daemon-driver-qemu" package, or else after the sequence

> # systemctl stop virtqemud.socket
> # systemctl start virtqemud.socket

-- we have the following situation:

> # ls -l /var/run/libvirt/virtqemud-sock*
> srw-rw-rw-. 1 root root 0 Jun 22 15:30 /var/run/libvirt/virtqemud-sock
> srw-rw-rw-. 1 root root 0 Jun 22 15:25 /var/run/libvirt/virtqemud-sock-ro

Note: *two* UNIX domain sockets. However, systemd is listening to only
one of them:

> # fuser -v /var/run/libvirt/virtqemud-sock*
>                      USER        PID ACCESS COMMAND
> /run/libvirt/virtqemud-sock:
>                      root          1 F.... systemd

This means we can make no read-only connection to the daemon. I can show
this even with plain "virsh":

> # virsh --readonly list
> error: failed to connect to the hypervisor
> error: Failed to connect socket to '/var/run/libvirt/virtqemud-sock-ro': Connection refused

When we start virt-v2v in this state of virtqemud (that is, with
virtqemud not even running yet, only systemd waiting for socket
activation, and systemd only listening to the read-write socket), a
"best effort" action in virt-v2v fails -- namely commit 4e7f20684373
("lib: Improve security of in/out sockets when running virt-v2v as
root", 2022-03-23), which we had made for bug 2066773.

Virt-v2v, *running as root*, fires up nbdkit (for vddk access), and
attempts to change the ownership of both the nbdkit socket and the
directory containing it, so that qemu-kvm, running as the "qemu" user,
can connect to the socket. This is the logic that fails. virt-v2v
attempts to learn the name of the UNIX user ("qemu") from libvirtd that
libvirtd is going to run qemu-kvm as, in order to change the ownership
of the directory + socket. For this libvirtd API call, virt-v2v makes a
*read-only* connection; refer to the "connect_readonly" call in the
"libvirt_qemu_user" function in "lib/utils.ml".

This connection fails (nobody is listening to "virtqemud-sock-ro"!).
Virt-v2v even logs that error in verbose mode:

> could not set owner of /tmp/v2v.6E90he: libvirt: VIR_ERR_SYSTEM_ERROR:
> VIR_FROM_RPC: Failed to connect socket to
> '/var/run/libvirt/virtqemud-sock-ro': Connection refused

As documented in commit 4e7f20684373, we intentionally ignore this
failure ("best effort"), and carry on. That produces the visible symptom
in the end.

Now, once this failure has been triggered, virtqemud does start up --
possibly due to virt-v2v using a read-write connection too, regardless
of the read-only connection error --, and then *both* virtqemud *and*
systemd are listening to *both* sockets:

> # fuser -v /var/run/libvirt/virtqemud-sock*
>                      USER        PID ACCESS COMMAND
> /run/libvirt/virtqemud-sock:
>                      root          1 F.... systemd
>                      root      39422 F.... virtqemud
> /run/libvirt/virtqemud-sock-ro:
>                      root          1 F.... systemd
>                      root      39422 F.... virtqemud

This is why the *next* invocation of the same virt-v2v command line
succeeds.

So, we need to decide the following now:

(1) Is this a user error (i.e., NOTABUG)?

Because, the problem is avoided if we manually start *both* socket
services in the beginning:

> # systemctl start virtqemud.socket
> # systemctl start virtqemud-ro.socket

This way the source is opened fine immediately. (In fact, if you only
start "virtqemud-ro.socket" explicitly, that one pulls in / starts
"virtqemud.socket" too.)

I'm not sure about proper socket service usage here; is the user
supposed to start both socket services?

(2) The other interpretation could be that the r/w and r/o *.socket
services are incorrectly implemented, and they should (how?)
cross-depend on each other. Currently they contain:

> # tail -n +1 /usr/lib/systemd/system/virtqemud{,-ro}.socket
> ==> /usr/lib/systemd/system/virtqemud.socket <==
> [Unit]
> Description=Libvirt qemu local socket
> Before=virtqemud.service
>
>
> [Socket]
> ListenStream=/run/libvirt/virtqemud-sock
> Service=virtqemud.service
> SocketMode=0666
> RemoveOnStop=yes
>
> [Install]
> WantedBy=sockets.target
>
> ==> /usr/lib/systemd/system/virtqemud-ro.socket <==
> [Unit]
> Description=Libvirt qemu local read-only socket
> Before=virtqemud.service
> BindsTo=virtqemud.socket
> After=virtqemud.socket
>
>
> [Socket]
> ListenStream=/run/libvirt/virtqemud-sock-ro
> Service=virtqemud.service
> SocketMode=0666
>
> [Install]
> WantedBy=sockets.target

I admit I don't know what the "BindsTo" and "After" directives do, in
"virtqemud-ro.socket". I figure we have *one* of those to thank for the
fact that once we start "virtqemud-ro.socket", it pulls in
"virtqemud.socket" too. But I'm not sure about the *other* directive.
And either way, this doesn't implement the circular dependency -- the
question is, do we want a circular dependency here?

Comment 17 Laszlo Ersek 2023-06-22 14:05:01 UTC

(In reply to Richard W.M. Jones from comment #15)
> Does it happen if you just use "virsh list --all" (as root) or a
> similar command ("virsh nodeinfo" might be worth a go).  If so it's a
> libvirt bug.

hah, another "mid-air collision" :/ So yes, plain "virsh" fails too, we
just need to pass it "--readonly".

Comment 18 Daniel Berrangé 2023-06-22 14:06:58 UTC

(In reply to mxie from comment #0)
> Description of problem:
> Steps to Reproduce:
> 1. Remove all libvirt and its dependence package, virt-v2v will be removed
> during uninstallation
> #yum remove libvirt* -y
> 
> 2. Install virt-v2v again
> #yum install virt-v2v -y

IMHO that is not a valid installation process.

The 'yum install virt-v2v' will pull in libvirt as a dependancy, however, nothing here is activating the systemd sockets.

Installing libvirt RPMs will register the systemd units, and they will get marked to start on *subsequent* boots, in accordance with distro presets. AFAIK, nothing is actually guaranteed to be running, however, until *after* the OS is rebooted.

Comment 19 Daniel Berrangé 2023-06-22 14:11:14 UTC

(In reply to Laszlo Ersek from comment #14)
> For reproducing this error, it is not necessary to remove the libvirt*
> components (and things that require them, such as virt-v2v), and to
> reinstall virt-v2v.
> 
> For reproducing the error, it is sufficient to just do
> 
> # systemctl stop virtqemud.socket
> # systemctl start virtqemud.socket
> # virt-v2v ...
> 
> Note that "systemct restart virtqemud.socket" is *not* sufficient for
> triggering the symptom. Separate stop and start actions are necessary.

This is also not a valid sequence. The two systemctl operations there are not the inverse of each other.

When you stop 'virtqemud.socket', the BindsTo directive means that 'virtqemud-ro.socket' and 'virtqemud-admin.socket' are also taken offline.

When you start 'virtqemud.socket', however, that alone will start. The 'BindsTo' directive doesn't work in the reverse way, to make the -ro and -admin sockets get started.

You didn't see a problem with 'systemctl restart virtqemud.socket' because with the atomic restart, there is no need for systemd to apply the BindsTo directive, so the -ro and -admin sockets remained unchanged.

Comment 20 Daniel Berrangé 2023-06-22 14:15:22 UTC

(In reply to Daniel Berrangé from comment #19)
> (In reply to Laszlo Ersek from comment #14)
> > For reproducing this error, it is not necessary to remove the libvirt*
> > components (and things that require them, such as virt-v2v), and to
> > reinstall virt-v2v.
> > 
> > For reproducing the error, it is sufficient to just do
> > 
> > # systemctl stop virtqemud.socket
> > # systemctl start virtqemud.socket
> > # virt-v2v ...
> > 
> > Note that "systemct restart virtqemud.socket" is *not* sufficient for
> > triggering the symptom. Separate stop and start actions are necessary.
> 
> This is also not a valid sequence. The two systemctl operations there are
> not the inverse of each other.

BTW, we could have made them an inverse by making virtqemud.socket include "Requires=virtqemud-ro.socket virtqemud-admin.socket". That is a stronger relationship than we want to express though.

We want the admin to retain the ability to turn off the -ro.socket and -admin.socket if they want to lock down their installation.

Normally none of this should be an issue, because the .socket units are configured to start on boot, and there's no common reason why an admin should ever stop them on a running machine. They should generally be left alone to just do their job providing auto-start of the service upon first connection.

Comment 21 Laszlo Ersek 2023-06-22 17:06:35 UTC

OK, so this BZ is NOTABUG then; incorrect libvirt startup.

For future reference -- also because it is a direct problem for the
RHEL9 virt-v2v "osci.brew-build.tier0.functional" gating test that runs
in some (?) containerized environment --, how exactly are we supposed to
start libvirtd right after installation (i.e., without a reboot)? What
services / sockets do we have to launch?

We depend on the "libvirt-daemon-kvm" metapackage, which is itself
empty; it depends on:

  libvirt-daemon
  libvirt-daemon-driver-interface
  libvirt-daemon-driver-network
  libvirt-daemon-driver-nodedev
  libvirt-daemon-driver-nwfilter
  libvirt-daemon-driver-qemu
  libvirt-daemon-driver-secret
  libvirt-daemon-driver-storage
  qemu-kvm

That set of RPMs, taken together, installs the following list of systemd
services and sockets (and even targets):

  /usr/lib/systemd/system/libvirt-guests.service
  /usr/lib/systemd/system/libvirtd-admin.socket
  /usr/lib/systemd/system/libvirtd-ro.socket
  /usr/lib/systemd/system/libvirtd-tcp.socket
  /usr/lib/systemd/system/libvirtd-tls.socket
  /usr/lib/systemd/system/libvirtd.service
  /usr/lib/systemd/system/libvirtd.socket
  /usr/lib/systemd/system/virt-guest-shutdown.target
  /usr/lib/systemd/system/virtinterfaced-admin.socket
  /usr/lib/systemd/system/virtinterfaced-ro.socket
  /usr/lib/systemd/system/virtinterfaced.service
  /usr/lib/systemd/system/virtinterfaced.socket
  /usr/lib/systemd/system/virtlockd-admin.socket
  /usr/lib/systemd/system/virtlockd.service
  /usr/lib/systemd/system/virtlockd.socket
  /usr/lib/systemd/system/virtlogd-admin.socket
  /usr/lib/systemd/system/virtlogd.service
  /usr/lib/systemd/system/virtlogd.socket
  /usr/lib/systemd/system/virtnetworkd-admin.socket
  /usr/lib/systemd/system/virtnetworkd-ro.socket
  /usr/lib/systemd/system/virtnetworkd.service
  /usr/lib/systemd/system/virtnetworkd.socket
  /usr/lib/systemd/system/virtnodedevd-admin.socket
  /usr/lib/systemd/system/virtnodedevd-ro.socket
  /usr/lib/systemd/system/virtnodedevd.service
  /usr/lib/systemd/system/virtnodedevd.socket
  /usr/lib/systemd/system/virtnwfilterd-admin.socket
  /usr/lib/systemd/system/virtnwfilterd-ro.socket
  /usr/lib/systemd/system/virtnwfilterd.service
  /usr/lib/systemd/system/virtnwfilterd.socket
  /usr/lib/systemd/system/virtproxyd-admin.socket
  /usr/lib/systemd/system/virtproxyd-ro.socket
  /usr/lib/systemd/system/virtproxyd-tcp.socket
  /usr/lib/systemd/system/virtproxyd-tls.socket
  /usr/lib/systemd/system/virtproxyd.service
  /usr/lib/systemd/system/virtproxyd.socket
  /usr/lib/systemd/system/virtqemud-admin.socket
  /usr/lib/systemd/system/virtqemud-ro.socket
  /usr/lib/systemd/system/virtqemud.service
  /usr/lib/systemd/system/virtqemud.socket
  /usr/lib/systemd/system/virtsecretd-admin.socket
  /usr/lib/systemd/system/virtsecretd-ro.socket
  /usr/lib/systemd/system/virtsecretd.service
  /usr/lib/systemd/system/virtsecretd.socket

This doesn't look encouraging.

What is the precise, minimal set of sockets and/or services we need to
start, after installing "libvirt-daemon-kvm", *without* rebooting, so we
can make both r/w and r/o client connections to the local (system-level)
virtqemud? (I don't think we need admin connections.) Thanks!

Comment 22 Daniel Berrangé 2023-06-22 17:19:44 UTC

(In reply to Laszlo Ersek from comment #21)
> OK, so this BZ is NOTABUG then; incorrect libvirt startup.
> 
> For future reference -- also because it is a direct problem for the
> RHEL9 virt-v2v "osci.brew-build.tier0.functional" gating test that runs
> in some (?) containerized environment --, how exactly are we supposed to
> start libvirtd right after installation (i.e., without a reboot)? What
> services / sockets do we have to launch?


> What is the precise, minimal set of sockets and/or services we need to
> start, after installing "libvirt-daemon-kvm", *without* rebooting, so we
> can make both r/w and r/o client connections to the local (system-level)
> virtqemud? (I don't think we need admin connections.) Thanks!

The set of unit files depends on what range of functionalit you need to use from libvirt.

The most strictly minimal setup would be

  systemctl start virtqemud.service

This will immediately start virtqemud, and indirectly the virtqemud sockets, and virtlogd sockets

# systemctl | grep ' virt'
  virtlockd.socket                                                                         loaded active listening Virtual machine lock manager socket
  virtlogd.socket                                                                          loaded active listening Virtual machine log manager socket
  virtqemud-admin.socket                                                                   loaded active listening Libvirt qemu admin socket
  virtqemud-ro.socket                                                                      loaded active listening Libvirt qemu local read-only socket
  virtqemud.socket                                                                         loaded active listening Libvirt qemu local socket


If you need to use disk encryption you'll also need   virtsecretd.service.

If you need virtual NAT 'default' network you'll need virtnetworkd.service

If you need libvirt storage pools, you'll need virtstoraged.service

Note, I'm suggesting the .service, because I'm assuming the intention is that the tests will immediately start using them, and as such relying on socket activation for lazy startup is not important.

Comment 23 Daniel Berrangé 2023-06-22 17:26:51 UTC

Actually a simpler approach could be simply

   'systemctl isolate multi-user.target'

this will ensure that all unit files implied by 'multi-user.target' are running, if they were not already running, even if the system is already at multi-user.target. As a consequence of this, all the system presets should get honoured from the new installed packages.

Comment 24 mxie@redhat.com 2023-06-26 02:38:23 UTC

(In reply to Laszlo Ersek from comment #21)
> OK, so this BZ is NOTABUG then; incorrect libvirt startup.
> 
> For future reference -- also because it is a direct problem for the
> RHEL9 virt-v2v "osci.brew-build.tier0.functional" gating test that runs
> in some (?) containerized environment --, how exactly are we supposed to
> start libvirtd right after installation (i.e., without a reboot)? What
> services / sockets do we have to launch?

Hi Laszlo, as you saw, the v2v gating test will encounter this bug, and as comment10 said, the bug will occur on a freshly installed v2v server, these scenarios are definitely valid libvirt startup. Besides, customers also will encounter the bug when running v2v on new v2v server , so I think this bug is valid and should be fixed, maybe v2v should rely on more libvirt packages to make necessary services start to avoid this bug, such as, the bug won't happen when all libvirt packages are installed and libvirtd service is running.

Comment 25 Laszlo Ersek 2023-06-26 08:57:57 UTC

Hi Ming, I'm about to fix the v2v gating test. I plan to replace the current (failing) "systemctl restart libvirtd" command in "tests/basic-test.sh" with "systemctl isolate multi-user.target", per Daniel's suggestion.

Regarding the unexpected behavior for customers -- there's not much we can do there. Even if some service has a vendor preset of "enabled", it will not be auto-started right after installation. Furthermore, Daniel explained that the rpm SPEC file (%postinst etc) is not a proper environment for starting services, so we cannot just start services from within the libvirtd spec file. It's effectively RHEL tradition that once you install new packages with new services (having vendor-preset "enabled"), you either have to reboot, or manually run some commands (after the package installation is complete) to start the services. With libvirtd having been modularized into a number of distinct daemons and services, the easiest command by far is "systemctl isolate multi-user.target".

Best we could do is modify the virt-v2v documentation. The following two manual pages seem to deal with libvirtd connections:
- for input: https://libguestfs.org/virt-v2v.1.html
- for output: https://libguestfs.org/virt-v2v-output-local.1.html

We could add two sentences to both pages (exact location TBD): "If you have just installed libvirtd from distribution packages, make sure the libvirtd services are actually running, before invoking virt-v2v. Some distributions will auto-start these services, but e.g. on Fedora and RHEL, you may have to issue 'systemctl isolate multi-user.target'."

If Rich agrees, we can repurpose / reopen this BZ for updating the documentation.

Comment 26 mxie@redhat.com 2023-06-26 09:23:14 UTC

(In reply to Laszlo Ersek from comment #25)
> Hi Ming, I'm about to fix the v2v gating test. I plan to replace the current
> (failing) "systemctl restart libvirtd" command in "tests/basic-test.sh" with
> "systemctl isolate multi-user.target", per Daniel's suggestion.
> 
> Regarding the unexpected behavior for customers -- there's not much we can
> do there. Even if some service has a vendor preset of "enabled", it will not
> be auto-started right after installation. Furthermore, Daniel explained that
> the rpm SPEC file (%postinst etc) is not a proper environment for starting
> services, so we cannot just start services from within the libvirtd spec
> file. It's effectively RHEL tradition that once you install new packages
> with new services (having vendor-preset "enabled"), you either have to
> reboot, or manually run some commands (after the package installation is
> complete) to start the services. With libvirtd having been modularized into
> a number of distinct daemons and services, the easiest command by far is
> "systemctl isolate multi-user.target".
> 
> Best we could do is modify the virt-v2v documentation. The following two
> manual pages seem to deal with libvirtd connections:
> - for input: https://libguestfs.org/virt-v2v.1.html
> - for output: https://libguestfs.org/virt-v2v-output-local.1.html
> 
> We could add two sentences to both pages (exact location TBD): "If you have
> just installed libvirtd from distribution packages, make sure the libvirtd
> services are actually running, before invoking virt-v2v. Some distributions
> will auto-start these services, but e.g. on Fedora and RHEL, you may have to
> issue 'systemctl isolate multi-user.target'."
> 
> If Rich agrees, we can repurpose / reopen this BZ for updating the
> documentation.

Can v2v add some info like 'you can start related services with command systemctl isolate multi-user.target and then rerun the virt-v2v command.' to tell user how to start services correctly in v2v conversion to fix the bug? 


#  virt-v2v -ic vpx://root.74.72/data/10.73.196.89/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk8.0.1 -io vddk-thumbprint=24:40:7E:C8:C8:1F:9C:DF:E4:E0:48:D0:9E:25:64:94:64:AF:C6:8C  -ip /home/passwd  Auto-esx6.5-rhel7.6-selinux-mls -o null 
[   0.0] Setting up the source: -i libvirt -ic vpx://root.74.72/data/10.73.196.89/?no_verify=1 -it vddk Auto-esx6.5-rhel7.6-selinux-mls
[   1.9] Opening the source
virt-v2v: error: libguestfs error: could not connect to libvirt (URI = 
qemu:///system): Failed to connect socket to 
'/var/run/libvirt/virtqemud-sock': No such file or directory [code=38 
int1=2]， you can start services by         -------> add info

systemctl isolate multi-user.target         -------> add info

and then rerun the virt-v2v command.        -------> add info

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]

Comment 27 Richard W.M. Jones 2023-06-26 09:43:19 UTC

Yes I agree we ought to document this, perhaps with a link back to this discussion
so that people can understand the issues.

Not that I necessarily agree with the outcome.  It does seem counterintuitive that
simply installing virt-v2v and running it as root won't work because of some (to the user)
obscurity in how systemd or rpm starts services.

Comment 28 Laszlo Ersek 2023-06-26 10:49:28 UTC

We don't have structured libvirt errors at the OCaml level, AFAICT. As a complication, in one set of cases, the error goes through 3 layers (libvirt client library --> libguestfs --> virt-v2v); in another case, through 2 layers (libvirt client library --> virt-v2v). We don't have a good way to identify "connection failure" programmatically. And even if we had structured errors, the libvirt error codes that are visible in the above comments are not too descriptive:

- VIR_ERR_INTERNAL_ERROR = 1,
- VIR_ERR_SYSTEM_ERROR = 38.

I'll attempt sending a small docs patch later.

Comment 30 Laszlo Ersek 2023-06-27 17:15:17 UTC

[v2v PATCH] docs/virt-v2v: document libvirt system instance startup
Message-Id: <20230627171436.231770-1-lersek>
https://listman.redhat.com/archives/libguestfs/2023-June/031910.html

Comment 31 Laszlo Ersek 2023-06-29 12:14:28 UTC

Hi mxie, ultimately we're not only documenting the proper libvirtd startup procedure, but also changing the exception handling a bit.

Going forward, you shouldn't see

  Failed to connect to '/tmp/v2v.sKlulY/in0': Permission denied

in response to the repro steps, but

  Failed to connect socket to '/var/run/libvirt/virtqemud-sock-ro': Connection refused

and the documentation will address the latter.

Comment 32 Laszlo Ersek 2023-06-29 12:35:30 UTC

[v2v PATCH v2 0/3] improve UX when running as root and we can't chown
Message-Id: <20230629123443.188350-4-lersek>
https://listman.redhat.com/archives/libguestfs/2023-June/031919.html

Comment 33 Laszlo Ersek 2023-06-30 09:01:37 UTC

(In reply to Laszlo Ersek from comment #32)
> [v2v PATCH v2 0/3] improve UX when running as root and we can't chown
> Message-Id: <20230629123443.188350-4-lersek>
> https://listman.redhat.com/archives/libguestfs/2023-June/031919.html

Commit range 97d8c28b7eb1..dcfea1b9b5d0.

Comment 35 Laszlo Ersek 2023-06-30 09:51:47 UTC

Backport on the rhel-9.3 branch in the upstream repo: commit range 10192f8ee3a7..f2e233b9e073.

Comment 41 Laszlo Ersek 2023-07-03 09:53:24 UTC

Rich, should we revert upstream commit d2b64ecc6701 ("v2v: Set the
number of vCPUs to same as host number of pCPUs.", 2020-12-01), on the
rhel-9.3 branch only, as a workaround?

Comment 42 Richard W.M. Jones 2023-07-03 10:51:04 UTC

(In reply to Laszlo Ersek from comment #41)
> Rich, should we revert upstream commit d2b64ecc6701 ("v2v: Set the
> number of vCPUs to same as host number of pCPUs.", 2020-12-01), on the
> rhel-9.3 branch only, as a workaround?

This change is really essential to get good performance from dracut, when
it compresses the final initramfs using pigz.

Some better ideas:
 - Add an --smp flag to virt-v2v.
 - Somehow detect if we're using TCG and skip the set_smp call.

The second one is better because it doesn't push the problem to the user.
However I'm not sure that there is a reliable way to detect if KVM / TCG
are available.  I think you can ask qemu (via QMP) for that, and I thought
we actually did that already inside libguestfs, but I can't find it right now.

Comment 43 Daniel Berrangé 2023-07-03 10:59:06 UTC

(In reply to Richard W.M. Jones from comment #42)

> The second one is better because it doesn't push the problem to the user.
> However I'm not sure that there is a reliable way to detect if KVM / TCG
> are available.  I think you can ask qemu (via QMP) for that, and I thought
> we actually did that already inside libguestfs, but I can't find it right
> now.

Libvirt can tell you which are supported in capabilities XML.

For the non-libvirt driver, spawn qemu with  -accel kvm:tcg and then do 'query-kvm' to find out if KVM actually enabled or not - if not, assume its TCG.

Comment 44 Richard W.M. Jones 2023-07-03 11:30:35 UTC

So we do in fact use query-kvm ourselves here:
https://github.com/libguestfs/libguestfs/blob/master/lib/qemu.c
and also we ask libvirt for the capabilities XML.  However I don't think we
make this information available up to libguestfs API users, although we could do.

Comment 45 Laszlo Ersek 2023-07-03 12:56:58 UTC

- I think dracut / pigz performance is irrelevant when we have a bug
  that prevents us from booting the appliance in the first place. That's
  a problem that affects all of  virt-v2v, libguestfs, guestfs-tools
  etc. In my view, as long as things are this broken, TCG should imply a
  uniprocessor guest.

- We only want this temporarily, not forever, so a big bunch of new
  infrastructure should be avoided. I've found the following avenue:
  both the libvirt backend and the direct backend already detect whether
  the acceleration is known to be TCG, and if so, they both modify the
  guest kernel (appliance kernel) command line.

See commits

- aeea803ad0fa ("appliance: Pass lpj=... on the appliance command line
                 (thanks Marcelo Tosatti).", 2012-11-24)

  --> libvirt backend

- 012b01a0fb87 ("launch: direct: Make sure we pass lpj= parameter when
                 using TCG.", 2014-01-18)

  --> direct backend

"lpj" is a super obscure, and here, irrelevant, artifact ("loops per
jiffy"); the point is that the infrastructure already exists for
tweaking the guest kernel cmdline based on TCG acceleration, and that
both backends support it. As of this writing:

construct_libvirt_xml_boot() [lib/launch-libvirt.c]:

>   if (!params->data->is_kvm)
>     flags |= APPLIANCE_COMMAND_LINE_IS_TCG;
>   cmdline = guestfs_int_appliance_command_line (g, params->appliance, flags);

launch_direct() [lib/launch-direct.c]:

>   if (!has_kvm || force_tcg)
>     flags |= APPLIANCE_COMMAND_LINE_IS_TCG;
>   append = guestfs_int_appliance_command_line (g, appliance, flags);

and APPLIANCE_COMMAND_LINE_IS_TCG is commonly handled inside
guestfs_int_appliance_command_line():

>   bool tcg = flags & APPLIANCE_COMMAND_LINE_IS_TCG;
>   ...
>   if (tcg) {
>     const int lpj = guestfs_int_get_lpj (g);
>     if (lpj > 0)
>       guestfs_int_add_sprintf (g, &argv, "lpj=%d", lpj);
>   }

So that is what we should scavenge: if "tcg" is set, also pass the
"nosmp" kernel parameter:

From "Documentation/admin-guide/kernel-parameters.txt":

>         maxcpus=        [SMP] Maximum number of processors that an SMP kernel
>                         will bring up during bootup.  maxcpus=n : n >= 0 limits
>                         the kernel to bring up 'n' processors. Surely after
>                         bootup you can bring up the other plugged cpu by executing
>                         "echo 1 > /sys/devices/system/cpu/cpuX/online". So maxcpus
>                         only takes effect during system bootup.
>                         While n=0 is a special case, it is equivalent to "nosmp",
>                         which also disables the IO APIC.
> 
>         nosmp           [SMP] Tells an SMP kernel to act as a UP kernel,
>                         and disable the IO APIC.  legacy for "maxcpus=0".

Comment 46 Laszlo Ersek 2023-07-03 13:15:04 UTC

Ah, regarding my proposal for reverting d2b64ecc6701, downstream -- sorry, that was wrong; it would regress performance on KVM too! Sorry for not realizing that earlier.

Comment 47 Laszlo Ersek 2023-07-03 13:40:13 UTC

LIBGUESTFS_APPEND should lend itself to testing the idea from comment 45, with one of the reproducers from bug 2216496.

Comment 48 Richard W.M. Jones 2023-07-03 14:00:50 UTC

One possibility it to modify the guestfs_impl_set_smp function in libguestfs
so it ignores the call if it knows TCG is being used.  We're already collecting
that information in libguestfs, I think.

Having said that I'm sort of confident I might be able to fix or at least
understand this bug with a bit more work.

Comment 58 mxie@redhat.com 2023-07-06 13:05:01 UTC

Test the bug with virt-v2v-2.3.4-5.el9.x86_64

Steps:
1.Check description about 'Starting the libvirt system instance' in virt-v2v man page
# man virt-v2v |grep 'Starting the libvirt system instance' -A 12 -B 7

-i libvirt
Set the input method to libvirt. This is the default.

In this mode you have to specify a libvirt guest name or UUID on the command line. You may also specify a
libvirt connection URI (see -ic).

See "Starting the libvirt system instance" below.

.....

-o libvirt
Set the output method to libvirt. This is the default.

In this mode, the converted guest is created as a libvirt guest. You may also specify a libvirt connection
URI (see -oc).

See "Starting the libvirt system instance" below, and virt-v2v-output-local(1).

-o local
Set the output method to local.

In this mode, the converted guest is written to a local directory specified by -os /dir (the directory must
exist). The converted guest’s disks are written as:

/dir/name-sda
/dir/name-sdb
[etc]

and a libvirt XML file is created containing guest metadata:
--
When using -o libvirt, you may need to run virt-v2v as root so that it can write to the libvirt system
instance (ie. "qemu:///system") and to the default location for disk images (usually
/var/lib/libvirt/images).

You can avoid this by setting up libvirt connection authentication, see http://libvirt.org/auth.html.
Alternatively, use -oc qemu:///session, which will write to your per-user libvirt instance.

See also "Starting the libvirt system instance".

Writing to Openstack
....
---
....

Starting the libvirt system instance
Failed to connect socket to '/var/run/libvirt/virtqemud-sock': No such file or directory
Failed to connect socket to '/var/run/libvirt/virtqemud-sock-ro': Connection refused

If you have just installed libvirt and virt-v2v, then you may see the errors above. This is caused by libvirt
daemons that provide various services not running straight after installation. (This may depend on your
distribution and vendor presets).

To fix this on systemd-based distributions, do:

systemctl isolate multi-user.target

See also https://bugzilla.redhat.com/2182024.

2. Stop and start virtqemud.socket service, then execute virt-v2v command
# systemctl stop virtqemud.socket
# systemctl start virtqemud.socket
# virt-v2v -ic vpx://administrator%40vsphere.local.213.93/data/10.73.212.38/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk8.0.1 -io vddk-thumbprint=1B:83:D8:5A:33:31:62:DB:BA:9E:73:6D:A8:29:14:48:3F:82:F6:FD -ip /home/passwd esx7.0-win10-x64-ltsc-2021 -o null
virt-v2v: error: exception: libvirt: VIR_ERR_SYSTEM_ERROR: VIR_FROM_RPC:
Failed to connect socket to '/var/run/libvirt/virtqemud-sock-ro':
Connection refused

If reporting bugs, run virt-v2v with debugging enabled and include the
complete output:

virt-v2v -v -x [...]

3.Start virtqemud-sock-ro service according to the error of step2, execute virt-v2v command again
# systemctl start virtqemud-ro.socket
# virt-v2v -ic vpx://administrator%40vsphere.local.213.93/data/10.73.212.38/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk8.0.1 -io vddk-thumbprint=1B:83:D8:5A:33:31:62:DB:BA:9E:73:6D:A8:29:14:48:3F:82:F6:FD -ip /home/passwd esx7.0-win10-x64-ltsc-2021 -o null
[ 1.1] Setting up the source: -i libvirt -ic vpx://administrator%40vsphere.local.213.93/data/10.73.212.38/?no_verify=1 -it vddk esx7.0-win10-x64-ltsc-2021
[ 2.9] Opening the source
[ 9.0] Inspecting the source
[ 14.8] Checking for sufficient free disk space in the guest
[ 14.8] Converting Windows 10 Enterprise LTSC 2021 to run on KVM
virt-v2v: This guest has virtio drivers installed.
[ 22.7] Mapping filesystem data to avoid copying unused and blank areas
[ 24.6] Closing the overlay
[ 24.9] Assigning disks to buses
[ 24.9] Checking if the guest needs BIOS or UEFI to boot
[ 24.9] Setting up the destination: -o null
[ 26.4] Copying disk 1/1
█ 100% [****************************************]
[ 433.2] Creating output metadata
[ 433.2] Finishing off

Result:
The bug is not only fixed in virt-v2v man page but also improve the error during v2v conversion

Comment 59 Laszlo Ersek 2023-07-06 14:39:02 UTC

Thank you for the thorough verification.

Comment 65 mxie@redhat.com 2023-07-12 02:53:01 UTC

Fixed version unchanged, so move the bug from ON_QA to VERIFIED

Comment 67 errata-xmlrpc 2023-11-07 08:28:57 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt-v2v bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6376

Note You need to log in before you can comment on or make changes to this bug.