Created attachment 2026348 [details] virt-builder verbose output Description of problem: When I run virt-builder on my laptop, it hangs until I ctrl+c. [smayhew@smayhew-thinkpadp1gen4i ~]$ virt-builder fedora-39 -o f39.qcow2 --format qcow2 --size 20G [ 4.2] Downloading: http://builder.libguestfs.org/fedora-39.xz [ 5.0] Planning how to build this image [ 5.0] Uncompressing [ 7.1] Resizing (using virt-resize) to expand the disk to 20.0G ^Cvirt-builder: Exiting on signal SIGINT Version-Release number of selected component (if applicable): libguestfs-1.50.1-1.fc38.x86_64 guestfs-tools-1.50.1-1.fc38.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: If I run libguestfs-test-tool, it also hangs in the "libguestfs: launch libvirt guest" step until the timeout hits. If I run libguestfs-test-tool with "LIBGUESTFS_BACKEND=direct", it works. Attaching verbose output from virt-builder as well as output from libguestfs-test-tool.
As the hang requires libvirt, it's likely a libvirt bug (although could also be qemu).
Works for me on RHEL9: Version: libvirt-10.0.0-6.1.el9_4.x86_64 qemu-kvm-8.2.0-11.el9_4.x86_64 guestfs-tools-1.51.6-2.el9.x86_64 libguestfs-1.50.1-7.el9.x86_64 $ virt-builder fedora-39 -o f39.qcow2 --format qcow2 --size 20G [ 4.6] Downloading: http://builder.libguestfs.org/fedora-39.xz #################################################################################################################################################################################################### 100.0%#################################################################################################################################################################################################### 100.0% [ 106.1] Planning how to build this image [ 106.1] Uncompressing [ 108.3] Resizing (using virt-resize) to expand the disk to 20.0G [ 151.2] Opening the new disk [ 156.0] Setting a random seed [ 156.0] Setting passwords virt-builder: Setting random password of root to Z1kCaiBKeAhIL0yZ [ 157.0] SELinux relabelling [ 168.2] Finishing off Output file: f39.qcow2 Output size: 20.0G Output format: qcow2 Total usable space: 19.9G Free space: 18.5G (93%) And fails on Fedora rawhide: Version: libvirt-10.2.0-2.fc41.x86_64 qemu-kvm-8.2.2-2.fc41.x86_64 guestfs-tools-1.52.0-4.fc41.x86_64 libguestfs-1.52.0-8.fc41.x86_64 $ virt-builder fedora-39 -o f39.qcow2 --format qcow2 --size 20G [ 6.1] Downloading: http://builder.libguestfs.org/fedora-39.xz #################################################################################################################################################################################################### 100.0%####################################################################################### ############################################################################################################# 100.0% [ 95.0] Planning how to build this image [ 95.0] Uncompressing [ 98.1] Resizing (using virt-resize) to expand the disk to 20.0G [ 142.8] Opening the new disk virt-builder: error: libguestfs error: could not create appliance through libvirt. Try running qemu directly without libvirt using this environment variable: export LIBGUESTFS_BACKEND=direct Original error from libvirt: internal error: Child process (passt --one-off --socket /home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0.socket --pid /home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0-passt.pid --address 169.254.2.15 --netmask 16) unexpected exit status 1: No IPv6 nameserver available for NDP/DHCPv6 Template interface: eno1 (IPv4), eno1 (IPv6) MAC: host: 2c:ea:7f:7a:0e:59 DHCP: assign: 169.254.2.15 mask: 255.255.0.0 router: XXXXX DNS: XXXXX DNS search list: XXXXX NDP/DHCPv6: assign: 2620:52:0:4972:2eea:7fff:fe7a:e59 router: fe80::f24b:3a02:8b8f:69a1 our link-local: fe80::2eea:7fff:fe7a:e59 DNS search list: XXXXX UNIX domain socket bound at /home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0.socket You can now start qemu (>= 7.2, with commit 13c6be96618c): kvm ... -device virtio-net-pci,netdev=s -netdev stream,id=s,server=off,addr.type=unix,addr.path=/home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0.socket or qrap, for earlier qemu versions: ./qrap 5 kvm ... -net socket,fd=5 -net nic,model=virtio PID file open: Permission denied [code=1 int1=-1] If reporting bugs, run virt-builder with debugging enabled and include the complete output: virt-builder -v -x [...] The error is from SELinux: Apr 12 09:04:40 dell-per440-16 audit[99814]: AVC avc: denied { search } for pid=99814 comm="passt.avx2" name=".cache" dev="sda3" ino=22637843 scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:cache_home_t:s0 tclass=dir permissive=0 Apr 12 09:04:40 dell-per440-16 audit[99814]: AVC avc: denied { search } for pid=99814 comm="passt.avx2" name=".cache" dev="sda3" ino=22637843 scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:cache_home_t:s0 tclass=dir permissive=0 Apr 12 09:04:40 dell-per440-16 audit[99814]: AVC avc: denied { search } for pid=99814 comm="passt.avx2" name=".cache" dev="sda3" ino=22637843 scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:cache_home_t:s0 tclass=dir permissive=0 Apr 12 09:04:40 dell-per440-16 passt[99814]: UNIX domain socket bound at /home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0.socket Apr 12 09:04:40 dell-per440-16 passt[99814]: You can now start qemu (>= 7.2, with commit 13c6be96618c): Apr 12 09:04:40 dell-per440-16 passt[99814]: kvm ... -device virtio-net-pci,netdev=s -netdev stream,id=s,server=off,addr.type=unix,addr.path=/home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0.socket Apr 12 09:04:40 dell-per440-16 passt[99814]: or qrap, for earlier qemu versions: Apr 12 09:04:40 dell-per440-16 passt[99814]: ./qrap 5 kvm ... -net socket,fd=5 -net nic,model=virtio Apr 12 09:04:40 dell-per440-16 audit[99814]: AVC avc: denied { search } for pid=99814 comm="passt.avx2" name=".cache" dev="sda3" ino=22637843 scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:cache_home_t:s0 tclass=dir permissive=0 Apr 12 09:04:40 dell-per440-16 virtqemud[99020]: internal error: Child process (passt --one-off --socket /home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0.socket --pid /home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0-passt.pid --address 169.254.2.15 --netmask 16) unexpected exit status 1: No IPv6 nameserver available for NDP/DHCPv6#012Template interface: eno1 (IPv4), eno1 (IPv6)#012MAC:#012 host: 2c:ea:7f:7a:0e:59#012DHCP:#012 assign: 169.254.2.15#012 mask: 255.255.0.0#012 router: 10.73.115.254#012DNS:#012 10.73.115.254#012DNS search list:#012 lab.eng.pek2.redhat.com#012NDP/DHCPv6:#012 assign: 2620:52:0:4972:2eea:7fff:fe7a:e59#012 router: fe80::f24b:3a02:8b8f:69a1#012 our link-local: fe80::2eea:7fff:fe7a:e59#012DNS search list:#012 lab.eng.pek2.redhat.com#012UNIX domain socket bound at /home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0.socket#012#012You can now start qemu (>= 7.2, with commit 13c6be96618c):#012 kvm ... -device virtio-net-pci,netdev=s -netdev stream,id=s,server=off,addr.type=unix,addr.path=/home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0.socket#012or qrap, for earlier qemu versions:#012 ./qrap 5 kvm ... -net socket,fd=5 -net nic,model=virtio#012PID file open: Permission denied Apr 12 09:04:42 dell-per440-16 systemd[1]: Starting setroubleshootd.service - SETroubleshoot daemon for processing new SELinux denial logs... Apr 12 09:04:42 dell-per440-16 systemd[1]: Started setroubleshootd.service - SETroubleshoot daemon for processing new SELinux denial logs. Apr 12 09:04:42 dell-per440-16 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=setroubleshootd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Apr 12 09:04:43 dell-per440-16 ceph-1100716e-1bd6-11ee-995b-2cea7f7a0e59-alertmanager-dell-per440-16[2361]: level=error ts=2024-04-12T01:04:43.101Z caller=dispatch.go:354 component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="ceph-dashboard/webhook[0]: notify retry canceled after 8 attempts: Post \"https://host.containers.internal:8443/api/prometheus_receiver\": dial tcp 10.73.114.79:8443: connect: connection refused; ceph-dashboard/webhook[1]: notify retry canceled after 8 attempts: Post \"https://host.containers.internal:8443/api/prometheus_receiver\": dial tcp 10.73.114.79:8443: connect: connection refused" Apr 12 09:04:43 dell-per440-16 ceph-1100716e-1bd6-11ee-995b-2cea7f7a0e59-alertmanager-dell-per440-16[2361]: level=warn ts=2024-04-12T01:04:43.101Z caller=notify.go:724 component=dispatcher receiver=ceph-dashboard integration=webhook[1] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://host.containers.internal:8443/api/prometheus_receiver\": dial tcp 10.73.114.79:8443: connect: connection refused" Apr 12 09:04:43 dell-per440-16 ceph-1100716e-1bd6-11ee-995b-2cea7f7a0e59-alertmanager-dell-per440-16[2361]: level=warn ts=2024-04-12T01:04:43.101Z caller=notify.go:724 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://host.containers.internal:8443/api/prometheus_receiver\": dial tcp 10.73.114.79:8443: connect: connection refused" Apr 12 09:04:43 dell-per440-16 systemd[1]: Started dbus-:1.3-org.fedoraproject.SetroubleshootPrivileged. Apr 12 09:04:43 dell-per440-16 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dbus-:1.3-org.fedoraproject.SetroubleshootPrivileged@6 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Apr 12 09:04:44 dell-per440-16 SetroubleshootPrivileged.py[99835]: failed to retrieve rpm info for path '/var/lib/selinux/targeted/active/modules/200/passt': Apr 12 09:04:44 dell-per440-16 setroubleshoot[99818]: SELinux is preventing passt.avx2 from search access on the directory .cache. For complete SELinux messages run: sealert -l 63cacc91-aa16-4779-9992-84f55fb4d808 Apr 12 09:04:44 dell-per440-16 setroubleshoot[99818]: SELinux is preventing passt.avx2 from search access on the directory .cache.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that passt.avx2 should be allowed search access on the .cache directory by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'passt.avx2' --raw | audit2allow -M my-passtavx2#012# semodule -X 300 -i my-passtavx2.pp#012 Apr 12 09:04:44 dell-per440-16 setroubleshoot[99818]: SELinux is preventing passt.avx2 from search access on the directory .cache. For complete SELinux messages run: sealert -l 63cacc91-aa16-4779-9992-84f55fb4d808 Apr 12 09:04:44 dell-per440-16 setroubleshoot[99818]: SELinux is preventing passt.avx2 from search access on the directory .cache.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that passt.avx2 should be allowed search access on the .cache directory by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'passt.avx2' --raw | audit2allow -M my-passtavx2#012# semodule -X 300 -i my-passtavx2.pp#012 Apr 12 09:04:44 dell-per440-16 setroubleshoot[99818]: SELinux is preventing passt.avx2 from search access on the directory .cache. For complete SELinux messages run: sealert -l 63cacc91-aa16-4779-9992-84f55fb4d808 Apr 12 09:04:44 dell-per440-16 setroubleshoot[99818]: SELinux is preventing passt.avx2 from search access on the directory .cache.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that passt.avx2 should be allowed search access on the .cache directory by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'passt.avx2' --raw | audit2allow -M my-passtavx2#012# semodule -X 300 -i my-passtavx2.pp#012 Apr 12 09:04:44 dell-per440-16 setroubleshoot[99818]: SELinux is preventing passt.avx2 from search access on the directory .cache. For complete SELinux messages run: sealert -l 63cacc91-aa16-4779-9992-84f55fb4d808 Apr 12 09:04:44 dell-per440-16 setroubleshoot[99818]: SELinux is preventing passt.avx2 from search access on the directory .cache.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that passt.avx2 should be allowed search access on the .cache directory by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'passt.avx2' --raw | audit2allow -M my-passtavx2#012# semodule -X 300 -i my-passtavx2.pp#012
Hi @yalzhang The above is a passt SELinux issue. Please help to check if it is a bug. passst&SELinux version for RHEL9: passt-0^20231204.gb86afe3-1.el9.x86_64 selinux-policy-38.1.35-2.el9_4.noarch passst&SELinux version for Fedora rawhide: passt-0^20240405.g954589b-1.fc41.x86_64 selinux-policy-40.15-1.fc41.noarch
The permission issue in comment 3 that passt try to access "/home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0-passt.pid" is not an issue. And it's by design. The issue comes from that fact that, the XDG_RUNTIME_DIR environment variable is not set in your enviroment: # echo $XDG_RUNTIME_DIR # and libvirt uses that variable to decide where sockets and PID files need to be stored. If it's not set, it uses ~/.cache instead. This is described in general terms at: https://access.redhat.com/solutions/6634751 Try to switch to the unprivileged user by: $ machinectl shell hhan@ and it should set the directroy, then passt will use this directroy instead: /run/user/$UID/libvirt/qemu/run/passt/3-rhel-net0.socket /run/user/$UID/libvirt/qemu/run/passt/3-rhel-net0-passt.pid
@(In reply to yalzhang from comment #5) > The permission issue in comment 3 that passt try to access > "/home/hhan/.cache/libvirt/qemu/run/passt/3-guestfs-yqpgltwh1z3j-net0-passt. > pid" is not an issue. And it's by design. > The issue comes from that fact that, the XDG_RUNTIME_DIR environment > variable is not set in your enviroment: > # echo $XDG_RUNTIME_DIR > # > and libvirt uses that variable to decide where sockets and PID files need to > be stored. If it's not set, it uses ~/.cache instead. This is described in > general terms at: > https://access.redhat.com/solutions/6634751 > > Try to switch to the unprivileged user by: > $ machinectl shell hhan@ > and it should set the directroy, then passt will use this directroy instead: > /run/user/$UID/libvirt/qemu/run/passt/3-rhel-net0.socket > /run/user/$UID/libvirt/qemu/run/passt/3-rhel-net0-passt.pid Thanks for your reply. It works for me now with machinectl on Fedora rawhide.
I believe I am seeing this with guestfish: $ export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1 $ guestfish -a Fedora-Server-KVM-38-1.6.x86_64.qcow2 -i ... libguestfs: launch libvirt guest \x1bc\x1b[?7l\x1b[2J\x1b[0mSeaBIOS (version 1.16.2-1.fc38) Machine UUID 67c38d36-c16b-4dc7-aa20-6c474da151ef Booting from ROM..\x1bc\x1b[?7l\x1b[2J^Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Hangs here. [1]+ Stopped guestfish -a Fedora-Server-KVM-38-1.6.x86_64.qcow2 -i $ kill -s SIGTERM %1 [1]+ Terminated guestfish -a Fedora-Server-KVM-38-1.6.x86_64.qcow2 -i == Related services are running: $ systemctl list-units -a \*vir\*.service UNIT LOAD ACTIVE SUB DESCRIPTION libvirtd.service loaded active running Virtualization daemon virtlockd.service loaded active running Virtual machine lock manager virtlogd.service loaded active running Virtual machine log manager == $ rpm -qa libguestfs libvirt-daemon libvirt-libs qemu-kvm | sort libguestfs-1.50.1-1.fc38.x86_64 libvirt-daemon-9.0.0-5.fc38.x86_64 libvirt-libs-9.0.0-5.fc38.x86_64 qemu-kvm-7.2.10-1.fc38.x86_64 $ uname -r 6.8.5-101.fc38.x86_64
Unfortunately without further diagnosis of where precisely it's hanging, there's not much we can do here. You might try choosing different kernels and versions of qemu to see if you can isolate which kernel or qemu first caused the problem. https://libguestfs.org/guestfs-faq.1.html#broken-kernel-or-trying-a-different-kernel https://libguestfs.org/guestfs-faq.1.html#broken-qemu-or-trying-a-different-qemu (Note: rm -rf /var/tmp/.guestfs-* after changing environment variables)
The qcow2 image is readonly, but the guestfish "--ro" option seems to be required anyway. Removed: /var/tmp/.guestfs-* ~/.cache/libvirt/ ~/.cache/libvirt-sandbox/ Copied Fedora-Server-KVM-38-1.6.x86_64.qcow2 to /tmp/test1/ Made readonly: $ ll -n /tmp/test1/Fedora-Server-KVM-38-1.6.x86_64.qcow2 -r--r--r--. 1 1000 1000 629276672 Apr 13 2023 /tmp/test1/Fedora-Server-KVM-38-1.6.x86_64.qcow2 $ guestfish -a Fedora-Server-KVM-38-1.6.x86_64.qcow2 -i libguestfs: error: could not create appliance through libvirt. Try running qemu directly without libvirt using this environment variable: export LIBGUESTFS_BACKEND=direct Original error from libvirt: internal error: process exited while connecting to monitor: 2024-04-16T13:39:02.991772Z qemu-kvm: -blockdev {"node-name":"libvirt-2-format","read-only":false,"cache":{"direct":false,"no-flush":false},"driver":"qcow2","file":"libvirt-2-storage","backing":null}: Could not open '/tmp/test1/Fedora-Server-KVM-38-1.6.x86_64.qcow2': Permission denied [code=1 int1=-1] Added "--ro" option to the command-line. There is no error and no hanging: $ guestfish --ro -a Fedora-Server-KVM-38-1.6.x86_64.qcow2 -i Welcome to guestfish, the guest filesystem shell for ... Operating system: Fedora Linux 38 (Server Edition) /dev/sysvg/root mounted on / /dev/sda2 mounted on /boot
(In reply to Steve from comment #7) ... > Booting from ROM..\x1bc\x1b[?7l\x1b[2J^Z > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Hangs here. ... That isn't quite accurate. "top" showed CPU usage was at 100%. Further, strace showed repeated calls to "poll" and failed attempts to connect to a socket. I realize that is vague, but the problem is not occurring now, so I will regard my problem report as resolved.
(In reply to Scott Mayhew from comment #0) ... > ... it hangs until I ctrl+c. ... In my case, pressing ctrl-c did not work, which is why I suspended guestfish and used the "kill" command, as shown in Comment 7.
(In reply to Richard W.M. Jones from comment #8) > Unfortunately without further diagnosis of where precisely it's hanging, > there's > not much we can do here. You might try choosing different kernels and > versions > of qemu to see if you can isolate which kernel or qemu first caused the > problem. > > https://libguestfs.org/guestfs-faq.1.html#broken-kernel-or-trying-a- > different-kernel > https://libguestfs.org/guestfs-faq.1.html#broken-qemu-or-trying-a-different- > qemu > > (Note: rm -rf /var/tmp/.guestfs-* after changing environment variables) Thanks, Rich. This started working for me again after a dnf upgrade. There weren't any qemu packages updated, so I focused on testing different kernels. The hang occurs with 6.8.4-100. The other kernels I have installed (6.7.7-100, 6.7.10-100, and 6.8.6-100) all work fine.
Let's move this to kernel. There is a known bug in current upstream kernels, but I don't think it applies to 6.8.4 and it's a crasher, not a hang. If 6.8.6-100 works then we can probably close this bug, on the assumption that the bug is fixed.
Fedora Linux 38 entered end-of-life (EOL) status on 2024-05-21. Fedora Linux 38 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.