When scratch-building libguestfs it failed to build for aarch64. https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19612808 I talked with Richard Jones and he mentioned that it's an actual bug in kernel/qemu/AAVMF: this kernel doesn't boot on qemu on this architecture. Unfortunately there's nothing useful in the logs so probably someone will need to fire up a RHEL 8 instance in Beaker to try to diagnose the issue.
kernel 4.18.0-57.el8 qemu-kvm 15:2.12.0-50.module+el8+2596+0a642e54
Created attachment 1516625 [details] root.log for package versions etc
Created attachment 1516626 [details] build.log Not much in the build log because it silently fails, so it's most likely actually an AAVMF thing. Attaching it anyway.
Based on the hw_info.log it appears this is on an HP Moonshot. I just tried to provision a couple of them with rhel8, but the machines can't get IPMI connections. I wish we would stop using these unsupported machines. Lately they seem to be more trouble than they're worth.
As per internal discussion on 8Dev, my understanding is that we're trying to determine whether this only happens on HP Moonshot? And if so, the solution will be to stop using those machines for build? I assume brew has other machines than HP Moonshots, right? In this case, what happens if we keep trying a scratch build until brew picks a different machine?
(In reply to Luiz Capitulino from comment #5) > As per internal discussion on 8Dev, my understanding is that we're trying to > determine whether > this only happens on HP Moonshot? And if so, the solution will be to stop > using those machines > for build? > > I assume brew has other machines than HP Moonshots, right? In this case, > what happens if we > keep trying a scratch build until brew picks a different machine? This is a no go. This affects regular builds, not only scratch builds. This also happen when building modules. Building a module can take a couple of hours, and all packages needs to build successfully. If one package fails to build, the whole module build is rejected. It prevents us to release other packages.
(In reply to Danilo Cesar de Paula from comment #8) > (In reply to Luiz Capitulino from comment #5) > > As per internal discussion on 8Dev, my understanding is that we're trying to > > determine whether > > this only happens on HP Moonshot? And if so, the solution will be to stop > > using those machines > > for build? > > > > I assume brew has other machines than HP Moonshots, right? In this case, > > what happens if we > > keep trying a scratch build until brew picks a different machine? > > This is a no go. Oh, I didn't suggested this as a solution. It was just one way to confirm that problem only happens on HP Moonshot.
I was able to provision hp-moonshot-03-c36.lab.eng.rdu.redhat.com with the latest RHEL8. I didn't see any problems in booting the bare-metal and installing a RHEL8 guest VM on it. So maybe the problem is brew server specific? PS: serial console access indeed has troubles. But it only happens on RDU servers, not on Boston Moonshot servers). -Wei
I am also _unable_ to reproduce this on HP Moonshot.
Guys, Is it possible that there's something that's specific to certain HP Moonshots that could be causing this? Say, a chipset or something? Do you think it is worth spending time debugging this?
Danilo, Have you tried doing the scratch build again, maybe until brew picks a different machine and see if that works? Note that I'm suggesting this as a solution, but just to help showing the problem is someway specific that system.
Latest build failed again, in the same way, and only on aarch64: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19703675 I'm not sure how to tell from the hwinfo what hardware that is. Although I'm not sure this is really hardware related at all anyway.
(In reply to Richard W.M. Jones from comment #16) > Latest build failed again, in the same way, and only on aarch64: > > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19703675 > > I'm not sure how to tell from the hwinfo what hardware that is. Although > I'm not > sure this is really hardware related at all anyway. Another important detail, libguestfs test-suite is using tcg. Is it at all possible to force it to use kvm to see what happens? But in any case, we have to unblock the build. I think we're going to that hack we talked about on the last meeting: skip make check for arm until we figure what's wrong. Do you think we can do it?
The problem is we'll build something that we know doesn't work. Might as well ExcludeArch aarch64 in that case. The test is there for a reason and the fact that the kernel doesn't boot on qemu is some problem with the kernel or qemu (and unlikely to be anything to do with libguestfs). I'm currenty attempting to diagnose this further, but my efforts are stymied by being unable to do scratch builds at all because of a serious of interlinking brokenness in our tooling.
(In reply to Richard W.M. Jones from comment #18) > The problem is we'll build something that we know doesn't work. Might as > well > ExcludeArch aarch64 in that case. The test is there for a reason and the > fact > that the kernel doesn't boot on qemu is some problem with the kernel or qemu > (and unlikely to be anything to do with libguestfs). OK, makes sense. But what about trying with kvm? It seems to be using tcg, so is it possible it's a tcg-related issue? Btw, when you tried to reproduce on comment 12 did you use tcg as well? > I'm currenty attempting to diagnose this further, but my efforts are > stymied by being unable to do scratch builds at all because of a serious of > interlinking brokenness in our tooling. Thanks a lot, that's going to be very helpful.
First of all I can very reliably reproduce this in Koji (not elsewhere). I temporarily changed the backend from "libvirt" to "direct"[1]. The test now succeeds, which suggests that it's something to do with the qemu command line which libvirt constructs. The command lines. Of course exactly what the important difference is, that's very hard indeed to say. libvirt (fails): LC_ALL=C LD_LIBRARY_PATH=/builddir/build/BUILD/libguestfs-1.38.4/ruby/ext/guestfs:/builddir/build/BUILD/libguestfs-1.38.4/lib/.libs:/builddir/build/BUILD/libguestfs-1.38.4/java/.libs:/builddir/build/BUILD/libguestfs-1.38.4/gobject/.libs PATH=/builddir/build/BUILD/libguestfs-1.38.4/v2v:/builddir/build/BUILD/libguestfs-1.38.4/tools:/builddir/build/BUILD/libguestfs-1.38.4/test-tool:/builddir/build/BUILD/libguestfs-1.38.4/sysprep:/builddir/build/BUILD/libguestfs-1.38.4/sparsify:/builddir/build/BUILD/libguestfs-1.38.4/resize:/builddir/build/BUILD/libguestfs-1.38.4/rescue:/builddir/build/BUILD/libguestfs-1.38.4/p2v:/builddir/build/BUILD/libguestfs-1.38.4/make-fs:/builddir/build/BUILD/libguestfs-1.38.4/inspector:/builddir/build/BUILD/libguestfs-1.38.4/get-kernel:/builddir/build/BUILD/libguestfs-1.38.4/fuse:/builddir/build/BUILD/libguestfs-1.38.4/format:/builddir/build/BUILD/libguestfs-1.38.4/fish:/builddir/build/BUILD/libguestfs-1.38.4/erlang:/builddir/build/BUILD/libguestfs-1.38.4/edit:/builddir/build/BUILD/libguestfs-1.38.4/diff:/builddir/build/BUILD/libguestfs-1.38.4/dib:/builddir/build/BUILD/libguestfs-1.38.4/df:/builddir/build/BUILD/libguestfs-1.38.4/customize:/builddir/build/BUILD/libguestfs-1.38.4/cat:/builddir/build/BUILD/libguestfs-1.38.4/builder:/builddir/build/BUILD/libguestfs-1.38.4/align:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/sbin HOME=/builddir USER=mockbuild LOGNAME=mockbuild QEMU_AUDIO_DRV=none TMPDIR=/builddir/build/BUILD/libguestfs-1.38.4/tmp /usr/libexec/qemu-kvm -name guest=guestfs-6e9ikzf2eardkjpv,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/builddir/.config/libvirt/qemu/lib/domain-1-guestfs-6e9ikzf2eard/master-key.aes -machine virt-rhel7.6.0,accel=tcg,usb=off,dump-guest-core=off,gic-version=2 -cpu cortex-a57 -drive file=/usr/share/AAVMF/AAVMF_CODE.verbose.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/builddir/build/BUILD/libguestfs-1.38.4/tmp/libguestfsYNOkt3/vars2.fd,if=pflash,format=raw,unit=1 -m 768 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 78f54c12-4f91-4b7d-aedb-5b7d8e3edd47 -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=20,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-reboot -no-acpi -boot strict=on -kernel /builddir/build/BUILD/libguestfs-1.38.4/tmp/.guestfs-996/appliance.d/kernel -initrd /builddir/build/BUILD/libguestfs-1.38.4/tmp/.guestfs-996/appliance.d/initrd -append 'panic=1 console=ttyAMA0 earlyprintk=pl011,0x9000000 ignore_loglevel efi-rtc=noprobe edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=vt100' -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 -device virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x0 -drive file=/builddir/build/BUILD/libguestfs-1.38.4/tmp/libguestfsYNOkt3/scratch1.img,format=raw,if=none,id=drive-scsi0-0-0-0,cache=unsafe -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on -drive file=/builddir/build/BUILD/libguestfs-1.38.4/tmp/libguestfsYNOkt3/overlay3.qcow2,format=qcow2,if=none,id=drive-scsi0-0-1-0,cache=unsafe -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-scsi0-0-1-0,id=scsi0-0-1-0,write-cache=on -chardev socket,id=charserial0,path=/tmp/libguestfsgP02RS/console.sock -serial chardev:charserial0 -chardev socket,id=charchannel0,path=/tmp/libguestfsgP02RS/guestfsd.sock -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.libguestfs.channel.0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.3,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on direct (succeeds): /usr/libexec/qemu-kvm \ -global virtio-blk-pci.scsi=off \ -enable-fips \ -nodefaults \ -display none \ -machine virt,accel=kvm:tcg \ -cpu cortex-a57 \ -m 768 \ -no-reboot \ -rtc driftfix=slew \ -drive if=pflash,format=raw,file=/usr/share/AAVMF/AAVMF_CODE.verbose.fd,readonly \ -drive if=pflash,format=raw,file=/builddir/build/BUILD/libguestfs-1.38.4/tmp/libguestfs6AbJjz/vars3.fd \ -kernel /builddir/build/BUILD/libguestfs-1.38.4/tmp/.guestfs-996/appliance.d/kernel \ -initrd /builddir/build/BUILD/libguestfs-1.38.4/tmp/.guestfs-996/appliance.d/initrd \ -object rng-random,filename=/dev/urandom,id=rng0 \ -device virtio-rng-pci,rng=rng0 \ -device virtio-scsi-pci,id=scsi \ -drive file=/builddir/build/BUILD/libguestfs-1.38.4/tmp/libguestfs6AbJjz/scratch1.img,cache=unsafe,format=raw,id=hd0,if=none \ -device scsi-hd,drive=hd0 \ -drive file=/builddir/build/BUILD/libguestfs-1.38.4/tmp/.guestfs-996/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none,format=raw \ -device scsi-hd,drive=appliance \ -device virtio-serial-pci \ -serial stdio \ -chardev socket,path=/tmp/libguestfsUhWnPW/guestfsd.sock,id=channel0 \ -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \ -append "panic=1 console=ttyAMA0 earlyprintk=pl011,,0x9000000 ignore_loglevel efi-rtc=noprobe edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=vt100" [1] http://libguestfs.org/guestfs.3.html#backend
(In reply to Luiz Capitulino from comment #19) > OK, makes sense. But what about trying with kvm? It seems to be using tcg, > so is it possible it's a tcg-related issue? The problem is KVM isn't available it'll use TCG, but we can't try KVM to see if that would make a difference. In comment 20 above it's using TCG in both cases, and in one case it *works*, which IMHO excludes TCG as a cause. > Btw, when you tried to reproduce on comment 12 did you use tcg as well? Yes I've tried all sorts: Running in L0 with L1 KVM appliance Running in L0 with L1 TCG appliance Running in L1 with L2 TCG appliance and I've not been able to reproduce it with Moonshot at all under any scenario. Since this is not using KVM I doubt hardware makes any difference.
(In reply to Richard W.M. Jones from comment #21) > (In reply to Luiz Capitulino from comment #19) > > OK, makes sense. But what about trying with kvm? It seems to be using tcg, > > so is it possible it's a tcg-related issue? > > The problem is KVM isn't available it'll use TCG, but we can't try KVM to > see if that would make a difference. > > In comment 20 above it's using TCG in both cases, and in one case it *works*, > which IMHO excludes TCG as a cause. Yeah, I agree. Your debugging work on comment 20 supersedes my question :) So, what are we down to now? I can think of the following: 1. Some "environmental" issue relating to libvirt, ie. some lib that libvirt needs is unavailable 2. libvirt is passing some bad command-line to qemu 3. libvirt is using some command-line option that in turn is triggering a bug in qemu
Here is the successful build using LIBGUESTFS_BACKEND=direct https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19712035 (In reply to Luiz Capitulino from comment #22) > 1. Some "environmental" issue relating to libvirt, ie. some lib that libvirt > needs is unavailable We know that libvirt does run qemu, so that would seem to exclude a missing component. We know that no messages are seen from qemu in the failure case. Normally we'd expect to see: add-symbol-file /builddir/build/BUILD/edk2-ee3198e672e2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/ArmPlatformPkg/PrePeiCore/PrePeiCoreUniCore/DEBUG/ArmPlatformPrePeiCore.dll 0x1800 add-symbol-file /builddir/build/BUILD/edk2-ee3198e672e2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Pei/PeiMain/DEBUG/PeiCore.dll 0x7980 UEFI firmware starting. which are messages from AAVMF, but we do not see those (or else the console is broken somehow, but that seems unlikely). > 2. libvirt is passing some bad command-line to qemu This seems the strongest possibility. > 3. libvirt is using some command-line option that in turn is triggering > a bug in qemu Indeed, something like -sandbox would be a strong contender here.
Based on the evidence that we have so far (comment 20 and comment 23), this looks like a libvirt issue. Re-assigning. Please, re-assign back if you find evidence of the contrary.
The libvirt command line on comment 20 has -no-acpi. RHEL kernels require ACPI. Everything else in the command line looks sane, except for some of the kernel command line parameters in -append (they're obviously for x86, not aarch64), but those are the same in the direct command line, so cannot be the culprit.
Yes the -no-acpi thing worried me too. However when I ran the same test on our HP Moonshot, I noticed that libvirt is also using -no-acpi there, and it's the same AAVMF and kernel, and it works there. (This bug would be a lot easier to solve if I could reproduce it!)
(In reply to Andrew Jones from comment #25) > The libvirt command line on comment 20 has -no-acpi. RHEL kernels require > ACPI. I just tried with a RHEL 8 guest I had lying around, so it might not be the latest and greatest, but it booted just fine after I removed the <acpi/> element from the corresponding guest XML... Looks like we're not really requiring ACPI after all? Either way, AAVMF and GRUB should come up regardless of what the kernel requirements are, and that's not the case here. I'm trying to make sense of this but I don't have any specific insights yet. I'll update the bug as I find out more.
I've compared the command lines from comment 20. (01) There's a difference in the kernel command lines ("-append") option. While I think this difference is irrelevant, it's worth pointing out in general. libvirt: "earlyprintk=pl011,0x9000000" direct: "earlyprintk=pl011,,0x9000000" (02) The libvirt backend starts the domain with "-S" (i.e., paused). If libguestfs failed to unpause the domain for whatever reason, that would be consistent with the symptom, I believe. (03) The libvirt backend specifies "-boot strict=on". This should make no difference, as it only generates a HALT entry at the end of the "bootorder" fw_cfg file, and AAVMF simply ignores that (AAVMF behaves as if "-boot strict=on" were present at all times, anyway). Also, this difference in fw_cfg would be encountered much later in the boot process. (04) Libvirt specifies "-uuid". Most likely irrelevant. (05) Libvirt specifies "-sandbox", as highlighted above. Could be relevant. (06) The -machine options are quite different: (a) libvirt: "virt-rhel7.6.0", direct: "virt". These are identical in practice. (b) libvirt adds "usb=off,dump-guest-core=off". Should be irrelevant. (c) libvirt specifies "gic-version=2". This is the default, so it's irrelevant. (d) The libvirt backend forces "accel=tcg", while the QEMU backend prefers KVM ("accel=kvm:tcg"). In comment 21, Rich wrote, "In comment 20 above it's using TCG in both cases" -- how do we know that? (07) The qemu backend specifies "-enable-fips". While I think it should not matter in this case, I do find it pretty interesting -- I'm sure it was added manually for a particular reason, and I wonder why it's OK to not do the same through the libvirt backend. (08) The qemu backend specifies "-global virtio-blk-pci.scsi=off". Irrelevant, neither backend uses any virtio-blk-pci devices. (09) The libvirt backend specifies "-msg timestamp=on". Irrelevant (and "on" seems to be the default anyway, according to the QEMU manual). (10) The libvirt backend sets the guest name, the host-side process name, and the host-side thread names, with "-name ...". Irrelevant. (11) Libvirt sets "-no-user-config". I don't think it makes a difference in the Brew build environment. (12) The libvirt backend specifie "-object secret,id=masterKey0,...". This should not matter, as "masterKey0" is not referenced anywhere on the command line. (13) The libvirt backend generates a PCIe hierarchy with the virtio devices plugged into PCIe root ports. The qemu backend places the virtio devices directly on the root complex. As a result, the libvirt backend's domain will have modern-only virtio devices, while the qemu backend's guest will see transitional virtio devices. This is certainly guest-visible, but should not matter until much later into the boot process. (14) The SCSI hard disk devices are specified with more details by libvirt. "write-cache" is enabled (which I *think* should be the default anyway), plus "bootindex=1" is set explicitly. Even if it mattered eventually, it shouldn't matter early in the boot process. (15) There's a difference in how the "appliance" (boot?) disk's backend is configured. QEMU uses a raw disk image (but with "snapshot=on"???), whereas libvirt uses a qcow2 image which seems to be based on another backing file. However... none of this should matter for the early boot hang. (16) The libvirt backend spells out "base=utc" for "-rtc", but that's the default anyway. Not relevant. (17) The libvirt backend says "-no-acpi". I think that's a grave divergence from the qemu backend (it is extremely guest-visible), so I think it would be worth reconciling with the QEMU backend. However, it doesn't explain the *early* boot hang. (18) The libvirt backend says "-realtime mlock=off". The qemu documentation is extremely confusing on this, but the source code helps. Namely, "mlock=on" is the default *IF* a bare-bones "-realtime" option is passed. However, if "-realtime" itself is not specified, then qemu & the guest RAM are *not* mlocked by default. Therefore, "-realtime mlock=off" only re-states the default, and there is no difference with the qemu backend in this regard. (19) While it's not immediately obvious, the "guestfsd.sock"-backed virtio serial *port* is set up identically between the QEMU and libvirt backends. (20) The libvirt backend uses the "console.sock" unix domain socket to expose the guest's PL011 serial port to the host. The QEMU backend uses QEMU's stin/stdout for the same. Given that the symptom we're seeing is tied to the (emulated PL011, not virtio) serial console, I think this difference could be relevant. (21) The QEMU backend doesn't set up a monitor connection at all. The libvirt backend does. It uses "fd=20", so I think this is simply libvirtd's internal monitor connection. I think this monitor connection is minimally used to unpause the guest (see point (02)), although I'm unsure how exactly libguestfs asks libvirtd to do that. Still, on the same note as (02), it remains a difference that the QEMU backend lets the guest run at once, while with libvirtd, a separate resume action is needed. I guess we could start adapting the direct backend's command line, gradually, to see where the issue is triggered first. With reference to: - (02)/(21) -- start paused (although this could require an entirely new communication pattern between libguestfs and QEMU), - (05) -- sandbox the emulator, - (06)(d) -- force TCG, - (20) -- expose PL011 via a unix domain socket.
(Sorry, I started writing what ended up as comment 30 before seeing Drew's comment 25 and the subsequent comments. That's the only reason I covered those same observations myself -- I didn't mean to ignore the earlier findings.)
(In reply to Laszlo Ersek from comment #30) > I've compared the command lines from comment 20. > > (01) There's a difference in the kernel command lines ("-append") option. > While I think this difference is irrelevant, it's worth pointing out in > general. > > libvirt: "earlyprintk=pl011,0x9000000" > direct: "earlyprintk=pl011,,0x9000000" What's happening here is libguestfs is attempting to quote commas in the -append argument by doubling them, which looks like it's a bug. Filed it: https://bugzilla.redhat.com/show_bug.cgi?id=1664711 > (02) The libvirt backend starts the domain with "-S" (i.e., paused). If > libguestfs failed to unpause the domain for whatever reason, that would > be consistent with the symptom, I believe. libvirt should be unpausing the domain. There's some reason why libvirt starts it paused which I forget now, but it's all hidden inside libvirt. > (d) The libvirt backend forces "accel=tcg", while the QEMU backend > prefers KVM ("accel=kvm:tcg"). In comment 21, Rich wrote, "In > comment 20 above it's using TCG in both cases" -- how do we know > that? We know that because the first output from qemu in direct mode is: Could not access KVM kernel module: No such file or directory qemu-kvm: failed to initialize KVM: No such file or directory qemu-kvm: Back to tcg accelerator (followed by AAVMF messages). > (07) The qemu backend specifies "-enable-fips". While I think it should not > matter in this case, I do find it pretty interesting -- I'm sure it was > added manually for a particular reason, and I wonder why it's OK to not > do the same through the libvirt backend. Yes this is interesting. Libvirt used to add it, apparently it doesn't add it any more. The reason we added it is so that qemu "does the right thing" when FIPS mode is enabled on the host, but note that the option doesn't actually "enable FIPS" so it is misnamed. See fips_enabled in util/osdep.c for the logic. Oh and by the way, FIPS is _only_ used in qemu upstream to prevent you from using password auth in VNC, and nothing else. > (11) Libvirt sets "-no-user-config". I don't think it makes a difference in > the Brew build environment. Yes I noticed this too - direct mode should set it as well so it's a bug, but not one that should affect us here. > (20) The libvirt backend uses the "console.sock" unix domain socket to > expose the guest's PL011 serial port to the host. The QEMU backend uses > QEMU's stin/stdout for the same. Given that the symptom we're seeing is > tied to the (emulated PL011, not virtio) serial console, I think this > difference could be relevant. So my belief is that it's not a serial console failure. Although not seeing any messages would be an obvious symptom of this, the serial console is only used for log messages and nothing important. There is a second connection which is used to pass messages to the guestfsd daemon (inside the appliance) and the library outside, and the test could still succeed as long as this second connection is established and working. However we don't see an expected debug message from libguestfs when it successfully handshakes with the daemon. (However this doesn't discount the possibility that the serial console fails and then later something else fails and the error is not seen because the serial console is broken) > I guess we could start adapting the direct backend's command line, > gradually, to see where the issue is triggered first. With reference to: > - (02)/(21) -- start paused (although this could require an entirely new > communication pattern between libguestfs and QEMU), > - (05) -- sandbox the emulator, > - (06)(d) -- force TCG, > - (20) -- expose PL011 via a unix domain socket. I'll have a look at some of these today, thanks for your detailed analysis.
As a test, I added the following extra flags to the direct command line, but it still did _not_ fail (in brew), so I guess we can conclude that the problem is NOT related to any of these. -machine virt,accel=kvm:tcg,usb=off,dump-guest-core=off,gic-version=2 \ -no-user-config \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -no-acpi \
(In reply to Richard W.M. Jones from comment #32) > (In reply to Laszlo Ersek from comment #30) > > (d) The libvirt backend forces "accel=tcg", while the QEMU backend > > prefers KVM ("accel=kvm:tcg"). In comment 21, Rich wrote, "In > > comment 20 above it's using TCG in both cases" -- how do we know > > that? > > We know that because the first output from qemu in direct mode is: > > Could not access KVM kernel module: No such file or directory > qemu-kvm: failed to initialize KVM: No such file or directory > qemu-kvm: Back to tcg accelerator > > (followed by AAVMF messages). Note that libvirt correctly detects and reports that KVM support is not available on the host: <capabilities> ... <guest> <os_type>hvm</os_type> <arch name='aarch64'> <wordsize>64</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='255'>virt-rhel7.6.0</machine> <machine canonical='virt-rhel7.6.0' maxCpus='255'>virt</machine> <domain type='qemu'/> </arch> ... </guest> </capabilities> Perhaps that's because brew builds happen inside virtual machines already? That wouldn't surprise me. In any case, making sure the test suite requests the same accelerator both when libvirt is used and when it's not sounds like a good idea.
We recently fixed this properly upstream (bug 1605071), but the fix hasn't been included in RHEL 8 yet. In any case I don't believe it's related to this bug.
(In reply to Richard W.M. Jones from comment #36) > We recently fixed this properly upstream (bug 1605071), but the fix hasn't > been included in RHEL 8 yet. In any case I don't believe it's related to > this bug. Yeah, I just thought I'd point that out - and I had forgotten all about Bug 1605071 :)
Dan came up with the excellent suggestion of running gstack periodically to get a stack trace. The stack trace is as follows, repeatedly: Thread 5 (Thread 0xffffae8bea10 (LWP 19204)): #0 0x0000ffffbdbd1a00 in nanosleep () from /lib64/libpthread.so.0 #1 0x0000ffffbdf974c0 in g_usleep () from /lib64/libglib-2.0.so.0 #2 0x0000aaaaaae878a0 in qemu_chr_write_buffer () #3 0x0000aaaaaae87bd4 in qemu_chr_write () #4 0x0000aaaaaad8ae3c in pl011_write () #5 0x0000aaaaaac61114 in memory_region_write_accessor () #6 0x0000aaaaaac5f4ec in access_with_adjusted_size () #7 0x0000aaaaaac63248 in memory_region_dispatch_write () #8 0x0000aaaaaac74160 in io_writex.isra () #9 0x0000ffffafb55388 in code_gen_buffer () #10 0x0000aaaaaac83a68 in cpu_exec () #11 0x0000aaaaaac4f170 in qemu_tcg_cpu_thread_fn () #12 0x0000ffffbdbc77f8 in start_thread () from /lib64/libpthread.so.0 #13 0x0000ffffbdb1157c in thread_start () from /lib64/libc.so.6 Thread 4 (Thread 0xffffaf0fea10 (LWP 19203)): #0 0x0000ffffbdb07c5c in ppoll () from /lib64/libc.so.6 #1 0x0000aaaaaaeebb54 in qemu_poll_ns () #2 0x0000aaaaaaeedb04 in aio_poll () #3 0x0000aaaaaad3e8c4 in iothread_run () #4 0x0000ffffbdbc77f8 in start_thread () from /lib64/libpthread.so.0 #5 0x0000ffffbdb1157c in thread_start () from /lib64/libc.so.6 Thread 3 (Thread 0xffffbc3bea10 (LWP 19201)): #0 0x0000ffffbda73aa0 in sigtimedwait () from /lib64/libc.so.6 #1 0x0000ffffbdbd2224 in sigwait () from /lib64/libpthread.so.0 #2 0x0000aaaaaaeedf6c in sigwait_compat () #3 0x0000ffffbdbc77f8 in start_thread () from /lib64/libpthread.so.0 #4 0x0000ffffbdb1157c in thread_start () from /lib64/libc.so.6 Thread 2 (Thread 0xffffbcbcea10 (LWP 19200)): #0 0x0000ffffbdb0d810 in syscall () from /lib64/libc.so.6 #1 0x0000aaaaaaeeff18 in qemu_event_wait () #2 0x0000aaaaaaf00d14 in call_rcu_thread () #3 0x0000ffffbdbc77f8 in start_thread () from /lib64/libpthread.so.0 #4 0x0000ffffbdb1157c in thread_start () from /lib64/libc.so.6 Thread 1 (Thread 0xffffbd1b77b0 (LWP 19199)): #0 0x0000ffffbdbd0df4 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x0000ffffbdbc9e98 in pthread_mutex_lock () from /lib64/libpthread.so.0 #2 0x0000aaaaaaeef684 in qemu_mutex_lock_impl () #3 0x0000aaaaaac4e89c in qemu_mutex_lock_iothread () #4 0x0000aaaaaaeecaa8 in main_loop_wait () #5 0x0000aaaaaac10eb8 in main () Therefore it appears to get stuck in this loop: https://git.qemu.org/?p=qemu.git;a=blob;f=chardev/char.c;h=ccba36bafb5f0d6fe8306886ad4af30523118e37;hb=HEAD#l111
I got the strace results. The final lines before the hang are: write(2, "libguestfs: ", 12libguestfs: ) = 12 write(2, "launch libvirt guest", 20launch libvirt guest) = 20 write(2, "\n", 1 ) = 1 gettid() = 22154 write(2, "2019-01-09 18:30:51.790+0000: 22"..., 26152019-01-09 18:30:51.790+0000: 22154: debug : virDomainCreateXML:166 : conn=0xaaaaaaaf7eb0, xmlDesc=<?xml version="1.0"?> <domain type="qemu" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0"> <name>guestfs-kwd20ic8pldsqgx3</name> <memory unit="MiB">768</memory> <currentMemory unit="MiB">768</currentMemory> <cpu> <model>cortex-a57</model> </cpu> <vcpu>1</vcpu> <clock offset="utc"> <timer name="rtc" tickpolicy="catchup"/> <timer name="pit" tickpolicy="delay"/> </clock> <os> <type machine="virt">hvm</type> <loader readonly="yes" type="pflash">/usr/share/AAVMF/AAVMF_CODE.verbose.fd</loader> <nvram>/builddir/build/BUILD/libguestfs-1.38.4/tmp/libguestfs1ULedD/vars2.fd</nvram> <kernel>/builddir/build/BUILD/libguestfs-1.38.4/tmp/.guestfs-996/appliance.d/kernel</kernel> <initrd>/builddir/build/BUILD/libguestfs-1.38.4/tmp/.guestfs-996/appliance.d/initrd</initrd> <cmdline>panic=1 console=ttyAMA0 earlyprintk=pl011,0x9000000 ignore_loglevel efi-rtc=noprobe edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=vt100</cmdline> </os> <on_reboot>destroy</on_reboot> <devices> <rng model="virtio"> <backend model="random">/dev/urandom</backend> </rng> <controller type="scsi" index="0" model="virtio-scsi"/> <disk device="disk" type="file"> <source file="/builddir/build/BUILD/libguestfs-1.38.4/tmp/libguestfs1ULedD/scratch1.img"/> <target dev="sda" bus="scsi"/> <driver name="qemu" type="raw" cache="unsafe"/> <address type="drive" controller="0" bus="0" target="0" unit="0"/> </disk> <disk type="file" device="disk"> <source file="/builddir/build/BUILD/libguestfs-1.38.4/tmp/libguestfs1ULedD/overlay3.qcow2"/> <target dev="sdb" bus="scsi"/> <driver name="qemu" type="qcow2" cache="unsafe"/> <address type="drive" controller="0" bus="0" target="1" unit="0"/> </disk> <serial type="unix"> <source mode="connect" path="/tmp/libguestfsUw59Lt/console.sock"/> <target port="0"/> </serial> <channel type="unix"> <source mode="connect" path="/tmp/libguestfsUw59Lt/guestfsd.sock"/> <target type="virtio" name="org.libguestfs.channel.0"/> </channel> <controller type="usb" model="none"/> <memballoon model="none"/> </devices> <qemu:commandline> <qemu:env name="TMPDIR" value="/builddir/build/BUILD/libguestfs-1.38.4/tmp"/> </qemu:commandline> </domain> , flags=0x2 ) = 2615 gettid() = 22154 write(2, "2019-01-09 18:30:51.790+0000: 22"..., 962019-01-09 18:30:51.790+0000: 22154: debug : virNetMessageNew:46 : msg=0xaaaaaaafff90 tracked=0 ) = 96 gettid() = 22154 write(2, "2019-01-09 18:30:51.790+0000: 22"..., 1002019-01-09 18:30:51.790+0000: 22154: debug : virNetMessageEncodePayload:389 : Encode length as 2540 ) = 100 gettid() = 22154 write(2, "2019-01-09 18:30:51.790+0000: 22"..., 1872019-01-09 18:30:51.790+0000: 22154: info : virNetClientSendInternal:2118 : RPC_CLIENT_MSG_TX_QUEUE: client=0xaaaaaaafe7f0 len=2540 prog=536903814 vers=1 proc=10 type=0 status=0 serial=6 ) = 187 gettid() = 22154 write(2, "2019-01-09 18:30:51.790+0000: 22"..., 1432019-01-09 18:30:51.790+0000: 22154: debug : virNetClientCallNew:2071 : New call 0xaaaaaaafef90: msg=0xaaaaaaafff90, expectReply=1, nonBlock=0 ) = 143 gettid() = 22154 write(2, "2019-01-09 18:30:51.790+0000: 22"..., 1602019-01-09 18:30:51.790+0000: 22154: debug : virNetClientIO:1877 : Outgoing message prog=536903814 version=1 serial=6 proc=10 type=0 length=2540 dispatch=(nil) ) = 160 gettid() = 22154 write(2, "2019-01-09 18:30:51.790+0000: 22"..., 1242019-01-09 18:30:51.790+0000: 22154: debug : virNetClientIO:1936 : We have the buck head=0xaaaaaaafef90 call=0xaaaaaaafef90 ) = 124 gettid() = 22154 write(2, "2019-01-09 18:30:51.790+0000: 22"..., 1272019-01-09 18:30:51.790+0000: 22154: debug : virNetSocketUpdateIOCallback:2206 : Watch not registered on socket 0xaaaaaaafe200 ) = 127 rt_sigprocmask(SIG_BLOCK, [PIPE CHLD WINCH], [], 8) = 0 ppoll([{fd=4, events=POLLOUT}, {fd=3, events=POLLIN}], 2, NULL, NULL, 0) = 1 ([{fd=4, revents=POLLOUT}]) rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 write(4, "\0\0\t\354 \0\200\206\0\0\0\1\0\0\0\n\0\0\0\0\0\0\0\6\0\0\0\0\0\0\t\310"..., 2540) = 2540 rt_sigprocmask(SIG_BLOCK, [PIPE CHLD WINCH], [], 8) = 0 <--- hang happens here These indicate that we're still inside virDomainCreateXML at the point where the hang occurs. This firmly points to a problem in libvirt. The detailed reasoning is as follows: (1) libguestfs prints the message "launch libvirt guest" (https://github.com/libguestfs/libguestfs/blob/fe1ee20a59ba62a89ef7e233b5b909c1942684bf/lib/launch-libvirt.c#L606) (2) Libguestfs calls virDomainCreateXML. (3) The next system call in libguestfs is a call to accept4(), but no such call is seen in the strace output.
We suspected that MT-TCG might be causing a problem, however adding this to the libvirt XML did NOT make any difference, it still hangs. (Given we are not using -smp I suppose this makes sense?) <qemu:commandline> <qemu:arg value="-accel"/> <qemu:arg value="tcg,thread=single"/> </qemu:commandline>
Created attachment 1519573 [details] build.log with libvirtd debugging and libvirtd stack trace The information in this attachment was requested by Dan for further debugging of libvirtd. It contains near to the end the libvirtd debugging information using: log_filters="1:qemu" Above that, it contains periodic stack traces collected every 10 seconds from libvirtd while it (or qemu) was hanging. Note that the stack trace from one thread includes a call to virDomainCreateXML. It seems to be hanging waiting on the qemu monitor?
My current theory on this bug: (1) libvirtd creates and unpauses the qemu process. (2) The qemu process quickly starts writing to the serial console (because AAVMF produces copious early output). (3) The serial console is configured as: -chardev socket,id=charserial0,path=/tmp/libguestfsgP02RS/console.sock \ -serial chardev:charserial0 but initially nothing is listening on that socket. (4) Libvirt starts probing qemu features using the qemu monitor. (5) For some reason - why? - the qemu monitor does not respond to monitor queries, and libvirtd gets stuck waiting. (6) Because libvirtd never progresses, it never returns to libguestfs, and so nothing ever accepts the serial console connection.
Rich, if libvirtd unpauses the guest in step (1), then libvirtd uses the QEMU monitor for that. It seems strange that libvirt would then probe qemu features only in step (4), after having used the monitor for unpausing the guest. Anyway, if QEMU waits on libguestfs (to accept()), and libguestfs waits on libvirtd (to return from an API call), and libvirtd waits on QEMU (to respond on the monitor), then we have a circular dependency, and if *all* of these waits are blocking, then we get a deadlock. What I don't get is why QEMU effectively implements a blocking write in qemu_chr_write_buffer() (see near the top of your comment 38): > 111 retry: > 112 res = cc->chr_write(s, buf + *offset, len - *offset); > 113 if (res < 0 && errno == EAGAIN && write_all) { > 114 g_usleep(100); > 115 goto retry; > 116 } Hm... the logic was introduced in QEMU commit cd18720a294b ("char: introduce a blocking version of qemu_chr_fe_write", 2013-03-26), but it's not immediately clear what for. The child commit, 0fbf01fe911e ("qtest: use synchronous I/O for char device", 2013-03-26) provides the first use case; however that commit only added a qtest-related call site, not production code. ... OK, so I think the currently relevant use of the blocking logic goes back to commit 6ab3fc32ea64 ("hw: replace most use of qemu_chr_fe_write with qemu_chr_fe_write_all", 2016-09-13). The function pl011_write() is seen in the stack trace from comment 38, and said commit modified it as follows: > diff --git a/hw/char/pl011.c b/hw/char/pl011.c > index c0fbf8a87428..786e605fdd61 100644 > --- a/hw/char/pl011.c > +++ b/hw/char/pl011.c > @@ -146,7 +146,9 @@ static void pl011_write(void *opaque, hwaddr offset, > /* ??? Check if transmitter is enabled. */ > ch = value; > if (s->chr) > - qemu_chr_fe_write(s->chr, &ch, 1); > + /* XXX this blocks entire thread. Rewrite to use > + * qemu_chr_fe_write and background I/O callbacks */ > + qemu_chr_fe_write_all(s->chr, &ch, 1); > s->int_level |= PL011_INT_TX; > pl011_update(s); > break; Given that commit 6ab3fc32ea64 fixed another bug (see the commit message for explanation / references), this is not a regression, but "technical debt". It is a problem in the PL011 device model. If Dan agrees, I'd suggest reassigning this BZ to QEMU. (The blocking write in pl011_write(), in "hw/char/pl011.c", is present in QEMU as of commit 8ae951fbc106 ("Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20190109' into staging", 2019-01-09).)
I reproduced this bug locally using libvirt 4.10 compiled from source with this patch. The qemu and libvirtd stack traces look the same as observed in Koji. I believe this proves my theory outlined in comment 42. diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 5a806f6c0e..8d4cd6079b 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -7102,6 +7102,9 @@ qemuMonitorJSONGetDeviceAliases(qemuMonitorPtr mon, *aliases = NULL; + VIR_DEBUG("the-big-sleep"); + sleep (5); + n = qemuMonitorJSONGetObjectListPaths(mon, "/machine/peripheral", &paths); if (n < 0) return -1;
Folks, I made a test yesteday night and got the result today morning: - I pushed a branch into qemu-kvm dist-git (private-no-edk2-aarch64) removing "Requires: edk2-aarch64" from qemu-kvm spec file. - pushed a branch into the virt module (private-danilo-test), based on 8.0.0 (fast-train) pointing to that qemu-kvm dist-git. All other packages are the same. - asked for a new module build based on that private one [1] - Took a few hours, but seems that libguestfs built fine. A few considerations: - Module built still failed because perl-sys-virt is broken [2]. - This might be a sign that the problem was introduced by the edk2-aarch64 requirement, or indicates that the issue is intermitent. - It might or might not help, I just thought that you would like to know that. [1] https://mbsweb.engineering.redhat.com/module/2653 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1664422
Dan, would you mind taking this one? It's not really arch-specific after all, and you know a whole lot more than me about this part of libvirt... We might also want to revisit the severity, as fixing either this one or Bug 1665063 will avoid the hang and thus make the remaining bug not as urgent to fix.
I'm thinking the problem in lbivirt is introduced in this commit: commit 93db7eea1b86408e02852a8c886d473f4f5007e3 Author: Peter Krempa <pkrempa> Date: Thu Feb 1 15:02:17 2018 +0100 qemu: migration: Refresh device information after transferring state In my first approach in 4b480d10768c I overlooked the comment in qemuMigrationRunIncoming stating that during actual migration the qemuMigrationRunIncoming does not wait until the migration is complete but rather offloads that to the Finish phase of migration. This means that during actual migration qemuProcessRefreshState was called prior to qemu actually transferring the full state and thus the queries did not get the correct information. The approach worked only for restore, where we wait for the migration to finish during qemu startup. Fix the issue by calling qemuProcessRefreshState both from qemuProcessStart if there's no incomming migration and from qemuMigrationFinish so that the code actually works as expected. This causes us to use the QMP monitor /after/ unpausing CPUs. I think we need to refactor this again so that it is run /after/ migration completes, but still before we unpause CPUs.
Disclaimer: socket programming is not my strong suit, quite the opposite in fact; that said, I've discussed this with colleagues so I'm somewhat confident I'm not just gonna write nonsense and publicly make a fool of myself :) It seems to me that, while both libvirt and QEMU should be fixed, not to block on pl011 write and not to resume guest CPUs up until the very last moment possible respectively, libguestfs is also not being robust enough when starting libvirt guests. Relevant snippet: if (bind (console_sock, (struct sockaddr *) &addr, sizeof addr) == -1) { perrorf (g, "bind"); goto cleanup; } if (listen (console_sock, 1) == -1) { perrorf (g, "listen"); goto cleanup; } ... dom = virDomainCreateXML (conn, (char *) xml, VIR_DOMAIN_START_AUTODESTROY); if (!dom) { libvirt_error (g, _( "could not create appliance through libvirt.\n" "\n" "Try running qemu directly without libvirt using this environment variable:\n" "export LIBGUESTFS_BACKEND=direct\n" "\n" "Original error from libvirt")); goto cleanup; } g->state = LAUNCHING; /* Wait for console socket to be opened (by qemu). */ r = accept4 (console_sock, NULL, NULL, SOCK_NONBLOCK|SOCK_CLOEXEC); So libguestfs will prepare a socket for the serial console, start listening on it and then start a guest that's configured to connect to it, but will only actually accept connections to the socket once virDomainCreateXML() has returned instead of right away, thus making it possible for QEMU to block on the connection attempt. A more resilient solution would be to have two separate threads, one setting up the socket and immediately trying to accept connections in a loop, and the other one starting the guest in parallel once the above has been set up. The direct backend is actually already doing something similar to the above, due to the fact that it has to fork() in order to spawn QEMU. I would suggest cloning this bug to libguestfs, and using that clone to track progress on a fix to the above; said bug would also be the perfect one to note eg. that tests have temporarily been disabled on aarch64 to unblock module builds or, if we decide to start using silent AAVMF in the libguestfs test suite to work around the issue, that we have done so. Thoughts?
(In reply to Andrea Bolognani from comment #51) > Disclaimer: socket programming is not my strong suit, quite the > opposite in fact; that said, I've discussed this with colleagues > so I'm somewhat confident I'm not just gonna write nonsense and > publicly make a fool of myself :) > > It seems to me that, while both libvirt and QEMU should be > fixed, not to block on pl011 write and not to resume guest CPUs > up until the very last moment possible respectively, libguestfs > is also not being robust enough when starting libvirt guests. > > Relevant snippet: > > if (bind (console_sock, (struct sockaddr *) &addr, sizeof addr) == -1) { > perrorf (g, "bind"); > goto cleanup; > } > > if (listen (console_sock, 1) == -1) { > perrorf (g, "listen"); > goto cleanup; > } > > ... > > dom = virDomainCreateXML (conn, (char *) xml, > VIR_DOMAIN_START_AUTODESTROY); > if (!dom) { > libvirt_error (g, _( > "could not create appliance through libvirt.\n" > "\n" > "Try running qemu directly without libvirt using this environment > variable:\n" > "export LIBGUESTFS_BACKEND=direct\n" > "\n" > "Original error from libvirt")); > goto cleanup; > } > > g->state = LAUNCHING; > > /* Wait for console socket to be opened (by qemu). */ > r = accept4 (console_sock, NULL, NULL, SOCK_NONBLOCK|SOCK_CLOEXEC); > > So libguestfs will prepare a socket for the serial console, start > listening on it and then start a guest that's configured to connect > to it, but will only actually accept connections to the socket once > virDomainCreateXML() has returned instead of right away, thus > making it possible for QEMU to block on the connection attempt. QEMU shouldn't block on the connect() call. The kernel will do the connection handshake in the background and queue the incoming connection for a later accept() call. QEMU can even start sending data before we accept() I believe, and will only block once the socket buffers are all full.
(In reply to Daniel Berrange from comment #52) > (In reply to Andrea Bolognani from comment #51) > > So libguestfs will prepare a socket for the serial console, start > > listening on it and then start a guest that's configured to connect > > to it, but will only actually accept connections to the socket once > > virDomainCreateXML() has returned instead of right away, thus > > making it possible for QEMU to block on the connection attempt. > > QEMU shouldn't block on the connect() call. The kernel will do the > connection handshake in the background and queue the incoming connection for > a later accept() call. QEMU can even start sending data before we accept() > I believe, and will only block once the socket buffers are all full. I assume the above is why we're only hitting the issue in a very narrow scenario as opposed to the whole thing not working at all, ever? If you're using verbose AAVMF then the buffers are going to be filled up fairly quickly, whereas normally the guest won't produce nearly as much output and libguestfs will be able to accept the connection after all. This doesn't invalidate the fact that setting up the accept() loop earlier will reduce or even remove the window where the deadlock can occur, right?
(In reply to Andrea Bolognani from comment #53) > This doesn't invalidate the fact that setting up the accept() loop > earlier will reduce or even remove the window where the deadlock > can occur, right? Doing the accept earlier won't help. It would need to actually be reading data from the socket to avoid the deadlock.
(In reply to Daniel Berrange from comment #54) > (In reply to Andrea Bolognani from comment #53) > > This doesn't invalidate the fact that setting up the accept() loop > > earlier will reduce or even remove the window where the deadlock > > can occur, right? > > Doing the accept earlier won't help. It would need to actually be reading > data from the socket to avoid the deadlock. Yeah, sorry, of course I didn't mean just calling accept() but also reading the data and logging it or whatnot. The point is, only starting the data processing loop after the guest is up is not as robust as calling virDomainCreateXML() in a separate thread once the loop is in place.
In addition, libguestfs can't start threads or process the data in threads. It's a library and might be running in a non-threaded program. What we might do is to start the guest in the paused state, accept the connection (assuming qemu connects in this scenario), and then unpause the guest. However this will cause boot times to increase, quite likely considerably - I measured at least 300ms extra delay based on libvirt debug messages - and that is not really acceptable either given the amount of time I spent two years ago knocking tens of milliseconds off the boot time.
(In reply to Richard W.M. Jones from comment #56) > In addition, libguestfs can't start threads or process the data in threads. > It's a library and might be running in a non-threaded program. > > What we might do is to start the guest in the paused state, accept the > connection (assuming qemu connects in this scenario), and then unpause > the guest. However this will cause boot times to increase, quite likely > considerably - I measured at least 300ms extra delay based on libvirt > debug messages - and that is not really acceptable either given the amount > of time I spent two years ago knocking tens of milliseconds off the boot > time. I'm pretty surprised to see such a noticeable time delay from starting paused. It is only adding one extra libvirt RPC roundtrip which should be minimal for a local UNIX socket connection. The number of QMP commands libvirt does wrt QEMU should be the same still - we just move the "cont" command to the second RPC call.
Let me actually measure this before I make any claims ...
As of libvirt upstream commit: commit 41d37d31b3ea7105bb3a4cc064032a7fe6142f04 Author: Marc Hartmayer <mhartmay.com> Date: Mon Feb 4 13:36:24 2019 +0100 qemu: Refresh state before starting the VCPUs For normal starts (no incoming migration) the refresh of the QEMU state must be done before the VCPUs getting started since otherwise there might be a race condition between a possible shutdown of the guest OS and the QEMU monitor queries. This fixes "qemu: migration: Refresh device information after transferring state" (93db7eea1b864). Signed-off-by: Marc Hartmayer <mhartmay.com> Reviewed-by: Peter Krempa <pkrempa> we no longer try to refresh the state prior to vCPUs being started.
(In reply to Peter Krempa from comment #63) > As of libvirt upstream commit: > > commit 41d37d31b3ea7105bb3a4cc064032a7fe6142f04 > Author: Marc Hartmayer <mhartmay.com> > Date: Mon Feb 4 13:36:24 2019 +0100 > > qemu: Refresh state before starting the VCPUs > > For normal starts (no incoming migration) the refresh of the QEMU > state must be done before the VCPUs getting started since otherwise > there might be a race condition between a possible shutdown of the > guest OS and the QEMU monitor queries. > > This fixes "qemu: migration: Refresh device information after > transferring state" (93db7eea1b864). > > Signed-off-by: Marc Hartmayer <mhartmay.com> > Reviewed-by: Peter Krempa <pkrempa> > > we no longer try to refresh the state prior to vCPUs being started. That's great, thanks for the heads-up! Moving to POST since we're going to get the fix through the next rebase. (In reply to Richard W.M. Jones from comment #49) > At the moment we have disabled the test on aarch64 to allow module builds > to go through Rich, is there a bug associated with that change? We should make sure the libguestfs test is re-enabled once libvirt has picked up the fix.
> (In reply to Richard W.M. Jones from comment #49) > > At the moment we have disabled the test on aarch64 to allow module builds > > to go through > > Rich, is there a bug associated with that change? We should make > sure the libguestfs test is re-enabled once libvirt has picked up > the fix. I'm afraid there isn't any bug associated with that. Pino simply disabled the tests in dist-git commit 8d36cc88fd09bd3e2b4bebc1dc0ead00b6cf0171.
Given the general perception that this defect is VERIFIED, and RHEL AV 8.1.0 was shipped (went GA) on 19 Nov 2019, I am closing this bug.