Since we moved our builders to fedora 31, and after bug #1769600 was fixed, we are still seeing libguestfs fail in composes. It's used in the Cloud and Container images to add modifications after the image is created. F31 cloud: https://koji.fedoraproject.org/koji/taskinfo?taskID=39714385 ... Exception encountered in _build_image_from_template thread guestfs_launch failed. This usually means the libguestfs appliance failed to start or crashed. Do: export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1 and run the command again. For further information, read: http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs You can also run 'libguestfs-test-tool' and post the *complete* output into a bug report or message to the libguestfs mailing list. Traceback (most recent call last): File "/usr/lib/python3.7/site-packages/imgfac/Builder.py", line 132, in _build_image_from_template self.os_plugin.create_base_image(self, template, parameters) File "/usr/lib/python3.7/site-packages/imagefactory_plugins/TinMan/TinMan.py", line 354, in create_base_image gfs = launch_inspect_and_mount(self.image, readonly=True) File "/usr/lib/python3.7/site-packages/imgfac/FactoryUtils.py", line 25, in launch_inspect_and_mount g.launch() File "/usr/lib64/python3.7/site-packages/guestfs.py", line 5872, in launch r = libguestfsmod.launch(self._o) RuntimeError: guestfs_launch failed. This usually means the libguestfs appliance failed to start or crashed. Do: export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1 and run the command again. For further information, read: http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs You can also run 'libguestfs-test-tool' and post the *complete* output into a bug report or message to the libguestfs mailing list. ABORT called in TinMan plugin Domain not found: no domain with matching name 'factory-build-39b625f4-da5d-459e-8693-463de8a82dc3' Traceback (most recent call last): File "/usr/lib/python3.7/site-packages/imagefactory_plugins/TinMan/TinMan.py", line 243, in abort guest_dom = self.guest.libvirt_conn.lookupByName(self.tdlobj.name) File "/usr/lib64/python3.7/site-packages/libvirt.py", line 4364, in lookupByName if ret is None:raise libvirtError('virDomainLookupByName() failed', conn=self) libvirt.libvirtError: Domain not found: no domain with matching name 'factory-build-39b625f4-da5d-459e-8693-463de8a82dc3' No Oz VM found with name (factory-build-39b625f4-da5d-459e-8693-463de8a82dc3) - nothing to do This likely means the local VM has already been destroyed or never started Resetting dropped connection: koji.fedoraproject.org https://koji.fedoraproject.org:443 "POST /kojihub?session-id=92178375&session-key=3682-PpaGAeidVV93gCOchWf&callnum=9 HTTP/1.1" 200 114 ... F30 container: https://koji.fedoraproject.org/koji/taskinfo?taskID=39713832 Note that these are on a Fedora 31 power9 machine with Fedora 31 guests (the task runs on the bildvm). libguestfs-test-tool gives: ************************************************************ * IMPORTANT NOTICE * * When reporting bugs, include the COMPLETE, UNEDITED * output below in your bug report. * ************************************************************ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin XDG_RUNTIME_DIR=/run/user/0 SELinux: Permissive guestfs_get_append: (null) guestfs_get_autosync: 1 guestfs_get_backend: libvirt guestfs_get_backend_settings: [] guestfs_get_cachedir: /var/tmp guestfs_get_hv: /usr/bin/qemu-system-ppc64 guestfs_get_memsize: 1024 guestfs_get_network: 0 guestfs_get_path: /usr/lib64/guestfs guestfs_get_pgroup: 0 guestfs_get_program: libguestfs-test-tool guestfs_get_recovery_proc: 1 guestfs_get_smp: 1 guestfs_get_sockdir: /tmp guestfs_get_tmpdir: /tmp guestfs_get_trace: 0 guestfs_get_verbose: 1 host_cpu: powerpc64le Launching appliance, timeout set to 600 seconds. libguestfs: launch: program=libguestfs-test-tool libguestfs: launch: version=1.40.2fedora=31,release=8.fc31,libvirt libguestfs: launch: backend registered: unix libguestfs: launch: backend registered: uml libguestfs: launch: backend registered: libvirt libguestfs: launch: backend registered: direct libguestfs: launch: backend=libvirt libguestfs: launch: tmpdir=/tmp/libguestfsI8FPWT libguestfs: launch: umask=0022 libguestfs: launch: euid=0 libguestfs: libvirt version = 5006000 (5.6.0) libguestfs: guest random name = guestfs-z2tu3i19vx35na9x libguestfs: connect to libvirt libguestfs: opening libvirt handle: URI = qemu:///system, auth = default+wrapper, flags = 0 libguestfs: successfully opened libvirt handle: conn = 0x11b29cae0 libguestfs: qemu version (reported by libvirt) = 4001001 (4.1.1) libguestfs: get libvirt capabilities libguestfs: parsing capabilities XML libguestfs: build appliance libguestfs: begin building supermin appliance libguestfs: run supermin libguestfs: command: run: /usr/bin/supermin libguestfs: command: run: \ --build libguestfs: command: run: \ --verbose libguestfs: command: run: \ --if-newer libguestfs: command: run: \ --lock /var/tmp/.guestfs-0/lock libguestfs: command: run: \ --copy-kernel libguestfs: command: run: \ -f ext2 libguestfs: command: run: \ --host-cpu powerpc64le libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d libguestfs: command: run: \ -o /var/tmp/.guestfs-0/appliance.d supermin: version: 5.1.20 supermin: rpm: detected RPM version 4.15 supermin: package handler: fedora/rpm supermin: acquiring lock on /var/tmp/.guestfs-0/lock supermin: if-newer: output does not need rebuilding libguestfs: finished building supermin appliance libguestfs: command: run: qemu-img libguestfs: command: run: \ create libguestfs: command: run: \ -f qcow2 libguestfs: command: run: \ -o backing_file=/var/tmp/.guestfs-0/appliance.d/root,backing_fmt=raw libguestfs: command: run: \ /tmp/libguestfsI8FPWT/overlay2.qcow2 Formatting '/tmp/libguestfsI8FPWT/overlay2.qcow2', fmt=qcow2 size=4294967296 backing_file=/var/tmp/.guestfs-0/appliance.d/root backing_fmt=raw cluster_size=65536 lazy_refcounts=off refcount_bits=16 libguestfs: create libvirt XML libguestfs: libvirt XML:\n<?xml version="1.0"?>\n<domain type="kvm" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">\n <name>guestfs-z2tu3i19vx35na9x</name>\n <memory unit="MiB">1024</memory>\n <currentMemory unit="MiB">1024</currentMemory>\n <vcpu>1</vcpu>\n <clock offset="utc">\n <timer name="rtc" tickpolicy="catchup"/>\n <timer name="pit" tickpolicy="delay"/>\n </clock>\n <os>\n <type machine="pseries">hvm</type>\n <kernel>/var/tmp/.guestfs-0/appliance.d/kernel</kernel>\n <initrd>/var/tmp/.guestfs-0/appliance.d/initrd</initrd>\n <cmdline>panic=1 console=hvc0 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=screen</cmdline>\n </os>\n <on_reboot>destroy</on_reboot>\n <devices>\n <rng model="virtio">\n <backend model="random">/dev/urandom</backend>\n </rng>\n <controller type="scsi" index="0" model="virtio-scsi"/>\n <disk device="disk" type="file">\n <source file="/tmp/libguestfsI8FPWT/scratch1.img"/>\n <target dev="sda" bus="scsi"/>\n <driver name="qemu" type="raw" cache="unsafe"/>\n <address type="drive" controller="0" bus="0" target="0" unit="0"/>\n </disk>\n <disk type="file" device="disk">\n <source file="/tmp/libguestfsI8FPWT/overlay2.qcow2"/>\n <target dev="sdb" bus="scsi"/>\n <driver name="qemu" type="qcow2" cache="unsafe"/>\n <address type="drive" controller="0" bus="0" target="1" unit="0"/>\n </disk>\n <serial type="unix">\n <source mode="connect" path="/tmp/libguestfsSJfQVT/console.sock"/>\n <target port="0"/>\n </serial>\n <channel type="unix">\n <source mode="connect" path="/tmp/libguestfsSJfQVT/guestfsd.sock"/>\n <target type="virtio" name="org.libguestfs.channel.0"/>\n </channel>\n <controller type="usb" model="none"/>\n <memballoon model="none"/>\n </devices>\n <qemu:commandline>\n <qemu:env name="TMPDIR" value="/var/tmp"/>\n </qemu:commandline>\n</domain>\n libguestfs: command: run: ls libguestfs: command: run: \ -a libguestfs: command: run: \ -l libguestfs: command: run: \ -R libguestfs: command: run: \ -Z /var/tmp/.guestfs-0 libguestfs: /var/tmp/.guestfs-0: libguestfs: total 184 libguestfs: drwxr-xr-x. 3 root root unconfined_u:object_r:user_tmp_t:s0 4096 Dec 18 19:56 . libguestfs: drwxrwxrwt. 9 root root system_u:object_r:tmp_t:s0 4096 Dec 18 19:56 .. libguestfs: drwxr-xr-x. 2 root root unconfined_u:object_r:user_tmp_t:s0 4096 Dec 17 23:15 appliance.d libguestfs: -rw-r--r--. 1 root root unconfined_u:object_r:user_tmp_t:s0 0 Dec 6 23:04 lock libguestfs: -rw-r--r--. 1 root root unconfined_u:object_r:user_tmp_t:s0 11104 Dec 7 21:02 qemu-16765032-1573858419.devices libguestfs: -rw-r--r--. 1 root root unconfined_u:object_r:user_tmp_t:s0 26890 Dec 7 21:02 qemu-16765032-1573858419.help libguestfs: -rw-r--r--. 1 root root unconfined_u:object_r:user_tmp_t:s0 124801 Dec 7 21:02 qemu-16765032-1573858419.qmp-schema libguestfs: -rw-r--r--. 1 root root unconfined_u:object_r:user_tmp_t:s0 48 Dec 7 21:02 qemu-16765032-1573858419.query-kvm libguestfs: -rw-r--r--. 1 root root unconfined_u:object_r:user_tmp_t:s0 49 Dec 7 21:02 qemu-16765032-1573858419.stat libguestfs: libguestfs: /var/tmp/.guestfs-0/appliance.d: libguestfs: total 446932 libguestfs: drwxr-xr-x. 2 root root unconfined_u:object_r:user_tmp_t:s0 4096 Dec 17 23:15 . libguestfs: drwxr-xr-x. 3 root root unconfined_u:object_r:user_tmp_t:s0 4096 Dec 18 19:56 .. libguestfs: -rw-r--r--. 1 qemu qemu unconfined_u:object_r:user_tmp_t:s0 2116096 Dec 18 19:56 initrd libguestfs: -rwxr-xr-x. 1 qemu qemu unconfined_u:object_r:user_tmp_t:s0 25101936 Dec 18 19:56 kernel libguestfs: -rw-r--r--. 1 qemu qemu system_u:object_r:virt_content_t:s0 4294967296 Dec 18 19:56 root libguestfs: command: run: ls libguestfs: command: run: \ -a libguestfs: command: run: \ -l libguestfs: command: run: \ -Z /tmp/libguestfsSJfQVT libguestfs: total 8 libguestfs: drwxr-xr-x. 2 root root unconfined_u:object_r:user_tmp_t:s0 4096 Dec 18 19:56 . libguestfs: drwxrwxrwt. 8 root root system_u:object_r:tmp_t:s0 4096 Dec 18 19:56 .. libguestfs: srw-rw----. 1 root qemu unconfined_u:object_r:user_tmp_t:s0 0 Dec 18 19:56 console.sock libguestfs: srw-rw----. 1 root qemu unconfined_u:object_r:user_tmp_t:s0 0 Dec 18 19:56 guestfsd.sock libguestfs: launch libvirt guest SLOF\x1b[0m\x1b[?25l ********************************************************************** \x1b[1mQEMU Starting \x1b[0m Build Date = Jul 24 2019 00:00:00 FW Version = mockbuild@ release 20190114 Press "s" to enter Open Firmware. Populating /vdevice methods Populating /vdevice/vty@30000000 Populating /vdevice/nvram@71000000 Populating /pci@800000020000000 00 0800 (D) : 1af4 1004 virtio [ scsi ] Populating /pci@800000020000000/scsi@1 SCSI: Looking for devices 100000000000000 DISK : "QEMU QEMU HARDDISK 2.5+" 101000000000000 DISK : "QEMU QEMU HARDDISK 2.5+" 00 1000 (D) : 1af4 1003 virtio [ serial ] 00 1800 (D) : 1af4 1005 unknown-legacy-device* No NVRAM common partition, re-initializing... Scanning USB Using default console: /vdevice/vty@30000000 Detected RAM kernel at 400000 (1a5b040 bytes) Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Booting from memory... OF stdout device is: /vdevice/vty@30000000 Preparing to boot Linux version 5.3.16-300.fc31.ppc64le (mockbuild.fedoraproject.org) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Fri Dec 13 17:59:56 UTC 2019 Detected machine type: 0000000000000101 command line: panic=1 console=hvc0 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=screen Max number of cores passed to firmware: 1024 (NR_CPUS = 1024) Calling ibm,client-architecture-support...libguestfs: error: appliance closed the connection unexpectedly, see earlier error messages libguestfs: child_cleanup: 0x11b29a290: child process died libguestfs: error: guestfs_launch failed, see earlier error messages libguestfs: closing guestfs handle 0x11b29a290 (state 0) libguestfs: command: run: rm libguestfs: command: run: \ -rf /tmp/libguestfsI8FPWT libguestfs: command: run: rm libguestfs: command: run: \ -rf /tmp/libguestfsSJfQVT If I manually start something via qemu, it works, but it restarts in the middle: [root@buildvm-ppc64le-09 tmp][PROD]# qemu-system-ppc64 -m 4096 -boot d -enable-kvm -smp 4 -net nic -net user -hda test.img -cdrom Fedora-Everything-netinst-ppc64le-Rawhide-20191217.n.0.iso -nographic SLOF ********************************************************************** QEMU Starting Build Date = Jul 24 2019 00:00:00 FW Version = mockbuild@ release 20190114 Press "s" to enter Open Firmware. Populating /vdevice methods Populating /vdevice/vty@71000000 Populating /vdevice/nvram@71000001 Populating /vdevice/l-lan@71000002 Populating /vdevice/v-scsi@71000003 SCSI: Looking for devices 8000000000000000 DISK : "QEMU QEMU HARDDISK 2.5+" 8200000000000000 CD-ROM : "QEMU QEMU CD-ROM 2.5+" Populating /pci@800000020000000 00 0000 (D) : 1234 1111 qemu vga 00 0800 (D) : 1033 0194 serial bus [ usb-xhci ] No NVRAM common partition, re-initializing... Installing QEMU fb Scanning USB XHCI: Initializing USB Keyboard USB mouse No console specified using screen & keyboard Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /vdevice/v-scsi@71000003/disk@8200000000000000 ... Successfully loaded qemu-system-ppc64: warning: kernel_irqchip allowed but unavailable: IRQ_XIVE capability must be present for KVM Falling back to kernel-irqchip=off SLOF ********************************************************************** QEMU Starting Build Date = Jul 24 2019 00:00:00 FW Version = mockbuild@ release 20190114 Press "s" to enter Open Firmware. Populating /vdevice methods Populating /vdevice/vty@71000000 Populating /vdevice/nvram@71000001 Populating /vdevice/l-lan@71000002 Populating /vdevice/v-scsi@71000003 SCSI: Looking for devices 8000000000000000 DISK : "QEMU QEMU HARDDISK 2.5+" 8200000000000000 CD-ROM : "QEMU QEMU CD-ROM 2.5+" Populating /pci@800000020000000 00 0000 (D) : 1234 1111 qemu vga 00 0800 (D) : 1033 0194 serial bus [ usb-xhci ] Installing QEMU fb Scanning USB XHCI: Initializing USB Keyboard USB mouse No console specified using screen & keyboard Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /vdevice/v-scsi@71000003/disk@8200000000000000 ... Successfully loaded Linux ppc64le #1 SMP Mon Dec 1 (it then puts the anaconda prompt inside the already rendered screen) Perhaps the restart is confusing libguestfs?
There may be more information from the libguestfs-test-tool run if you look in /var/log/libvirt/qemu/guestfs-z2tu3i19vx35na9x.log. Also if qemu segfaulted then abrt/coredumpctl may have captured a core dump. However the basic problem is that qemu is crashing, so this is most likely to be a qemu (or possibly kernel/firmware) problem.
The /var/log/libvirt/guestfs*.log: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ HOME=/var/lib/libvirt/qemu/domain-1-guestfs-z2tu3i19vx35 \ XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-guestfs-z2tu3i19vx35/.local/share \ XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-guestfs-z2tu3i19vx35/.cache \ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-guestfs-z2tu3i19vx35/.config \ QEMU_AUDIO_DRV=none \ TMPDIR=/var/tmp \ /usr/bin/qemu-system-ppc64 \ -name guest=guestfs-z2tu3i19vx35na9x,debug-threads=on \ -S \ -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-guestfs-z2tu3i19vx35/master-key.a es \ -machine pseries-4.1,accel=kvm,usb=off,dump-guest-core=off \ -m 1024 \ -overcommit mem-lock=off \ -smp 1,sockets=1,cores=1,threads=1 \ -uuid 0024f525-c02c-42c5-9326-0bfa6acf4a9a \ -display none \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=32,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc,driftfix=slew \ -no-reboot \ -boot strict=on \ -kernel /var/tmp/.guestfs-0/appliance.d/kernel \ -initrd /var/tmp/.guestfs-0/appliance.d/initrd \ -append 'panic=1 console=hvc0 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check p rintk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=screen' \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x1 \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x2 \ -drive file=/tmp/libguestfsI8FPWT/scratch1.img,format=raw,if=none,id=drive-scsi0-0-0-0,cache=unsafe \ -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,device_id=drive-scsi0-0-0-0,drive=drive-scsi0-0-0-0,id= scsi0-0-0-0,bootindex=1,write-cache=on \ -drive file=/tmp/libguestfsI8FPWT/overlay2.qcow2,format=qcow2,if=none,id=drive-scsi0-0-1-0,cache=unsafe \ -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,device_id=drive-scsi0-0-1-0,drive=drive-scsi0-0-1-0,id= scsi0-0-1-0,write-cache=on \ -chardev socket,id=charserial0,path=/tmp/libguestfsSJfQVT/console.sock \ -device spapr-vty,chardev=charserial0,id=serial0,reg=0x30000000 \ -chardev socket,id=charchannel0,path=/tmp/libguestfsSJfQVT/guestfsd.sock \ -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.libguestfs.channel .0 \ -object rng-random,id=objrng0,filename=/dev/urandom \ -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x3 \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on 2019-12-18 19:56:10.134+0000: Domain id=1 is tainted: custom-argv 2019-12-18 19:56:14.213+0000: shutting down, reason=shutdown There's no crash or core...
Any ideas or news here? Still happening. ;(
For the record, I have already seen the "restarting" VM when playing with RHEL 8 cloud images (and perhaps with others too). The symptom was similar, boot starts, writes the grub boot menu and instead of booting the selected OS, it boots again to the grub menu. And with the second grub run, it allows to Linux to boot.
Also, have you tried passing the same number of threads as the host? I mean "-smp 1,sockets=1,cores=1,threads=1" (P9 usually has 4 threads per core). See Bug 1789199. []'s Gustavo
I meant to write "-smp 1,sockets=1,cores=1,threads=4"
should be "-smp 4,sockets=1,cores=1,threads=4" :-) But I see no change. Kevin's command line gives a good reproducer, so let's switch to qemu or start a new bug against qemu.
it could be related to this warning qemu-system-ppc64: warning: kernel_irqchip allowed but unavailable: IRQ_XIVE capability must be present for KVM Falling back to kernel-irqchip=off because it appear when Linux kernel is loaded/booted for the first time. It's missing in the second boot. and I think there has been a bug for it already
Seems it's the machine type problem again, using -M pseries-4.0 makes the "double boot" problem in qemu go away.
(In reply to Dan Horák from comment #9) > Seems it's the machine type problem again, using -M pseries-4.0 makes the > "double boot" problem in qemu go away. This has been fixed upstream by 8deb8019d696 ("spapr: Don't trigger a CAS reboot for XICS/XIVE mode changeover")
Laurent, could it be backported to qemu 4.2? Are there any prerequisite patches? Cedric has already explained me the "double boot" in https://bugzilla.redhat.com/show_bug.cgi?id=1769600#c34 so I'm wondering what route should be we go - running regular VMs is OK, but libguestfs can't deal with that, but it would allow us override the machine parameters via http://libguestfs.org/guestfs.3.html#qemu-wrappers. Easier would be to use pacthed qemu.
(In reply to Dan Horák from comment #11) > Laurent, could it be backported to qemu 4.2? Yes, and it's straightforward. No prerequisite patches.
(In reply to Laurent Vivier from comment #12) > (In reply to Dan Horák from comment #11) > > Laurent, could it be backported to qemu 4.2? > > Yes, and it's straightforward. No prerequisite patches. and how about qemu 4.1? Because that's the version in F-31 that installed on the Fedora builders.
(In reply to Dan Horák from comment #13) > (In reply to Laurent Vivier from comment #12) > > (In reply to Dan Horák from comment #11) > > > Laurent, could it be backported to qemu 4.2? > > > > Yes, and it's straightforward. No prerequisite patches. > > and how about qemu 4.1? Because that's the version in F-31 that installed on > the Fedora builders. It more complicated: 4.1 needs more patches to regenerate the device tree when CAS is called, to add a code path to activate and deactivate interrupt controllers and a new version of SLOF. dgibson knows better than me what is the list of needed patches
OK, then it would be better to use the virt stack from the virt-preview repo which has 4.2 for F-31 and F-30
And I confirm, that the "double boot" goes away when I use qemu 4.2 with the 8deb8019d696 patch applied. I'll now check if oz/image-factory works too.
And I got an image created ============ Final Image Details ============ UUID: 48b2d109-d0ef-4f0b-8e20-4a62424ea25b Type: base_image Image filename: /var/lib/imagefactory/storage/48b2d109-d0ef-4f0b-8e20-4a62424ea25b.body Image build completed SUCCESSFULLY! So the main question is about the next steps? Can infra use qemu from virt-preview? How to best integrate the patch into our qemu package?
So to clarify, this is in the guest right? We could do a qemu build + the patch in our infra repo... that would upgrade all the builders, but I guess that might be ok?
Dan, There is perhaps another way to fix the problem easily if the problem is with the no-reboot parameter and not with the double-boot. The commit 9146206eb26c ("spapr: Use SHUTDOWN_CAUSE_SUBSYSTEM_RESET for CAS reboots") allows to reboot to do the CAS negotiation even if the --no-reboot parameter is provided. It is already included in 4.2 and easy to backport to 4.1
(In reply to Kevin Fenzi from comment #18) > So to clarify, this is in the guest right? yes, qemu in the builder VM needs the update > We could do a qemu build + the patch in our infra repo... that would upgrade > all the builders, but I guess that might be ok? yes, it should be OK, you could give the other arches some testing in staging env first
(In reply to Laurent Vivier from comment #19) > Dan, > > There is perhaps another way to fix the problem easily if the problem is > with the no-reboot parameter and not with the double-boot. > > The commit 9146206eb26c ("spapr: Use SHUTDOWN_CAUSE_SUBSYSTEM_RESET for CAS > reboots") allows to reboot to do the CAS negotiation even if the --no-reboot > parameter is provided. > > It is already included in 4.2 and easy to backport to 4.1 I think it's a question for the libguests guys and how libguestfs communicates with qemu. Also I think we have no way to pass additional parameters to to the domain XML or qemu, so qemu 4.2 + the patch looks as a good solution to me :-)
Not really sure of the question but maybe this diagram helps? http://libguestfs.org/guestfs-internals.1.html#architecture We try not to need custom tweaking for each architecture. If qemu can't by default boot a kernel image then we usually think of that architecture as needing to be fixed. I even have a tool to test this: https://people.redhat.com/~rjones/qemu-sanity-check/
Applying 8deb8019d696 to 4.1 will require a *lot* of preliminary patches. However, for the problem you have specifically with libguestfs (which occurs because you use -no-reboot), it should only be necessary to use the stopgap fix in 9146206eb26c1436c80a7c2ca1e4c5f86b27179d "spapr: Use SHUTDOWN_CAUSE_SUBSYSTEM_RESET for CAS reboots". That one should apply to 4.1 much more easily.
The reboots turned out to actually be a problem for openQA as well, it was just a bit more subtle than I first realized. We have some tests which are set to specify kernel parameters, by typing them into the bootloader when the VM boots. But those tests are broken by the reboot behaviour, because the test only types the parameters on the *first* boot, then the reboot happens and they are effectively lost. Rewriting the test code to handle the VM spontaneously rebooting like this would be a bit awkward. For now, Kevin has done a backport of qemu 4.2 in the infra repo, and I have that deployed on the openQA VMs as well; with a newer SLOF it seems to be working OK.
Ah, right. 9146206eb26c1436c80a7c2ca1e4c5f86b27179d alone won't fix that kernel parameters problem. For that you will need 8deb8019d696 and all its preliminaries.
Switching to qemu to build an update with the mentioned patch applied. Cole (or another qemu maintainer), could you build new qemu 4.2 (f32 + rawhide + virt-preview) with 8deb8019d696 ("spapr: Don't trigger a CAS reboot for XICS/XIVE mode changeover") applied? Thanks, Dan.
Patches pushed and f32 build is done, rawhide qemu build is failing due to some kernel headers breakage though: https://bugzilla.redhat.com/show_bug.cgi?id=1804330
This message is a reminder that Fedora 32 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '32'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 32 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.