rawhide composes have been failing since 5.9 rc1 landed. This is due to aarch64 Cloud-Base and/or Workstation images failing compose. They timeout and on investigating, they seem to compose fine, but then at the end they just hang on reboot. Here's the last messages from one of the vm's: [ OK ] Stopped Rebuild Hardware Database. [ OK ] Stopped Rebuild Journal Catalog. Stopping Update UTMP about System Boot/Shutdown... [ OK ] Stopped Update UTMP about System Boot/Shutdown. [ OK ] Stopped Create Volatile Files and Directories. [ OK ] Stopped Import network configuration from initramfs. [ OK ] Stopped Restore /run/initramfs on shutdown. [ OK ] Stopped target Local File Systems. Unmounting /mnt/sysimage/boot/efi... Unmounting /mnt/sysimage/dev/pts... Unmounting /mnt/sysimage/dev/shm... Unmounting /mnt/sysimage/proc... Unmounting /mnt/sysimage/run... Unmounting /mnt/sysimage/sys/firmware/efi/efivars... Unmounting /mnt/sysimage/sys/fs/selinux... Unmounting /mnt/sysroot/boot/efi... Unmounting /mnt/sysroot/dev/pts... Unmounting /mnt/sysroot/dev/shm... Unmounting /mnt/sysroot/proc... Unmounting /mnt/sysroot/run... Unmounting /mnt/sysroot/sys/firmware/efi/efivars... Unmounting /mnt/sysroot/sys/fs/selinux... Unmounting Temporary Directory (/tmp)... [ OK ] Unmounted /mnt/sysimage/boot/efi. [ OK ] Unmounted /mnt/sysimage/dev/pts. [ OK ] Unmounted /mnt/sysimage/dev/shm. [ OK ] Unmounted /mnt/sysimage/proc. [ OK ] Unmounted /mnt/sysimage/run. [ OK ] Unmounted /mnt/sysimage/sys/firmware/efi/efivars. [ OK ] Unmounted /mnt/sysimage/sys/fs/selinux. [ OK ] Unmounted /mnt/sysroot/boot/efi. [ OK ] Unmounted /mnt/sysroot/dev/pts. [ OK ] Unmounted /mnt/sysroot/dev/shm. [ OK ] Unmounted /mnt/sysroot/proc. [ OK ] Unmounted /mnt/sysroot/run. [ OK ] Unmounted /mnt/sysroot/sys/firmware/efi/efivars. [ OK ] Unmounted /mnt/sysroot/sys/fs/selinux. Unmounting /mnt/sysimage/dev... Unmounting /mnt/sysimage/sys... Unmounting /mnt/sysroot/dev... Unmounting /mnt/sysroot/sys... [ OK ] Unmounted /mnt/sysimage/dev. [ OK ] Unmounted /mnt/sysimage/sys. [ OK ] Unmounted /mnt/sysroot/dev. [ OK ] Unmounted /mnt/sysroot/sys. Unmounting /mnt/sysimage... Unmounting /mnt/sysroot... [ OK ] Unmounted /mnt/sysimage. [ OK ] Unmounted Temporary Directory (/tmp). [ OK ] Stopped target Swap. Deactivating swap Compressed swap on /dev/zram0... [ OK ] Deactivated swap Compressed swap on /dev/zram0. Stopping Create swap on /dev/zram0... [ OK ] Stopped Create swap on /dev/zram0. [ OK ] Removed slice system-swap\x2dcreate.slice. [ OK ] Unmounted /mnt/sysroot. [ OK ] Stopped target Local File Systems (Pre). [ OK ] Reached target Unmount All Filesystems. [ OK ] Stopped Create Static Device Nodes in /dev. [ OK ] Stopped Create System Users. [ OK ] Stopped Remount Root and Kernel File Systems. [ OK ] Reached target Shutdown. [ OK ] Reached target Final Step. [ OK ] Finished Reboot. [ OK ] Reached target Reboot. [ 1774.256311] dracut Warning: Killing all remaining processes dracut Warning: Killing all remaining processes [ 1774.866439] dracut Warning: Unmounted /oldroot. Rebooting. It then sits there until timeout and failure of the compose. These vm's are on fedora 32 hosts and are sometimes on mustangs and sometimes on lenovo emags.
What's the version of qemu/libvirt/edk2 on the hosts?
libvirt-daemon-6.1.0-4.fc32.aarch64 qemu-4.2.1-1.fc32.aarch64 edk2-aarch64-20200201stable-1.fc32.noarch
usually on aarch64 it's the PSCI firmware interface that deals with reboots and related bits, just looking through the changes there to see if there's anything of note showing up
I haven't been able to reproduce on an F33 mustang or F32 eMag. Verified the same packages are installed on the F32 host: libvirt-daemon-6.1.0-4.fc32.aarch64 qemu-system-aarch64-4.2.1-1.fc32.aarch64 edk2-aarch64-20200201stable-1.fc32.noarch [root@ampere-hr350a-06 ~]# uname -r 5.7.16-200.fc32.aarch64 On the vm: 5.9.0-0.rc1.20200821gitda2968ff879b.1.fc34.aarch64
I wonder if this is a imagefactory specific issue
It's possible. Or the way it's defining the guest?
Reproduced with imagefactory-1.1.15-2.fc32.noarch when trying to run the cloud-base image build.
Marking automatic f34-beta blocker: "Bugs which entirely prevent the composition of one or more of the release-blocking images required to be built for a currently-pending (pre-)release"
Cloud-base kickstart installs/reboots ok outside of imagefactory.
I think kernel-5.9.0-0.rc3.1.fc34 is still having issues https://pagure.io/releng/failed-composes/issue/1697. Untagging it and running another compose now.
We need to find a fix for imagefactory asap. Another rawhide failure due to kernel-5.9.0-0.rc3.20200902git9c7d619be5a0.1.fc34. I dont want to keep untagging them to get a compose out. @Paul Whalen, any update on your issue tracking?
(In reply to Mohan Boddu from comment #11) > We need to find a fix for imagefactory asap. Another rawhide failure due to > kernel-5.9.0-0.rc3.20200902git9c7d619be5a0.1.fc34. > > I dont want to keep untagging them to get a compose out. > > @Paul Whalen, any update on your issue tracking? Unfortunately not. Perhaps we can make the image fail-able until resolved?
But Cloud base for x86_64 and aarch64 are release blocking. Although its rawhide, we always followed the same conditions. Maybe we could make an exception. @Kevin, are you okay with it.
I started looking at it last week, I'm on PTO this week, will deal with it Monday
It might be oz/imagefactory passing some not-wanted-anymore options to libvirt/qemu or something like that ...
There is a crash when using the virtio_gpu driver that seems to be causing this (full boot attached): [ 12.928989] [drm] Initialized virtio_gpu 0.1.0 0 for virtio5 on minor 0 [ 11.982271] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000 [ 11.984567] Mem abort info: [ 11.985118] ESR = 0x96000004 [ 11.985759] EC = 0x25: DABT (current EL), IL = 32 bits [ 11.986789] SET = 0, FnV = 0 [ 11.987382] EA = 0, S1PTW = 0 [ 11.988028] Data abort info: [ 11.988638] ISV = 0, ISS = 0x00000004 [ 11.989762] CM = 0, WnR = 0 [ 11.990424] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001d24e9000 [ 11.991753] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 11.993227] Internal error: Oops: 96000004 [#1] SMP [ 11.993955] Modules linked in: virtio_gpu(+) drm_kms_helper crct10dif_ce ghash_ce syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm virtio_blk virtio_console xhci_pci(+) xhci_pci_renesas qemu_fw_cfg virtio_mmio dm_multipath aes_neon_bs [ 11.997225] CPU: 3 PID: 527 Comm: systemd-udevd Not tainted 5.9.0-0.rc3.20200902git9c7d619be5a0.1.fc34.aarch64 #1 [ 11.998815] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 12.000068] pstate: 60400005 (nZCv daif +PAN -UAO BTYPE=--) [ 12.001083] pc : swiotlb_map+0x194/0x1b0 [ 12.001771] lr : swiotlb_map+0x174/0x1b0 [ 12.002470] sp : ffff800010ac35e0 [ 12.003089] x29: ffff800010ac35e0 x28: 0000000000000000 [ 12.004094] x27: ffff000193108850 x26: ffff000192b1a000 [ 12.005138] x25: 0000000000000000 x24: ffffa60579bd40f8 [ 12.006185] x23: 00000001d2654000 x22: 0000000000000000 [ 12.007211] x21: 0000000000000000 x20: 0000000000001000 [ 12.008244] x19: ffff000193108850 x18: 0000000000000000 [ 12.009274] x17: 0000000000000000 x16: ffffa605769dc7d4 [ 12.009345] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.09 [ 12.010352] x15: ffffa60577ca2808 x14: 0000000000000000 [ 12.011933] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 12.013058] x13: 0000000000000000 x12: 0000000000000000 [ 12.014364] usb usb1: Product: xHCI Host Controller [ 12.015315] x11: 0000000000000000 x10: 0000000000000000 [ 12.016229] usb usb1: Manufacturer: Linux 5.9.0-0.rc3.20200902git9c7d619be5a0.1.fc34.aarch64 xhci-hcd [ 12.017155] x9 : ffffa605769e3b08 x8 : 0000000000000070 [ 12.018924] usb usb1: SerialNumber: 0000:02:00.0 [ 12.019853] x7 : ffff0001fe608000 x6 : 0000000000000000 [ 12.021693] x5 : 0000000000000001 x4 : 0000000000001000 [ 12.022632] x3 : ffff800010ac3628 x2 : ffff000193064a00 [ 12.023573] x1 : ffffa60511d55620 x0 : 0000000000000000 [ 12.024518] Call trace: [ 12.024959] swiotlb_map+0x194/0x1b0 [ 12.025613] dma_direct_map_sg+0x12c/0x214 [ 12.026412] dma_map_sg_attrs+0x94/0xac [ 12.027225] drm_gem_shmem_get_pages_sgt+0x84/0xd4 [drm] [ 12.028310] virtio_gpu_object_shmem_init+0x4c/0x170 [virtio_gpu] [ 12.029617] virtio_gpu_object_create+0x184/0x220 [virtio_gpu] [ 12.030815] virtio_gpu_mode_dumb_create+0xb0/0x1a0 [virtio_gpu] [ 12.032032] drm_mode_create_dumb+0x9c/0xc0 [drm] [ 12.033039] drm_client_buffer_create+0x84/0x114 [drm] [ 12.034133] drm_client_framebuffer_create+0x30/0x90 [drm] [ 12.035254] drm_fb_helper_generic_probe+0x5c/0x170 [drm_kms_helper] [ 12.036587] drm_fb_helper_single_fb_probe+0x2a8/0x440 [drm_kms_helper] [ 12.037910] __drm_fb_helper_initial_config_and_unlock+0x48/0x154 [drm_kms_helper] [ 12.039426] drm_fbdev_client_hotplug+0xbc/0x1c0 [drm_kms_helper] [ 12.040717] drm_fbdev_generic_setup+0xc0/0x1c0 [drm_kms_helper] [ 12.041165] hub 1-0:1.0: USB hub found [ 12.041942] virtio_gpu_probe+0xc0/0x194 [virtio_gpu] [ 12.043722] virtio_dev_probe+0x154/0x200 [ 12.044524] really_probe+0xf0/0x504 [ 12.045235] driver_probe_device+0xe4/0x100 [ 12.046065] device_driver_attach+0xd4/0xe0 [ 12.046895] __driver_attach+0xb4/0x180 [ 12.047619] bus_for_each_dev+0x6c/0xb0 [ 12.048300] driver_attach+0x30/0x3c [ 12.048969] bus_add_driver+0x154/0x250 [ 12.049669] driver_register+0x84/0x140 [ 12.050338] register_virtio_driver+0x30/0x50 [ 12.050459] hub 1-0:1.0: 15 ports detected [ 12.051101] virtio_gpu_driver_init+0x28/0x1000 [virtio_gpu] [ 12.052762] do_one_initcall+0x44/0x170 [ 12.053559] do_init_module+0x60/0x27c [ 12.054282] load_module+0x60c/0x760 [ 12.054692] xhci_hcd 0000:02:00.0: xHCI Host Controller [ 12.054930] __do_sys_init_module+0xb0/0x120 [ 12.056549] __arm64_sys_init_module+0x28/0x34 [ 12.057357] el0_svc_common.constprop.0+0x80/0x1b0 [ 12.058223] do_el0_svc+0x30/0xa0 [ 12.058900] el0_sync_handler+0x90/0x1ec [ 12.059807] el0_sync+0x15c/0x180 [ 12.060570] Code: aa1403e4 f9423660 910123e3 f9423e66 (f9400005) [ 12.061936] ---[ end trace e80946b15db81549 ]---
Created attachment 1714358 [details] full boot log
Gerd are you aware of any issues around virtio-gpu on aarch64?
(In reply to Peter Robinson from comment #18) > Gerd are you aware of any issues around virtio-gpu on aarch64? virtio-gpu in general has problems, 5.9-rc5 & newer should be good again.
Fedora-Rawhide-20200915.n.1 compose with kernel-5.9.0-0.rc5.11.fc34 completed successfully.
> virtio-gpu in general has problems, 5.9-rc5 & newer should be good again. Confirmed it is. Thanks