Bug 2218719

Summary: Cloud vm guest hangs after installing new kernel
Product: [Fedora] Fedora Reporter: Josef Bacik <josef>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 38CC: berrange, cfergeau, crobinso, mcascell, pbonzini, philmd, rjones, virt-maint
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Josef Bacik 2023-06-30 02:02:37 UTC
If I build a new virt guest based off of a f38 cloud image (I assume other ones as well, this just happens to be how I'm doing it) and I re-install the kernel I can no longer boot the guest, it simply hangs.  There's no console information or anything.

Reproducible: Always

Steps to Reproduce:
1. Download a qcow2 image of fedora 38
2. Use virt-customize to inject your ssh key or set a root password.
3. Do an installer with
virt-install --memory 4096 --vcpus 2 --name testvm \
        --import --disk $VM_IMAGE,format=qcow2,bus=virtio \
        --os-variant fedora38 \
        --network bridge=virbr0,model=virtio \
        --graphics none \
        --noautoconsole
3. Log into the vm, run dnf reinstall kernel-core, reboot
Actual Results:  
The vm no longer boots, it just hangs.

Expected Results:  
I would like it to actually boot.

I have this same workflow working on an older fedora machine and it's working fine.

Comment 1 Richard W.M. Jones 2023-06-30 07:46:43 UTC
You'll need to add a serial console to the guest, then interrupt grub and
edit the kernel command line.  Remove rhgb quiet.  Add console=ttyS0.  Hit
Ctrl+x to boot and see where its hanging.

Comment 2 Josef Bacik 2023-06-30 13:40:44 UTC
That makes it work, which is maddening.  This has gotten a bit confusing,

WORKING CASE 1:
1) create guest in the above described way.
2) grubby --remove-args="rhgb quiet" --args=console=ttyS0,115200 --update-kernel=DEFAULT
3) dnf reinstall -y kernel-core
4) reboot

WORKING CASE 2:
1) add '--boot uefi' to the above virt-install command.
2) dnf reinstall -y kernel-core
3) reboot

BROKEN CASE:
1) create guest in the above described way
2) dnf reinstall -y kernel-core (or even make install from a kernel source which is what I was trying to do originally)
3) reboot

The image is setup to be UEFI booted, it's got a UEFI partition and everything.  With the original virt-install command it gets booted in bios mode.  If I mess with the grubby options then something changes with how we do the grub install to make it boot properly.  In case #1 we're still booted with bios mode when I reboot.  In the broken case it appears to break the bios booting, I'm not entirely sure how.  With case #2 we're just always UEFI and everything is happy.

Comment 3 Richard W.M. Jones 2023-06-30 13:51:54 UTC
What's the full output from the kernel when it hangs?

Comment 4 Josef Bacik 2023-06-30 14:00:17 UTC
There's no output, it just sits there and spins.

Comment 5 Josef Bacik 2023-06-30 14:00:54 UTC
I don't think it's even getting to grub, the VM just spins on a cpu.

Comment 7 Josef Bacik 2023-06-30 16:00:06 UTC
I have the corresponding fix in my devel tree that I was having the same issue with yesterday, it's how I noticed the problem, I did the dnf reinstall thing to make sure it wasn't an upstream bug.