Description of problem: We regularly see a condition with the qemu not seeing kernel boot output, when the guest is a Fedora 22 guest. Version-Release number of selected component (if applicable): Kernel 4.0.0-0.rc3.git0.1.fc22.x86_64 on an x86_64 In my case qemu is also running on a Fedora 22 host. We have have also seen this with RHEL 7.x hosts. qemu-2.2.0-5.fc22.x86_64 How reproducible: One out of 10 boots. Steps to Reproduce: 1. Run the Cockpit CI suite. Actual results. The entirety of the boot output: Fedora release 22 (Twenty Two) Kernel 4.0.0-0.rc3.git0.1.fc22.x86_64 on an x86_64 (ttyS0) m11 login: Expected results: The entire boot output. including kernel messages, systemd initialization, etc. This is what we see the other 9 out of 10 times.
This breaks Cockpit development.
Is there a reproducer which isn't "Run the Cockpit CI suite". I can pretty much guarantee that no one will investigate this bug without considerably more information, like the qemu command line being used and how you're expecting to get the console messages and so forth. Ideally I'd want to see a qemu command line which can be run that demonstrates the loss of console messages intermittently, eg: $ qemu-kvm -nodefaults -nographic -m 1024 -kernel /boot/vmlinuz-XXX -append "console=ttyS0" -serial stdio FWIW here is a simple libguestfs-based test you can try: $ libguestfs-test-tool We have never seen intermittent lost console messages however.
> I can pretty much guarantee that no one will investigate this bug without considerably more information, Indeed, and I wanted to see what kind of information to provide. Thanks for the notes, that's a good place to get started.
I've started trying to 'tee' the output from qemu. This may have caused a heisenbug situation, where the tee file descriptor reading behavior causes the bug to go away. Will keep you posted. In the meantime, this is the sort of qemu command line we're running: qemu-kvm -m 1024 -drive if=virtio,file=/data/src/cockpit/test/run/cockpit-fedora-22-x86_64-root,index=0,serial=ROOT,snapshot=on -kernel /data/src/cockpit/test/run/cockpit-fedora-22-x86_64-kernel -initrd /data/src/cockpit/test/run/cockpit-fedora-22-x86_64-initrd -append 'root=/dev/vda console=ttyS0 quiet ' -nographic -net nic,model=virtio,macaddr=52:54:00:9e:00:00 -net bridge,vlan=0,br=cockpit0 -device virtio-scsi-pci,id=hot -monitor unix:path=/data/src/cockpit/test/run/machine-lKrTWb.mon,server,nowait
We continue to see this behavior off and on. We had to refactor our test suite so we didn't depend on qemu console output. But again, that doesn't help you debug this ... so I can close this for now. Sorry about that.
I have a very unreliable reproducer that I was meaning to upload and link to... http://files.cockpit-project.org/~mvo/bootlog-reproducer.tar.xz (Warning, 600 MB.) Instructions: Untar it and cd into the directory. $ sudo ./vm-prep $ ./check-example This will very occasionally time out while waiting for a certain boot message. You might try this: $ while ./check-example; do true; done At this point, my personal hunch is that it's actually usually Fedora 22 that sometimes fails to output boot messages, but we have definitely also seen breakage with a Fedora 21 image. With Fedora 22, we always see the final "<hostname> login: " output, but sometimes no "[ OK ] Starting BlitzGewitter" etc messages. With Fedora 21, we used to sometimes not see any output. This is what made us think that the breakage happens in qemu.