Bug 839156

Summary: Fedora 16 and 17 guests hang during boot
Product: Red Hat Enterprise Linux 6 Reporter: Amit Shah <amit.shah>
Component: qemu-kvmAssignee: Amit Shah <amit.shah>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.4CC: abaron, acathrow, bazulay, bsarathy, cpelland, dyasny, iheim, jbrooks, juzhang, mdeng, mfojtik, mgoldboi, michen, mkenneth, qzhang, rh-bugzilla, rhod, tvvcox, virt-maint, xuhj
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: virt
Fixed In Version: qemu-kvm-0.12.1.2-2.298.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 837925 Environment:
Last Closed: 2013-02-21 07:37:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 806717, 861049    

Description Amit Shah 2012-07-11 06:25:40 UTC
systemd creates a getty on serial ports available, and the virtio-console port is one of them.  If there's no listener connected for that port on the host, the output from the guest is throttled till a listener is connected.  This is desirable for virtio-serial ports, but not for console ports, whose output should be discarded in such cases.

With the current guest code, the guest just keeps spinning in a while() loop till it hears back from the host after writing data on a console port.  In this case, since there's no listener connected, the guest never hears back, and just keeps spinning.

Fix is to simply not throttle console ports, and discard the data if no listener is connected.  This is done in upstream commit ed8e5a85a1741147ce06932b478a509ce3407061.

This affects only in the case where the console chardev is of type 'pty'.  'unix' and 'tcp' chardevs will just throttle.

With the fix, the default behaviour for 'unix' and 'tcp' chardevs will change as well to not throttle if the port is a console port.

+++ This bug was initially created as a clone of Bug #837925 +++

Description of problem:

On oVirt 3.1, Fedora 16 and Fedora 17 guests hang during boot. F16 guests will install normally from the DVD iso, and then hang on first boot. F17 guests hang when booting from the DVD iso.

I booted up my F16 VM in rescue mode, and started each of the services wanted by the default runlevel individually. It was getty.target that appears to be causing the hangup. All the other services start without an issue, but right after "systemctl start getty.target" the system cpu % grows quickly to 99% and the VM won't respond.

Version-Release number of selected component (if applicable):

4.10.0-4.fc17

I chose vdsm as the affected component because a user on the list mentioned a similar-sounding issue with F16 and F17 guests, manipulating vdsm directly: http://lists.ovirt.org/pipermail/users/2012-June/002555.html.

How reproducible:

Install an F16 guest on oVirt 3.1 and attempt to boot, or attempt to install an F17 guest from the DVD install image.

Actual results:

Guest hangs during boot.


Expected results:

Guest boots normally.

Comment 1 Amit Shah 2012-07-11 06:31:38 UTC
Just to hilight a behaviour change: without the fix, if a guest writes to a console port, and if the host-side unix or tcp chardev isn't connected, the output will be buffered till the host is connected, or till the guest vq becomes full.

After the fix is applied, the guest output will be discarded till the host chardev gets connected.

The previous behaviour isn't important for console ports, as console output getting discarded is normal if no one is listening.

The guest will get frozen only for 'pty' type of chardev associated with a virtio-console port before the fix.

Comment 2 Qunfang Zhang 2012-07-11 10:40:51 UTC
Reproduced this issue with the following packages and steps:

Host:
2.6.32-272.el6.x86_64
qemu-kvm-0.12.1.2-2.295.el6.x86_64
seabios-0.6.1.2-19.el6.x86_64

Guest:
Fedora16-64 and Fedora17-64 released version.

1. Fedora 17:
(1) Install Fedora 17 with virtio-console with pty chardev backend
Result:
Guest hang at the begining of installation, qemu-kvm process consumes 200% cpu (-smp 2)

(2) Install Fedora 17 with virtio-console with 'unix' chardev.
Result: 
Guest does not hang at the beginning, installation could proceed.

CLI:
# /usr/libexec/qemu-kvm  -m 2048 -smp 1,sockets=1,cores=1,threads=1 -enable-kvm -name fedora17 -uuid 32caa689-717d-4851-9800-908a55ee7d98 -k en-us -rtc base=localtime,driftfix=slew -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=input0 -drive file=/home/qzhang-test/fedoar17-64.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=koTUXQrb,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/home/qzhang-test/Fedora-17-x86_64-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1b:42:72:12:1b,bus=pci.0,addr=0x5 -monitor stdio -qmp tcp:0:6667,server,nowait -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,addr=0x7 -chardev unix,path=/tmp/qzhang,server,nowait,id=channel0 -device virtconsole,bus=virtio-serial0.0,chardev=channel0,name=org.fedoraproject.console.foo,id=console1,nr=1 -vnc :11 -boot dc

2. Fedora 16:
(1) Install fedora-16 with virtio-console device with pty backend. 
Result: Finished installation.

(2)Reboot guest after installation finished.
Result: Guest hangs and consumes 100% cpu.

Comment 3 Qunfang Zhang 2012-07-11 10:43:36 UTC
(In reply to comment #2)
> Reproduced this issue with the following packages and steps:
> 
> Host:
> 2.6.32-272.el6.x86_64
> qemu-kvm-0.12.1.2-2.295.el6.x86_64
> seabios-0.6.1.2-19.el6.x86_64
> 
> Guest:
> Fedora16-64 and Fedora17-64 released version.
> 
> 1. Fedora 17:
> (1) Install Fedora 17 with virtio-console with pty chardev backend
> Result:
> Guest hang at the begining of installation, qemu-kvm process consumes 200%
> cpu (-smp 2)
> 
> (2) Install Fedora 17 with virtio-console with 'unix' chardev.
> Result: 
> Guest does not hang at the beginning, installation could proceed.
> 
> CLI:
> # /usr/libexec/qemu-kvm  -m 2048 -smp 1,sockets=1,cores=1,threads=1
> -enable-kvm -name fedora17 -uuid 32caa689-717d-4851-9800-908a55ee7d98 -k
> en-us -rtc base=localtime,driftfix=slew -device
> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=input0
> -drive
> file=/home/qzhang-test/fedoar17-64.qcow2,if=none,id=drive-virtio-disk0,
> format=qcow2,serial=koTUXQrb,cache=none,werror=stop,rerror=stop,aio=native
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,
> id=virtio-disk0 -drive
> file=/home/qzhang-test/Fedora-17-x86_64-DVD.iso,if=none,media=cdrom,id=drive-
> ide0-1-0,readonly=on,format=raw -device
> ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev
> tap,id=hostnet0,vhost=on -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1b:42:72:12:1b,bus=pci.0,
> addr=0x5 -monitor stdio -qmp tcp:0:6667,server,nowait -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -device
> virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,
> addr=0x7 -chardev unix,path=/tmp/qzhang,server,nowait,id=channel0 -device
> virtconsole,bus=virtio-serial0.0,chardev=channel0,name=org.fedoraproject.
> console.foo,id=console1,nr=1 -vnc :11 -boot dc
> 
pty:
-device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,
 addr=0x7 -chardev pty,id=channel0 -device
 virtconsole,bus=virtio-serial0.0,chardev=channel0,name=org.fedoraproject.
 console.foo,id=console1,nr=1



> 2. Fedora 16:
> (1) Install fedora-16 with virtio-console device with pty backend. 
> Result: Finished installation.
> 
> (2)Reboot guest after installation finished.
> Result: Guest hangs and consumes 100% cpu.

Comment 8 Amit Shah 2012-08-22 04:42:37 UTC
*** Bug 849461 has been marked as a duplicate of this bug. ***

Comment 9 Andrew Cathrow 2012-09-24 18:41:37 UTC
*** Bug 859017 has been marked as a duplicate of this bug. ***

Comment 13 Qunfang Zhang 2012-11-06 08:37:47 UTC
Reproduce this bug on qemu-kvm-0.12.1.2-2.295.el6.x86_64 and verified pass on qemu-kvm-0.12.1.2-2.334.el6.x86_64.

Steps:

Install a Fedora 17 guest with pty console chardev.

# /usr/libexec/qemu-kvm  -m 2048 -smp 1,sockets=1,cores=1,threads=1 -enable-kvm -name fedora17-67 -uuid 11caa689-717d-4851-9800-908a55ee7d98 -k en-us -rtc base=localtime,driftfix=slew -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=input0 -drive file=/home/fedora17-64.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=koTUXQrb,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/home/boot.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:10:40:72:12:1b,bus=pci.0,addr=0x5 -monitor stdio -qmp tcp:0:6667,server,nowait -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,addr=0x7 -chardev pty,id=channel0 -device virtconsole,bus=virtio-serial0.0,chardev=channel0,name=org.fedoraproject.console.foo,id=console1,nr=1 -vnc :11 -boot dc

In the unfixed version qemu-kvm-0.12.1.2-2.295.el6, guest hangs at the beginning of installation, and consume 100% cpu. 
In the fixed version qemu-kvm-0.12.1.2-2.334.el6, guest doesn't hang and proceed the installation until finished.

So this bug is fixed.

Comment 15 errata-xmlrpc 2013-02-21 07:37:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0527.html