Bug 1562750
Summary: | VM doesn't boot from HD | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Liran Rotenberg <lrotenbe> | ||||||||
Component: | qemu-kvm-rhev | Assignee: | Fam Zheng <famz> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | aihua liang <aliang> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | 7.5 | CC: | aliang, areis, asanders, chayang, chhu, coli, cshao, dgilbert, dyuan, famz, fweimer, huzhao, jiyan, juzhang, knoel, kraxel, lersek, lmen, lrotenbe, mavital, michal.skrivanek, michen, mkalinin, mkarg, mrezanin, ngu, pbonzini, pmatyas, qiyuan, ratamir, stefanha, virt-maint, weiwang, xuwei, xuzhang, yaniwang, ycui, ylavi | ||||||||
Target Milestone: | rc | Keywords: | Automation, AutomationBlocker, Reopened, ZStream | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | qemu-kvm-rhev-2.12.0-11.el7 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 1619128 1623085 (view as bug list) | Environment: | |||||||||
Last Closed: | 2018-11-01 11:07:15 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1619128, 1623085 | ||||||||||
Attachments: |
|
Description
Liran Rotenberg
2018-04-02 11:18:18 UTC
Hi, Liran. I tried several scenarios in pure libvirt, but I am not sure I hit this bug. Because in the following test scenario, S2, VM can not start with IOThreads configured for bus scsi target sda Could you please provide more details about the disk configuration? thanks in advance. When I checked the 'engine.log' and found the xml configuration as following, ** I did not find disk with 'scsi' bus but 'virtio' bus. ** # cat engine.log <disk type="file" device="cdrom" snapshot="no"> <driver name="qemu" type="raw" error_policy="report"/> <source file="" startupPolicy="optional"/> <target dev="hdc" bus="ide"/> <readonly/> <alias name="ua-acb27e03-ae91-4876-8c26-c75092d9be36"/> <address bus="1" controller="0" unit="0" type="drive" target="0"/> </disk> <disk snapshot="no" type="file" device="disk"> <target dev="vda" ** bus="virtio" **/> <source file="/rhev/data-center/3ba182ae-3a1b-4c96-b742-41ca091cfc5e/7b10e8fe-c6bb-496a-8873-80b1067629db/images/d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c/2653a8c8-6aa6-4466-aa85-0ba5e58af129"/> <driver name="qemu" iothread="1" io="threads" type="qcow2" error_policy="stop" cache="none"/> <alias name="ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c"/> <address bus="0x00" domain="0x0000" function="0x0" slot="0x07" type="pci"/> <boot order="1"/> <serial>d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c</serial> </disk> Then I tried the following scenarios in pure libvirt: Version: qemu-kvm-rhev-2.10.0-21.el7_5.1.x86_64 libvirt-3.9.0-14.virtcov.el7_5.2.x86_64 kernel-3.10.0-862.el7.x86_64 S1: Use the XML above and start VM # virsh dumpxml test1 --inactive |grep "<disk" -A10 <disk type='file' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='threads' iothread='1'/> <source file='/var/lib/libvirt/images/optionalRHEL-7.5-x86_64-latest.qcow2'/> <target dev='sda' bus='virtio'/> <serial>d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c</serial> <boot order='1'/> <alias name='ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw' error_policy='report'/> <source startupPolicy='optional'/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ua-acb27e03-ae91-4876-8c26-c75092d9be36'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> # virsh start test1 Domain test1 started # virsh console test1 Connected to domain test1 Escape character is ^] Red Hat Enterprise Linux Server 7.5 Beta (Maipo) Kernel 3.10.0-855.el7.x86_64 on an x86_64 localhost login: root Password: [root@localhost ~]# S1: Use the XML above but replace 'virtio' bus as 'scsi' bus. # virsh dumpxml test1 --inactive |grep "<disk" -A10 <disk type='file' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='threads' iothread='1'/> <source file='/var/lib/libvirt/images/optionalRHEL-7.5-x86_64-latest.qcow2'/> <target dev='sda' bus='scsi'/> <serial>d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c</serial> <boot order='1'/> <alias name='ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw' error_policy='report'/> <source startupPolicy='optional'/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ua-acb27e03-ae91-4876-8c26-c75092d9be36'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> # virsh start test1 error: Failed to start domain test1 error: unsupported configuration: IOThreads not available for bus scsi target sda Sry for the wrong usage for virtio-scsi with iothread enabled, update test steps here. I tried maybe more than 20 times but can also always login into the guest. S2: 'virtio-scsi' bus disk with iothread enabled # virsh dumpxml test1 --inactive <disk type='file' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='threads'/> <source file='/var/lib/libvirt/images/optionalRHEL-7.5-x86_64-latest.qcow2'/> <target dev='sda' bus='scsi'/> <serial>d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c</serial> <boot order='1'/> <alias name='ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw' error_policy='report'/> <source startupPolicy='optional'/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ua-acb27e03-ae91-4876-8c26-c75092d9be36'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <driver iothread='1'/> <alias name='ua-46657f70-84ff-4a4f-9ff2-345441b34e7a'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </controller> # virsh start test1 Domain test1 started # virsh console test1 Connected to domain test1 Escape character is ^] Red Hat Enterprise Linux Server 7.5 Beta (Maipo) Kernel 3.10.0-855.el7.x86_64 on an x86_64 localhost login: root Password: [root@localhost ~]# Could you please upload the engine.log with whole detailed dumpxml info for 'virtio-scsi' bus disk? (I can only find 'virtio' bus in the current engine.log.) Created attachment 1416719 [details]
engine dumpxml
Use the original xml from 'engine.log' can hit this bug, sometimes VM can boot successfully while sometimes failed, nearly half and half. Steps: Define VM from the xml file and then start VM # diff bug.xml originalfromengine.xml 3c3 < <name>bug</name> --- > <name>golden_env_mixed_virtio_3</name> 37,39d36 < <graphics type='vnc' port='-1' autoport='yes'> < <listen type='address'/> < </graphics> 42c39 < <source mode="bind" path="/var/lib/libvirt/qemu/1"/> --- > <source mode="bind" path="/var/lib/libvirt/qemu/channels/3e685cf8-ea27-4a87-9c51-0f7edb550e64.ovirt-guest-agent.0"/> 46c43 < <source mode="bind" path="/var/lib/libvirt/qemu/2"/> --- > <source mode="bind" path="/var/lib/libvirt/qemu/channels/3e685cf8-ea27-4a87-9c51-0f7edb550e64.org.qemu.guest_agent.0"/> 84a82 > <listen type="network" network="vdsm-ovirtmgmt"/> 92c90 < <source bridge="br0"/> --- > <source bridge="ovirtmgmt"/> 109c107 < <source file="/var/lib/libvirt/images/optionalRHEL-7.5-x86_64-latest.qcow2"/> --- > <source file="/rhev/data-center/c73ab90f-ae2e-4289-bfd5-076a20c1501f/3bcff1ce-0131-4a5b-bddc-59240b1e6b36/images/48800e5d-1cd3-461b-b690-ed3a9684fe62/7ce2a429-5e5c-4d90-8556-46db52aa1987"/> # virsh define bug.xml Domain bug defined from bug.xml # virsh start bug Domain bug started As for the result, sometimes just hang on at 'Booting from Hard Disk' while sometimes can login into the VM. The following attachment 'bug.tar.gz' includes: bug.xml, originalfromengine.xml, failedboot.png and successfullyboot.png Created attachment 1417002 [details] attachment for comment7 I was not able to reproduce this issue and Jing Yan is not able to reproduce it anymore either. However, the limited logs we have for the last domain start which was reported as the one that hit this bug show that libvirt started QEMU with the specified disk as expected and QEMU did not report any errors accessing the disk. In other words, it looks like some magic issue between QEMU and SeaBIOS. vdsm.log: 2018-04-02 11:44:55,592+0300 INFO (vm/cadb924c) [virt.vm] (vmId='cadb924c-9eff-4f96-a4d9-42d053d62da9') <?xml version='1.0' encoding='utf-8'?> <domain xmlns:ns0="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0" type="kvm"> <name>golden_env_mixed_virtio_1_0</name> <uuid>cadb924c-9eff-4f96-a4d9-42d053d62da9</uuid> <memory>1048576</memory> <currentMemory>1048576</currentMemory> <iothreads>1</iothreads> <maxMemory slots="16">4194304</maxMemory> <vcpu current="1">16</vcpu> <sysinfo type="smbios"> <system> <entry name="manufacturer">oVirt</entry> <entry name="product">RHEV Hypervisor</entry> <entry name="version">7.5-8.el7</entry> <entry name="serial">4C4C4544-0056-3010-8056-C8C04F563232</entry> <entry name="uuid">cadb924c-9eff-4f96-a4d9-42d053d62da9</entry> </system> </sysinfo> <clock adjustment="0" offset="variable"> <timer name="rtc" tickpolicy="catchup" /> <timer name="pit" tickpolicy="delay" /> <timer name="hpet" present="no" /> </clock> <features> <acpi /> </features> <cpu match="exact"> <model>Conroe</model> <topology cores="1" sockets="16" threads="1" /> <numa> <cell cpus="0" memory="1048576" /> </numa> </cpu> <cputune /> <devices> <input bus="ps2" type="mouse" /> <channel type="unix"> <target name="ovirt-guest-agent.0" type="virtio" /> <source mode="bind" path="/var/lib/libvirt/qemu/channels/cadb924c-9eff-4f96-a4d9-42d053d62da9.ovirt-guest-agent.0" /> </channel> <channel type="unix"> <target name="org.qemu.guest_agent.0" type="virtio" /> <source mode="bind" path="/var/lib/libvirt/qemu/channels/cadb924c-9eff-4f96-a4d9-42d053d62da9.org.qemu.guest_agent.0" /> </channel> <graphics autoport="yes" passwd="*****" passwdValidTo="1970-01-01T00:00:01" port="-1" tlsPort="-1" type="spice"> <channel mode="secure" name="main" /> <channel mode="secure" name="inputs" /> <channel mode="secure" name="cursor" /> <channel mode="secure" name="playback" /> <channel mode="secure" name="record" /> <channel mode="secure" name="display" /> <channel mode="secure" name="smartcard" /> <channel mode="secure" name="usbredir" /> <listen network="vdsm-ovirtmgmt" type="network" /> </graphics> <controller index="0" model="piix3-uhci" type="usb"> <address bus="0x00" domain="0x0000" function="0x2" slot="0x01" type="pci" /> </controller> <rng model="virtio"> <backend model="random">/dev/urandom</backend> <alias name="ua-70d0edb3-4dc7-48d5-be0a-30109d5c50a6" /> </rng> <controller index="0" model="virtio-scsi" type="scsi"> <driver iothread="1" /> <alias name="ua-97cc068a-eafa-4c6b-a1e1-d61ffbcd6524" /> </controller> <controller index="0" type="ide"> <address bus="0x00" domain="0x0000" function="0x1" slot="0x01" type="pci" /> </controller> <memballoon model="virtio"> <stats period="5" /> <alias name="ua-a925c27a-880d-43a1-adf3-b398968fd8f8" /> <address bus="0x00" domain="0x0000" function="0x0" slot="0x08" type="pci" /> </memballoon> <sound model="ich6"> <alias name="ua-bd10e3e9-bb2d-4eac-a56b-944122b21d8d" /> <address bus="0x00" domain="0x0000" function="0x0" slot="0x04" type="pci" /> </sound> <video> <model heads="1" ram="65536" type="qxl" vgamem="16384" vram="8192" /> <alias name="ua-d6bf25f1-27c1-4977-ba25-29041fea2c1a" /> <address bus="0x00" domain="0x0000" function="0x0" slot="0x02" type="pci" /> </video> <controller index="0" ports="16" type="virtio-serial"> <alias name="ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db" /> <address bus="0x00" domain="0x0000" function="0x0" slot="0x06" type="pci" /> </controller> <channel type="spicevmc"> <target name="com.redhat.spice.0" type="virtio" /> </channel> <disk device="cdrom" snapshot="no" type="file"> <address bus="1" controller="0" target="0" type="drive" unit="0" /> <source file="" startupPolicy="optional" /> <target bus="ide" dev="hdc" /> <readonly /> <driver error_policy="report" name="qemu" type="raw" /> <alias name="ua-acb27e03-ae91-4876-8c26-c75092d9be36" /> </disk> <disk device="disk" snapshot="no" type="file"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x07" type="pci" /> <source file="/rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com:_nas01_ge__4__nfs__0/7b10e8fe-c6bb-496a-8873-80b1067629db/images/d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c/2653a8c8-6aa6-4466-aa85-0ba5e58af129" /> <target bus="virtio" dev="vda" /> <serial>d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c</serial> <boot order="1" /> <driver cache="none" error_policy="stop" io="threads" iothread="1" name="qemu" type="qcow2" /> <alias name="ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c" /> </disk> <interface type="bridge"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x03" type="pci" /> <mac address="00:1a:4a:16:88:c9" /> <model type="virtio" /> <source bridge="ovirtmgmt" /> <filterref filter="vdsm-no-mac-spoofing" /> <alias name="ua-e5ae5ef0-eb1e-41bf-ba4b-277c95d52131" /> </interface> </devices> <pm> <suspend-to-disk enabled="no" /> <suspend-to-mem enabled="no" /> </pm> <os> <type arch="x86_64" machine="pc-i440fx-rhel7.5.0">hvm</type> <smbios mode="sysinfo" /> </os> <metadata> <ns0:qos /> <ovirt-vm:vm> <minGuaranteedMemoryMb type="int">1024</minGuaranteedMemoryMb> <clusterVersion>4.2</clusterVersion> <ovirt-vm:custom /> <ovirt-vm:device mac_address="00:1a:4a:16:88:c9"> <ovirt-vm:custom /> </ovirt-vm:device> <ovirt-vm:device devtype="disk" name="vda"> <ovirt-vm:poolID>3ba182ae-3a1b-4c96-b742-41ca091cfc5e</ovirt-vm:poolID> <ovirt-vm:volumeID>2653a8c8-6aa6-4466-aa85-0ba5e58af129</ovirt-vm:volumeID> <ovirt-vm:imageID>d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c</ovirt-vm:imageID> <ovirt-vm:domainID>7b10e8fe-c6bb-496a-8873-80b1067629db</ovirt-vm:domainID> </ovirt-vm:device> <launchPaused>false</launchPaused> <resumeBehavior>auto_resume</resumeBehavior> </ovirt-vm:vm> </metadata> </domain> qemu.log: 2018-04-02 08:44:56.272+0000: starting up libvirt version: 3.9.0, package: 14.el7_5.2 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2018-03-20-14:43:16, x86-041.build.eng.bos.redhat.com), qemu version: 2.10.0(qemu-kvm-rhev-2.10.0-21.el7_5.1), hostname: lynx12.lab.eng.tlv2.redhat.com LC_ALL=C \ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ QEMU_AUDIO_DRV=spice \ /usr/libexec/qemu-kvm \ -name guest=golden_env_mixed_virtio_1_0,debug-threads=on \ -S \ -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-66-golden_env_mixed_vir/master-key.aes \ -machine pc-i440fx-rhel7.5.0,accel=kvm,usb=off,dump-guest-core=off \ -cpu Conroe \ -m size=1048576k,slots=16,maxmem=4194304k \ -realtime mlock=off \ -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 \ -object iothread,id=iothread1 \ -numa node,nodeid=0,cpus=0,mem=1024 \ -uuid cadb924c-9eff-4f96-a4d9-42d053d62da9 \ -smbios 'type=1,manufacturer=oVirt,product=RHEV Hypervisor,version=7.5-8.el7,serial=4C4C4544-0056-3010-8056-C8C04F563232,uuid=cadb924c-9eff-4f96-a4d9-42d053d62da9' \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-66-golden_env_mixed_vir/monitor.sock,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=2018-04-02T08:44:55,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -no-shutdown \ -global PIIX4_PM.disable_s3=1 \ -global PIIX4_PM.disable_s4=1 \ -boot strict=on \ -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ -device virtio-scsi-pci,iothread=iothread1,id=ua-97cc068a-eafa-4c6b-a1e1-d61ffbcd6524,bus=pci.0,addr=0x5 \ -device virtio-serial-pci,id=ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db,max_ports=16,bus=pci.0,addr=0x6 \ -drive if=none,id=drive-ua-acb27e03-ae91-4876-8c26-c75092d9be36,readonly=on,werror=report,rerror=report \ -device ide-cd,bus=ide.1,unit=0,drive=drive-ua-acb27e03-ae91-4876-8c26-c75092d9be36,id=ua-acb27e03-ae91-4876-8c26-c75092d9be36 \ -drive file=/rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com:_nas01_ge__4__nfs__0/7b10e8fe-c6bb-496a-8873-80b1067629db/images/d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c/2653a8c8-6aa6-4466-aa85-0ba5e58af129,format=qcow2,if=none,id=drive-ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,serial=d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,cache=none,werror=stop,rerror=stop,aio=threads \ -device virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.0,addr=0x7,drive=drive-ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,id=ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,bootindex=1 \ -netdev tap,fd=33,id=hostua-e5ae5ef0-eb1e-41bf-ba4b-277c95d52131,vhost=on,vhostfd=35 \ -device virtio-net-pci,netdev=hostua-e5ae5ef0-eb1e-41bf-ba4b-277c95d52131,id=ua-e5ae5ef0-eb1e-41bf-ba4b-277c95d52131,mac=00:1a:4a:16:88:c9,bus=pci.0,addr=0x3 \ -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/cadb924c-9eff-4f96-a4d9-42d053d62da9.ovirt-guest-agent.0,server,nowait \ -device virtserialport,bus=ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db.0,nr=1,chardev=charchannel0,id=channel0,name=ovirt-guest-agent.0 \ -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/cadb924c-9eff-4f96-a4d9-42d053d62da9.org.qemu.guest_agent.0,server,nowait \ -device virtserialport,bus=ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 \ -chardev spicevmc,id=charchannel2,name=vdagent \ -device virtserialport,bus=ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 \ -spice port=5902,tls-port=5903,addr=10.46.16.27,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on \ -device qxl-vga,id=ua-d6bf25f1-27c1-4977-ba25-29041fea2c1a,ram_size=67108864,vram_size=8388608,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 \ -device intel-hda,id=ua-bd10e3e9-bb2d-4eac-a56b-944122b21d8d,bus=pci.0,addr=0x4 \ -device hda-duplex,id=ua-bd10e3e9-bb2d-4eac-a56b-944122b21d8d-codec0,bus=ua-bd10e3e9-bb2d-4eac-a56b-944122b21d8d.0,cad=0 \ -device virtio-balloon-pci,id=ua-a925c27a-880d-43a1-adf3-b398968fd8f8,bus=pci.0,addr=0x8 \ -object rng-random,id=objua-70d0edb3-4dc7-48d5-be0a-30109d5c50a6,filename=/dev/urandom \ -device virtio-rng-pci,rng=objua-70d0edb3-4dc7-48d5-be0a-30109d5c50a6,id=ua-70d0edb3-4dc7-48d5-be0a-30109d5c50a6,bus=pci.0,addr=0x9 \ -msg timestamp=on 2018-04-02 08:44:56.287+0000: 80940: info : virObjectUnref:350 : OBJECT_UNREF: obj=0x7f2210114a90 2018-04-02T08:44:56.466027Z qemu-kvm: -drive file=/rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com:_nas01_ge__4__nfs__0/7b10e8fe-c6bb-496a-8873-80b1067629db/images/d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c/2653a8c8-6aa6-4466-aa85-0ba5e58af129,format=qcow2,if=none,id=drive-ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,serial=d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,cache=none,werror=stop,rerror=stop,aio=threads: 'serial' is deprecated, please use the corresponding option of '-device' instead 2018-04-02T08:44:56.478337Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 1 [socket-id: 1, core-id: 0, thread-id: 0], CPU 2 [socket-id: 2, core-id: 0, thread-id: 0], CPU 3 [socket-id: 3, core-id: 0, thread-id: 0], CPU 4 [socket-id: 4, core-id: 0, thread-id: 0], CPU 5 [socket-id: 5, core-id: 0, thread-id: 0], CPU 6 [socket-id: 6, core-id: 0, thread-id: 0], CPU 7 [socket-id: 7, core-id: 0, thread-id: 0], CPU 8 [socket-id: 8, core-id: 0, thread-id: 0], CPU 9 [socket-id: 9, core-id: 0, thread-id: 0], CPU 10 [socket-id: 10, core-id: 0, thread-id: 0], CPU 11 [socket-id: 11, core-id: 0, thread-id: 0], CPU 12 [socket-id: 12, core-id: 0, thread-id: 0], CPU 13 [socket-id: 13, core-id: 0, thread-id: 0], CPU 14 [socket-id: 14, core-id: 0, thread-id: 0], CPU 15 [socket-id: 15, core-id: 0, thread-id: 0] 2018-04-02T08:44:56.478362Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future main_channel_link: add main channel client main_channel_client_handle_pong: net test: latency 6.278000 ms, bitrate 143982002 bps (137.311937 Mbps) inputs_connect: inputs channel client create red_qxl_set_cursor_peer: red_channel_client_disconnect: rcc=0x5613c1b4cdd0 (channel=0x5613c1bdf090 type=4 id=0) red_channel_client_disconnect: rcc=0x5613c267a5b0 (channel=0x5613c092ea30 type=2 id=0) red_channel_client_disconnect: rcc=0x5613c12f89c0 (channel=0x5613c092f100 type=3 id=0) red_channel_client_disconnect: rcc=0x5613c24881b0 (channel=0x5613c092eae0 type=5 id=0) red_channel_client_disconnect: rcc=0x5613c1b901b0 (channel=0x5613c092eba0 type=6 id=0) red_channel_client_disconnect: rcc=0x5613c131c9f0 (channel=0x5613c092e960 type=1 id=0) main_channel_client_on_disconnect: rcc=0x5613c131c9f0 red_client_destroy: destroy client 0x5613c0970170 with #channels=6 red_qxl_disconnect_cursor_peer: red_qxl_disconnect_display_peer: 2018-04-02T08:45:13.486342Z qemu-kvm: terminating on signal 15 from pid 72746 (<unknown process>) 2018-04-02 08:45:13.686+0000: shutting down, reason=destroyed (In reply to Jiri Denemark from comment #9) > I was not able to reproduce this issue and Jing Yan is not able to reproduce > it anymore either. :-( CondNAK(Reproducer). In case it reproduces somehow somewhere, please report the details here. > However, the limited logs we have for the last domain > start > which was reported as the one that hit this bug show that libvirt started > QEMU > with the specified disk as expected and QEMU did not report any errors > accessing the disk. In other words, it looks like some magic issue between > QEMU and SeaBIOS. OK, we'll keep an eye on it, but waiting for a reproducer. QE tried 20 times, could not reproduce this issue by attached xml file. Versions: Host: kernel-3.10.0-862.el7.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.1 libvirt-3.9.0-14.el7.x86_64 seabios-bin-1.11.0-2.el7.noarch Guest: kernel-3.10.0-862.el7.x86_64 (In reply to Xueqiang Wei from comment #11) > QE tried 20 times, could not reproduce this issue by attached xml file. > Looks like this can't be reproduced anywhere (se also comment #9), so I'm closing this bug. Please reopen if it can be reproduced somehow or if there's new data. Hi, Seems like this bug pop-up again. Reopening and attaching logs. ovirt-engine-4.2.5-0.1.el7ev.noarch kernel-3.10.0-862.el7.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64 seabios-bin-1.11.0-2.el7.noarch vdsm-4.20.33-1.el7ev.x86_64 Additional info: logs have host3(the one with the problematic start-up, also spm) sosreport and engine log. the correlation id on engine: vms_syncAction_5e62130a-9eb2-41a5 Created attachment 1459389 [details]
logs
(In reply to Liran Rotenberg from comment #13) > Hi, > Seems like this bug pop-up again. > Reopening and attaching logs. > > ovirt-engine-4.2.5-0.1.el7ev.noarch > kernel-3.10.0-862.el7.x86_64 > qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64 > seabios-bin-1.11.0-2.el7.noarch > vdsm-4.20.33-1.el7ev.x86_64 > > Additional info: > logs have host3(the one with the problematic start-up, also spm) sosreport > and engine log. > the correlation id on engine: > vms_syncAction_5e62130a-9eb2-41a5 Can we access the machine where this reproduces? How frequently is it reproducing? Are the steps/configuration the same as before? (In reply to Jiri Denemark from comment #9) > I was not able to reproduce this issue and Jing Yan is not able to reproduce > it anymore either. However, the limited logs we have for the last domain > start > which was reported as the one that hit this bug show that libvirt started > QEMU > with the specified disk as expected and QEMU did not report any errors > accessing the disk. In other words, it looks like some magic issue between > QEMU and SeaBIOS. > qemu.log: > 2018-04-02 08:44:56.272+0000: starting up libvirt version: 3.9.0, package: > 14.el7_5.2 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, > 2018-03-20-14:43:16, x86-041.build.eng.bos.redhat.com), qemu version: > 2.10.0(qemu-kvm-rhev-2.10.0-21.el7_5.1), hostname: > lynx12.lab.eng.tlv2.redhat.com > LC_ALL=C \ > PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ > QEMU_AUDIO_DRV=spice \ > /usr/libexec/qemu-kvm \ > -name guest=golden_env_mixed_virtio_1_0,debug-threads=on \ > -S \ > -object > secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-66- > golden_env_mixed_vir/master-key.aes \ > -machine pc-i440fx-rhel7.5.0,accel=kvm,usb=off,dump-guest-core=off \ > -cpu Conroe \ > -m size=1048576k,slots=16,maxmem=4194304k \ > -realtime mlock=off \ > -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 \ > -object iothread,id=iothread1 \ > -numa node,nodeid=0,cpus=0,mem=1024 \ > -uuid cadb924c-9eff-4f96-a4d9-42d053d62da9 \ > -smbios 'type=1,manufacturer=oVirt,product=RHEV > Hypervisor,version=7.5-8.el7,serial=4C4C4544-0056-3010-8056-C8C04F563232, > uuid=cadb924c-9eff-4f96-a4d9-42d053d62da9' \ > -no-user-config \ > -nodefaults \ > -chardev > socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-66- > golden_env_mixed_vir/monitor.sock,server,nowait \ > -mon chardev=charmonitor,id=monitor,mode=control \ > -rtc base=2018-04-02T08:44:55,driftfix=slew \ > -global kvm-pit.lost_tick_policy=delay \ > -no-hpet \ > -no-shutdown \ > -global PIIX4_PM.disable_s3=1 \ > -global PIIX4_PM.disable_s4=1 \ > -boot strict=on \ > -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ > -device > virtio-scsi-pci,iothread=iothread1,id=ua-97cc068a-eafa-4c6b-a1e1- > d61ffbcd6524,bus=pci.0,addr=0x5 \ > -device > virtio-serial-pci,id=ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db,max_ports=16, > bus=pci.0,addr=0x6 \ > -drive > if=none,id=drive-ua-acb27e03-ae91-4876-8c26-c75092d9be36,readonly=on, > werror=report,rerror=report \ > -device > ide-cd,bus=ide.1,unit=0,drive=drive-ua-acb27e03-ae91-4876-8c26-c75092d9be36, > id=ua-acb27e03-ae91-4876-8c26-c75092d9be36 \ > -drive > file=/rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com: > _nas01_ge__4__nfs__0/7b10e8fe-c6bb-496a-8873-80b1067629db/images/d5316ae9- > c6a4-4cca-a3ee-ef890f63cd8c/2653a8c8-6aa6-4466-aa85-0ba5e58af129, > format=qcow2,if=none,id=drive-ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c, > serial=d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,cache=none,werror=stop, > rerror=stop,aio=threads \ > -device > virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.0,addr=0x7,drive=drive-ua- > d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,id=ua-d5316ae9-c6a4-4cca-a3ee- > ef890f63cd8c,bootindex=1 \ > -netdev > tap,fd=33,id=hostua-e5ae5ef0-eb1e-41bf-ba4b-277c95d52131,vhost=on,vhostfd=35 > \ > -device > virtio-net-pci,netdev=hostua-e5ae5ef0-eb1e-41bf-ba4b-277c95d52131,id=ua- > e5ae5ef0-eb1e-41bf-ba4b-277c95d52131,mac=00:1a:4a:16:88:c9,bus=pci.0, > addr=0x3 \ > -chardev > socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/cadb924c-9eff- > 4f96-a4d9-42d053d62da9.ovirt-guest-agent.0,server,nowait \ > -device > virtserialport,bus=ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db.0,nr=1, > chardev=charchannel0,id=channel0,name=ovirt-guest-agent.0 \ > -chardev > socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/cadb924c-9eff- > 4f96-a4d9-42d053d62da9.org.qemu.guest_agent.0,server,nowait \ > -device > virtserialport,bus=ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db.0,nr=2, > chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 \ > -chardev spicevmc,id=charchannel2,name=vdagent \ > -device > virtserialport,bus=ua-dfd83dc8-909c-4c5f-9af2-4d546abe82db.0,nr=3, > chardev=charchannel2,id=channel2,name=com.redhat.spice.0 \ > -spice > port=5902,tls-port=5903,addr=10.46.16.27,x509-dir=/etc/pki/vdsm/libvirt- > spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls- > channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard, > tls-channel=usbredir,seamless-migration=on \ > -device > qxl-vga,id=ua-d6bf25f1-27c1-4977-ba25-29041fea2c1a,ram_size=67108864, > vram_size=8388608,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0, > addr=0x2 \ > -device > intel-hda,id=ua-bd10e3e9-bb2d-4eac-a56b-944122b21d8d,bus=pci.0,addr=0x4 \ > -device > hda-duplex,id=ua-bd10e3e9-bb2d-4eac-a56b-944122b21d8d-codec0,bus=ua-bd10e3e9- > bb2d-4eac-a56b-944122b21d8d.0,cad=0 \ > -device > virtio-balloon-pci,id=ua-a925c27a-880d-43a1-adf3-b398968fd8f8,bus=pci.0, > addr=0x8 \ > -object > rng-random,id=objua-70d0edb3-4dc7-48d5-be0a-30109d5c50a6,filename=/dev/ > urandom \ > -device > virtio-rng-pci,rng=objua-70d0edb3-4dc7-48d5-be0a-30109d5c50a6,id=ua-70d0edb3- > 4dc7-48d5-be0a-30109d5c50a6,bus=pci.0,addr=0x9 \ > -msg timestamp=on > 2018-04-02 08:44:56.287+0000: 80940: info : virObjectUnref:350 : > OBJECT_UNREF: obj=0x7f2210114a90 > 2018-04-02T08:44:56.466027Z qemu-kvm: -drive > file=/rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com: > _nas01_ge__4__nfs__0/7b10e8fe-c6bb-496a-8873-80b1067629db/images/d5316ae9- > c6a4-4cca-a3ee-ef890f63cd8c/2653a8c8-6aa6-4466-aa85-0ba5e58af129, > format=qcow2,if=none,id=drive-ua-d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c, > serial=d5316ae9-c6a4-4cca-a3ee-ef890f63cd8c,cache=none,werror=stop, > rerror=stop,aio=threads: 'serial' is deprecated, please use the > corresponding option of '-device' instead > 2018-04-02T08:44:56.478337Z qemu-kvm: warning: CPU(s) not present in any > NUMA nodes: CPU 1 [socket-id: 1, core-id: 0, thread-id: 0], CPU 2 > [socket-id: 2, core-id: 0, thread-id: 0], CPU 3 [socket-id: 3, core-id: 0, > thread-id: 0], CPU 4 [socket-id: 4, core-id: 0, thread-id: 0], CPU 5 > [socket-id: 5, core-id: 0, thread-id: 0], CPU 6 [socket-id: 6, core-id: 0, > thread-id: 0], CPU 7 [socket-id: 7, core-id: 0, thread-id: 0], CPU 8 > [socket-id: 8, core-id: 0, thread-id: 0], CPU 9 [socket-id: 9, core-id: 0, > thread-id: 0], CPU 10 [socket-id: 10, core-id: 0, thread-id: 0], CPU 11 > [socket-id: 11, core-id: 0, thread-id: 0], CPU 12 [socket-id: 12, core-id: > 0, thread-id: 0], CPU 13 [socket-id: 13, core-id: 0, thread-id: 0], CPU 14 > [socket-id: 14, core-id: 0, thread-id: 0], CPU 15 [socket-id: 15, core-id: > 0, thread-id: 0] > 2018-04-02T08:44:56.478362Z qemu-kvm: warning: All CPU(s) up to maxcpus > should be described in NUMA config, ability to start up with partial NUMA > mappings is obsoleted and will be removed in future > main_channel_link: add main channel client > main_channel_client_handle_pong: net test: latency 6.278000 ms, bitrate > 143982002 bps (137.311937 Mbps) > inputs_connect: inputs channel client create > red_qxl_set_cursor_peer: > red_channel_client_disconnect: rcc=0x5613c1b4cdd0 (channel=0x5613c1bdf090 > type=4 id=0) > red_channel_client_disconnect: rcc=0x5613c267a5b0 (channel=0x5613c092ea30 > type=2 id=0) > red_channel_client_disconnect: rcc=0x5613c12f89c0 (channel=0x5613c092f100 > type=3 id=0) > red_channel_client_disconnect: rcc=0x5613c24881b0 (channel=0x5613c092eae0 > type=5 id=0) > red_channel_client_disconnect: rcc=0x5613c1b901b0 (channel=0x5613c092eba0 > type=6 id=0) > red_channel_client_disconnect: rcc=0x5613c131c9f0 (channel=0x5613c092e960 > type=1 id=0) > main_channel_client_on_disconnect: rcc=0x5613c131c9f0 > red_client_destroy: destroy client 0x5613c0970170 with #channels=6 > red_qxl_disconnect_cursor_peer: > red_qxl_disconnect_display_peer: > 2018-04-02T08:45:13.486342Z qemu-kvm: terminating on signal 15 from pid > 72746 (<unknown process>) > 2018-04-02 08:45:13.686+0000: shutting down, reason=destroyed QE still could not reproduce this issue as comment 11 with the QEMU CML above. """ > 2018-04-02T08:45:13.486342Z qemu-kvm: terminating on signal 15 from pid > 72746 (<unknown process>) > 2018-04-02 08:45:13.686+0000: shutting down, reason=destroyed """ qemu-kvm process was killed here, not sure if it's impact the vm boot. I've tried to take a look at the logs in comment 13 and comment 14. The things I'm seeing could be "business as usual" for ovirt, but they are certainly new to me. (1) Is this a nested virt scenario? When I untar "sosreport-host_mixed_3-20180717134200.tar", it produces a directory called "sosreport-virt-nested-vm13.scl.lab.tlv.redhat.com-20180717134200". (2) Grepping "engine.log" for "virt-nested-vm13", there's a good number of "Heartbeat exceeded for host 'virt-nested-vm13.scl.lab.tlv.redhat.com'" messages. (3) Looking at "/var/log/messages" in the sosreport, I see three soft lockups. The first such message seems to be the first lookup since the host had been booted (because when it is logged, the kernel isn't tainted yet); the other two already display "soft lockup" as taint reason: Jul 16 16:18:08 virt-nested-vm13 kernel: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [python:22132] Jul 16 16:39:36 virt-nested-vm13 kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [python:8725] Jul 16 16:47:40 virt-nested-vm13 kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd:1] (References: <https://access.redhat.com/solutions/40594>, <https://access.redhat.com/articles/371803>.) Note that these soft lockups all precede the actual VM start in question -- the time for that can be located in "engine.log", using the correlation ID specified in comment 13 --> "2018-07-17 12:28:32,448+03". (4) Looking at the generated domain XML, it's not clear to me what storage the disk image is on. <disk snapshot="no" type="file" device="disk"> <target dev="vda" bus="virtio"/> <source file="/rhev/data-center/efffe225-3927-4563-968c-42d3d98f482e/d011d522-efb3-47f2-859e-d3e444eb4d4b/images/d4af4c5d-ade6-46d7-acde-d25e6b5109f2/10de0412-2454-46a1-a453-8750f0d3a3af"/> <driver name="qemu" iothread="1" io="threads" type="qcow2" error_policy="stop" cache="none"/> <alias name="ua-d4af4c5d-ade6-46d7-acde-d25e6b5109f2"/> <address bus="0x00" domain="0x0000" function="0x0" slot="0x07" type="pci"/> <boot order="1"/> <serial>d4af4c5d-ade6-46d7-acde-d25e6b5109f2</serial> </disk> Can it be confirmed please that this recent domain XML fragment is consistent with the originally reported symptom? ("The VM doesn't start from the bootable HD with OS. It goes to the other bootable options(PXE for example)".) Because, if the storage backend had failed, then I believe the above fragment would have caused QEMU to pause the guest (error_policy="stop"). But, according to the original report, the guest is not paused, it advances to the next boot option. Now, in the domain XML generated for correlation ID "vms_syncAction_5e62130a-9eb2-41a5", only this one disk is marked with <boot order=...>, and libvirt always generates "HALT" in the "bootorder" fw_cfg file for the firmware. Meaning that, if all of the explicitly marked boot devices fail (= the sole disk), then the firmware will not boot at all. So... points (1) through (3) suggest to me that the host is over-subscribed (nested virt, heartbeats missed, CPUs intermittently stuck), and the boot fails due to that, for some reason. Whereas point (4) suggests that the firmware fails to read from the storage backend (which is of unknown kind), such that QEMU doesn't actually detect the failure as an I/O error (because in that case QEMU would set the VM state to "paused"); as a consequence, SeaBIOS has nothing else to boot, and stops. This could be validated against the graphical display of the VM (the domain XML does specify QXL + spice). Additionally, the SeaBIOS log should be possible to capture, assuming we can modify the generated domain XML fragment, and write to some filesystem (such as /tmp). I sent Markus some ideas for this earlier. ... Is it possible that the storage backend is over-subscribed (... as well)? Because, from comment 19, "This bug reproduced very easily on <REDACTED>. Just mark all vms and start them." Launching a herd of VMs simultaneously suggests a spike in both host CPU load and storage backend load. Made some progress. The following SeaBIOS debug patch was non-intrusive enough to let the hang / spin reproduce (in 1 out of 6 VMs launched at the same time), and it provides some insight: > diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c > index 88d7e54a2a92..853738f96eea 100644 > --- a/src/hw/virtio-blk.c > +++ b/src/hw/virtio-blk.c > @@ -23,57 +23,69 @@ > #include "virtio-ring.h" > #include "virtio-blk.h" > > +#define OPS_DONE_LOG_LIMIT 10 > + > struct virtiodrive_s { > struct drive_s drive; > struct vring_virtqueue *vq; > struct vp_device vp; > + unsigned ops_done; > }; > > static int > virtio_blk_op(struct disk_op_s *op, int write) > { > struct virtiodrive_s *vdrive = > container_of(op->drive_fl, struct virtiodrive_s, drive); > struct vring_virtqueue *vq = vdrive->vq; > struct virtio_blk_outhdr hdr = { > .type = write ? VIRTIO_BLK_T_OUT : VIRTIO_BLK_T_IN, > .ioprio = 0, > .sector = op->lba, > }; > u8 status = VIRTIO_BLK_S_UNSUPP; > struct vring_list sg[] = { > { > .addr = (void*)(&hdr), > .length = sizeof(hdr), > }, > { > .addr = op->buf_fl, > .length = vdrive->drive.blksize * op->count, > }, > { > .addr = (void*)(&status), > .length = sizeof(status), > }, > }; > + unsigned op_nr = vdrive->ops_done++; > + > + if (op_nr < OPS_DONE_LOG_LIMIT) { > + dprintf(1, "virtio-blk @ %p op#%u submit\n", vdrive, op_nr); > + } > > /* Add to virtqueue and kick host */ > if (write) > vring_add_buf(vq, sg, 2, 1, 0, 0); > else > vring_add_buf(vq, sg, 1, 2, 0, 0); > vring_kick(&vdrive->vp, vq, 1); > > /* Wait for reply */ > while (!vring_more_used(vq)) > usleep(5); > > + if (op_nr < OPS_DONE_LOG_LIMIT) { > + dprintf(1, "virtio-blk @ %p op#%u done\n", vdrive, op_nr); > + } > + > /* Reclaim virtqueue element */ > vring_get_buf(vq, NULL); > > /* Clear interrupt status register. Avoid leaving interrupts stuck if > * VRING_AVAIL_F_NO_INTERRUPT was ignored and interrupts were raised. > */ > vp_get_isr(&vdrive->vp); > > return status == VIRTIO_BLK_S_OK ? DISK_RET_SUCCESS : DISK_RET_EBADTRACK; > } 5 out of 6 VMs logged: > enter handle_19: > NULL > Booting from Hard Disk... > virtio-blk @ 0x000ebb80 op#0 submit > virtio-blk @ 0x000ebb80 op#0 done > Booting from 0000:7c00 > virtio-blk @ 0x000ebb80 op#1 submit > virtio-blk @ 0x000ebb80 op#1 done > virtio-blk @ 0x000ebb80 op#2 submit > virtio-blk @ 0x000ebb80 op#2 done > virtio-blk @ 0x000ebb80 op#3 submit > virtio-blk @ 0x000ebb80 op#3 done > virtio-blk @ 0x000ebb80 op#4 submit > virtio-blk @ 0x000ebb80 op#4 done > virtio-blk @ 0x000ebb80 op#5 submit > virtio-blk @ 0x000ebb80 op#5 done > virtio-blk @ 0x000ebb80 op#6 submit > virtio-blk @ 0x000ebb80 op#6 done > virtio-blk @ 0x000ebb80 op#7 submit > virtio-blk @ 0x000ebb80 op#7 done > virtio-blk @ 0x000ebb80 op#8 submit > virtio-blk @ 0x000ebb80 op#8 done > virtio-blk @ 0x000ebb80 op#9 submit > virtio-blk @ 0x000ebb80 op#9 done > VBE mode info request: 140 > ... but the 1 that got stuck spinning logged: > enter handle_19: > NULL > Booting from Hard Disk... > virtio-blk @ 0x000ebb80 op#0 submit > [HANG] In other words, SeaBIOS submits the virtio-blk read request for the MBR, but it never sees the request complete, so it waits forever. I'll enable some tracing in QEMU to see what happens. More progress. After hours of struggle which I won't go into (basically, read "/usr/share/doc/qemu-kvm/README.systemtap", and try to fill in the gaps the best you can...), I managed to create the following files on the host: "/etc/systemtap/conf.d/qemu_kvm.conf": > # Force load uprobes (see BZ#1118352) > stap -e 'probe process("/usr/libexec/qemu-kvm").function("main") { printf("") }' -c true > > qemu_kvm_OPT="-s1024" # per-CPU buffer size, in megabytes "/etc/systemtap/script.d/qemu_kvm.stp" > probe qemu.kvm.simpletrace.virtio_blk_data_plane_start {} > probe qemu.kvm.simpletrace.virtio_blk_data_plane_stop {} > probe qemu.kvm.simpletrace.virtio_blk_handle_read {} > probe qemu.kvm.simpletrace.virtio_blk_handle_write {} > probe qemu.kvm.simpletrace.virtio_blk_req_complete {} > probe qemu.kvm.simpletrace.virtio_blk_rw_complete {} > probe qemu.kvm.simpletrace.virtio_blk_submit_multireq {} > probe qemu.kvm.simpletrace.virtio_notify {} > probe qemu.kvm.simpletrace.virtio_notify_irqfd {} > probe qemu.kvm.simpletrace.virtio_queue_notify {} > probe qemu.kvm.simpletrace.virtio_set_status {} > probe qemu.kvm.simpletrace.virtqueue_alloc_element {} > probe qemu.kvm.simpletrace.virtqueue_fill {} > probe qemu.kvm.simpletrace.virtqueue_flush {} > probe qemu.kvm.simpletrace.virtqueue_pop {} Started the flight recorder: # service systemtap start qemu_kvm Started all 6 VMs on the WebUI. Made sure one was stuck again, and grabbed its PID with "top" (easy to see from the CPU load which one it was -- it was spinning). That was PID 74403. Powered off all VMs; forced off the one that was stuck. (At this point all non-stuck VMs had booted their OSes and quiesced.) Dumped the flight recorder buffer: # staprun -A qemu_kvm >/tmp/trace.dat & After "trace.dat" stopped growing, I interrupted "staprun". (This is *absolutely* non-intuitive: "staprun -A" attaches to the stap module, dumps whatever's accumulated in the buffer, and then continues dumping new entries as they appear. One has no idea when the buffer is empty, except when the file stops growing. Yes, I guess I could read the hundreds of kilobytes of stap documentation. :) ) Formatted the binary records to text: # /usr/share/qemu-kvm/simpletrace.py \ --no-header \ /usr/share/qemu-kvm/trace-events-all \ /tmp/trace.dat \ > /tmp/trace.txt Then collated the trace into files by PID: # egrep -o '\<pid=[0-9]+\>' trace.txt | sort -u pid=74403 pid=74495 pid=74589 pid=74699 pid=74852 pid=74965 # for PID in 74403 74495 74589 74699 74852 74965; do egrep '\<pid='"$PID"'\>' trace.txt > trace-$PID.txt done The file sizes are telling: # ls -l --sort=size trace-*.txt -rw-r--r--. 1 root root 7832694 Aug 2 03:53 trace-74699.txt -rw-r--r--. 1 root root 6679923 Aug 2 03:53 trace-74589.txt -rw-r--r--. 1 root root 5778395 Aug 2 03:53 trace-74852.txt -rw-r--r--. 1 root root 5309522 Aug 2 03:53 trace-74495.txt -rw-r--r--. 1 root root 4904952 Aug 2 03:53 trace-74965.txt -rw-r--r--. 1 root root 109037 Aug 2 03:53 trace-74403.txt <-- stuck guy I counted how many times each trace point was logged for each VM. Here's a table: stuck guy | v PID 74403 74495 74589 74699 74852 74965 virtio_blk_data_plane_start 1 2 2 2 2 2 virtio_blk_data_plane_stop 0 1 1 1 1 1 virtio_blk_handle_read 0 5198 6228 7253 5211 5199 virtio_blk_handle_write 0 583 75 76 1095 199 virtio_blk_req_complete 0 5800 6322 7348 6325 5417 virtio_blk_rw_complete 0 5781 6303 7329 6306 5398 virtio_blk_submit_multireq 0 11 9 8 11 20 virtio_notify 0 123 125 127 124 127 virtio_notify_irqfd 0 3871 4191 5159 4275 3409 virtio_queue_notify 279 15140 27976 33593 16623 13425 virtio_set_status 38 122 122 122 122 122 virtqueue_alloc_element 256 6449 6974 7999 6974 6069 virtqueue_fill 256 6448 6973 7998 6973 6068 virtqueue_flush 256 6448 6973 7998 6973 6068 virtqueue_pop 256 6449 6974 7999 6974 6069 The relevant observation is that there are *zero* virtio_blk_* tracepoints logged for PID=74403, beyond the sole "virtio_blk_data_plane_start". Compare comment 29: SeaBIOS did try to submit a read request. My take is that the virtio-blk data plane misses the virtio kick from SeaBIOS when the host is under load. ... Trying to correlate "trace-74403.txt" with "trace-74495.txt" (i.e. the stuck one with one of the lucky ones), the last *common* entries are: > virtio_set_status 337.846 pid=74495 vdev=0x55e0c1d1c170 val=0x0 > virtio_set_status 3.248 pid=74495 vdev=0x55e0c1d1c170 val=0x0 and the first entries in the latter trace that seem to be missing from the former (= hung one) are: > virtio_queue_notify 102338.774 pid=74495 vdev=0x55e0c3724170 n=0x0 vq=0x55e0c3732000 > virtqueue_alloc_element 26.545 pid=74495 elem=0x55e0c42b74a0 sz=0xa8 in_num=0x2 out_num=0x1 > virtqueue_pop 4.815 pid=74495 vq=0x55e0c3732000 elem=0x55e0c42b74a0 in_num=0x2 out_num=0x1 > virtio_blk_handle_read 6.049 pid=74495 vdev=0x55e0c3724170 req=0x55e0c42b74a0 sector=0x0 nsectors=0x1 > virtio_blk_rw_complete 2705.791 pid=74495 vdev=0x55e0c3724170 req=0x55e0c42b74a0 ret=0x0 > virtio_blk_req_complete 4.540 pid=74495 vdev=0x55e0c3724170 req=0x55e0c42b74a0 status=0x0 > virtqueue_fill 3.757 pid=74495 vq=0x55e0c3732000 elem=0x55e0c42b74a0 len=0x201 idx=0x0 > virtqueue_flush 6.440 pid=74495 vq=0x55e0c3732000 count=0x1 This is where the MBR is read by one of the lucky VMs. One interesting bit about that: note the value "102338.774". If I understand correctly, that's the time (in microseconds) since the last tracepoint was hit. It's *huge*, relative to the rest of the intervals, and it's also not small in absolute sense -- it's ~0.1 second. Why does it take so long? And, it takes so long for three out of the five non-stuck VMs: > trace-74495.txt-virtio_queue_notify 102338.774 pid=74495 vdev=0x55e0c3724170 n=0x0 vq=0x55e0c3732000 > trace-74589.txt-virtio_queue_notify 102939.733 pid=74589 vdev=0x56082c58e170 n=0x0 vq=0x56082c59a000 > trace-74699.txt-virtio_queue_notify 106781.136 pid=74699 vdev=0x557220736170 n=0x0 vq=0x557220742000 > trace-74852.txt-virtio_queue_notify 21459.920 pid=74852 vdev=0x56231fd62170 n=0x0 vq=0x56231fd70000 > trace-74965.txt-virtio_queue_notify 108.849 pid=74965 vdev=0x55cdf3b20170 n=0x0 vq=0x55cdf3b2c000 To highlight it -- the virtio_queue_notify tracepoint is not triggered in QEMU when SeaBIOS submits the virtio-blk read request for the MBR. This means that QEMU doesn't even notice the request (hence it doesn't start any host-side IO either). (In reply to Laszlo Ersek from comment #31) > To highlight it -- the virtio_queue_notify tracepoint is not triggered in > QEMU when SeaBIOS submits the virtio-blk read request for the MBR. This > means that QEMU doesn't even notice the request (hence it doesn't start any > host-side IO either). In this setup I believe the notification path is MMIO -> KVM ioeventfd -> QEMU IOThread. But there is a chance that QEMU IOthread is polling the virt queue directly. I think it will help if I can gdb attach to the stuck QEMU and check VQ and even loop status. It might be a race condition specific to IOThread. Can I have access to the machine and the reproducing steps, Laszlo? (In reply to Fam Zheng from comment #34) > (In reply to Laszlo Ersek from comment #31) > > To highlight it -- the virtio_queue_notify tracepoint is not triggered in > > QEMU when SeaBIOS submits the virtio-blk read request for the MBR. This > > means that QEMU doesn't even notice the request (hence it doesn't start any > > host-side IO either). > > In this setup I believe the notification path is MMIO -> KVM ioeventfd -> > QEMU IOThread. But there is a chance that QEMU IOthread is polling the virt > queue directly. > > I think it will help if I can gdb attach to the stuck QEMU and check VQ and > even loop status. It might be a race condition specific to IOThread. Can I > have access to the machine and the reproducing steps, Laszlo? Yes, feel free to use it I've successfully reproduced the hang following Laszlo's steps. Then I gdb attached to the hanging QEMU. The IOThread is not making progress with a blocking aio_poll(): (gdb) bt #0 0x00007ff7909e3f5f in ppoll () from /lib64/libc.so.6 #1 0x000055a3fc8e64cb in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=-1) at util/qemu-timer.c:322 #3 0x000055a3fc8e8195 in aio_poll (ctx=0x55a3ff44bb80, blocking=blocking@entry=true) at util/aio-posix.c:629 #4 0x000055a3fc6dab56 in iothread_run (opaque=0x55a3ff453960) at iothread.c:59 #5 0x00007ff790cc4dd5 in start_thread () from /lib64/libpthread.so.0 #6 0x00007ff7909eeb3d in clone () from /lib64/libc.so.6 aio_poll() is only polling the dummy event notifier used in aio_notify(): #6 0x000055a3fc8e8195 in aio_poll (ctx=0x55a3ff44bb80, blocking=blocking@entry=true) at util/aio-posix.c:629 629 ret = qemu_poll_ns(pollfds, npfd, timeout); (gdb) p npfd $36 = 1 (gdb) p *pollfds $37 = {fd = 13, events = 25, revents = 0} But there is an active AioHandler which should be polled: (gdb) set $n = ctx.aio_handlers.lh_first (gdb) while $n >p *$n >set $n = $n.node.le_next >end [hidding deleted nodes...] $6 = {pfd = {fd = 37, events = 25, revents = 0}, io_read = 0x55a3fc652cf0 <virtio_queue_host_notifier_aio_read>, io_write = 0x0, io_poll = 0x55a3fc6530f0 <virtio_queue_host_notifier_aio_poll>, io_poll_begin = 0x55a3fc652ed0 <virtio_queue_host_notifier_aio_poll_begin>, io_poll_end = 0x55a3fc652eb0 <virtio_queue_host_notifier_aio_poll_end>, deleted = 0, opaque = 0x55a400eca068, is_external = true, node = {le_next = 0x55a3ff4721e0, le_prev = 0x55a3ff473068}} $7 = {pfd = {fd = 13, events = 25, revents = 0}, io_read = 0x55a3fc8e4dc0 <event_notifier_dummy_cb>, io_write = 0x0, io_poll = 0x55a3fc8e4dd0 <event_notifier_poll>, io_poll_begin = 0x0, io_poll_end = 0x0, deleted = 0, opaque = 0x55a3ff44bc2c, is_external = false, node = {le_next = 0x0, le_prev = 0x55a3ff472e28}} Checking the handler virtio_queue_host_notifier_aio_read and opaque 0x55a400eca068, this is exactly the virtio-blk dataplane handler. So the question is why aio_poll() didn't add this node to "pollfds", or alternatively an aio_notify() is missed. Will continue to debug tomorrow. Fam, thanks for the excellent analysis! :) You mention "alternatively an aio_notify() is missed". I'm eerily reminded of: - bug 1184405, and the corresponding downstream commits on the "rhev7/master-2.6.0" branch (aka RHV-7.3.z): - 3e531f1d6e28 ("aio_notify(): force main loop wakeup with SIGIO (aarch64 host only)", 2016-05-12) - d76a59993511 ("ignore SIGIO in tests that use AIO context (aarch64 host only)", 2016-05-12) - and bug 1188656, and the corresponding downstream commit (on the same branch): - 02daa7f3cb1a ("Revert "aio_notify(): force main loop wakeup with SIGIO (aarch64 host only)"", 2016-06-21) The last downstream commit mentioned above references the following upstream commits as the real fix: - eabc977 ("AioContext: fix broken ctx->dispatching optimization", 2015-07-22) - 21a03d1 ("AioContext: fix broken placement of event_notifier_test_and_clear", 2015-07-22), and I don't know if those commits have any relevance now, but the summary back then looked somewhat similar ("aio_notify() wouldn't do it's job", kind of); perhaps reviewing the upstream fixes from back then could help your investigation. Just an idea. (If, at the top, you meant that aio_notify() wasn't even called, then I'm sorry about the noise.) Summary: continuing the investigation in comment 42 I came to this change that seems to fix the problem (survives >20 repeats of the reproducer): Author: Fam Zheng <famz> Date: Fri Aug 3 21:47:18 2018 +0800 async: Fix aio_notify_accept Signed-off-by: Fam Zheng <famz> diff --git a/util/async.c b/util/async.c index 4dd9d95a9e..6fbadda2b1 100644 --- a/util/async.c +++ b/util/async.c @@ -346,7 +346,11 @@ void aio_notify(AioContext *ctx) void aio_notify_accept(AioContext *ctx) { - if (atomic_xchg(&ctx->notified, false)) { + /* If ctx->notify_me >= 2, another aio_poll() is waiting which may need the + * ctx->notifier event to wake up, so don't clear clear it because "we" are + * already in a running state. */ + if (atomic_read(&ctx->notify_me) < 2 + && atomic_xchg(&ctx->notified, false)) { event_notifier_test_and_clear(&ctx->notifier); } } --- To avoid forgetting details in the weekend (I'm off on Monday too), I will elaborate some on the analysis I did. From comment 42 we already know that aio_notify() is the suspect. But before I get to it, let's say other flaws in the event loop may also lead to the same abnormal condition. The approach to get more clues is like how Laszlo narrowed things down previously: using tracepoints. I wanted to trace what happened during the timespan of the hanging aio_poll() in the IOThread - not just the very aio_poll() itself, but also other activities in the main loop and vCPU thread. I decided to write a poor man's tracing code instead of using existing tracing points for a few reasons: 1) I can significantly limit the number of events to log and analysis; 2) I can log a wide range of things, not limited by the defined QEMU tracepoints; 3) I can keep the log in a QEMU global array which can be easily inspected with gdb once I attach. So my tracer keeps records of (file, func, line, thread id, some scalar), like any other tracer does. For example, here are the most interesting events I've collected right at the point of the hanging (don't try to understand it, I'll explain): file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 623, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 619, thread = 43911, data = 94828190945312 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 620, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 621, thread = 43911, data = 25 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 622, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 623, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 619, thread = 43911, data = 94828190945216 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 620, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 646, thread = 0, data = 5 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 621, thread = 43911, data = 25 file = "util/async.c\000\000\000", func = "otify_accept", line = 356, thread = 0, data = 94828190768000 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 622, thread = 43911, data = 1 file = "util/async.c\000\000\000", func = "otify_accept", line = 357, thread = 0, data = 94828190767680 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 623, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 619, thread = 43911, data = 94828190944640 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 620, thread = 43911, data = 0 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 621, thread = 43911, data = 25 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 622, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 623, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 619, thread = 43911, data = 94828190941568 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 620, thread = 43911, data = 0 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 621, thread = 43911, data = 25 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 622, thread = 43911, data = 0 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 623, thread = 43911, data = 1 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 626, thread = 43911, data = 94828190941568 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 644, thread = 43911, data = 4 file = "util/async.c\000\000\000", func = "aio_notify\000", line = 344, thread = 0, data = 94828190768000 file = "util/async.c\000\000\000", func = "aio_notify\000", line = 345, thread = 0, data = 94828190767680 file = "util/async.c\000\000\000", func = "aio_notify\000", line = 348, thread = 0, data = 2 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 590, thread = 0, data = 94828190768000 file = "util/aio-posix.c", func = "aio_poll\000\000\000", line = 610, thread = 0, data = 1 file = "util/async.c\000\000\000", func = "otify_accept", line = 356, thread = 0, data = 94828190768000 file = "util/async.c\000\000\000", func = "otify_accept", line = 357, thread = 0, data = 94828190767680 file = "util/async.c\000\000\000", func = "otify_accept", line = 359, thread = 0, data = 1 Match the tracepoints back to the modified QEMU code, from the func="aio_poll" lines I know that IOThread is doing an aio_poll() with ctx->external_disable_cnt == 1. That's why the pollfds doesn't include the VQ ioeventfd; from the func="aio_notify" and func="otify_accept" (it is truncated aio_notify_accept) lines it's obvious that the main loop or the vCPU thread is doing aio_disable_external/aio_enable_external. What looks wrong here is the aio_notify_accept part: it is STEALING the ctx->notifier event before IOThread is awakened! The fix is obvious at this point. The hanging aio_poll() made sure ctx->notify_me is 2 before entering ppoll(). Main loop (or vCPU thread) sees that when it calls aio_enable_external(), and calls aio_notify(). The ctx->notifier is notified. But immediately, main loop calls aio_poll(), probably from bdrv_set_aio_context() (which is very questionable by itself, but let's see how Paolo and others think.). This questionable aio_poll() clears the event on ctx->notifier so the IOThread never wakes up. Upstream discussion started by Fam (let's make the connection bi-directional): [Qemu-devel] [RFC PATCH] async: Fix aio_notify_accept https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg00470.html http://mid.mail-archive.com/20180803154955.25251-1-famz@redhat.com > This questionable aio_poll() clears the event on ctx->notifier so the IOThread
> never wakes up.
Yeah, that aio_poll is a bit questionable... even though I added it. :) But as long as blocking is false it should be okay.
My concern was that this caused a busy wait in bdrv_set_aio_context, but again blocking=false avoids that concern.
I think the patch is correct and the idea below (skip aio_notify_accept) is certainly the right one. I commented on the upstream mailing list about less tricky ways to get the same effect, instead of changing aio_notify_accept directly.
Two changes are included in the build: Author: Fam Zheng <famz> Date: Fri Aug 3 21:47:18 2018 +0800 aio: Don't call aio_notify_accept unless blocking Signed-off-by: Fam Zheng <famz> diff --git a/util/aio-posix.c b/util/aio-posix.c index 429bc0e592..c4ee624047 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -633,6 +633,7 @@ bool aio_poll(AioContext *ctx, bool blocking) if (blocking) { atomic_sub(&ctx->notify_me, 2); + aio_notify_accept(ctx); } /* Adjust polling time */ @@ -676,8 +677,6 @@ bool aio_poll(AioContext *ctx, bool blocking) } } - aio_notify_accept(ctx); - /* if we have any readable fds, dispatch event */ if (ret > 0) { for (i = 0; i < npfd; i++) { diff --git a/util/aio-win32.c b/util/aio-win32.c index d6d5e02f00..0bc41e1bf2 100644 --- a/util/aio-win32.c +++ b/util/aio-win32.c @@ -374,10 +374,12 @@ bool aio_poll(AioContext *ctx, bool blocking) if (blocking) { assert(first); atomic_sub(&ctx->notify_me, 2); + if (first) { + aio_notify_accept(ctx); + } } if (first) { - aio_notify_accept(ctx); progress |= aio_bh_poll(ctx); first = false; } commit 4ee3e6d1acea818f7a7c39ab375ed04039dfb18f Author: Fam Zheng <famz> Date: Tue Aug 7 10:16:05 2018 +0800 aio-posix: ctx->notifier doesn't count as progress Signed-off-by: Fam Zheng <famz> diff --git a/util/aio-posix.c b/util/aio-posix.c index 5946ac09f0..429bc0e592 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -494,7 +494,8 @@ static bool run_poll_handlers_once(AioContext *ctx) QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) { if (!node->deleted && node->io_poll && aio_node_check(ctx, node->is_external) && - node->io_poll(node->opaque)) { + node->io_poll(node->opaque) && + node->opaque != &ctx->notifier) { progress = true; } Hi, Fam I can reproduce this bug without dataplane setting, just boot multi vms on a host. So please help to make sure if my reproducer hit this issue? Reproduce env: kernerl version: 3.10.0-926.el7.x86_64 qemu-kvm-rhev version: qemu-kvm-rhev-2.12.0-8.el7.x86_64 ****host's meminfo:******* # cat /proc/meminfo MemTotal: 32775484 kB MemFree: 27778608 kB MemAvailable: 27589008 kB Buffers: 0 kB Cached: 109768 kB SwapCached: 28216 kB Active: 4154444 kB Inactive: 149352 kB Active(anon): 4139712 kB Inactive(anon): 98808 kB Active(file): 14732 kB Inactive(file): 50544 kB Unevictable: 10100 kB Mlocked: 2765958948724 kB SwapTotal: 16515068 kB SwapFree: 16375540 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 4180996 kB Mapped: 65336 kB Shmem: 35404 kB Slab: 85536 kB SReclaimable: 31456 kB SUnreclaim: 54080 kB KernelStack: 5120 kB PageTables: 22408 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 32902808 kB Committed_AS: 24856892 kB VmallocTotal: 34359738367 kB VmallocUsed: 391892 kB VmallocChunk: 34358947836 kB HardwareCorrupted: 0 kB AnonHugePages: 3977216 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 267584 kB DirectMap2M: 33282048 kB ***********host's cpuinfo:*********** # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5205 @ 1.86GHz stepping : 6 microcode : 0x612 cpu MHz : 1861.994 cache size : 6144 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority dtherm bogomips : 3723.98 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5205 @ 1.86GHz stepping : 6 microcode : 0x612 cpu MHz : 1861.994 cache size : 6144 KB physical id : 3 siblings : 2 core id : 0 cpu cores : 2 apicid : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority dtherm bogomips : 3723.82 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5205 @ 1.86GHz stepping : 6 microcode : 0x612 cpu MHz : 1861.994 cache size : 6144 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority dtherm bogomips : 3723.98 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5205 @ 1.86GHz stepping : 6 microcode : 0x612 cpu MHz : 1861.994 cache size : 6144 KB physical id : 3 siblings : 2 core id : 1 cpu cores : 2 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority dtherm bogomips : 3723.82 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: Reproduce steps: 1.Start multi vms (without dataplane setting)on a host, in my env, i start 5 vms. qemu cmd of one vm as bellow: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20180803-032320-wyZb4cEl,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20180803-032320-wyZb4cEl,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idyjQL6c \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20180803-032320-wyZb4cEl,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20180803-032320-wyZb4cEl,path=/var/tmp/seabios-20180803-032320-wyZb4cEl,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20180803-032320-wyZb4cEl,iobase=0x402 \ -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \ -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=//home/kvm_autotest_root/images/rhel610-32-virtio.raw \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x4 \ -device virtio-net-pci,mac=9a:7f:80:81:82:83,id=id6T4Avb,vectors=4,netdev=id3Wo1qc,bus=pci.0,addr=0x6 \ -netdev tap,id=id3Wo1qc,vhost=on \ -m 4096 \ -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \ -cpu host \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,strict=on \ -enable-kvm \ -monitor stdio \ 2. Wait for some minutes, check if all vms can start successfully. Test result: 4 vms can start successfully, but the last one failed to start with screendisplay hang on "Booting from Hard Disk...." Expected result: All vms can start successfully. Additional info: I tried to boot from scsi-hd/ide-hd/virtio-blk-pci, all of these device type can hit this issue. Does the scratch build in comment 52 fix the issue for you? Hi, Fam Sorry for the late reply, there's something wrong with my test host. When i prepared test env for the new build, the host hang(maybe with too heave load, i think). The only thing i can do was to restart it by force. And after it restarted, the heavy load env gone and i couldn't rebuild it any more. I tried for days, but it had no progress. So please cc others that have the env to help to verify the build. And I'll continue to rebuild the env. Once it's ok, i'll verify it. thanks, aliang Adding additional info from email thread between Fam and Liran : Liran did 50 iterations, in each iteration it starts the 6 VMs in the environment. It took around 75 minutes and I didn't hit the bug. The patch(below) is fixing the issue. https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=17644958 Fix included in qemu-kvm-rhev-2.12.0-11.el7 Hi, Liran We can't still reproduce this bug, even on our own rhev env with seven vms started. The bug is "ON_QA" now. So, can you help to verify it on your env? Thanks, aliang From the RHEV side: qemu-img-rhev-2.10.0-21.el7_5.5.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.5.x86_64 qemu-kvm-common-rhev-2.10.0-21.el7_5.5.x86_64 libvirt-3.9.0-14.el7_5.7.x86_64 vdsm-4.20.37-1.el7ev.x86_64 ovirt-engine-4.2.6.3-0.1.el7ev.noarch First i made sure i actually hit the bug. Steps: 1. Create 6 VMs in the environment. 2. Pin the VMs to one host. 3. Set IO Threads enable on each VM. 4. Start the 6VMs. 5. Shutdown and repeat on step 4 until i hit the bug. On the 4th iteration - got the bug on one VM From this step on i tested Fam's patch: 6. Install Fam's RPMs on the desired host: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=17946743 The new rpms: # rpm -qa | grep qemu libvirt-daemon-driver-qemu-3.9.0-14.el7_5.7.x86_64 qemu-kvm-common-rhev-2.10.0-21.el7_5.5.fzheng201808201632.x86_64 ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch qemu-kvm-rhev-debuginfo-2.10.0-21.el7_5.5.fzheng201808201632.x86_64 qemu-kvm-tools-rhev-2.10.0-21.el7_5.5.fzheng201808201632.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.5.fzheng201808201632.x86_64 qemu-img-rhev-2.10.0-21.el7_5.5.fzheng201808201632.x86_64 7. Go back to step 4 and try to reproduce again. Results: After 10 hours, 317 iteration all the VMs went up as supposed. I didn't hit the bug again. Fam's patch seems to fix this bug. Has run regression test on kernel-3.10.0-935+qemu-kvm-rhev-2.12.0-11.el7, no regression bug introduced. In Liran's RHV env with qemu-kvm-rhev-2.12.0-11.el7.x86_64, test 100 times, don't hit this issue, so set bug's status to "Verified" Test steps: 1. Create 6 VMs. 2. Pin the VMs to one host. 3. Set IO Threads enable on each VM. 4. Start the 6VMs, check if they all start successfully. 5. Shutdown vms, then start them. 6. Repeat step5 for times. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3443 |