1435660 – Windows 2016 got stucked when creating volume on shared vioscsi disk

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1435660 - Windows 2016 got stucked when creating volume on shared vioscsi disk

Summary: Windows 2016 got stucked when creating volume on shared vioscsi disk

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	qemu-kvm-rhev
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Paolo Bonzini
QA Contact:	Xueqiang Wei
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-03-24 13:23 UTC by Jiri Belka
Modified:	2020-01-10 10:19 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-08-24 13:15:33 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
screenshot (63.65 KB, image/png) 2017-03-24 13:41 UTC, Jiri Belka	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1111784	0	high	CLOSED	[RFE] Provide SCSI reservation support for virtio-scsi via rhev-guest-tools for win-8 and win-2012 guests using Shared d...	2023-09-14 23:57:43 UTC

Internal Links: 1111784

Description Jiri Belka 2017-03-24 13:23:02 UTC

Description of problem:

Windows 2016 VM got stucked when I was creating volume on a shared SAS disk utilizing vioscsi driver, the disk was added via qemu iscsi initiator from EMC VNX.

PS C:\Users\ad-w2k12r2> Get-WmiObject Win32_PnPSignedDriver| select devicename, driverversion, driverdate | where {$_.devicename -like "*scsi*" }

devicename                                  driverversion   driverdate
----------                                  -------------   ----------
Red Hat VirtIO SCSI controller              62.73.104.12600 20160811000000.******+***
Red Hat VirtIO SCSI pass-through controller 62.73.104.12400 20160729000000.******+***
iscsi

main_channel_handle_parsed: net test: latency 23.343000 ms, bitrate 4249112 bps (4.052269 Mbps) LOW BANDWIDTH
red_dispatcher_set_cursor_peer: 
inputs_connect: inputs channel client create
main_channel_handle_parsed: agent start
2017-03-24T11:24:42.359200Z qemu-kvm: virtio-serial-bus: Unexpected port id 736681576 for device virtio-serial0.0
2017-03-24T11:24:45.382932Z qemu-kvm: virtio-serial-bus: Unexpected port id 1185792 for device virtio-serial0.0
2017-03-24T11:25:06.623340Z qemu-kvm: virtio-serial-bus: Unexpected port id 3528894800 for device virtio-serial0.0
2017-03-24T11:25:06.624055Z qemu-kvm: virtio-serial-bus: Unexpected port id 739323064 for device virtio-serial0.0
2017-03-24T11:25:14.019747Z qemu-kvm: virtio-serial-bus: Unexpected port id 3528894800 for device virtio-serial0.0
2017-03-24T11:25:57.144946Z qemu-kvm: virtio-serial-bus: Unexpected port id 3491010928 for device virtio-serial0.0
2017-03-24T11:27:26.876178Z qemu-kvm: virtio-serial-bus: Unexpected port id 0 for device virtio-serial0.0
2017-03-24T11:27:29.913275Z qemu-kvm: virtio-serial-bus: Unexpected port id 3552624224 for device virtio-serial0.0
2017-03-24T11:27:32.784350Z qemu-kvm: virtio-serial-bus: Unexpected port id 739049208 for device virtio-serial0.0
2017-03-24T11:27:34.025448Z qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio-serial0.0
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
red_channel_client_disconnect: rcc=0x7fa34bdca000 (channel=0x7fa349cccc00 type=2 id=0)
red_channel_client_disconnect: rcc=0x7fa34aacd000 (channel=0x7fa349d71080 type=4 id=0)
red_channel_client_disconnect: rcc=0x7fa34aabe000 (channel=0x7fa349d86000 type=3 id=0)
red_channel_client_disconnect: rcc=0x7fa34ab1a000 (channel=0x7fa349d7e000 type=1 id=0)
main_channel_client_on_disconnect: rcc=0x7fa34ab1a000
red_client_destroy: destroy client 0x7fa34dcbbb80 with #channels=4
red_dispatcher_disconnect_cursor_peer: 
red_dispatcher_disconnect_display_peer: 

Version-Release number of selected component (if applicable):
1.9.0-3

How reproducible:
just happened

Steps to Reproduce:
1. windows 2016 on vioblock installed
2. attach iscsi direct-lun shared (unfiltered) disk via qemu initiator
3. attach the disk to two VMs
4. bring online on both nodes
5. create volume on VM 1

Actual results:
VM got stucked

Expected results:
should work

Additional info:

Comment 2 Jiri Belka 2017-03-24 13:25:25 UTC

kernel-3.10.0-514.6.1.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64
libiscsi-1.9.0-7.el7.x86_64
virtio-win-1.9.0-3.el7.noarch

Comment 3 Jiri Belka 2017-03-24 13:29:57 UTC

# cat /var/log/libvirt/qemu/vm2-w2k16-X.log 
2017-03-24 09:57:55.865+0000: starting up libvirt version: 2.0.0, package: 10.el7_3.5 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2017-02-10-03:02:29, x86-020.build.eng.bos.redhat.com), qemu version: 2.6.0 (qemu-kvm-rhev-2.6.0-28.el7_3.6), hostname: dell-r210ii-03.rhev.lab.eng.brq.redhat.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=vm2-w2k16-X,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-16-vm2-w2k16-X/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -cpu SandyBridge,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff -m size=4194304k,slots=16,maxmem=16777216k -realtime mlock=off -smp 2,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0-1,mem=4096 -uuid a2ff2496-d8c9-41f6-8337-35c8e3689a5e -smbios 'type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=7.3-0.10.el7,serial=4C4C4544-0058-3410-8058-C3C04F38354A,uuid=a2ff2496-d8c9-41f6-8337-35c8e3689a5e' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-16-vm2-w2k16-X/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2017-03-24T09:57:55,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot menu=on,splash-time=10000,strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -drive file=/rhev/data-center/mnt/data.rhev.lab.eng.brq.redhat.com:_data_iso_shared/0c78b4d6-ba00-4d3e-9f9f-65c7d5899d71/images/11111111-1111-1111-1111-111111111111/en_windows_server_2016_x64_dvd_9327751.iso,format=raw,if=none,id=drive-ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -drive file=/rhev/data-center/mnt/data.rhev.lab.eng.brq.redhat.com:_data_iso_shared/0c78b4d6-ba00-4d3e-9f9f-65c7d5899d71/images/11111111-1111-1111-1111-111111111111/virtio-win-1.9.0-3.el7-amd64.vfd,format=raw,if=none,id=drive-fdc0-0-0,readonly=on -global isa-fdc.driveA=drive-fdc0-0-0 -drive file=/rhev/data-center/00000001-0001-0001-0001-00000000017c/d93e6782-f200-4e5e-9713-2a53ceca3c49/images/c649791a-ac9e-45e0-89d7-16c882ad2f2e/c91cc2ec-cdaf-4f58-8abf-211d37aeaace,format=raw,if=none,id=drive-virtio-disk0,serial=c649791a-ac9e-45e0-89d7-16c882ad2f2e,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=33 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:01:51,bus=pci.0,addr=0x3,bootindex=3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/a2ff2496-d8c9-41f6-8337-35c8e3689a5e.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/a2ff2496-d8c9-41f6-8337-35c8e3689a5e.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice tls-port=5900,addr=10.34.63.222,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=default,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=8388608,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 -msg timestamp=on
2017-03-24T09:57:56.032400Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2017-03-24T09:57:56.032524Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 39.291000 ms, bitrate 6622281 bps (6.315499 Mbps) LOW BANDWIDTH
inputs_connect: inputs channel client create
red_dispatcher_set_cursor_peer: 
red_channel_client_disconnect: rcc=0x7fa34bcb8000 (channel=0x7fa349cccc00 type=2 id=0)
red_channel_client_disconnect: rcc=0x7fa34d269000 (channel=0x7fa349d86000 type=3 id=0)
red_channel_client_disconnect: rcc=0x7fa34d26e000 (channel=0x7fa349d71080 type=4 id=0)
red_channel_client_disconnect: rcc=0x7fa34c924000 (channel=0x7fa349d7e000 type=1 id=0)
main_channel_client_on_disconnect: rcc=0x7fa34c924000
red_client_destroy: destroy client 0x7fa349c45700 with #channels=4
red_dispatcher_disconnect_cursor_peer: 
red_dispatcher_disconnect_display_peer: 
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 23.343000 ms, bitrate 4249112 bps (4.052269 Mbps) LOW BANDWIDTH
red_dispatcher_set_cursor_peer: 
inputs_connect: inputs channel client create
main_channel_handle_parsed: agent start
2017-03-24T11:24:42.359200Z qemu-kvm: virtio-serial-bus: Unexpected port id 736681576 for device virtio-serial0.0
2017-03-24T11:24:45.382932Z qemu-kvm: virtio-serial-bus: Unexpected port id 1185792 for device virtio-serial0.0
2017-03-24T11:25:06.623340Z qemu-kvm: virtio-serial-bus: Unexpected port id 3528894800 for device virtio-serial0.0
2017-03-24T11:25:06.624055Z qemu-kvm: virtio-serial-bus: Unexpected port id 739323064 for device virtio-serial0.0
2017-03-24T11:25:14.019747Z qemu-kvm: virtio-serial-bus: Unexpected port id 3528894800 for device virtio-serial0.0
2017-03-24T11:25:57.144946Z qemu-kvm: virtio-serial-bus: Unexpected port id 3491010928 for device virtio-serial0.0
2017-03-24T11:27:26.876178Z qemu-kvm: virtio-serial-bus: Unexpected port id 0 for device virtio-serial0.0
2017-03-24T11:27:29.913275Z qemu-kvm: virtio-serial-bus: Unexpected port id 3552624224 for device virtio-serial0.0
2017-03-24T11:27:32.784350Z qemu-kvm: virtio-serial-bus: Unexpected port id 739049208 for device virtio-serial0.0
2017-03-24T11:27:34.025448Z qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio-serial0.0
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
red_channel_client_disconnect: rcc=0x7fa34bdca000 (channel=0x7fa349cccc00 type=2 id=0)
red_channel_client_disconnect: rcc=0x7fa34aacd000 (channel=0x7fa349d71080 type=4 id=0)
red_channel_client_disconnect: rcc=0x7fa34aabe000 (channel=0x7fa349d86000 type=3 id=0)
red_channel_client_disconnect: rcc=0x7fa34ab1a000 (channel=0x7fa349d7e000 type=1 id=0)
main_channel_client_on_disconnect: rcc=0x7fa34ab1a000
red_client_destroy: destroy client 0x7fa34dcbbb80 with #channels=4
red_dispatcher_disconnect_cursor_peer: 
red_dispatcher_disconnect_display_peer: 
2017-03-24T13:17:49.736414Z qemu-kvm: terminating on signal 15 from pid 1702
2017-03-24 13:18:14.944+0000: shutting down

Comment 4 Jiri Belka 2017-03-24 13:40:34 UTC

It has happened again. Steps in W2k16:

- in Server Manager ensure you see two VMs (I added them to domain and logged in
  first VM as domain admin, then I added second VM into Server Manager
- i attached iscsi direct-lun via 'attach-device' in virsh

    <disk type='network' device='lun' sgio='unfiltered' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
      <source protocol='iscsi' name='iqn.1992-05.com.emc:ckm001201002300000-5-vnxe/5'>
        <host name='10.34.63.200' port='3260'/>
      </source>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <shareable/>
      <alias name='scsi0-0-0-0'/>
    </disk>

- bring disk online, so it's up on both nodes in SM
- on first VM do New Volume on the disk (it's getting stucked in Formating Volume)

Comment 5 Jiri Belka 2017-03-24 13:41:03 UTC

Created attachment 1266059 [details]
screenshot

Comment 6 Peixiu Hou 2017-03-28 05:28:38 UTC

I tried to reproduce this bug with LIO iscsi target, I cannot reproduce it, my reproduced steps as follows:

1. Boot 2 VMs on qemu level with qemu initiator:

1). /usr/libexec/qemu-kvm -name guest=peixiu_vm4,debug-threads=on -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -cpu Nehalem -m size=1048576k,slots=16,maxmem=4194304k -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0,mem=1024 -uuid b31d5b11-5fd6-44c6-a3ad-365cb2e146eb -rtc base=2017-03-28T03:16:50,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -drive file=/rhev/data-center/a41ee8e2-4e1c-4549-865d-3c9cf1bcc727/c6f00513-2812-4fbd-ab28-b6aba6079280/images/b07818ad-815c-4f5f-a13a-9e8392fca2db/17ce52a2-97a8-44c6-8115-31bfbc50cb87,format=raw,if=none,id=drive-virtio-disk0,serial=b07818ad-815c-4f5f-a13a-9e8392fca2db,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -drive file=/rhev/data-center/a41ee8e2-4e1c-4549-865d-3c9cf1bcc727/c6f00513-2812-4fbd-ab28-b6aba6079280/images/b6130e3b-8175-4fad-b513-0d2433d8af88/3af3c62c-00a5-40ef-93cc-735452fc822a,format=raw,if=none,id=drive-virtio-disk1,serial=b6130e3b-8175-4fad-b513-0d2433d8af88,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:01:54,bus=pci.0,addr=0x3 -vnc :2 -monitor stdio -drive file=iscsi://10.66.4.129/iqn.2016-06.local.server:sas/0,format=raw,if=none,id=drive-scsi0-0-0-1,cache=none,werror=stop,rerror=stop,aio=native -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi0-0-0-1,id=scsi0-0-0-1

2) /usr/libexec/qemu-kvm -name guest=peixiu_vm2,debug-threads=on -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -cpu Nehalem -m size=1048576k,slots=16,maxmem=4194304k -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0,mem=1024 -uuid d58a72a1-6f72-4a8f-8578-c80dd9cf04dc -rtc base=2017-03-28T02:22:38,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -drive file=iscsi://10.66.4.129/iqn.2016-06.local.server:sas/0,format=raw,if=none,id=drive-scsi0-0-0-1,cache=none,werror=stop,rerror=stop,aio=native -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi0-0-0-1,id=scsi0-0-0-1 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:01:52,bus=pci.0,addr=0x3 -vnc :3 -monitor stdio -drive file=/rhev/data-center/a41ee8e2-4e1c-4549-865d-3c9cf1bcc727/c6f00513-2812-4fbd-ab28-b6aba6079280/images/e09ee7e5-1d0a-48d7-bd07-e7087cfe3635/f16e06b7-e533-4a2c-8e27-b7918137782b,format=raw,if=none,id=drive-virtio-disk1,serial=b6130e3b-8175-4fad-b513-0d2433d8af88,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=1

2. Install the vioscsi driver.

3. In vm disk management, online the disk and create new volume

Used version:
kernel-3.10.0-514.6.1.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64
virtio-win-1.9.0-3.el7.noarch

Additional info:
Tried on rhel7.4 with LIO iscsi target, it's also normal.

Best Regards~
Peixiu Hou

Comment 7 Jiri Belka 2017-03-28 06:27:50 UTC

(In reply to Peixiu Hou from comment #6)
> I tried to reproduce this bug with LIO iscsi target, I cannot reproduce it,
> my reproduced steps as follows:
> 
> 1. Boot 2 VMs on qemu level with qemu initiator:
> 
> 1). /usr/libexec/qemu-kvm -name guest=peixiu_vm4,debug-threads=on -machine
> pc-i440fx-rhel7.3.0,accel=kvm,usb=off -cpu Nehalem -m
                                         ^^^^^^^^^^^^^^

different qemu configuration... vs

  cpu SandyBridge,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff
  ...
  -smbios 'type=1,manufacturer=Red Hat,product=RHEV   
  Hypervisor,version=7.3-0.10.el7,serial=4C4C4544-0058-3410-8058-
  C3C04F38354A,uuid=a2ff2496-d8c9-41f6-8337-35c8e3689a5e'

my env was RHV 4.1, yours seems just "plain" qemu from EL7. I also used this EMC VMX.

Comment 8 Jiri Belka 2017-03-29 11:28:07 UTC

I tried to get crash dump from Windows but it seems qemu process is stucked as well.

I tried 'inject-nmi' via virsh, I saw blue screen in Windows with notification about NMI but after that qemu got stucked.

I used en_windows_server_2016_x64_dvd_9327751.iso for installation.

virsh # qemu-monitor-command 23 --hmp 'info block'
error: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainGetBlockIoTune)

virsh # inject-nmi 23
error: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainGetBlockIoTune)

Comment 12 Jiri Belka 2017-04-11 15:18:38 UTC

It just this:

- reset disk (it will have no mbr/gpt - undefined)
- make the disk online on both
- new volume in Server Manager -> File and Storage Services -> Disks

Comment 13 Jiri Belka 2017-04-12 07:22:26 UTC

- when the VM is frozen, send 'inject-nmi' via virsh
- the blue screen will show '0% complete' for ages and it changes to '?' in Admin Portal...

2017-04-11 17:16:56,026+0200 WARN  (jsonrpc/6) [virt.vm] (vmId='a2ff2496-d8c9-41f6-8337-35c8e3689a5e') monitor became unresponsive (command timeout, age=61.1299999999) (vm:5013)

  and the status in virsh is 'paused'

I also tried following without success:

- 'qemu-monitor-command $domain --hmp 'dump-guest-memory /var/log/core/vmdump' in virsh

Comment 14 Jiri Belka 2017-04-12 07:26:43 UTC

# grep "doesn't" /var/log/messages
Apr 11 16:01:13 localhost journal: host doesn't support hyperv 'relaxed' feature
Apr 11 16:01:13 localhost journal: host doesn't support hyperv 'vapic' feature
Apr 11 16:04:36 localhost journal: host doesn't support hyperv 'relaxed' feature
Apr 11 16:04:36 localhost journal: host doesn't support hyperv 'vapic' feature


    <cpu>
      <arch>x86_64</arch>
      <model>SandyBridge</model>
      <vendor>Intel</vendor>
      <topology sockets='1' cores='4' threads='2'/>
      <feature name='vme'/>
      <feature name='ds'/>
      <feature name='acpi'/>
      <feature name='ss'/>
      <feature name='ht'/>
      <feature name='tm'/>
      <feature name='pbe'/>
      <feature name='dtes64'/>
      <feature name='monitor'/>
      <feature name='ds_cpl'/>
      <feature name='vmx'/>
      <feature name='smx'/>
      <feature name='est'/>
      <feature name='tm2'/>
      <feature name='xtpr'/>
      <feature name='pdcm'/>
      <feature name='pcid'/>
      <feature name='osxsave'/>
      <feature name='arat'/>
      <feature name='xsaveopt'/>
      <feature name='invtsc'/>
      <pages unit='KiB' size='4'/>
      <pages unit='KiB' size='2048'/>
    </cpu>


any other info I could provide?

Comment 16 Paolo Bonzini 2017-04-28 12:28:40 UTC

Jiri, since QEMU is hung, can you grab a stack backtrace with gdb? Or is it in D state?

Comment 17 Ademar Reis 2017-05-09 14:22:58 UTC

(In reply to Paolo Bonzini from comment #16)
> Jiri, since QEMU is hung, can you grab a stack backtrace with gdb? Or is it
> in D state?

ping

Comment 18 Jiri Belka 2017-05-10 07:06:18 UTC

(In reply to Paolo Bonzini from comment #16)
> Jiri, since QEMU is hung, can you grab a stack backtrace with gdb? Or is it
> in D state?

What is 'D' state please? Last time I tried to get more info I found the VM was paused by some component (systemd or vdsm?)... I'll try to get more info soon.

Comment 19 Paolo Bonzini 2017-05-10 07:54:14 UTC

"D" state means that "ps" shows a "D" for the process state, such as this:

root     29965  0.0  0.0  42632  3240 pts/2    D+   09:53   0:00 /sbin/mount.nfs
                                               ^^

When this happens you cannot use "gdb" to show the process's stack backtrace.  If the process is not "D" state, gdb works.

Comment 20 Ademar Reis 2017-05-16 17:18:12 UTC

(In reply to Ademar Reis from comment #17)
> (In reply to Paolo Bonzini from comment #16)
> > Jiri, since QEMU is hung, can you grab a stack backtrace with gdb? Or is it
> > in D state?
> 
> ping

restoring needinfo, as Paolo is still waiting for a backtrace.

Comment 23 Paolo Bonzini 2018-08-03 13:06:29 UTC

The VM is trying to reconnect to the disk:

Thread 1 (Thread 0x7f3f7c8c6c80 (LWP 5687)):
#0  0x00007f3f7230d66d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x00007f3f7230d504 in __sleep (seconds=0, seconds@entry=5) at ../sysdeps/unix/sysv/linux/sleep.c:137
        ts = {tv_sec = 2, tv_nsec = 554646223}
        set = {__val = {65536, 0 <repeats 15 times>}}
        oset = {__val = {268444224, 139910641029281, 139910888208128, 139910641043794, 4294967296, 139910845870080, 139910845870080, 139910690283872, 0, 139910641029281, 139910690283872, 139910641076894, 14429289160618278914, 0, 0, 68719476740}}
        result = <optimized out>
#2  0x00007f3f7c0f0ad2 in iscsi_reconnect (old_iscsi=old_iscsi@entry=0x7f3f81aec000) at lib/connect.c:242
        iscsi = <optimized out>
        retry = 0

If the bug happens as soon as the second VM is started, or very little after, it could be a configuration issue in the iSCSI target.  Is it possible to have access to the storage system?

Comment 24 Jiri Belka 2018-08-13 07:29:29 UTC

(In reply to Paolo Bonzini from comment #23)
> The VM is trying to reconnect to the disk:
> 
> Thread 1 (Thread 0x7f3f7c8c6c80 (LWP 5687)):
> #0  0x00007f3f7230d66d in nanosleep () at
> ../sysdeps/unix/syscall-template.S:81
> No locals.
> #1  0x00007f3f7230d504 in __sleep (seconds=0, seconds@entry=5) at
> ../sysdeps/unix/sysv/linux/sleep.c:137
>         ts = {tv_sec = 2, tv_nsec = 554646223}
>         set = {__val = {65536, 0 <repeats 15 times>}}
>         oset = {__val = {268444224, 139910641029281, 139910888208128,
> 139910641043794, 4294967296, 139910845870080, 139910845870080,
> 139910690283872, 0, 139910641029281, 139910690283872, 139910641076894,
> 14429289160618278914, 0, 0, 68719476740}}
>         result = <optimized out>
> #2  0x00007f3f7c0f0ad2 in iscsi_reconnect
> (old_iscsi=old_iscsi@entry=0x7f3f81aec000) at lib/connect.c:242
>         iscsi = <optimized out>
>         retry = 0
> 
> If the bug happens as soon as the second VM is started, or very little
> after, it could be a configuration issue in the iSCSI target.  Is it
> possible to have access to the storage system?

I don't have that env - after one year - and when I tried similar flow on another env I did not face such issue anymore.

Note You need to log in before you can comment on or make changes to this bug.