Bug 1525375 - [Engine 4.1] Failed to start VM with direct lun on both 4.1 and 4.2 host
Summary: [Engine 4.1] Failed to start VM with direct lun on both 4.1 and 4.2 host
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.19.41
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.2.1
: ---
Assignee: Eyal Shenitzky
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-13 08:36 UTC by Israel Pinto
Modified: 2019-04-28 13:40 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-01-02 09:21:18 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.2?
ylavi: planning_ack+
ipinto: devel_ack?
ipinto: testing_ack?


Attachments (Terms of Use)
logs(engine,vdsm - both hosts) (821.82 KB, application/x-xz)
2017-12-13 08:37 UTC, Israel Pinto
no flags Details
Events - vm is up (136.97 KB, image/png)
2017-12-13 08:38 UTC, Israel Pinto
no flags Details
Events - Failed to start VM (235.75 KB, image/png)
2017-12-13 08:38 UTC, Israel Pinto
no flags Details

Description Israel Pinto 2017-12-13 08:36:21 UTC
Description of problem:
The setup is 4.1 engine with cluster 4.1 with 2 host: 4.1 and 4.2
Failed to start VM with direct lun on both hosts 4.1 and 4.2

The VM was running and succeed migrate between host see attached events screenshots.
After migrating between hosts it failed to migrate and I stop it and try to rerun it on both hosts and it failed to run.
 
VM name: direct_lun
VM id: c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4

Version-Release number of selected component (if applicable):
Engine: Red Hat Virtualization Manager Version: 4.1.8.2-0.1.el7
vdsm 4.1: vdsm-4.19.42-1.el7ev.x86_64	
vdsm 4.2: vdsm-4.20.9-30.gite026991.el7.centos

Steps:
1. Run VM on host 4.1
2. Migrate to host 4.2 host
3. Restart vdsm on host 4.2 
4. Migrate back to 4.1 
5. Put host to maintenance (failed here)
6. Stop VM 
7. Start VM on host (4.1/4.2) - Vm failed to start.

Additional info:
Engine log:
2017-12-13 09:56:48,913+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-38) [204ca1cf] EVENT_ID: USER_FAILED_RUN_VM(54), Correlation ID: 101f892e-b5b0-45eb-9904-fda9b9a9378b, Job ID: 73cabd36-77ba-4f1e-b600-49eda1a1d2b7, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: Failed to run VM direct_lun (User: admin@internal-authz).

4.1 vdsm log:

2017-12-13 09:56:07,276+0200 ERROR (vm/c524a3ed) [virt.vm] (vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4') The vm start process failed (vm:631)
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 562, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 1978, in _run
    self._preparePathsForDrives(dev_spec_map[hwclass.DISK])
  File "/usr/share/vdsm/virt/vm.py", line 657, in _preparePathsForDrives
    drive['path'] = self.cif.prepareVolumePath(drive, self.id)
  File "/usr/share/vdsm/clientIF.py", line 337, in prepareVolumePath
    raise vm.VolumeError(drive)
VolumeError: Bad volume specification {u'index': 0, u'iface': u'scsi', u'format': u'raw', u'bootOrder': u'1', u'discard': False, u'sgio': u'filtered', u'GUID': u'3514f0c5a51601089', u'specParams': {}, u'readonly': u'false', u'optional': u'false', u'deviceId': u'eb7b2891-7f87-4be0-a233-55ffd3993398', u'address': {u'bus': u'0', u'controller': u'0', u'type': u'drive', u'target': u'0', u'unit': u'0'}, u'device': u'lun', u'shared': u'false', u'propagateErrors': u'off', u'type': u'disk'}
2017-12-13 09:56:07,277+0200 INFO  (vm/c524a3ed) [virt.vm] (vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4') Changed state to Down: Bad volume specification {u'index': 0, u'iface': u'scsi', u'format': u'raw', u'bootOrder': u'1', u'discard': False, u'sgio': u'filtered', u'GUID': u'3514f0c5a51601089', u'specParams': {}, u'readonly': u'false', u'optional': u'false', u'deviceId': u'eb7b2891-7f87-4be0-a233-55ffd3993398', u'address': {u'bus': u'0', u'controller': u'0', u'type': u'drive', u'target': u'0', u'unit': u'0'}, u'device': u'lun', u'shared': u'false', u'propagateErrors': u'off', u'type': u'disk'} (code=1) (vm:1259)


4.2 host vdsm log:
2017-12-13 09:56:37,224+0200 ERROR (vm/c524a3ed) [virt.vm] (vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4') The vm start process failed (vm:914)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 843, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2632, in _run
    self._devices = self._make_devices()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2540, in _make_devices
    self._preparePathsForDrives(dev_spec_map[hwclass.DISK])
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 988, in _preparePathsForDrives
    drive['path'] = self.cif.prepareVolumePath(drive, self.id)
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 407, in prepareVolumePath
    raise vm.VolumeError(drive)
VolumeError: Bad volume specification {u'index': 0, u'iface': u'scsi', 'vm_custom': {u'device_08b0c6a6-a0ea-470b-9bdb-938695a4d676device_62bae4e5-d01d-4951-b21d-115c8510860bdevice_b138dc19-a6e0-4960-975d-900cc99b624a': u"VmDevice:{id='VmDeviceId:{deviceId='b138dc19-a6e0-4960-975d-900cc99b624a', vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4'}', device='spicevmc', type='CHANNEL', bootOrder='0', specParams='[]', address='{bus=0, controller=0, type=virtio-serial, port=3}', managed='false', plugged='true', readOnly='false', deviceAlias='channel2', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}", u'device_08b0c6a6-a0ea-470b-9bdb-938695a4d676': u"VmDevice:{id='VmDeviceId:{deviceId='08b0c6a6-a0ea-470b-9bdb-938695a4d676', vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4'}', device='unix', type='CHANNEL', bootOrder='0', specParams='[]', address='{bus=0, controller=0, type=virtio-serial, port=1}', managed='false', plugged='true', readOnly='false', deviceAlias='channel0', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}", u'device_08b0c6a6-a0ea-470b-9bdb-938695a4d676device_62bae4e5-d01d-4951-b21d-115c8510860b': u"VmDevice:{id='VmDeviceId:{deviceId='62bae4e5-d01d-4951-b21d-115c8510860b', vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4'}', device='unix', type='CHANNEL', bootOrder='0', specParams='[]', address='{bus=0, controller=0, type=virtio-serial, port=2}', managed='false', plugged='true', readOnly='false', deviceAlias='channel1', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}", u'device_08b0c6a6-a0ea-470b-9bdb-938695a4d676device_62bae4e5-d01d-4951-b21d-115c8510860bdevice_b138dc19-a6e0-4960-975d-900cc99b624adevice_1eadae15-2af4-4515-94b4-1f73aadd8836': u"VmDevice:{id='VmDeviceId:{deviceId='1eadae15-2af4-4515-94b4-1f73aadd8836', vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4'}', device='ide', type='CONTROLLER', bootOrder='0', specParams='[]', address='{slot=0x01, bus=0x00, domain=0x0000, type=pci, function=0x1}', managed='false', plugged='true', readOnly='false', deviceAlias='ide', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}"}, u'format': u'raw', u'bootOrder': u'1', u'discard': False, 'vmid': 'c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4', u'sgio': u'filtered', u'GUID': u'3514f0c5a51601089', u'specParams': {}, u'readonly': u'false', u'optional': u'false', u'deviceId': u'eb7b2891-7f87-4be0-a233-55ffd3993398', u'address': {u'bus': u'0', u'controller': u'0', u'type': u'drive', u'target': u'0', u'unit': u'0'}, u'device': u'lun', u'shared': u'false', u'propagateErrors': u'off', u'type': u'disk'}
2017-12-13 09:56:37,224+0200 INFO  (vm/c524a3ed) [virt.vm] (vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4') Changed state to Down: Bad volume specification {u'index': 0, u'iface': u'scsi', 'vm_custom': {u'device_08b0c6a6-a0ea-470b-9bdb-938695a4d676device_62bae4e5-d01d-4951-b21d-115c8510860bdevice_b138dc19-a6e0-4960-975d-900cc99b624a': u"VmDevice:{id='VmDeviceId:{deviceId='b138dc19-a6e0-4960-975d-900cc99b624a', vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4'}', device='spicevmc', type='CHANNEL', bootOrder='0', specParams='[]', address='{bus=0, controller=0, type=virtio-serial, port=3}', managed='false', plugged='true', readOnly='false', deviceAlias='channel2', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}", u'device_08b0c6a6-a0ea-470b-9bdb-938695a4d676': u"VmDevice:{id='VmDeviceId:{deviceId='08b0c6a6-a0ea-470b-9bdb-938695a4d676', vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4'}', device='unix', type='CHANNEL', bootOrder='0', specParams='[]', address='{bus=0, controller=0, type=virtio-serial, port=1}', managed='false', plugged='true', readOnly='false', deviceAlias='channel0', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}", u'device_08b0c6a6-a0ea-470b-9bdb-938695a4d676device_62bae4e5-d01d-4951-b21d-115c8510860b': u"VmDevice:{id='VmDeviceId:{deviceId='62bae4e5-d01d-4951-b21d-115c8510860b', vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4'}', device='unix', type='CHANNEL', bootOrder='0', specParams='[]', address='{bus=0, controller=0, type=virtio-serial, port=2}', managed='false', plugged='true', readOnly='false', deviceAlias='channel1', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}", u'device_08b0c6a6-a0ea-470b-9bdb-938695a4d676device_62bae4e5-d01d-4951-b21d-115c8510860bdevice_b138dc19-a6e0-4960-975d-900cc99b624adevice_1eadae15-2af4-4515-94b4-1f73aadd8836': u"VmDevice:{id='VmDeviceId:{deviceId='1eadae15-2af4-4515-94b4-1f73aadd8836', vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4'}', device='ide', type='CONTROLLER', bootOrder='0', specParams='[]', address='{slot=0x01, bus=0x00, domain=0x0000, type=pci, function=0x1}', managed='false', plugged='true', readOnly='false', deviceAlias='ide', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}"}, u'format': u'raw', u'bootOrder': u'1', u'discard': False, 'vmid': 'c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4', u'sgio': u'filtered', u'GUID': u'3514f0c5a51601089', u'specParams': {}, u'readonly': u'false', u'optional': u'false', u'deviceId': u'eb7b2891-7f87-4be0-a233-55ffd3993398', u'address': {u'bus': u'0', u'controller': u'0', u'type': u'drive', u'target': u'0', u'unit': u'0'}, u'device': u'lun', u'shared': u'false', u'propagateErrors': u'off', u'type': u'disk'} (code=1) (vm:1633)
2017-12-13 09:56:37,253+0200 INFO  (vm/c524a3ed) [virt.vm] (vmId='c524a3ed-d3f0-4f1b-bb7c-6ff31c1a49d4') Stopping connection (guestagent:435)

Comment 1 Israel Pinto 2017-12-13 08:37:22 UTC
Created attachment 1367175 [details]
logs(engine,vdsm - both hosts)

Comment 2 Israel Pinto 2017-12-13 08:38:08 UTC
Created attachment 1367176 [details]
Events - vm is up

Comment 3 Israel Pinto 2017-12-13 08:38:40 UTC
Created attachment 1367177 [details]
Events - Failed to start VM

Comment 4 Milan Zamazal 2017-12-13 19:54:54 UTC
I couldn't reproduce the bug with the steps above on my setup. I created a VM and attached an iSCSI disk to it. In step 5. the VM had automatically and successfully migrated to the 4.2 host, so I put that host to maintenance to perform step 5. with failure. Even then the VM starts successfully in step 7., on any of the hosts.

Considering the weird errors reported from libvirt in bug 1524941, I'd suggest double checking that there is no problem outside oVirt, e.g. with storage.

Comment 5 Sandro Bonazzola 2017-12-14 04:01:34 UTC
Dropping blocker flag according to https://lists.ovirt.org/pipermail/devel/2017-December/032213.html

Comment 6 Allon Mureinik 2017-12-14 10:31:09 UTC
(In reply to Sandro Bonazzola from comment #5)
> Dropping blocker flag according to
> https://lists.ovirt.org/pipermail/devel/2017-December/032213.html

Pushing out to 4.2.1 accordingly.

Comment 7 Yaniv Kaul 2017-12-27 15:03:12 UTC
(In reply to Milan Zamazal from comment #4)
> I couldn't reproduce the bug with the steps above on my setup. I created a
> VM and attached an iSCSI disk to it. In step 5. the VM had automatically and
> successfully migrated to the 4.2 host, so I put that host to maintenance to
> perform step 5. with failure. Even then the VM starts successfully in step
> 7., on any of the hosts.
> 
> Considering the weird errors reported from libvirt in bug 1524941, I'd
> suggest double checking that there is no problem outside oVirt, e.g. with
> storage.

Can you please check?

Comment 15 Israel Pinto 2018-01-02 09:21:18 UTC
Closing this BZ after check with storage DEV,
There was problem with the storage settings that cause this problem.


Note You need to log in before you can comment on or make changes to this bug.