Bug 834893 - vdsm: vms with shared disk will pause due to I/O errors on double use of PCI Address
vdsm: vms with shared disk will pause due to I/O errors on double use of PCI ...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.1.0
x86_64 Linux
urgent Severity high
: ---
: 3.1.0
Assigned To: Eli Mesika
Dafna Ron
storage
:
Depends On: 840386
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-24 12:11 EDT by Dafna Ron
Modified: 2016-02-10 11:48 EST (History)
12 users (show)

See Also:
Fixed In Version: SI13
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 840386 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
logs (512.70 KB, application/x-gzip)
2012-06-24 12:11 EDT, Dafna Ron
no flags Details
logs (5.21 MB, application/x-gzip)
2012-06-26 11:30 EDT, Dafna Ron
no flags Details
logs (13.81 MB, application/x-gzip)
2012-07-29 10:58 EDT, Dafna Ron
no flags Details

  None (edit)
Description Dafna Ron 2012-06-24 12:11:01 EDT
Description of problem:

running vm's with shared disk on the same host will cause vm's to pause due to I/O errors with XML error: Attempted double use of PCI Address

Version-Release number of selected component (if applicable):

vdsm-4.9.6-16.0.el6.x86_64
si6

How reproducible:

100%

Steps to Reproduce:
1. create a shared disk and attach it to several vm's
2. run all vms on the same host
3.
  
Actual results:

vm's will pause due to I/O errors with the following error: 

XML error: Attempted double use of PCI Address

Expected results:

we should be able to run the vms on the same host

Additional info: full backend and vdsm logs

hread-411::ERROR::2012-06-24 18:49:07,264::vm::604::vm.Vm::(_startUnderlyingVm) vmId=`0ffa8e45-f64d-45f4-9df1-6d165c48f8d8`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 570, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/libvirtvm.py", line 1364, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib/python2.6/site-packages/vdsm/libvirtconnection.py", line 82, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2490, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: XML error: Attempted double use of PCI Address '0:0:4.0'
Comment 1 Dafna Ron 2012-06-24 12:11:57 EDT
Created attachment 594021 [details]
logs
Comment 2 Dan Kenigsberg 2012-06-25 07:10:24 EDT
{'device': 'ich6', 'specParams': {}, 'type': 'sound'}

is specified twice in the devices list.
Comment 4 Itamar Heim 2012-06-26 04:25:13 EDT
ich6 is a sound device. not sure how related to shared disk
Comment 5 Eli Mesika 2012-06-26 05:50:46 EDT
Have tested with 5 VMs using the same shared disk running on a single host.
Problem is not reproducable.
Looked at the code , can not be connected to shared disk.

Please let me know hot to proceed.
Comment 6 Itamar Heim 2012-06-26 05:59:51 EDT
Dafna - per comment 5 - please try to reproduce and provide steps.
thanks
Comment 7 Dafna Ron 2012-06-26 11:26:21 EDT
reproduces on si7 with vdsm-4.9.6-17.0.el6.x86_64

[root@orange-vdsd ~]# vdsClient -s 0 list table
345b7456-c365-4334-87f7-0a5eb6e54ad3  27414  NEW3                 Paused                                   
d9ed9c4a-0892-4992-9005-4a538cdba77b  27321  NEW                  Paused                                   
c790c4ae-edfd-46c0-b79e-7008d748b44f  27464  NEW2                 Up          


event log:

VM NEW2 started on Host orange-vdsd

logs will be attached again -> I restarted vdsm before test so look for I am in the logs. 

reproduce: 

1. create several vms with no disks
2. create a shared disk
3. attach the shared disks to all vm's (as single not bootable disk)
4. run all vms on the same host
Comment 8 Dafna Ron 2012-06-26 11:30:28 EDT
Created attachment 594521 [details]
logs
Comment 9 Eli Mesika 2012-06-27 03:23:47 EDT
seems like a vdsm issue 

Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 570, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/libvirtvm.py", line 1290, in _run
    self.preparePaths(devices[vm.DISK_DEVICES])
  File "/usr/share/vdsm/vm.py", line 616, in preparePaths
    drive['path'] = self.cif.prepareVolumePath(drive, self.id)
  File "/usr/share/vdsm/clientIF.py", line 190, in prepareVolumePath
    raise vm.VolumeError(drive)
VolumeError: Bad volume specification {'index': '0', 'iface': 'virtio', 'format': 'raw', 'type': 'disk', 'specParams': {}, 'readonly': 'false', 'deviceId': '070fe1ec-18c1-4941-85b8-c857735f0bb4', 'propagateErrors': 'off', 'address': {'bus': '0x00', ' slot': '0x06', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'}, 'device': 'disk', 'shared': 'false', 'GUID': '1Dafna-Direct41340269', 'optional': 'false'}


seems like /dev/mapper/1Dafna-Direct41340269 volume is not valid or not accessible 

danken , please recheck ....
Comment 10 Eli Mesika 2012-06-27 05:31:49 EDT
bug was not reproduced on latest even when using same vdsm,libvirt,qemu RPMs as Dafna

vdsm-python-4.9.6-17.0.el6.noarch
vdsm-4.9.6-17.0.el6.x86_64
vdsm-cli-4.9.6-17.0.el6.noarch
libvirt-0.9.10-21.el6.x86_64
qemu-img-rhev-0.12.1.2-2.295.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.295.el6.x86_64
Comment 11 Eli Mesika 2012-06-27 15:38:05 EDT
testing again with a git branch on git hash 1e1966cfd65cc2008fd2317ef127e3c09fc40d16
(this is the git hash reported in si7)

didn't succeeded to reproduce the bug

So, I have now an identical environment as Dafna : core, kernel, vdsm, libvirt and qemu and still bug is not reproducable.

Will need additional information to proceed.
Setting NEEDINFO on Dafna again.
Comment 12 Itamar Heim 2012-06-27 21:59:12 EDT
eli - you are mentioinng seeing a bad volume.
dafna/danken are discussing a duplicate ich (sound iirc) device.
assuming dafna reproduces the duplicate ich error, please take a look in her db at the device table to see if ich defined more than once.
if the error is the bad volume specification, i agree need to look in vdsm, but need the environemnt reproducing this.

dafna - for the repro steps in comment 7, did this reprodcue for you consistently each time you tried to start the VM?
Comment 13 Dafna Ron 2012-06-28 04:30:24 EDT
> 
> dafna - for the repro steps in comment 7, did this reprodcue for you
> consistently each time you tried to start the VM?

yes
Comment 14 Eli Mesika 2012-06-28 05:56:55 EDT
Checking again with ISCSI domain as Dafna uses (my previous checks were in NFS domain)

Same result , not reproduced on si7

Dafna is going to check it on si8 as next step

The sound card suplication seems totally not reklated
Comment 15 Yair Zaslavsky 2012-07-01 03:41:34 EDT
Dafna  , following comment #14 Can you please reproduce on si8?
Comment 16 Eli Mesika 2012-07-01 08:11:51 EDT
(In reply to comment #15)
> Dafna  , following comment #14 Can you please reproduce on si8?

Had checked on si8 (Kiril's env) 
Unable to reproduce the bug, reported scenario works perfectly.
Comment 18 Eli Mesika 2012-07-08 09:35:20 EDT
updated scenario:

1.create 3-4 vm's with nic but no disk
2.go to disk tab
3.create a new shared disk
4.go back to vm tab
5.attach the disk you created one vm at a time :)
6.run the vms on the hosts
Comment 21 Eli Mesika 2012-07-16 04:40:54 EDT
correction :

patch is only :
http://gerrit.ovirt.org/#/c/6282/
Comment 23 Dafna Ron 2012-07-29 10:46:10 EDT
not verified. 
vms still paused due to I/O errors
attaching new logs
Comment 24 Eli Mesika 2012-07-29 10:54:37 EDT
(In reply to comment #23)
> not verified. 
> vms still paused due to I/O errors
> attaching new logs

As you see , this bug blocks 840386 which is a vdsm bug that is in a POST status, so it will not work until 840386 will be merged
Comment 25 Dafna Ron 2012-07-29 10:58:58 EDT
Created attachment 601042 [details]
logs

si12 - logs attached
Comment 26 Dafna Ron 2012-07-29 11:02:38 EDT
actually, this bug is marked as if its blocking 840386 and not the other way around :)

changing this bug to depend on 840386
Comment 27 Dafna Ron 2012-08-12 09:46:38 EDT
verified on si13.2 vdsm-4.9.6-27.0.el6_3.x86_64

Note You need to log in before you can comment on or make changes to this bug.