Bug 834893
| Summary: | vdsm: vms with shared disk will pause due to I/O errors on double use of PCI Address | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Dafna Ron <dron> | ||||||||
| Component: | ovirt-engine | Assignee: | Eli Mesika <emesika> | ||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Dafna Ron <dron> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | urgent | ||||||||||
| Version: | 3.1.0 | CC: | abaron, amureini, bazulay, danken, dyasny, hateya, iheim, lpeer, Rhev-m-bugs, yeylon, ykaul, yzaslavs | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | 3.1.0 | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | storage | ||||||||||
| Fixed In Version: | SI13 | Doc Type: | Bug Fix | ||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | |||||||||||
| : | 840386 (view as bug list) | Environment: | |||||||||
| Last Closed: | Type: | Bug | |||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | 840386 | ||||||||||
| Bug Blocks: | |||||||||||
| Attachments: |
|
||||||||||
Created attachment 594021 [details]
logs
{'device': 'ich6', 'specParams': {}, 'type': 'sound'}
is specified twice in the devices list.
ich6 is a sound device. not sure how related to shared disk Have tested with 5 VMs using the same shared disk running on a single host. Problem is not reproducable. Looked at the code , can not be connected to shared disk. Please let me know hot to proceed. Dafna - per comment 5 - please try to reproduce and provide steps. thanks reproduces on si7 with vdsm-4.9.6-17.0.el6.x86_64 [root@orange-vdsd ~]# vdsClient -s 0 list table 345b7456-c365-4334-87f7-0a5eb6e54ad3 27414 NEW3 Paused d9ed9c4a-0892-4992-9005-4a538cdba77b 27321 NEW Paused c790c4ae-edfd-46c0-b79e-7008d748b44f 27464 NEW2 Up event log: VM NEW2 started on Host orange-vdsd logs will be attached again -> I restarted vdsm before test so look for I am in the logs. reproduce: 1. create several vms with no disks 2. create a shared disk 3. attach the shared disks to all vm's (as single not bootable disk) 4. run all vms on the same host Created attachment 594521 [details]
logs
seems like a vdsm issue
Traceback (most recent call last):
File "/usr/share/vdsm/vm.py", line 570, in _startUnderlyingVm
self._run()
File "/usr/share/vdsm/libvirtvm.py", line 1290, in _run
self.preparePaths(devices[vm.DISK_DEVICES])
File "/usr/share/vdsm/vm.py", line 616, in preparePaths
drive['path'] = self.cif.prepareVolumePath(drive, self.id)
File "/usr/share/vdsm/clientIF.py", line 190, in prepareVolumePath
raise vm.VolumeError(drive)
VolumeError: Bad volume specification {'index': '0', 'iface': 'virtio', 'format': 'raw', 'type': 'disk', 'specParams': {}, 'readonly': 'false', 'deviceId': '070fe1ec-18c1-4941-85b8-c857735f0bb4', 'propagateErrors': 'off', 'address': {'bus': '0x00', ' slot': '0x06', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'}, 'device': 'disk', 'shared': 'false', 'GUID': '1Dafna-Direct41340269', 'optional': 'false'}
seems like /dev/mapper/1Dafna-Direct41340269 volume is not valid or not accessible
danken , please recheck ....
bug was not reproduced on latest even when using same vdsm,libvirt,qemu RPMs as Dafna vdsm-python-4.9.6-17.0.el6.noarch vdsm-4.9.6-17.0.el6.x86_64 vdsm-cli-4.9.6-17.0.el6.noarch libvirt-0.9.10-21.el6.x86_64 qemu-img-rhev-0.12.1.2-2.295.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.295.el6.x86_64 testing again with a git branch on git hash 1e1966cfd65cc2008fd2317ef127e3c09fc40d16 (this is the git hash reported in si7) didn't succeeded to reproduce the bug So, I have now an identical environment as Dafna : core, kernel, vdsm, libvirt and qemu and still bug is not reproducable. Will need additional information to proceed. Setting NEEDINFO on Dafna again. eli - you are mentioinng seeing a bad volume. dafna/danken are discussing a duplicate ich (sound iirc) device. assuming dafna reproduces the duplicate ich error, please take a look in her db at the device table to see if ich defined more than once. if the error is the bad volume specification, i agree need to look in vdsm, but need the environemnt reproducing this. dafna - for the repro steps in comment 7, did this reprodcue for you consistently each time you tried to start the VM? >
> dafna - for the repro steps in comment 7, did this reprodcue for you
> consistently each time you tried to start the VM?
yes
Checking again with ISCSI domain as Dafna uses (my previous checks were in NFS domain) Same result , not reproduced on si7 Dafna is going to check it on si8 as next step The sound card suplication seems totally not reklated Dafna , following comment #14 Can you please reproduce on si8? (In reply to comment #15) > Dafna , following comment #14 Can you please reproduce on si8? Had checked on si8 (Kiril's env) Unable to reproduce the bug, reported scenario works perfectly. updated scenario: 1.create 3-4 vm's with nic but no disk 2.go to disk tab 3.create a new shared disk 4.go back to vm tab 5.attach the disk you created one vm at a time :) 6.run the vms on the hosts correction : patch is only : http://gerrit.ovirt.org/#/c/6282/ not verified. vms still paused due to I/O errors attaching new logs (In reply to comment #23) > not verified. > vms still paused due to I/O errors > attaching new logs As you see , this bug blocks 840386 which is a vdsm bug that is in a POST status, so it will not work until 840386 will be merged Created attachment 601042 [details]
logs
si12 - logs attached
actually, this bug is marked as if its blocking 840386 and not the other way around :) changing this bug to depend on 840386 verified on si13.2 vdsm-4.9.6-27.0.el6_3.x86_64 |
Description of problem: running vm's with shared disk on the same host will cause vm's to pause due to I/O errors with XML error: Attempted double use of PCI Address Version-Release number of selected component (if applicable): vdsm-4.9.6-16.0.el6.x86_64 si6 How reproducible: 100% Steps to Reproduce: 1. create a shared disk and attach it to several vm's 2. run all vms on the same host 3. Actual results: vm's will pause due to I/O errors with the following error: XML error: Attempted double use of PCI Address Expected results: we should be able to run the vms on the same host Additional info: full backend and vdsm logs hread-411::ERROR::2012-06-24 18:49:07,264::vm::604::vm.Vm::(_startUnderlyingVm) vmId=`0ffa8e45-f64d-45f4-9df1-6d165c48f8d8`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 570, in _startUnderlyingVm self._run() File "/usr/share/vdsm/libvirtvm.py", line 1364, in _run self._connection.createXML(domxml, flags), File "/usr/lib/python2.6/site-packages/vdsm/libvirtconnection.py", line 82, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2490, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: XML error: Attempted double use of PCI Address '0:0:4.0'