Bug 1088508

Summary:

libvirt: Duplicate ID 'drive-virtio-disk1' for drive when trying to attach a volume to instance for a second time

Product:

Red Hat Enterprise Linux 6

Reporter:

Dafna Ron <dron>

Component:

libvirt

Assignee:

Libvirt Maintainers <libvirt-maint>

Status:

CLOSED DUPLICATE

QA Contact:

Virtualization Bugs <virt-bugs>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

6.6

CC:

acathrow, bili, chhu, dron, dyuan, jdenemar, mzhan

Target Milestone:

Keywords:

Reopened

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

storage

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-04-17 13:41:51 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
logs	none

Description Dafna Ron 2014-04-16 18:00:50 UTC

Created attachment 886967 [details]
logs

Description of problem:

Attach a volume to an instance -> detach the volume from the instance -> attach a volume again to the same instance: we fail we Duplicate ID 'drive-virtio-disk1' for drive error from libvirt. 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. install openstack setup with stand alone cinder + gluster backend
2. create 2 volumes
3. launch an instance 
4. attach the first volume to the instance 
5. detach the first volume to the instance 
6. attach the first volume to the instance again
7. attach the second volume to the instance 


Actual results:

we fail to re-attach a volume after we detach with the following error: 
libvirtError: internal error unable to execute QEMU command '__com.redhat_drive_add': Duplicate ID 'drive-virtio-disk1' for drive

Expected results:

we should be able to re-attach volumes after detaching 

Additional info: logs

full trace: 

2014-04-16 20:17:48.527 10254 DEBUG nova.openstack.common.rpc.amqp [req-6d6c61c2-259f-4206-abac-e45264393bbc 97e5450b24624fd78ad6fa6d8a14ef3d c3178ebef2c24d1b9a045bd67483a83c] UNIQUE_ID is 928d2ad4bb574ed69d933b23db60db9e. _add_unique_id
 /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py:341
2014-04-16 20:17:48.557 10254 DEBUG nova.ceilometer.notifier [req-6d6c61c2-259f-4206-abac-e45264393bbc 97e5450b24624fd78ad6fa6d8a14ef3d c3178ebef2c24d1b9a045bd67483a83c] ignoring attach_volume notify /usr/lib/python2.6/site-packages/ceil
ometer/compute/nova_notifier.py:146
2014-04-16 20:17:48.557 10254 ERROR nova.openstack.common.rpc.amqp [req-6d6c61c2-259f-4206-abac-e45264393bbc 97e5450b24624fd78ad6fa6d8a14ef3d c3178ebef2c24d1b9a045bd67483a83c] Exception during message handling
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 461, in _process_data
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     **args)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     result = getattr(proxyobj, method)(ctxt, **kwargs)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/exception.py", line 90, in wrapped
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     payload)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/exception.py", line 73, in wrapped
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     return f(self, context, *args, **kw)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 244, in decorated_function
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     pass
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 230, in decorated_function
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     return function(self, context, *args, **kwargs)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 272, in decorated_function
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     e, sys.exc_info())
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 259, in decorated_function
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     return function(self, context, *args, **kwargs)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 3657, in attach_volume
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     context, instance, mountpoint)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 3652, in attach_volume
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     mountpoint, instance)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 3699, in _attach_volume
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     connector)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 3689, in _attach_volume
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     encryption=encryption)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 1117, in attach_volume
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     disk_dev)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 1104, in attach_volume
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     virt_dom.attachDeviceFlags(conf.to_xml(), flags)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 179, in doit
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     result = proxy_call(self._autowrap, f, *args, **kwargs)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 139, in proxy_call
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     rv = execute(f,*args,**kwargs)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 77, in tworker
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     rv = meth(*args,**kwargs)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/libvirt.py", line 419, in attachDeviceFlags
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp     if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self)
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp libvirtError: internal error unable to execute QEMU command '__com.redhat_drive_add': Duplicate ID 'drive-virtio-disk1' for drive
2014-04-16 20:17:48.557 10254 TRACE nova.openstack.common.rpc.amqp 
2014-04-16 20:17:49.996 10254 DEBUG nova.openstack.common.periodic_task [-] Running periodic task ComputeManager._poll_volume_usage run_periodic_tasks /usr/lib/python2.6/site-packages/nova/openstack/common/periodic_task.py:176
2014-04-16 20:17:49.997 10254 DEBUG nova.openstack.common.periodic_task [-] Running periodic task ComputeManager._instance_usage_audit run_periodic_tasks /usr/lib/python2.6/site-packages/nova/openstack/common/periodic_task.py:176
2014-04-16 20:17:49.997 10254 DEBUG nova.openstack.common.rpc.amqp [-] Making synchronous call on conductor ... multicall /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py:553
2014-04-16 20:17:49.997 10254 DEBUG nova.openstack.common.rpc.amqp [-] MSG_ID is 2d52b78fc1154139b627657238cd3854 multicall /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py:556
2014-04-16 20:17:49.998 10254 DEBUG nova.openstack.common.rpc.amqp [-] UNIQUE_ID is 4d03764ccd8f430ebbd8c04f91a9771b. _add_unique_id /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py:341
2014-04-16 20:17:50.018 10254 DEBUG nova.openstack.common.periodic_task [-] Running periodic task ComputeManager.update_available_resource run_periodic_tasks /usr/lib/python2.6/site-packages/nova/openstack/common/periodic_task.py:176
2014-04-16 20:17:50.019 10254 DEBUG nova.openstack.common.lockutils [-] Got semaphore "compute_resources" lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:166
2014-04-16 20:17:50.019 10254 DEBUG nova.openstack.common.lockutils [-] Got semaphore / lock "update_available_resource" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:245

Comment 1 Jiri Denemark 2014-04-17 08:53:27 UTC


*** This bug has been marked as a duplicate of bug 807023 ***

Comment 2 Dafna Ron 2014-04-17 09:18:39 UTC

I'm not sure I understand why this is duplicate of Bug 807023. 
bug 807023 is about changing the device_del from async to a sync event alerting of a failure to do the device_del. 

However, my bug is about not being able to attach the device again which means that a device_del does not happen when we detach the device. 

This is a regression in the current build of openstack so, when a device_del does not succeed when I detach the volume from the instance, is that Nova? Cinder? Libvirt?

Comment 3 Jiri Denemark 2014-04-17 12:29:54 UTC

And that's exactly the point of bug 807023. The device_del QEMU command is and has always been asynchronous, even though libvirt thought it wasn't. Thus libvirt was not able to detect when the device was not in fact removed. And that's your case, you tried to remove the device, which did not succeed but libvirt still removed it from it's records. And when you tried to attach it again, libvirt reused the original alias "drive-virtio-disk1", which was however still in use by the original disk and so QEMU complained about it.

If you see this as a regression, it's either because you were lucky and the first disk was actually removed before you tried to attach a new disk or because the current guest OS does not cooperate in the removal.

Comment 4 Dafna Ron 2014-04-17 12:41:42 UTC

Jiri, I think we understand the same thing only referring to different issues. 

My case is actually that we fail to clean the record which did not happen in past builds and I am reproducing it 100% (so it's not my luck that changed :)) 
I am trying to figure out what process is responsible in cleaning this record and why we have a regression (is it libvirt? qemu? nova? cinder? if we have a 100% failure in detach - who can we attribute this to?

Bug 807023 will allow me to debug this more easily or not allow to detach the volume in the first place which would allow us to debug the problem more easily but it would not solve my underline issue which is that for some reason there is a failure in detach.  

perhaps the headline should be modified?

Comment 5 Jiri Denemark 2014-04-17 13:14:42 UTC

A failure to detach is most likely caused by the guest OS ignoring the detach request. What guest OS do you use?

Comment 6 Dafna Ron 2014-04-17 13:31:52 UTC

It's some fedora image. 
Let me create some new images and check with them. 
If this only happens for a specific image than indeed this is a clone and I will close :) 

Thanks!

Comment 7 Dafna Ron 2014-04-17 13:41:51 UTC

yeah - it's the fedora image. 

Thanks Jiri!
closing this as duplicate.

*** This bug has been marked as a duplicate of bug 807023 ***