1546826 – error detaching volumes when under load

Bug 1546826 - error detaching volumes when under load

Summary: error detaching volumes when under load

Keywords:
Status:	CLOSED DUPLICATE of bug 1551733
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	12.0 (Pike)
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	async
Target Release:	13.0 (Queens)
Assignee:	Lee Yarwood
QA Contact:	OSP DFG:Compute
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-02-19 16:46 UTC by coldford@redhat.com
Modified:	2023-03-21 18:44 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-03-21 10:17:48 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
OpenStack gerrit	551950	0	None	MERGED	Avoid exploding if guest refuses to detach a volume	2020-10-28 22:41:12 UTC
Red Hat Issue Tracker	OSP-11355	0	None	None	None	2021-12-10 15:58:12 UTC

Internal Links: 1548070

Description coldford@redhat.com 2018-02-19 16:46:49 UTC

Description of problem:

running a stack delete while instances are busy sometimes results in errors:

2018-02-14 20:27:59.748 1 ERROR oslo_messaging.rpc.server libvirtError: internal error: unable to execute QEMU command 'device_del': Device 'virtio-disk2' not found

and failure to detach the volumes, resulting stack delete failure.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create stack
2. Create load 
3. Delete stack

Actual results:

Volumes showing as DELETE_FAILED


Expected results:

Volumes get deleted.

Additional info:

We've tried raising the following to compensate:
- haproxy server and client timeout to 5 minutes
- cinder workers to 56 (to match number of cpus)
- keystone rpc timeout to 300s
- cinder rpc timeout to 300s

Comment 1 Eric Harney 2018-02-19 16:54:32 UTC

What volume driver is being used here?

Comment 3 coldford@redhat.com 2018-02-19 16:57:24 UTC

As per the sosreports:
etc/cinder/cinder.conf:volume_driver=cinder.volume.drivers.dell_emc.vnx.driver.VNXDriver

Comment 5 Thiago da Silva 2018-03-09 15:03:34 UTC

Lee, can someone from the Nova team take a look at the nova error? Cinder team is not sure whether this bug is Cinder or Nova related. It is not possible to duplicate without a VNX array.

Comment 6 Lee Yarwood 2018-03-12 10:19:19 UTC

(In reply to Thiago da Silva from comment #5)
> Lee, can someone from the Nova team take a look at the nova error? Cinder
> team is not sure whether this bug is Cinder or Nova related. It is not
> possible to duplicate without a VNX array.

Moving this across to Nova, this looks like guest is still sending I/O to the volume, we've recently started ignoring this upstream and it should be a trivial backport downstream into OSP 12.

[1] https://review.openstack.org/#/c/546423/
[2] https://bugs.launchpad.net/nova/+bug/1750680

Comment 10 Lee Yarwood 2018-03-21 10:17:48 UTC


*** This bug has been marked as a duplicate of bug 1551733 ***

Note You need to log in before you can comment on or make changes to this bug.