1428962 – detach fails due to failed mpath output parsing

Bug 1428962 - detach fails due to failed mpath output parsing

Summary: detach fails due to failed mpath output parsing

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-cinder
Sub Component:
Version:	10.0 (Newton)
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	z5
Target Release:	10.0 (Newton)
Assignee:	Gorka Eguileor
QA Contact:	Avi Avraham
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-03-03 17:16 UTC by Jack Waterworth
Modified:	2020-12-14 08:17 UTC (History)
CC List:	13 users (show)
Fixed In Version:	openstack-cinder-9.1.4-7.el7ost
Doc Type:	Bug Fix
Doc Text:	This update improves iSCSI connections by using the latest os-brick functionality to force detach volumes when appropriate. For optimum results, use with iscsi-initiator-utils version 6.2.0.874-2 or greater.
Clone Of:
Environment:
Last Closed:	2017-09-28 16:31:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:2821	0	normal	SHIPPED_LIVE	openstack-cinder bug fix advisory	2017-09-28 20:30:49 UTC

Description Jack Waterworth 2017-03-03 17:16:07 UTC

Description of problem:

 Here are a few samples:

2017-03-02 20:20:02.838 6293 INFO nova.compute.manager [req-bca5e2cb-d5e1-4c21-b5ea-3e732116aca2 241af030f55c4f80ada2471b7d890d02 6cf231df5cd34237ae96713a13fe35e0 - - -] [instance: e6f37a83-d85a-4740-9637-151b2966a0a7] Detach volume 05bd4067-53d8-4d74-b58e-f196883b8d31 from mountpoint /dev/vdb
2017-03-02 20:20:16.395 6293 INFO os_brick.initiator.linuxscsi [req-bca5e2cb-d5e1-4c21-b5ea-3e732116aca2 241af030f55c4f80ada2471b7d890d02 6cf231df5cd34237ae96713a13fe35e0 - - -] Find Multipath device file for volume WWN 3600601609e603d004ec5f36e80ffe611
2017-03-02 20:20:16.474 6293 WARNING os_brick.initiator.connectors.iscsi [req-bca5e2cb-d5e1-4c21-b5ea-3e732116aca2 241af030f55c4f80ada2471b7d890d02 6cf231df5cd34237ae96713a13fe35e0 - - -] Failed to parse the output of multipath -ll. stdout: 3600601609e603d00db0e7da761ffe611 dm-0 
3600601609e603d004ec5f36e80ffe611 dm-6 ,
size=1.0G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=0 status=enabled
  `- #:#:#:# - #:# failed faulty running
3600601609e603d00db0e7da761ffe611 dm-0 
size=10G features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw
3600601609e603d00f541bb3a6affe611 dm-2 
size=10G features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw
3600601609e603d005b9abc497cffe611 dm-4 
size=10G features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw

and also:
2017-03-02 20:20:02.440 6289 INFO nova.compute.manager [req-4512de11-1cfe-46ce-860c-e37df8307141 241af030f55c4f80ada2471b7d890d02 6cf231df5cd34237ae96713a13fe35e0 - - -] [instance: b4db63c3-5c29-4238-8eb2-844202beac58] Detach volume f7766f20-b1f6-49a4-aaab-60a64975c885 from mountpoint /dev/vdb
2017-03-02 20:20:03.926 6289 INFO os_brick.initiator.linuxscsi [req-4512de11-1cfe-46ce-860c-e37df8307141 241af030f55c4f80ada2471b7d890d02 6cf231df5cd34237ae96713a13fe35e0 - - -] Find Multipath device file for volume WWN 3600601609e603d00237b277580ffe611
2017-03-02 20:20:04.003 6289 WARNING os_brick.initiator.connectors.iscsi [req-4512de11-1cfe-46ce-860c-e37df8307141 241af030f55c4f80ada2471b7d890d02 6cf231df5cd34237ae96713a13fe35e0 - - -] Failed to parse the output of multipath -ll. stdout: 3600601609e603d007c6bb7427cffe611 dm-2 
3600601609e603d00237b277580ffe611 dm-4 ,
size=1.0G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=0 status=enabled
  `- #:#:#:# - #:# failed faulty running
3600601609e603d007c6bb7427cffe611 dm-2 
size=10G features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw
3600601609e603d002cbf54d161ffe611 dm-0 
size=10G features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw

Version-Release number of selected component (if applicable):
OSP10
Customer has upgrade all packages
VNX backend storage (iscsi)

How reproducible:
Not every time

Steps to Reproduce:
1. Deleting a stack of several (32) instances, each with cinder volumes attached

Actual results:
'openstack volume list' showing some volumes still in 'attached to. <uuid>' but those 'uuid' aren't present anymore.

Expected results:
devices should delete correctly

Additional info:
running an `multipath -F` seems to work around the issue.

Comment 5 Paul Grist 2017-03-30 01:03:42 UTC

In further testing last week, some key additional fixes were identified for this  collection and Gorka is the process of getting those ready to post upstream.  We don't have a specific ETA, but we will get the BZs updated once the patches are ready.

Comment 6 Paul Grist 2017-04-12 03:23:18 UTC

Patches are posted for review, comprehensive testing is now passing on iSCSI (FC mpath testing will follow). 

The relevant patch set will actually be the following set and we will confirm the proper collection needed which may actually vary from the initial set proposed.  This BZ will be the right place to track status for the collection.

https://review.openstack.org/#/c/455394/
https://review.openstack.org/#/c/455393/
https://review.openstack.org/#/c/455392/

Comment 10 Tzach Shefi 2017-09-28 09:02:07 UTC

Verified on: 
openstack-cinder-9.1.4-9.el7ost.noarch

Configured Cinder Kaminario iscsi multipath backend. 

Using a virt deployment, maxed out on 18 tiny instances.

Boot instance create Cinder boot volume from image, shutdown=preserve. 

# nova boot --flavor tinydisk --nic net-id=0a4d14d4-735b-4aaa-a0b3-f40c76b8d277 --block-device source=image,id=83a1ffc8-38e5-4b41-817e-09b105cef3f4,dest=volume,size=1,shutdown=preserve,bootindex=0 one1 --poll --min-count 9 

can't use --min-count 18 system fails to create them all at once.
ran again now have 18 instance with 18 attached volumes. 

Deleted all 18 instances
# for i in $(nova list --all-tenant | awk '{print$2}'); do nova delete $i  ; done

All 18 instances were deleted
All 18 volumes remain detached/available. 

Reran whole 18 create/delete loop four cycles, no issues were spotted.

As expected no volume remained attach to phantom UUIDs, verified.  

Out of curiosity also run a fifth cycle this time with shutdown=remove
18 instance/volumes created again all instance/volumes were deleted none left behind.

Comment 11 errata-xmlrpc 2017-09-28 16:31:38 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2821

Note You need to log in before you can comment on or make changes to this bug.