Bug 1462346 - Heat stack creating volumes has intermittent volume's failing.
Summary: Heat stack creating volumes has intermittent volume's failing.
Keywords:
Status: CLOSED DUPLICATE of bug 1396112
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Eric Harney
QA Contact: Avi Avraham
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-16 19:43 UTC by Jeremy
Modified: 2020-12-14 08:53 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-07 14:41:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jeremy 2017-06-16 19:43:54 UTC
Description of problem:

/cases/01870330/sosreport-20170614-135807/overcloud-controller-prod5b-0.localdomain/var/log/cinder/volume.log
 
###volume creating.
2017-06-14 11:53:04.678 26097 DEBUG cinder.volume.manager [req-c070feb8-761c-463f-9d2d-761d0474837b c59125b2f80d402ba9251a90da6e0171 caaf3b67128749f3b96b69eab1b3bf35 - - -] Task 'cinder.volume.flows.manager.create_volume.ExtractVolumeSpecTask;volume:create' (8a1c7088-d2fa-49d3-af6d-dd2c8b0bb692) transitioned into state 'SUCCESS' from state 'RUNNING' with result '{'status': u'creating', 'volume_size': 100, 'volume_name': 'volume-f799e366-75dc-4c2b-9af0-ee93ef3197d2', 'type': 'raw', 'volume_id': 'f799e366-75dc-4c2b-9af0-ee93ef3197d2'}' _task_receiver /usr/lib/python2.7/site-packages/taskflow/listeners/logging.py:178
 
###volume created with success
cinder.volume.flows.manager.create_volume.ExtractVolumeRefTask;volume:create' (f70d0782-405e-4e4e-9918-9a1854908696) transitioned into state 'SUCCESS' from state 'RUNNING'
 
###volume not able to be found??
2017-06-14 11:53:05.941 26097 DEBUG oslo_concurrency.processutils [-] u'/opt/Navisphere/bin/naviseccli -address 10.224.52.96 -user bete -password beter00t -scope global -np lun -list -name volume-f799e366-75dc-4c2b-9af0-ee93ef3197d2 -state -status -opDetails -userCap -owner -attachedSnapshot' failed. Not Retrying. execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:422
2017-06-14 11:53:05.942 26097 DEBUG cinder.volume.drivers.emc.emc_vnx_cli [-] EMC: Command: ('/opt/Navisphere/bin/naviseccli', '-address', '10.224.52.96', '-np', 'lun', '-list', '-name', 'volume-f799e366-75dc-4c2b-9af0-ee93ef3197d2', '-state', '-status', '-opDetails', '-userCap', '-owner', '-attachedSnapshot'). Result: Could not retrieve the specified (pool lun). The (pool lun) may not exist\n\n. _command_execute_on_active_ip /usr/lib/python2.7/site-packages/cinder/volume/drivers/emc/emc_vnx_cli.py:2056
 
 
###ends up errored??
| f799e366-75dc-4c2b-9af0-ee93ef3197d2 | error  |  elasticsearch_volume  | 100  |      -      |  false   |             |
+--------------------------------------+--------+------------------------+------+-------------+----------+-------------+


Version-Release number of selected component (if applicable):
openstack-cinder-8.1.0-1.el7ost.noarch

How reproducible:
intermittent

Steps to Reproduce:
1. run heat stack
2. notice some volumes fail
3.

Actual results:
volumes fail

Expected results:
volumes created.

Additional info:

Comment 2 Jeremy 2017-06-19 00:21:27 UTC
Update: 
It seems this problem happens when creating volume from image. AS a test a loop was ran that created volumes from images. The test script left 5 / 200 volumes in error state. we see multipath issues:

2017-06-17 18:05:09.303 2120 ERROR cinder.volume.manager ImageCopyFailure: Failed to copy image to volume: Failed to execute command multipath -l /dev/disk
/by-path/ip-10.148.43.135:3260-iscsi-iqn.1992-04.com.emc:cx.ckm00171601134.a3-lun-149

It appears to try /dev/mapper  address first and after multiple failed attempts  switches to address by path. see below:
In each volume failure case the actual volumes exists on the EMC.

2017-06-18 02:35:50.445 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Have been at this for 6.036 seconds _print_stop /usr/lib/python2.7/site-packages/os_brick/utils.py:40
2017-06-18 02:35:50.446 2120 DEBUG os_brick.initiator.linuxscsi [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Checking to see if /dev/mapper/36006016076e04400f3b8bc80ce53e711 exists yet. wait_for_path /usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py:163
2017-06-18 02:35:50.446 2120 DEBUG os_brick.initiator.linuxscsi [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] /dev/mapper/36006016076e04400f3b8bc80ce53e711 doesn't exists yet. wait_for_path /usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py:165
2017-06-18 02:35:50.447 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Failed attempt 1 _print_stop /usr/lib/python2.7/site-packages/os_brick/utils.py:38
2017-06-18 02:35:50.448 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Have been at this for 0.001 seconds _print_stop /usr/lib/python2.7/site-packages/os_brick/utils.py:40
2017-06-18 02:35:50.448 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Sleeping for 2 seconds _backoff_sleep /usr/lib/python2.7/site-packages/os_brick/utils.py:33
2017-06-18 02:35:52.484 2120 DEBUG os_brick.initiator.linuxscsi [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Checking to see if /dev/mapper/36006016076e04400f3b8bc80ce53e711 exists yet. wait_for_path /usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py:163
2017-06-18 02:35:52.487 2120 DEBUG os_brick.initiator.linuxscsi [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] /dev/mapper/36006016076e04400f3b8bc80ce53e711 doesn't exists yet. wait_for_path /usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py:165
2017-06-18 02:35:52.487 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Failed attempt 2 _print_stop /usr/lib/python2.7/site-packages/os_brick/utils.py:38
2017-06-18 02:35:52.488 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Have been at this for 2.042 seconds _print_stop /usr/lib/python2.7/site-packages/os_brick/utils.py:40
2017-06-18 02:35:52.488 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Sleeping for 4 seconds _backoff_sleep /usr/lib/python2.7/site-packages/os_brick/utils.py:33
2017-06-18 02:35:56.489 2120 DEBUG os_brick.initiator.linuxscsi [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Checking to see if /dev/mapper/36006016076e04400f3b8bc80ce53e711 exists yet. wait_for_path /usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py:163
2017-06-18 02:35:56.492 2120 DEBUG os_brick.initiator.linuxscsi [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] /dev/mapper/36006016076e04400f3b8bc80ce53e711 doesn't exists yet. wait_for_path /usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py:165
2017-06-18 02:35:56.493 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Failed attempt 3 _print_stop /usr/lib/python2.7/site-packages/os_brick/utils.py:38
2017-06-18 02:35:56.493 2120 DEBUG os_brick.utils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Have been at this for 6.047 seconds _print_stop /usr/lib/python2.7/site-packages/os_brick/utils.py:40
2017-06-18 02:35:56.494 2120 WARNING os_brick.initiator.linuxscsi [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] couldn't find a valid multipath device path for 36006016076e04400f3b8bc80ce53e711
2017-06-18 02:35:56.495 2120 DEBUG oslo_concurrency.processutils [req-15a31bdd-158f-4b53-b575-f5bee74e945c 9481bf9971604ee986034e0a542f1cdb 7ae5391e0a87438e9b4fb530e06c861e - - -] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf multipath -l /dev/disk/by-path/ip-10.148.44.7:3260-iscsi-iqn.1992-04.com.emc:cx.ckm00171601134.a1-lun-130 execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:344

Comment 3 Gorka Eguileor 2017-06-21 12:24:06 UTC

*** This bug has been marked as a duplicate of bug 1432315 ***

Comment 4 Jeremy 2017-06-29 13:23:01 UTC
Hello,
Customer applied the fix from := bug 1432315, however still has the issue.

'Failure' object has no attribute '__getitem__'

Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.919 38828 ERROR cinder.volume.manager DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.919 38828 ERROR cinder.volume.manager
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher [req-47d4cd80-5e2d-4331-a6c2-f96c07d54f23 50e6177297fc434993703a59ac801dc4 32c4a188f6ba44d794cc3b3c911caf2e - - -] Exception during message handling: 'Failure' object has no attribute '__getitem__'
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last):
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher     incoming.message))
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher     return self._do_dispatch(endpoint, method, ctxt, args)
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/cinder/volume/manager.py", line 654, in create_volume
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher     self._update_allocated_capacity(vol_ref)
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/cinder/volume/manager.py", line 2727, in _update_allocated_capacity
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher     pool = vol_utils.extract_host(vol['host'], 'pool')
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher TypeError: 'Failure' object has no attribute '__getitem__'
Jun 29 07:35:57 overcloud-controller-prodca-0 cinder-volume: 2017-06-29 07:35:57.928 38828 ERROR oslo_messaging.rpc.dispatcher
Jun 29 07:37:04 overcloud-controller-prodca-0 kernel: device-mapper: table: 253:3: multipath: error getting device
Jun 29 07:37:04 overcloud-controller-prodca-0 kernel: device-mapper: ioctl: error adding target to table
Jun 29 07:37:06 overcloud-controller-prodca-0 kernel: device-mapper: table: 253:0: multipath: error getting device
Jun 29 07:37:06 overcloud-controller-prodca-0 kernel: device-mapper: ioctl: error adding target to table
Jun 29 07:37:19 overcloud-controller-prodca-0 kernel: device-mapper: table: 253:4: multipath: error getting device
Jun 29 07:37:19 overcloud-controller-prodca-0 kernel: device-mapper: ioctl: error adding target to table
Jun 29 07:37:21 overcloud-controller-prodca-0 kernel: device-mapper: table: 253:1: multipath: error getting device
Jun 29 07:37:21 overcloud-controller-prodca-0 kernel: device-mapper: ioctl: error adding target to table
Jun 29 07:37:23 overcloud-controller-prodca-0 kernel: device-mapper: table: 253:1: multipath: error getting device
Jun 29 07:37:23 overcloud-controller-prodca-0 kernel: device-mapper: ioctl: error adding target to table
Jun 29 07:37:26 overcloud-controller-prodca-0 kernel: device-mapper: table: 253:0: multipath: error getting device
Jun 29 07:37:26 overcloud-controller-prodca-0 kernel: device-mapper: ioctl: error adding target to table
Jun 29 07:37:52 overcloud-controller-prodca-0 kernel: device-mapper: table: 253:0: multipath: error getting device
Jun 29 07:37:52 overcloud-controller-prodca-0 kernel: device-mapper: ioctl: error adding target to table
Jun 29 07:37:53 overcloud-controller-prodca-0 kernel: device-mapper: table: 253:0: multipath: error getting device
Jun 29 07:37:53 overcloud-controller-prodca-0 kernel: device-mapper: ioctl: error adding target to table


Sosreport on collab shell is here;
/cases/01870330/sosreport-20170629-111343/overcloud-controller-prodca-0.localdomain/

Comment 5 Eric Harney 2017-09-07 14:41:43 UTC
A lot of work has been done on multipath recently which should help this issue.  This needs to be retested with the updates from bug 1440576 and bug 1396112.

*** This bug has been marked as a duplicate of bug 1396112 ***


Note You need to log in before you can comment on or make changes to this bug.