1298264 – Cinder volumes stuck in downloading state

Bug 1298264 - Cinder volumes stuck in downloading state

Summary: Cinder volumes stuck in downloading state

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	python-oslo-messaging
Sub Component:
Version:	6.0 (Juno)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	async
Target Release:	6.0 (Juno)
Assignee:	Eric Harney
QA Contact:	lkuchlan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-01-13 16:00 UTC by Jeremy
Modified:	2023-09-14 03:19 UTC (History)
CC List:	13 users (show)
Fixed In Version:	python-oslo-messaging-1.4.1-8.el7ost
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1301634 (view as bug list)
Environment:
Last Closed:	2016-02-18 17:44:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	OSP-28586	0	None	None	None	2023-09-14 03:18:13 UTC
Red Hat Product Errata	RHBA-2016:0267	0	normal	SHIPPED_LIVE	python-oslo-messaging bug fix advisory	2016-02-18 22:44:28 UTC

Description Jeremy 2016-01-13 16:00:21 UTC

Description of problem:
[root@phllcu401 ~(cloud_admin_v2)]# cinder show af405553-e0cf-4f3c-a8b0-dc772422ed40
+---------------------------------------+--------------------------------------------------------------------------------------------+
|                Property               |                                           Value                                            |
+---------------------------------------+--------------------------------------------------------------------------------------------+
|              attachments              |                                             []                                             |
|           availability_zone           |                                  SungardAS-IntegrationLab                                  |
|                bootable               |                                           false                                            |
|               created_at              |                                 2016-01-13T06:53:25.000000                                 |
|          display_description          |                                                                                            |
|              display_name             |                                                                                            |
|               encrypted               |                                           False                                            |
|                   id                  |                            af405553-e0cf-4f3c-a8b0-dc772422ed40                            |
|                metadata               |                                             {}                                             |
|         os-vol-host-attr:host         | phllcu402@netapp#phllnasnet1000-storage.edisonqa.sgns.net:/phllnasnet1000v_vol8_uat_cinder |
|     os-vol-mig-status-attr:migstat    |                                            None                                            |
|     os-vol-mig-status-attr:name_id    |                                            None                                            |
|      os-vol-tenant-attr:tenant_id     |                              396f6c55aba34d8fade57c21ae00ed5d                              |
|   os-volume-replication:driver_data   |                                            None                                            |
| os-volume-replication:extended_status |                                            None                                            |
|                  size                 |                                             80                                             |
|              snapshot_id              |                                            None                                            |
|              source_volid             |                                            None                                            |
|                 status                |                                        downloading                                         |
|              volume_type              |                                           netapp                                           |
+---------------------------------------+--------------------------------------------------------------------------------------------+
 
 
 
21672: ERROR cinder.volume.drivers.netapp.nfs [req-818755ac-04de-4447-a5be-56034df0cfbd 2fa2cef88f9742448fd8180054eeade5 396f6c55aba34d8fade57c21ae00ed5d - - -] Error in workflow copy from cache. Unexpected error while running command.
Command: None
Exit code: -
Stdout: u"Unexpected error while running command.\nCommand: /usr/bin/na_copyoffload_64 10.140.231.6 10.140.231.6 /phllnasnet1000v_vol9_uat_cinder/img-cache-0890828d-41f9-4268-8834-dffe250a4ab4 /phllnasnet1000v_vol8_uat_cinder/volume-af405553-e0cf-4f3c-a8b0-dc772422ed40\nExit code: 24\nStdout: u'Program exiting with return code 10008.\\n'\nStderr: u''"
Stderr: None.
21672: TRACE cinder.volume.drivers.netapp.nfs Traceback (most recent call last):
21672: TRACE cinder.volume.drivers.netapp.nfs   File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/nfs.py", line 1226, in _copy_from_cache
21672: TRACE cinder.volume.drivers.netapp.nfs     check_exit_code=0)
21672: TRACE cinder.volume.drivers.netapp.nfs   File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 142, in execute
21672: TRACE cinder.volume.drivers.netapp.nfs     return processutils.execute(*cmd, **kwargs)
21672: TRACE cinder.volume.drivers.netapp.nfs   File "/usr/lib/python2.7/site-packages/cinder/openstack/common/processutils.py", line 200, in execute
21672: TRACE cinder.volume.drivers.netapp.nfs     cmd=sanitized_cmd)
21672: TRACE cinder.volume.drivers.netapp.nfs ProcessExecutionError: Unexpected error while running command.
21672: TRACE cinder.volume.drivers.netapp.nfs Command: /usr/bin/na_copyoffload_64 10.140.231.6 10.140.231.6 /phllnasnet1000v_vol9_uat_cinder/img-cache-0890828d-41f9-4268-8834-dffe250a4ab4 /phllnasnet1000v_vol8_uat_cinder/volume-af405553-e0cf-4f3c-a8b0-dc772422ed40
21672: TRACE cinder.volume.drivers.netapp.nfs Exit code: 24
21672: TRACE cinder.volume.drivers.netapp.nfs Stdout: u'Program exiting with return code 10008.\n'
21672: TRACE cinder.volume.drivers.netapp.nfs Stderr: u''
21672: TRACE cinder.volume.drivers.netapp.nfs



Version-Release number of selected component (if applicable):
openstack-cinder-2014.2.3-3.el7ost.noarch 
python-cinderclient-1.1.1-1.el7ost.noarch  

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Restart of cinder services, or server reboot solve the problem temporarily, but it seems to occur again fairly quiclky.

Comment 2 Jon Bernard 2016-01-13 16:37:59 UTC

Huge number of rabbit missing exchanges, I think your messaging configuration is suspect.  I didn't see any proper cinder exceptions there beyond the copyoffload, but that will fallback to an unoptimized operation.

Comment 9 Jeremy 2016-01-18 14:40:31 UTC

Hello,
We are still having issues with volumes stuck in downloading state after patching rabbitmq. Below is the error we are seeing:

volume.log:Stdout: u"Unexpected error while running command.\nCommand: /usr/bin/na_copyoffload_64 10.140.231.6 10.140.231.6 /phllnasnet1000v_vol9_uat_cinder/img-cache-d5b5ff16-e609-4aed-938c-7bda3921fc64 /phllnasnet1000v_vol8_uat_cinder/volume-07f0cd02-6cf4-4b83-b370-5f8d05b4f0cf\nExit code: 24\nStdout: u'Program exiting with return code 10008.\\n'\nStderr: u''"
volume.log:14172: TRACE cinder.volume.drivers.netapp.nfs Stdout: u'Program exiting with return code 10008.\n'

Comment 10 Mehdi ABAAKOUK 2016-01-18 15:22:09 UTC

Do you still seen issue with oslo.messaging/rabbit ? Or only the cinder one is present ?

Comment 11 Freddy Wissing 2016-01-18 15:24:23 UTC

They have reported that the oslo.messaging/rabbit issue no longer manifests, but the original symptom of it (the cinder volumes stuck in downloading state) is still there.

Comment 14 Rich Boyce 2016-01-21 11:57:36 UTC

We are experiencing this issue in our environment as well. Here is the relevant part of /var/log/cinder/volume.log:

2016-01-21 11:17:32.702 65499 ERROR cinder.volume.drivers.netapp.nfs [req-2030a5c0-258f-455b-b312-431448f1d6e2 a1c9ba1fc9964a0f845aef41a22de7b5 2dc4661120364eaba4744eb0348a81b3 - - -] Error 
in workflow copy from cache. Unexpected error while running command.
Command: None
Exit code: -
Stdout: u"Unexpected error while running command.\nCommand: /etc/cinder/na_copyoffload_64 10.35.4.12 10.35.4.12 /vol/hh_extcl03_nfsprod01_vol01/img-cache-8f74dd15-46ae-4ce2-b33a-75459ff8869a /vol/hh_extcl03_nfsprod01_vol03/volume-1f75a6ee-6de8-43fb-9ac8-3aceb990bf3d\nExit code: 24\nStdout: u'Program exiting with return code 10008.\\n'\nStderr: u''"
Stderr: None.
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Traceback (most recent call last):
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs   File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/nfs.py", line 1226, in _copy_from_cache
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs     check_exit_code=0)
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs   File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 142, in execute
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs     return processutils.execute(*cmd, **kwargs)
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs   File "/usr/lib/python2.7/site-packages/cinder/openstack/common/processutils.py", line 200, in execute
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs     cmd=sanitized_cmd)
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs ProcessExecutionError: Unexpected error while running command.
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Command: /etc/cinder/na_copyoffload_64 10.35.4.12 10.35.4.12 /vol/hh_extcl03_nfsprod01_vol01/img-cache-8f74dd15-46ae-4ce2-b33a-75459ff8869a /vol/hh_extcl03_nfsprod01_vol03/volume-1f75a6ee-6de8-43fb-9ac8-3aceb990bf3d
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Exit code: 24
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Stdout: u'Program exiting with return code 10008.\n'
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Stderr: u''
2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs 

Here's the resulting cinder volume:

[root@ctrl002 mnt(openstack_admin)]# cinder show 1f75a6ee-6de8-43fb-9ac8-3aceb990bf3d
+---------------------------------------+----------------------------------------------------------+
|                Property               |                          Value                           |
+---------------------------------------+----------------------------------------------------------+
|              attachments              |                            []                            |
|           availability_zone           |                           nova                           |
|                bootable               |                          false                           |
|               created_at              |                2016-01-21T11:15:53.000000                |
|          display_description          |                                                          |
|              display_name             |                                                          |
|               encrypted               |                          False                           |
|                   id                  |           1f75a6ee-6de8-43fb-9ac8-3aceb990bf3d           |
|                metadata               |                            {}                            |
|         os-vol-host-attr:host         | ha-controller#10.35.4.12:/vol/hh_extcl03_nfsprod01_vol03 |
|     os-vol-mig-status-attr:migstat    |                           None                           |
|     os-vol-mig-status-attr:name_id    |                           None                           |
|      os-vol-tenant-attr:tenant_id     |             2dc4661120364eaba4744eb0348a81b3             |
|   os-volume-replication:driver_data   |                           None                           |
| os-volume-replication:extended_status |                           None                           |
|                  size                 |                            20                            |
|              snapshot_id              |                           None                           |
|              source_volid             |                           None                           |
|                 status                |                       downloading                        |
|              volume_type              |                           None                           |
+---------------------------------------+----------------------------------------------------------+


And when running the /etc/cinder/na_copyoffload_64 command manually:

[root@ctrl002 mnt]# /etc/cinder/na_copyoffload_64 10.35.4.12 10.35.4.12 /vol/hh_extcl03_nfsprod01_vol01/img-cache-8f74dd15-46ae-4ce2-b33a-75459ff8869a /vol/hh_extcl03_nfsprod01_vol03/rbtest210116
Program exiting with return code 10008.

We are not seeing any rabbitmq issues as far as I can tell - what logs should I check for that?

Comment 16 Lon Hohberger 2016-01-28 21:44:36 UTC

Is this bug a cinder issue or oslo-messaging issue?  If it's "both", please clone to cinder, as we have a fix for oslo-messaging.

Comment 17 lkuchlan 2016-02-01 08:00:47 UTC

The latest rhos6 does not contain the petch python-oslo-messaging-1.4.1-8.el7ost

Tested using:
python-oslo-messaging-1.4.1-7.el7ost.noarch

Comment 19 lkuchlan 2016-02-04 17:05:49 UTC

Tested using:
python-oslo-messaging-1.4.1-8.el7ost.noarch
python-cinder-2014.2.3-11.el7ost.noarch
python-cinderclient-1.1.1-1.el7ost.noarch
openstack-cinder-2014.2.3-11.el7ost.noarch

Verification instructions:
1. Configure cinder and glance for using Netapp NFS driver as back-end
2. Create a volume from image

Results:

[root@cougar01 ~]# mount | grep 'glance\|cinder'
10.35.64.11:/vol/vol_rhos_auto_nfs on /var/lib/cinder/mnt/b412ab27c39458e94f0027cd1aff3877 type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.35.64.11,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=10.35.64.11)
10.35.64.11:/vol/vol_rhos_auto_nfs on /var/lib/glance/images type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.35.64.11,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=10.35.64.11)

[root@cougar01 ~(keystone_admin)]# cinder create 1 --image cbe8b84d-8269-4126-8216-286fd42b454e
+---------------------+--------------------------------------+
|       Property      |                Value                 |
+---------------------+--------------------------------------+
|     attachments     |                  []                  |
|  availability_zone  |                 nova                 |
|       bootable      |                false                 |
|      created_at     |      2016-02-04T15:27:44.371045      |
| display_description |                 None                 |
|     display_name    |                 None                 |
|      encrypted      |                False                 |
|          id         | 2615ab3b-a649-4d2f-b4e9-0d880d7f8d79 |
|       image_id      | cbe8b84d-8269-4126-8216-286fd42b454e |
|       metadata      |                  {}                  |
|         size        |                  1                   |
|     snapshot_id     |                 None                 |
|     source_volid    |                 None                 |
|        status       |               creating               |
|     volume_type     |                 None                 |
+---------------------+--------------------------------------+

[root@cougar01 ~(keystone_admin)]# cinder list
+--------------------------------------+-----------+--------------+------+-------------+----------+-------------+
|                  ID                  |   Status  | Display Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+--------------+------+-------------+----------+-------------+
| 2615ab3b-a649-4d2f-b4e9-0d880d7f8d79 | available |     None     |  1   |     None    |   true   |             |
+--------------------------------------+-----------+--------------+------+-------------+----------+-------------+

Comment 21 errata-xmlrpc 2016-02-18 17:44:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0267.html

Comment 22 Red Hat Bugzilla 2023-09-14 03:16:03 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.