Description of problem: [root@phllcu401 ~(cloud_admin_v2)]# cinder show af405553-e0cf-4f3c-a8b0-dc772422ed40 +---------------------------------------+--------------------------------------------------------------------------------------------+ | Property | Value | +---------------------------------------+--------------------------------------------------------------------------------------------+ | attachments | [] | | availability_zone | SungardAS-IntegrationLab | | bootable | false | | created_at | 2016-01-13T06:53:25.000000 | | display_description | | | display_name | | | encrypted | False | | id | af405553-e0cf-4f3c-a8b0-dc772422ed40 | | metadata | {} | | os-vol-host-attr:host | phllcu402@netapp#phllnasnet1000-storage.edisonqa.sgns.net:/phllnasnet1000v_vol8_uat_cinder | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 396f6c55aba34d8fade57c21ae00ed5d | | os-volume-replication:driver_data | None | | os-volume-replication:extended_status | None | | size | 80 | | snapshot_id | None | | source_volid | None | | status | downloading | | volume_type | netapp | +---------------------------------------+--------------------------------------------------------------------------------------------+ 21672: ERROR cinder.volume.drivers.netapp.nfs [req-818755ac-04de-4447-a5be-56034df0cfbd 2fa2cef88f9742448fd8180054eeade5 396f6c55aba34d8fade57c21ae00ed5d - - -] Error in workflow copy from cache. Unexpected error while running command. Command: None Exit code: - Stdout: u"Unexpected error while running command.\nCommand: /usr/bin/na_copyoffload_64 10.140.231.6 10.140.231.6 /phllnasnet1000v_vol9_uat_cinder/img-cache-0890828d-41f9-4268-8834-dffe250a4ab4 /phllnasnet1000v_vol8_uat_cinder/volume-af405553-e0cf-4f3c-a8b0-dc772422ed40\nExit code: 24\nStdout: u'Program exiting with return code 10008.\\n'\nStderr: u''" Stderr: None. 21672: TRACE cinder.volume.drivers.netapp.nfs Traceback (most recent call last): 21672: TRACE cinder.volume.drivers.netapp.nfs File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/nfs.py", line 1226, in _copy_from_cache 21672: TRACE cinder.volume.drivers.netapp.nfs check_exit_code=0) 21672: TRACE cinder.volume.drivers.netapp.nfs File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 142, in execute 21672: TRACE cinder.volume.drivers.netapp.nfs return processutils.execute(*cmd, **kwargs) 21672: TRACE cinder.volume.drivers.netapp.nfs File "/usr/lib/python2.7/site-packages/cinder/openstack/common/processutils.py", line 200, in execute 21672: TRACE cinder.volume.drivers.netapp.nfs cmd=sanitized_cmd) 21672: TRACE cinder.volume.drivers.netapp.nfs ProcessExecutionError: Unexpected error while running command. 21672: TRACE cinder.volume.drivers.netapp.nfs Command: /usr/bin/na_copyoffload_64 10.140.231.6 10.140.231.6 /phllnasnet1000v_vol9_uat_cinder/img-cache-0890828d-41f9-4268-8834-dffe250a4ab4 /phllnasnet1000v_vol8_uat_cinder/volume-af405553-e0cf-4f3c-a8b0-dc772422ed40 21672: TRACE cinder.volume.drivers.netapp.nfs Exit code: 24 21672: TRACE cinder.volume.drivers.netapp.nfs Stdout: u'Program exiting with return code 10008.\n' 21672: TRACE cinder.volume.drivers.netapp.nfs Stderr: u'' 21672: TRACE cinder.volume.drivers.netapp.nfs Version-Release number of selected component (if applicable): openstack-cinder-2014.2.3-3.el7ost.noarch python-cinderclient-1.1.1-1.el7ost.noarch How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: Restart of cinder services, or server reboot solve the problem temporarily, but it seems to occur again fairly quiclky.
Huge number of rabbit missing exchanges, I think your messaging configuration is suspect. I didn't see any proper cinder exceptions there beyond the copyoffload, but that will fallback to an unoptimized operation.
Hello, We are still having issues with volumes stuck in downloading state after patching rabbitmq. Below is the error we are seeing: volume.log:Stdout: u"Unexpected error while running command.\nCommand: /usr/bin/na_copyoffload_64 10.140.231.6 10.140.231.6 /phllnasnet1000v_vol9_uat_cinder/img-cache-d5b5ff16-e609-4aed-938c-7bda3921fc64 /phllnasnet1000v_vol8_uat_cinder/volume-07f0cd02-6cf4-4b83-b370-5f8d05b4f0cf\nExit code: 24\nStdout: u'Program exiting with return code 10008.\\n'\nStderr: u''" volume.log:14172: TRACE cinder.volume.drivers.netapp.nfs Stdout: u'Program exiting with return code 10008.\n'
Do you still seen issue with oslo.messaging/rabbit ? Or only the cinder one is present ?
They have reported that the oslo.messaging/rabbit issue no longer manifests, but the original symptom of it (the cinder volumes stuck in downloading state) is still there.
We are experiencing this issue in our environment as well. Here is the relevant part of /var/log/cinder/volume.log: 2016-01-21 11:17:32.702 65499 ERROR cinder.volume.drivers.netapp.nfs [req-2030a5c0-258f-455b-b312-431448f1d6e2 a1c9ba1fc9964a0f845aef41a22de7b5 2dc4661120364eaba4744eb0348a81b3 - - -] Error in workflow copy from cache. Unexpected error while running command. Command: None Exit code: - Stdout: u"Unexpected error while running command.\nCommand: /etc/cinder/na_copyoffload_64 10.35.4.12 10.35.4.12 /vol/hh_extcl03_nfsprod01_vol01/img-cache-8f74dd15-46ae-4ce2-b33a-75459ff8869a /vol/hh_extcl03_nfsprod01_vol03/volume-1f75a6ee-6de8-43fb-9ac8-3aceb990bf3d\nExit code: 24\nStdout: u'Program exiting with return code 10008.\\n'\nStderr: u''" Stderr: None. 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Traceback (most recent call last): 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/nfs.py", line 1226, in _copy_from_cache 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs check_exit_code=0) 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 142, in execute 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs return processutils.execute(*cmd, **kwargs) 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs File "/usr/lib/python2.7/site-packages/cinder/openstack/common/processutils.py", line 200, in execute 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs cmd=sanitized_cmd) 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs ProcessExecutionError: Unexpected error while running command. 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Command: /etc/cinder/na_copyoffload_64 10.35.4.12 10.35.4.12 /vol/hh_extcl03_nfsprod01_vol01/img-cache-8f74dd15-46ae-4ce2-b33a-75459ff8869a /vol/hh_extcl03_nfsprod01_vol03/volume-1f75a6ee-6de8-43fb-9ac8-3aceb990bf3d 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Exit code: 24 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Stdout: u'Program exiting with return code 10008.\n' 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Stderr: u'' 2016-01-21 11:17:32.702 65499 TRACE cinder.volume.drivers.netapp.nfs Here's the resulting cinder volume: [root@ctrl002 mnt(openstack_admin)]# cinder show 1f75a6ee-6de8-43fb-9ac8-3aceb990bf3d +---------------------------------------+----------------------------------------------------------+ | Property | Value | +---------------------------------------+----------------------------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | created_at | 2016-01-21T11:15:53.000000 | | display_description | | | display_name | | | encrypted | False | | id | 1f75a6ee-6de8-43fb-9ac8-3aceb990bf3d | | metadata | {} | | os-vol-host-attr:host | ha-controller#10.35.4.12:/vol/hh_extcl03_nfsprod01_vol03 | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 2dc4661120364eaba4744eb0348a81b3 | | os-volume-replication:driver_data | None | | os-volume-replication:extended_status | None | | size | 20 | | snapshot_id | None | | source_volid | None | | status | downloading | | volume_type | None | +---------------------------------------+----------------------------------------------------------+ And when running the /etc/cinder/na_copyoffload_64 command manually: [root@ctrl002 mnt]# /etc/cinder/na_copyoffload_64 10.35.4.12 10.35.4.12 /vol/hh_extcl03_nfsprod01_vol01/img-cache-8f74dd15-46ae-4ce2-b33a-75459ff8869a /vol/hh_extcl03_nfsprod01_vol03/rbtest210116 Program exiting with return code 10008. We are not seeing any rabbitmq issues as far as I can tell - what logs should I check for that?
Is this bug a cinder issue or oslo-messaging issue? If it's "both", please clone to cinder, as we have a fix for oslo-messaging.
The latest rhos6 does not contain the petch python-oslo-messaging-1.4.1-8.el7ost Tested using: python-oslo-messaging-1.4.1-7.el7ost.noarch
Tested using: python-oslo-messaging-1.4.1-8.el7ost.noarch python-cinder-2014.2.3-11.el7ost.noarch python-cinderclient-1.1.1-1.el7ost.noarch openstack-cinder-2014.2.3-11.el7ost.noarch Verification instructions: 1. Configure cinder and glance for using Netapp NFS driver as back-end 2. Create a volume from image Results: [root@cougar01 ~]# mount | grep 'glance\|cinder' 10.35.64.11:/vol/vol_rhos_auto_nfs on /var/lib/cinder/mnt/b412ab27c39458e94f0027cd1aff3877 type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.35.64.11,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=10.35.64.11) 10.35.64.11:/vol/vol_rhos_auto_nfs on /var/lib/glance/images type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.35.64.11,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=10.35.64.11) [root@cougar01 ~(keystone_admin)]# cinder create 1 --image cbe8b84d-8269-4126-8216-286fd42b454e +---------------------+--------------------------------------+ | Property | Value | +---------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | created_at | 2016-02-04T15:27:44.371045 | | display_description | None | | display_name | None | | encrypted | False | | id | 2615ab3b-a649-4d2f-b4e9-0d880d7f8d79 | | image_id | cbe8b84d-8269-4126-8216-286fd42b454e | | metadata | {} | | size | 1 | | snapshot_id | None | | source_volid | None | | status | creating | | volume_type | None | +---------------------+--------------------------------------+ [root@cougar01 ~(keystone_admin)]# cinder list +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | ID | Status | Display Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | 2615ab3b-a649-4d2f-b4e9-0d880d7f8d79 | available | None | 1 | None | true | | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0267.html
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days