Created attachment 765094 [details] logs Description of problem: when running cinder create --display_name <name> 5 when external nfs server is unavailable the volume is stuck in creating instead of moving to error (snapshot-create for example will be in status error). Version-Release number of selected component (if applicable): openstack-cinder-2013.1.2-3.el6ost.noarch python-cinderclient-1.0.4-1.el6ost.noarch python-cinder-2013.1.2-3.el6ost.noarch How reproducible: 100% Steps to Reproduce: 1. add an nfs server to cinder 2. block connectivity to the server using iptables 3. create volume Actual results: volume is stuck in creating status instead of moving to error Expected results: we should have consistency and move status to error Additional info: logs creating a volume is stuck in creating: [root@opens-vdsb tmp(keystone_admin)]# cinder list +--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+ | ID | Status | Display Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+ | 0d1c1913-0e9d-4de4-87d1-2d051b6fcfc8 | available | orion-share3 | 5 | None | false | | | 6d7ff4c0-86f7-40b6-b59d-99e412bf6bbd | in-use | orion-share | 5 | None | false | 8f107ee9-809f-45c3-b062-520968e53bb5 | | 7325035d-01c7-4f8b-bb3e-7c2aac40fb1a | creating | orion-share7 | 5 | None | false | | | b88a6220-e22d-4297-a072-516a146f9474 | creating | orion-share6 | 5 | None | false | | | e243132e-3f70-4a02-b414-3cc450edc1f0 | creating | orion-share5 | 5 | None | false | | +--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+ creating snapshot results in error: [root@opens-vdsb ~(keystone_admin)]# cinder snapshot-create 0d1c1913-0e9d-4de4-87d1-2d051b6fcfc8 +---------------------+--------------------------------------+ | Property | Value | +---------------------+--------------------------------------+ | created_at | 2013-06-25T13:30:23.477040 | | display_description | None | | display_name | None | | id | fc8862e0-f3cc-4cec-b3c9-931182c6e856 | | metadata | {} | | size | 5 | | status | creating | | volume_id | 0d1c1913-0e9d-4de4-87d1-2d051b6fcfc8 | +---------------------+--------------------------------------+ [root@opens-vdsb ~(keystone_admin)]# cinder snapshot-list +--------------------------------------+--------------------------------------+--------+--------------+------+ | ID | Volume ID | Status | Display Name | Size | +--------------------------------------+--------------------------------------+--------+--------------+------+ | fc8862e0-f3cc-4cec-b3c9-931182c6e856 | 0d1c1913-0e9d-4de4-87d1-2d051b6fcfc8 | error | None | 5 | +--------------------------------------+--------------------------------------+--------+--------------+------+
Just for my own testing, how did you block connectivity w/ iptables? The snapshot-create operation is failing because the NFS driver doesn't support it, unrelated to the NFS server availability.
iptables -A OUTPUT -d <nfs_server> -j DROP
The volume is never moved to error because no error is being raised. The reason is that the NFS mount is stalled and the commands (stat, du, ...) have *way too long* timeouts. Besides this, the nfs client never raises an error. A workaround for this is setting the parameter below in `cinder.conf`: nfs_mount_options=soft,timeo=20,retrans=1,retry=0 soft -> report an error back when an nfs operation fails timeo -> timeout for each nfs action retrans -> number of times an nfs request should be retried retry -> number of minutes a `mount` operation should be retried. I believe `soft` should be present by default and `timeo` should be set to 40 (4s) and retrans to `2`. These sound like a saner default values to me. The user can manually change them if necessary.