977865 – cinder [Negative]: create volume is stuck in creating when nfs server is unavailable

Bug 977865 - cinder [Negative]: create volume is stuck in creating when nfs server is unavailable

Summary: cinder [Negative]: create volume is stuck in creating when nfs server is unav...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-cinder
Sub Component:
Version:	unspecified
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	6.0 (Juno)
Assignee:	Eric Harney
QA Contact:	Dafna Ron
Docs Contact:
URL:
Whiteboard:
Depends On:	1016224 1035891 1051605 1066955
Blocks:
TreeView+	depends on / blocked

Reported:	2013-06-25 13:39 UTC by Dafna Ron
Modified:	2016-04-26 14:18 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-04-02 13:39:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
logs (4.98 KB, application/x-gzip) 2013-06-25 13:39 UTC, Dafna Ron	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Launchpad	1316075	0	None	None	None	Never
OpenStack gerrit	92083	0	None	ABANDONED	Use saner options for NFS mounts	2020-08-17 13:56:13 UTC

Description Dafna Ron 2013-06-25 13:39:37 UTC

Created attachment 765094 [details]
logs

Description of problem:

when running cinder create --display_name <name> 5 when external nfs server is unavailable the volume is stuck in creating instead of moving to error

(snapshot-create for example will be in status error). 
 
Version-Release number of selected component (if applicable):

openstack-cinder-2013.1.2-3.el6ost.noarch
python-cinderclient-1.0.4-1.el6ost.noarch
python-cinder-2013.1.2-3.el6ost.noarch

How reproducible:

100%

Steps to Reproduce:
1. add an nfs server to cinder
2. block connectivity to the server using iptables 
3. create volume 

Actual results:

volume is stuck in creating status instead of moving to error

Expected results:

we should have consistency and move status to error

Additional info: logs

creating a volume is stuck in creating: 

[root@opens-vdsb tmp(keystone_admin)]# cinder list
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
|                  ID                  |   Status  | Display Name | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
| 0d1c1913-0e9d-4de4-87d1-2d051b6fcfc8 | available | orion-share3 |  5   |     None    |  false   |                                      |
| 6d7ff4c0-86f7-40b6-b59d-99e412bf6bbd |   in-use  | orion-share  |  5   |     None    |  false   | 8f107ee9-809f-45c3-b062-520968e53bb5 |
| 7325035d-01c7-4f8b-bb3e-7c2aac40fb1a |  creating | orion-share7 |  5   |     None    |  false   |                                      |
| b88a6220-e22d-4297-a072-516a146f9474 |  creating | orion-share6 |  5   |     None    |  false   |                                      |
| e243132e-3f70-4a02-b414-3cc450edc1f0 |  creating | orion-share5 |  5   |     None    |  false   |                                      |
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+


creating snapshot results in error: 

[root@opens-vdsb ~(keystone_admin)]# cinder snapshot-create 0d1c1913-0e9d-4de4-87d1-2d051b6fcfc8
+---------------------+--------------------------------------+
|       Property      |                Value                 |
+---------------------+--------------------------------------+
|      created_at     |      2013-06-25T13:30:23.477040      |
| display_description |                 None                 |
|     display_name    |                 None                 |
|          id         | fc8862e0-f3cc-4cec-b3c9-931182c6e856 |
|       metadata      |                  {}                  |
|         size        |                  5                   |
|        status       |               creating               |
|      volume_id      | 0d1c1913-0e9d-4de4-87d1-2d051b6fcfc8 |
+---------------------+--------------------------------------+
[root@opens-vdsb ~(keystone_admin)]# cinder snapshot-list 
+--------------------------------------+--------------------------------------+--------+--------------+------+
|                  ID                  |              Volume ID               | Status | Display Name | Size |
+--------------------------------------+--------------------------------------+--------+--------------+------+
| fc8862e0-f3cc-4cec-b3c9-931182c6e856 | 0d1c1913-0e9d-4de4-87d1-2d051b6fcfc8 | error  |     None     |  5   |
+--------------------------------------+--------------------------------------+--------+--------------+------+

Comment 1 Eric Harney 2013-06-25 13:46:48 UTC

Just for my own testing, how did you block connectivity w/ iptables?

The snapshot-create operation is failing because the NFS driver doesn't support it, unrelated to the NFS server availability.

Comment 2 Dafna Ron 2013-06-25 13:53:54 UTC

iptables -A OUTPUT -d <nfs_server> -j DROP

Comment 3 Flavio Percoco 2014-05-02 15:20:39 UTC

The volume is never moved to error because no error is being raised. The reason is that the NFS mount is stalled and the commands (stat, du, ...) have *way too long* timeouts. Besides this, the nfs client never raises an error.

A workaround for this is setting the parameter below in `cinder.conf`:

nfs_mount_options=soft,timeo=20,retrans=1,retry=0

soft -> report an error back when an nfs operation fails
timeo -> timeout for each nfs action
retrans -> number of times an nfs request should be retried
retry -> number of minutes a `mount` operation should be retried.

I believe `soft` should be present by default and `timeo` should be set to 40 (4s) and retrans to `2`. These sound like a saner default values to me. The user can manually change them if necessary.

Note You need to log in before you can comment on or make changes to this bug.