1577800 – [Tracker-RHGS-BZ#1612098] [stability] After a poweroff/poweron of the 2 "Arbiter:disabled" nodes, new volume creation fails with "Create Volume Build Failed: read-only file system"

Bug 1577800 - [Tracker-RHGS-BZ#1612098] [stability] After a poweroff/poweron of the 2 "Arbiter:disabled" nodes, new volume creation fails with "Create Volume Build Failed: read-only file system"

Summary: [Tracker-RHGS-BZ#1612098] [stability] After a poweroff/poweron of the 2 "Arbi...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	heketi
Sub Component:
Version:	cns-3.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	OCS 3.11
Assignee:	Michael Adam
QA Contact:	Neha Berry
Docs Contact:
URL:
Whiteboard:
Depends On:	1612098
Blocks:	1629575
TreeView+	depends on / blocked

Reported:	2018-05-14 07:21 UTC by Neha Berry
Modified:	2018-10-24 04:51 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-10-24 04:51:02 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:2986	0	None	None	None	2018-10-24 04:51:57 UTC

Description Neha Berry 2018-05-14 07:21:02 UTC

After a poweroff/poweron of the 2 "Arbiter:disabled" nodes, new volume cration fails with "Create Volume Build Failed: read-only file system"


Description of problem:
+++++++++++++++++++++++++

We were testing the new arbiter feature for heketi and as per test case, "arbiter:disabled" was set for 2 nodes and "arbiter:required" on one node, in a 3 node CNS cluster.

We were supposed to bring down the two data nodes and try to create volume (to confirm that volume creation fails). But, even after bringing up the failed nodes, we are now unable to create any volume using PVC/heketi-cli options. The volume creations are failing with the error message:

"Create Volume Build Failed: read-only file system"

Also, it is seen that the heal status for "heketidbstorage" volume reports "Transport endpoint is not connected" but for all other exisitong volumes, heal info is without errors. 

As the "heketidbstorage" is a pure 1*3 volume where none of the 3 bricks are arbiter, the reason for getting this error only for "heketidbstorage volume" is unclear



Note: 
+++++++++

a)Storage Class used for creating volumes have "volumeoptions=user.heketi.arbiter true" set and hence a 1 x (2 + 1) = 3 PVC/volume should be created.
b)2 nodes were tagged as "arbiter:disabled" and one as "arbiter:required". The two data nodes were powered off for some time.
c)After power on, the cluster peer status is "Connected" and we are able to use the already existing volumes. Only the new volume creations are failing.

Arbiter:disabled tagged nodes = Data nodes
Arbiter:required tagged node = Arbiter node.


Following is the snippet of the error message seen from heketi logs:
--------------------------------------


[heketi] INFO 2018/05/14 04:59:08 Placing brick with discounted size: 81920
[heketi] ERROR 2018/05/14 04:59:08 /src/github.com/heketi/heketi/apps/glusterfs/operations.go:973: Create Volume Build Failed: read-only file system
[negroni] Completed 500 Internal Server Error in 23.685005ms

PVC creation error message
-------------------------------

[root@dhcp47-29 home]# oc describe pvc down-pvc
Name:          down-pvc
Namespace:     glusterfs
StorageClass:  gluster-container
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   volume.beta.kubernetes.io/storage-class=gluster-container
               volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/glusterfs
Finalizers:    []
Capacity:      
Access Modes:  
Events:
  Type     Reason              Age                From                         Message
  ----     ------              ----               ----                         -------
  Warning  ProvisioningFailed  14s (x2 over 15s)  persistentvolume-controller  Failed to provision volume with StorageClass "gluster-container": create volume error: error creating volume Failed to allocate new volume: read-only file system

Heketi-cli volume creation error message
----------------------------------------------

[root@dhcp47-29 home]#  heketi-cli volume create --size=3 --name=down$i --gluster-volume-options='user.heketi.arbiter true'
Error: Failed to allocate new volume: read-only file system


Version-Release number of selected component (if applicable):
++++++++++++++++++++++++++
CNS 3.9 with arbiter support

[root@dhcp47-29 home]# oc rsh heketi-storage-1-hfwjv
sh-4.2# rpm -qa| grep heketi
python-heketi-6.0.0-11.el7rhgs.x86_64
heketi-6.0.0-11.el7rhgs.x86_64
heketi-client-6.0.0-11.el7rhgs.x86_64
sh-4.2# 



How reproducible:
++++++++++++++++++++++++++
The issue is reproducible whenever we power down and bring up the "Data-Only" nodes in a 3 node CNS cluster.

 When all 3 nodes are powered off and resumed, we don't encounter any issue in volume creation after poweron.

Steps to Reproduce:
++++++++++++++++++++++++++
1. A CNS setup with new "Arbiter support" for heketi is to be created.
2. In a running CNS setup with arbiter support, tag 2 nodes as 'arbiter:disabled" and one node as "arbiter:required"
3. Power off the 2 nodes for which "arbiter=disabled" tag is set, ie. the 2 Data nodes.
4. Power on both the nodes after some time andwait till for all the gluster pods, "Ready" status changes to "1/1"
5. Try creating a new volume using Heketi-cli/PVC using following type of command:
 #  heketi-cli volume create --size=3 --name=down$i --gluster-volume-options='user.heketi.arbiter true'


Full details of issue will be shared shortly, along with heketi and glusterd logs.


Actual results:
+++++++++++++++++++
Even On resuming the 2 "Data-nodes" , and with gluster peer status as connected, we are unable to create any more volumes.

Expected results:
++++++++++++++++++++

Once all nodes are up and 3 node CNS cluster is in Ready state, the volume creations should be successful.


Additional info:
+++++++++++++++++++


Some information about the heketidbstorage volume
+++++++++++++++++++++++++++++++++++++++++++++++++++++

Mounted on : master node

[root@dhcp47-29 home]# df -kh| grep heketidb
10.70.46.114:heketidbstorage     2.0G   33M  2.0G   2% /var/lib/origin/openshift.local.volumes/pods/8a837427-5508-11e8-a3b2-005056a54e86/volumes/kubernetes.io~glusterfs/db
[root@dhcp47-29 home]# 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

[root@dhcp47-29 home]# oc exec glusterfs-storage-t9mdg -- gluster v heal heketidbstorage  info
Brick 10.70.46.114:/var/lib/heketi/mounts/vg_66ede0d16ea3218ed31b15fb585f58cf/brick_b7e7cf184f37a4c4c2f8fb42f50180e4/brick
Status: Transport endpoint is not connected
Number of entries: -

Brick 10.70.46.130:/var/lib/heketi/mounts/vg_c891346cafb7ab29cdf641beca71de68/brick_1521644fdd65bafe550e0672cb05d5b9/brick
Status: Transport endpoint is not connected
Number of entries: -

Brick 10.70.46.212:/var/lib/heketi/mounts/vg_2e906b85353509bcfe3b6fdeb296a0f3/brick_0fbc5b61989e5747b9cd3f3fadc44ea3/brick
/container.log 
/ 
/heketi.db 
Status: Connected
Number of entries: 3

[root@dhcp47-29 home]# 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

[root@dhcp47-29 home]# oc exec glusterfs-storage-t9mdg  -- gluster v info heketidbstorage
 
Volume Name: heketidbstorage
Type: Replicate
Volume ID: 5a729da8-6f6c-4059-8c96-026c43fb19bc
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.46.114:/var/lib/heketi/mounts/vg_66ede0d16ea3218ed31b15fb585f58cf/brick_b7e7cf184f37a4c4c2f8fb42f50180e4/brick
Brick2: 10.70.46.130:/var/lib/heketi/mounts/vg_c891346cafb7ab29cdf641beca71de68/brick_1521644fdd65bafe550e0672cb05d5b9/brick
Brick3: 10.70.46.212:/var/lib/heketi/mounts/vg_2e906b85353509bcfe3b6fdeb296a0f3/brick_0fbc5b61989e5747b9cd3f3fadc44ea3/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
cluster.brick-multiplex: on

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


[root@dhcp47-29 home]# oc exec glusterfs-storage-t9mdg  -- gluster v info status
Volume status does not exist
command terminated with exit code 1
[root@dhcp47-29 home]# oc exec glusterfs-storage-t9mdg  -- gluster v  status heketidbstorage
Status of volume: heketidbstorage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.114:/var/lib/heketi/mounts/v
g_66ede0d16ea3218ed31b15fb585f58cf/brick_b7
e7cf184f37a4c4c2f8fb42f50180e4/brick        49153     0          Y       1022 
Brick 10.70.46.130:/var/lib/heketi/mounts/v
g_c891346cafb7ab29cdf641beca71de68/brick_15
21644fdd65bafe550e0672cb05d5b9/brick        49153     0          Y       972  
Brick 10.70.46.212:/var/lib/heketi/mounts/v
g_2e906b85353509bcfe3b6fdeb296a0f3/brick_0f
bc5b61989e5747b9cd3f3fadc44ea3/brick        49153     0          Y       1025 
Self-heal Daemon on localhost               N/A       N/A        Y       4041 
Self-heal Daemon on dhcp46-114.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       794  
Self-heal Daemon on 10.70.46.130            N/A       N/A        Y       777  
 
Task Status of Volume heketidbstorage
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp47-29 home]#

Comment 15 Raghavendra Talur 2018-05-16 19:08:12 UTC

sh-4.2# gluster vol status heketidbstorage
Status of volume: heketidbstorage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.114:/var/lib/heketi/mounts/v
g_66ede0d16ea3218ed31b15fb585f58cf/brick_b7
e7cf184f37a4c4c2f8fb42f50180e4/brick        49153     0          Y       1022 
Brick 10.70.46.130:/var/lib/heketi/mounts/v
g_c891346cafb7ab29cdf641beca71de68/brick_15
21644fdd65bafe550e0672cb05d5b9/brick        49153     0          Y       972  
Brick 10.70.46.212:/var/lib/heketi/mounts/v
g_2e906b85353509bcfe3b6fdeb296a0f3/brick_0f
bc5b61989e5747b9cd3f3fadc44ea3/brick        49153     0          Y       1025 
Self-heal Daemon on localhost               N/A       N/A        Y       777  
Self-heal Daemon on 10.70.46.212            N/A       N/A        Y       4041 
Self-heal Daemon on dhcp46-114.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       794  
 
Task Status of Volume heketidbstorage
------------------------------------------------------------------------------
There are no active volume tasks
 
sh-4.2# exit
exit
[root@dhcp47-29 ~]# ssh root.46.130
Last login: Thu May 17 00:32:28 2018 from 10.70.47.29
[root@dhcp46-130 ~]# netstat -tnap | grep 972
tcp        0      0 10.70.46.130:49152      10.70.46.114:972        ESTABLISHED 7173/glusterfsd     
tcp        0      0 10.70.46.130:972        10.70.46.114:24007      TIME_WAIT   -             

According to gluster status, brick is supposed to be listening on 49153 but the netstat output says otherwise.

The only gluster brick process on that node is listening on 49152
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      7173/glusterfsd 

Adding a needinfo on Mohit to determine if it is a duplicate of any bug he is aware of.

Comment 24 errata-xmlrpc 2018-10-24 04:51:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2986

Note You need to log in before you can comment on or make changes to this bug.