Bug 1511698 - Enabling Halo sets volume to Read Only
Summary: Enabling Halo sets volume to Read Only
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Mohammed Rafi KC
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-09 22:20 UTC by Jon Cope
Modified: 2023-09-14 04:11 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-03-12 12:38:54 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
Node1-glusterd.log (537.78 KB, text/plain)
2017-11-10 17:02 UTC, Jon Cope
no flags Details
Node2-glusterd.log (177.71 KB, text/plain)
2017-11-10 17:06 UTC, Jon Cope
no flags Details
Node3-glusterd.log (21.34 KB, text/plain)
2017-11-10 17:07 UTC, Jon Cope
no flags Details
Node4-glusterd.log (21.12 KB, text/plain)
2017-11-10 17:08 UTC, Jon Cope
no flags Details
Node1-glustershd.log (127.05 KB, text/plain)
2017-11-10 17:09 UTC, Jon Cope
no flags Details
Node2-glustershd.log (138.55 KB, text/plain)
2017-11-10 17:10 UTC, Jon Cope
no flags Details
Node3-glustershd.log (8.54 KB, text/plain)
2017-11-10 17:11 UTC, Jon Cope
no flags Details
Node4-glustershd.log (8.54 KB, text/plain)
2017-11-10 17:11 UTC, Jon Cope
no flags Details
Node1-data-brick-gv0.log (53.09 KB, text/plain)
2017-11-10 17:13 UTC, Jon Cope
no flags Details
Node2-data-brick-gv0.log (33.81 KB, text/plain)
2017-11-10 17:14 UTC, Jon Cope
no flags Details
Node3-data-brick-gv0.log (12.42 KB, text/plain)
2017-11-10 17:14 UTC, Jon Cope
no flags Details
Node4-data-brick-gv0.log (12.42 KB, text/plain)
2017-11-10 17:15 UTC, Jon Cope
no flags Details

Description Jon Cope 2017-11-09 22:20:07 UTC
Description of problem:

In GCE, I have 4 instances, each with one 10gb brick. 2 instances are in the US and the other 2 are in Asia (with the hope that it will drive up i/o latency sufficiently). The bricks make up a Replica-4 volume. Before I enable halo, I can mount the volume and r/w files.

However, when I set `cluster.halo-enabled yes`, I can no longer write to the volume: 

[root at jcope-rhs-g2fn vol]# touch /mnt/vol/test1 
touch: setting times of ‘test1’: Read-only file system.

Thanks to a helpful user on the mailing list, setting these volume values solves the issue:

cluster.quorum-type fixed 
cluster.quorum-count 2

Version-Release number of selected component (if applicable):

glusterfs-client-xlators-3.12.1-2.el7.x86_64
glusterfs-libs-3.12.1-2.el7.x86_64
glusterfs-api-3.12.1-2.el7.x86_64
glusterfs-server-3.12.1-2.el7.x86_64
glusterfs-3.12.1-2.el7.x86_64
glusterfs-cli-3.12.1-2.el7.x86_64
glusterfs-fuse-3.12.1-2.el7.x86_64

How reproducible:

100%

Steps to Reproduce:
1.  Setup a replica volume.
2.  Enable halo (gluster volume set gv0 cluster.halo-enabled yes)
3.  Write to the volume.

Actual results:
touch: setting times of ‘test1’: Read-only file system 

Expected results:
File written to volume

Additional info:

Comment 1 Sanoj Unnikrishnan 2017-11-10 09:09:31 UTC
Hi jon,
could you please share the volume info, volume status, brick logs, client logs and self heal daemon logs from all the nodes.

Comment 2 Jon Cope 2017-11-10 17:02:59 UTC
Created attachment 1350594 [details]
Node1-glusterd.log

Comment 3 Jon Cope 2017-11-10 17:05:26 UTC
Here is volume info and status.  I'll attach the logs individually as text.

# gluster volume info

Volume Name: gv0
Type: Replicate
Volume ID: 24831bec-32bb-46a6-9507-6b4c8a8dd14f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: gce-node1:/data/brick/gv0
Brick2: gce-node2:/data/brick/gv0
Brick3: gce-node3:/data/brick/gv0
Brick4: gce-node4:/data/brick/gv0
Options Reconfigured:
cluster.halo-enabled: yes
transport.address-family: inet
nfs.disable: on

# gluster volume status
Status of volume: gv0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gce-node1:/data/brick/gv0             49152     0          Y       19918
Brick gce-node2:/data/brick/gv0             49152     0          Y       14428
Brick gce-node3:/data/brick/gv0             49152     0          Y       2676
Brick gce-node4:/data/brick/gv0             49152     0          Y       2518
Self-heal Daemon on localhost               N/A       N/A        Y       19939
Self-heal Daemon on gce-node2               N/A       N/A        Y       14449
Self-heal Daemon on gce-node4               N/A       N/A        Y       2539
Self-heal Daemon on gce-node3               N/A       N/A        Y       2697

Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks

Comment 4 Jon Cope 2017-11-10 17:06:39 UTC
Created attachment 1350597 [details]
Node2-glusterd.log

Comment 5 Jon Cope 2017-11-10 17:07:28 UTC
Created attachment 1350598 [details]
Node3-glusterd.log

Comment 6 Jon Cope 2017-11-10 17:08:18 UTC
Created attachment 1350599 [details]
Node4-glusterd.log

Comment 7 Jon Cope 2017-11-10 17:09:50 UTC
Created attachment 1350600 [details]
Node1-glustershd.log

Comment 8 Jon Cope 2017-11-10 17:10:30 UTC
Created attachment 1350601 [details]
Node2-glustershd.log

Comment 9 Jon Cope 2017-11-10 17:11:00 UTC
Created attachment 1350602 [details]
Node3-glustershd.log

Comment 10 Jon Cope 2017-11-10 17:11:42 UTC
Created attachment 1350604 [details]
Node4-glustershd.log

Comment 11 Jon Cope 2017-11-10 17:13:23 UTC
Created attachment 1350605 [details]
Node1-data-brick-gv0.log

Comment 12 Jon Cope 2017-11-10 17:14:06 UTC
Created attachment 1350606 [details]
Node2-data-brick-gv0.log

Comment 13 Jon Cope 2017-11-10 17:14:44 UTC
Created attachment 1350607 [details]
Node3-data-brick-gv0.log

Comment 14 Jon Cope 2017-11-10 17:15:27 UTC
Created attachment 1350608 [details]
Node4-data-brick-gv0.log

Comment 15 Jon Cope 2017-11-10 17:21:03 UTC
To add to the previous comment, here are the relevant config values when I reproduced the bug for the attached logs:

cluster.quorum-type                     none
cluster.quorum-count                    (null)
cluster.server-quorum-type              off
cluster.server-quorum-ratio             0
cluster.quorum-reads                    no
cluster.halo-enabled                    yes  # set by me.
cluster.halo-shd-max-latency            99999
cluster.halo-nfsd-max-latency           5
cluster.halo-max-latency                5
cluster.halo-max-replicas               99999
cluster.halo-min-replicas               2

Comment 16 Shyamsundar 2018-10-23 13:59:42 UTC
Bug is moved to mailline, as there has been no analysis of data presented and 3.12 has reached EOL.

Request reporter (@JonCope) to attempt reproduction on later releases and refresh provided data.

Request @Rafi to update any findings and/or updates here.

Comment 17 Worker Ant 2020-03-12 12:38:54 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/918, and will be tracked there from now on. Visit GitHub issues URL for further details

Comment 18 Red Hat Bugzilla 2023-09-14 04:11:36 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.