Bug 1177603

Summary: CTDB: On a 1X2 ctdb setup,after reboot even when the /gluster/lock is not mounted the ctdb status shows ok.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: surabhi <sbhaloth>
Component: ctdbAssignee: Anoop C S <anoopcs>
Status: CLOSED ERRATA QA Contact: Vivek Das <vdas>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.0CC: akrishna, anoopcs, bkunal, gdeschner, ira, madam, nlevinki, rhs-smb, sanandpa, sankarshan, sheggodu
Target Milestone: ---Keywords: Reopened, ZStream
Target Release: RHGS 3.4.z Batch Update 3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: ctdb
Fixed In Version: samba-4.8.5-103.el7rhgs, samba-4.8.5-103.el6rhs Doc Type: Bug Fix
Doc Text:
Systemd automatically sorts through the unit files to read dependencies and ordering information. Previously, Clustered Trivial Database (CTDB) did not provide correct dependency information. As a consequence, CTDB service tries to come up without mounting the lock volume from file systems table. Due to the failure in synchronizing with the other nodes in the cluster, the node enters unhealthy state.With this fix, a dependency is added on remote network file systems target within CTDB systemd service file. Hence, the lock volume is mounted and available for CTDB to access lock file during its startup.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-04 07:36:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1649191    

Description surabhi 2014-12-29 11:06:11 UTC
Description of problem:
****************************************
In a 1X2 CTDB setup, if both the nodes are rebooted it doesn't mount the /gluster/lock on one of the node once that node is up but ctdb status shows ok for that node.
On the other node where /gluster/lock is mounted the ctdb status shows unhealthy.
once the node is rebooted and /gluster/lock is mounted the ctdb status should be consistent.
As per BZ https://bugzilla.redhat.com/show_bug.cgi?id=1164222 , if both nodes are down then 



Version-Release number of selected component (if applicable):
**************************************************************
glusterfs-3.6.0.40-1.el6rhs.x86_64
samba-glusterfs-3.6.509-169.4.el6rhs.x86_64

How reproducible:
*************************************************************
Always

Steps to Reproduce:
1.Create 1x2 volume.Do ctdb setup
2.Reboot one node in the cluster, check ctdb status, check mount point
3.Reboot the other node, check ctdb status, check mount point

Actual results:
**************************************************************
Once the nodes are up, on the node where gluster mount didn't happen shows ctdb status as OK, and the other node where the mount happened shows ctdb status as UNHEALTHY.


Expected results:
****************************************************************
After coming up, the /gluster/lock mount should happen on both the nodes and ctdb status should show the correct status.


Additional info:

Comment 3 surabhi 2016-06-28 06:20:35 UTC
After doing the setup with configuring public and private interfaces separately, where the gluster volume is present on public IP and ctdb nodes on private ip's, rebooting one of the nodes doesn't cause ctdb node  to stay in unhealthy state once the gluster lock is mounted.

Also as I learned if the gluster lock is not mounted then ctdb will create and access it's own lock and will show status as OK.
As per discussion with glusterd team , on a two node gluster cluster if one node goes down then the gluster services may not come up once the node comes up (because of 50% node down in a cluster)and so lock doesn't get mounted.

Closing this BZ as works for me. If I see the similar issue in 4 node setup then we open the BZ again.

Comment 4 surabhi 2016-06-29 08:47:52 UTC
As mentioned in above comment, ctdb creates its own lock and and ctdb status shows ok even if gluster lock is not mounted , that happens because the lockfile is created on the root. We may want to have lock-file being created in the sub-directory of mount point and not on the root.

Reopening the BZ to get this fix.

Comment 11 Michael Adam 2018-04-10 10:05:13 UTC
*** Bug 1202328 has been marked as a duplicate of this bug. ***

Comment 25 Anjana KD 2019-01-29 03:09:02 UTC
Updated the doc text. Kindly review it for technical accuracy.

Comment 28 errata-xmlrpc 2019-02-04 07:36:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0261