1356942 – Problem with persistant mount on quorum fail when using SSL

Bug 1356942 - Problem with persistant mount on quorum fail when using SSL

Summary: Problem with persistant mount on quorum fail when using SSL

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	core
Sub Component:
Version:	3.7.13
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Mohit Agrawal
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1354394
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-15 10:37 UTC by Klas Mattsson
Modified:	2017-03-08 10:48 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-03-08 10:48:49 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Logfile (28.73 KB, text/plain) 2016-07-15 10:37 UTC, Klas Mattsson	no flags	Details
logfile2 (1.72 KB, text/plain) 2016-07-15 10:56 UTC, Klas Mattsson	no flags	Details
Concatenated logfiles from all servers. (3.25 MB, text/plain) 2016-07-15 11:16 UTC, Klas Mattsson	no flags	Details
Second version of logs. (10.59 KB, application/x-gzip) 2016-07-15 16:34 UTC, Klas Mattsson	no flags	Details
View All

Description Klas Mattsson 2016-07-15 10:37:44 UTC

Created attachment 1180086 [details]
Logfile

Description of problem:
When quorum is lost and regained and SSL is used, the clients are unable to regain working write capabilities.
In some cases, the mount turns stale as well.
mount.glusterfs is being used.

Version-Release number of selected component (if applicable):
3.7.13

How reproducible:
Every time in my lab.

Steps to Reproduce:
1. Create a 3 (uncertain if others are applicable) peer replica with 3 nodes (arbiter or otherwhise).
2. Activate SSL (self-signed certs with CA file with all client and server pems on servers and with server pems on clients).
3. Activate quorum.
4: Mount on client.
5: Turn off two nodes.
6: Read only occurs.
7: Start one or two nodes.

Actual results:
The mount is still read-only.

Expected results:
The mount should recover.


Additional info:

Comment 1 Klas Mattsson 2016-07-15 10:41:49 UTC

Note that the "date" command in the command line log is UTC+2.

Comment 2 Atin Mukherjee 2016-07-15 10:51:49 UTC

Could you paste output of 'gluster v status' ?

Comment 3 Atin Mukherjee 2016-07-15 10:53:23 UTC

And which quorum it is? Server or client?

Comment 4 Klas Mattsson 2016-07-15 10:56:39 UTC

Created attachment 1180102 [details]
logfile2

Comment 5 Klas Mattsson 2016-07-15 10:58:16 UTC

Added logfile.
This occurs with and without arbiter.
In this case, the volume is a 3 node replica without arbiter.

I still don't remember server/client quorum, but I activated it with:
gluster volum set ${volume} cluster.quorum-type auto

Comment 6 rjoseph 2016-07-15 10:59:50 UTC

Could you also post brick logs and glusterd logs. Is SSL activated on both I/O and mgmt connection? I see that I/O encryption is enabled.

Comment 7 Klas Mattsson 2016-07-15 11:02:57 UTC

I'll get on those logs (I assume from all servers).

/var/lib/glusterd/secure-access
Is the mgmt connection, right?
If so, it's disabled, the error did occur with it activated as well.

Comment 8 rjoseph 2016-07-15 11:06:17 UTC

yes, presence of /var/lib/glusterd/secure-access file enables encryption on mgmt connection.

Logs from all sever would be good, but I guess it would be ok to get logs from the server you mounted the volume.

Comment 9 Klas Mattsson 2016-07-15 11:16:27 UTC

Created attachment 1180118 [details]
Concatenated logfiles from all servers.

Which log would glusterd logs be?
bricks  cli.log  cmd_history.log  etc-glusterfs-glusterd.vol.log  geo-replication  geo-replication-slaves  glustershd.log  nfs.log  snaps

I'm assuming glustershd.log and etc-glusterfs-glusterd.vol.log

I added those and brick logs concatenaded.

Comment 10 rjoseph 2016-07-15 12:24:01 UTC

glustershd.log is Self Heal Daemon
etc-glusterfs-glusterd.vol.log is the glusterd log.

Comment 11 Klas Mattsson 2016-07-15 12:26:55 UTC

Ah, ok.

You got both ;)

Comment 12 Atin Mukherjee 2016-07-15 12:51:27 UTC

FWIW, after regaining quorum do you see the bricks running and thats why I asked for gluster v status and gluster v info output.

Comment 13 Klas Mattsson 2016-07-15 13:11:05 UTC

Oh, I forgot to mention one very central part, sorry.

If I mount the share again, everything is fine.
On a new place or by unmounting the old and mounting on the old place again.

This is only an issue for as long as the mount is remaining from when quorum is lost.
And only when SSL is activated.

Comment 14 Klas Mattsson 2016-07-15 16:34:58 UTC

Created attachment 1180205 [details]
Second version of logs.

New logs according to instructions from Joe on irc.

Comment 15 Klas Mattsson 2016-07-15 16:36:38 UTC

mnt-03_2-.log
Is a secondary mount done after the servers rebooted.
mnt-03_2 has full RW, mnt-03 has RO.
It can read anything I write from mnt-03_2.

Comment 16 Mohit Agrawal 2016-07-18 10:23:29 UTC

Hi,

  I am able to reproduce the issue in my vm environment without enable SSL.After checked the logs I think it is defined behavior.As you also mentioned issue is occurred only when mount is remaining from when quorum is lost.

Reproduction steps
  1) Create 1X3 cluster 
  2) Activate Quorum (Set cluster.quorum-type to auto)
  3) Mount the client
  4) Run below command on client to write a small line in a file
     cd /mnt;cnt=0;while [ 1 ]; do   echo "instance no. is $cnt";   echo "Start instance is $cnt" >> f1;   cnt=`expr $cnt + 1 `; done
  5) kill glusterfsd  on two nodes
  6) Got "Read-only file system failure messages on client
  7) Press CTRL-C to kill while loop 
  8) After checked the contents of f1 file it shows after kill glusterfsd no line was not updated in file.
  9) Restart glusterd to start glusterfsd on one node after regain quorum it will show message like ("Connected to replicate-client-0) 
  

The logs(Read-only file system) are showing in mnt-03.log when quorum is not met and client is trying to write a buffer on 
mount point continuously.

>>>>>>>>>>>>>>>

[2016-07-15 16:11:20.399760] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 97: FLUSH() ERR => -1 (Read-only file system)
[2016-07-15 16:11:21.725593] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 105: FLUSH() ERR => -1 (Read-only file system)
[2016-07-15 16:11:22.310991] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 112: FLUSH() ERR => -1 (Read-only file system)
[2016-07-15 16:11:22.951805] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 120: FLUSH() ERR => -1 (Read-only file system)
[2016-07-15 16:11:43.370784] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 128: FLUSH() ERR => -1 (Read-only file system)
[2016-07-15 16:11:44.233100] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 135: FLUSH() ERR => -1 (Read-only file system)
[2016-07-15 16:11:45.018485] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 143: FLUSH() ERR => -1 (Read-only file system)
[2016-07-15 16:12:28.109760] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 227: FLUSH() ERR => -1 (Read-only file system)

>>>>>>>>>>>>>>>

After regain the quorum it will show below kind of message in mnt.log

>>>>>>>>>>>>>>>>>>>>>>

2016-07-18 10:12:12.167800] I [MSGID: 114046] [client-handshake.c:1222:client_setvolume_cbk] 0-replicate-client-1: Connected to replicate-client-1, attached to remote volume '/dist1/brick2'.
[2016-07-18 10:12:12.167841] I [MSGID: 114047] [client-handshake.c:1233:client_setvolume_cbk] 0-replicate-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2016-07-18 10:12:12.167863] I [MSGID: 114042] [client-handshake.c:1053:client_post_handshake] 0-replicate-client-1: 1 fds open - Delaying child_up until they are re-opened

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>



As per documentation if cluster.quorum-type is auto It allows writes to the file only if the percentage of active replicate
bricks is more than 50% of the total number of bricks that constitute that replica so i think it is expected behavior.After stopped glusterfsd on one node you will not get any error but after kill glusterfsd on other node you will get this kind of error messages. 

Regards
Mohit Agrawal

Comment 17 Klas Mattsson 2016-07-18 10:41:13 UTC

Mohit:
That it goes read-only is expected, that is how it's supposed to work.
That is the functionality I want (to prevent split-brain).

The problem is that with SSL, it doesn't recover even after quorum has been reached.

So, the following flow is expected:
1: A mount is working, 3 servers are functioning, quorum has been reached.
2: A mount is still working, 2 servers are functioning, quorum has been reached.
3: A mount has failed, 1 server is functioning, quorum has not been reached.
4: A mount starts working again, 2 servers are functioning, quorum been reached.

What actually happens:
1: A mount is working, 3 servers are functioning, quorum has been reached.
2: A mount is still working, 2 servers are functioning, quorum has been reached.
3: A mount has failed, 1 server is functioning, quorum has not been reached.
4: A mount is still failed, 2 servers are functioning, quorum been reached.
5: A remount is required of the volume to make it work again, even though quorum has been reached.

For me, this does not happen when client.ssl and server.ssl is off, but happens when they are on.

Comment 18 Mohit Agrawal 2016-07-18 11:02:32 UTC

Even after enable ssl(client.ssl/server.ssl on) i am not able to reproduce the issue as followed steps as i have mentioned in Comment 16.

After restart glusterd on other two nodes after showin below messages on client side(in mnt.log) write is workking fine.


>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.

[
[2016-07-18 10:52:40.206263] W [fuse-bridge.c:1291:fuse_err_cbk] 0-glusterfs-fuse: 12611: FLUSH() ERR => -1 (Read-only file system)
[2016-07-18 10:52:40.238430] I [rpc-clnt.c:1855:rpc_clnt_reconfig] 0-replicate-client-2: changing port to 49152 (from 0)
[2016-07-18 10:52:40.271836] I [socket.c:343:ssl_setup_connection] 0-replicate-client-2: peer CN =  test
[2016-07-18 10:52:40.271923] I [socket.c:346:ssl_setup_connection] 0-replicate-client-2: SSL verification succeeded (client: )
[2016-07-18 10:52:40.272245] W [fuse-bridge.c:1291:fuse_err_cbk] 0-glusterfs-fuse: 12618: FLUSH() ERR => -1 (Read-only file system)
[2016-07-18 10:52:40.273095] I [MSGID: 114057] [client-handshake.c:1446:select_server_supported_programs] 0-replicate-client-2: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-07-18 10:52:40.277389] I [MSGID: 114046] [client-handshake.c:1222:client_setvolume_cbk] 0-replicate-client-2: Connected to replicate-client-2, attached to remote volume '/dist1/brick3'.
[2016-07-18 10:52:40.277429] I [MSGID: 114047] [client-handshake.c:1233:client_setvolume_cbk] 0-replicate-client-2: Server and Client lk-version numbers are not same, reopening the fds
[2016-07-18 10:52:40.277446] I [MSGID: 114042] [client-handshake.c:1053:client_post_handshake] 0-replicate-client-2: 1 fds open - Delaying child_up until they are re-opened
[2016-07-18 10:52:40.278426] I [MSGID: 114041] [client-handshake.c:675:client_child_up_reopen_done] 0-replicate-client-2: last fd open'd/lock-self-heal'd - notifying CHILD-UP
[2016-07-18 10:52:40.278587] I [MSGID: 108002] [afr-common.c:4213:afr_notify] 0-replicate-replicate-0: Client-quorum is met
[2016-07-18 10:52:40.278803] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 0-replicate-client-2: Server lk version = 1


>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

In your attached mnt-03_2.log there is no log after FLUSH(Read-Only file system) messages.

Can you please confirm there is no log in mnt.log after showing the FLUSH ERR messages.

Comment 19 Klas Mattsson 2016-07-18 11:30:31 UTC

Good if it's not general bug!
To bad it still happens to me!

Nothing new has happened in the client logs since this friday barring the two read-only errors supplied below.
I also added some steps I use to test if writing exists:

root@client:/mnt# mount | grep /mnt/03
server1.some.domain:klma03 on /mnt/03 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
server1.some.domain:klma03 on /mnt/03_2 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
root@client:/mnt# date >> 03/date3
-bash: 03/date3: Read-only file system
root@client:/mnt# date >> 03_2/date3
root@client:/mnt# cat 03/date3
Mon 18 Jul 13:18:54 CEST 2016
root@client:/mnt# 
root@client:/mnt# tail -n 3 
[2016-07-15 16:12:28.109760] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 227: FLUSH() ERR => -1 (Read-only file system)
[2016-07-18 11:17:02.217629] W [fuse-bridge.c:2002:fuse_create_cbk] 0-glusterfs-fuse: 253: /date3 => -1 (Read-only file system)
[2016-07-18 11:18:43.640304] W [fuse-bridge.c:2002:fuse_create_cbk] 0-glusterfs-fuse: 289: /date3 => -1 (Read-only file system)

Comment 20 Atin Mukherjee 2016-07-18 11:37:48 UTC

(In reply to Klas Mattsson from comment #17)
> Mohit:
> That it goes read-only is expected, that is how it's supposed to work.
> That is the functionality I want (to prevent split-brain).
> 
> The problem is that with SSL, it doesn't recover even after quorum has been
> reached.
> 
> So, the following flow is expected:
> 1: A mount is working, 3 servers are functioning, quorum has been reached.
> 2: A mount is still working, 2 servers are functioning, quorum has been
> reached.
> 3: A mount has failed, 1 server is functioning, quorum has not been reached.
> 4: A mount starts working again, 2 servers are functioning, quorum been
> reached.
> 
> What actually happens:
> 1: A mount is working, 3 servers are functioning, quorum has been reached.
> 2: A mount is still working, 2 servers are functioning, quorum has been
> reached.
> 3: A mount has failed, 1 server is functioning, quorum has not been reached.
> 4: A mount is still failed, 2 servers are functioning, quorum been reached.

When you mention that quorum is reached, could you provide the output of gluster peer status and gluster volume status?

> 5: A remount is required of the volume to make it work again, even though
> quorum has been reached.
> 
> For me, this does not happen when client.ssl and server.ssl is off, but
> happens when they are on.

Comment 21 Klas Mattsson 2016-07-18 11:42:18 UTC

I already have, in logfile2.
Well, not peer status, but all of the says connected, which is what I'm assuming you are after?

Comment 22 Atin Mukherjee 2016-07-18 12:50:43 UTC

(In reply to Klas Mattsson from comment #21)
> I already have, in logfile2.
> Well, not peer status, but all of the says connected, which is what I'm
> assuming you are after?

Right. We will deep dive into it more. Thanks Klas.

Comment 23 Klas Mattsson 2016-07-18 12:55:41 UTC

Please tell me if you need any more info, up to and including the .deb files being used.

Comment 24 Mohit Agrawal 2016-07-20 05:47:46 UTC

Hi Klas,
  
  I tried to reproduce the issue again on my vm environment.I am able to reproduce the issue but behavior is slightly different what you were facing.
  I am not getting any FLUSH () ERR logs but after restart glusterfsd mount connection turns into stale.

  After debug the issue at client side more why it is hanged i found that it is happening due to thread_reaper code in socket_poller thread in case of SSL.

  The issue is already known and it is fixed in mainline branch(http://review.gluster.org/#/c/14694/) soon it will be merge in release-3.7 and 3.8 also.

  I have tested the patch after apply the fix it is working in my environment.
  For the time being let me know if you want to test the patch then i will share rpms.

  Before test the patch i want to share the disclaimer , it is only for testing purpose not for production system.

Regards
Mohit Agrawal

Comment 25 Klas Mattsson 2016-07-20 06:27:54 UTC

Ah, ok.

Sounds good.

I assume this will be in 3.7.14 or something along those lines as well?

Building my own packages atm.

Comment 26 Mohit Agrawal 2016-07-25 04:52:55 UTC

Hi Klas,

  Do you have any update? 
  Did you face the issue again after apply the patch?

Regards
Mohit Agrawal

Comment 27 Mohit Agrawal 2016-07-26 13:59:17 UTC

Hi,

Patch is merged from below upstream commit

http://review.gluster.org/#/c/14886/

Regards
Mohit Agrawal

Comment 28 Klas Mattsson 2016-08-15 12:06:01 UTC

Hi,

Great, I've built new packages with 3.7.14 and I can't reproduce that error anymore =)

Took a while since I was on a vacation.

Good job!

Comment 29 Klas Mattsson 2016-08-19 08:14:08 UTC

Hmm, unfortunately, I was wrong.

It still persists.

Comment 30 Mohit Agrawal 2016-08-19 08:25:30 UTC


Can you please share the logs?

Thanks 
Mohit Agrawal

Comment 31 Klas Mattsson 2016-08-19 08:28:23 UTC

Ah, sorry, checked again, new bug, not full reproducible.

I will make a new bug report once I've narrowed it down, if I can.

Comment 32 Kaushal 2017-03-08 10:48:49 UTC

This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.