Created attachment 1180086 [details] Logfile Description of problem: When quorum is lost and regained and SSL is used, the clients are unable to regain working write capabilities. In some cases, the mount turns stale as well. mount.glusterfs is being used. Version-Release number of selected component (if applicable): 3.7.13 How reproducible: Every time in my lab. Steps to Reproduce: 1. Create a 3 (uncertain if others are applicable) peer replica with 3 nodes (arbiter or otherwhise). 2. Activate SSL (self-signed certs with CA file with all client and server pems on servers and with server pems on clients). 3. Activate quorum. 4: Mount on client. 5: Turn off two nodes. 6: Read only occurs. 7: Start one or two nodes. Actual results: The mount is still read-only. Expected results: The mount should recover. Additional info:
Note that the "date" command in the command line log is UTC+2.
Could you paste output of 'gluster v status' ?
And which quorum it is? Server or client?
Created attachment 1180102 [details] logfile2
Added logfile. This occurs with and without arbiter. In this case, the volume is a 3 node replica without arbiter. I still don't remember server/client quorum, but I activated it with: gluster volum set ${volume} cluster.quorum-type auto
Could you also post brick logs and glusterd logs. Is SSL activated on both I/O and mgmt connection? I see that I/O encryption is enabled.
I'll get on those logs (I assume from all servers). /var/lib/glusterd/secure-access Is the mgmt connection, right? If so, it's disabled, the error did occur with it activated as well.
yes, presence of /var/lib/glusterd/secure-access file enables encryption on mgmt connection. Logs from all sever would be good, but I guess it would be ok to get logs from the server you mounted the volume.
Created attachment 1180118 [details] Concatenated logfiles from all servers. Which log would glusterd logs be? bricks cli.log cmd_history.log etc-glusterfs-glusterd.vol.log geo-replication geo-replication-slaves glustershd.log nfs.log snaps I'm assuming glustershd.log and etc-glusterfs-glusterd.vol.log I added those and brick logs concatenaded.
glustershd.log is Self Heal Daemon etc-glusterfs-glusterd.vol.log is the glusterd log.
Ah, ok. You got both ;)
FWIW, after regaining quorum do you see the bricks running and thats why I asked for gluster v status and gluster v info output.
Oh, I forgot to mention one very central part, sorry. If I mount the share again, everything is fine. On a new place or by unmounting the old and mounting on the old place again. This is only an issue for as long as the mount is remaining from when quorum is lost. And only when SSL is activated.
Created attachment 1180205 [details] Second version of logs. New logs according to instructions from Joe on irc.
mnt-03_2-.log Is a secondary mount done after the servers rebooted. mnt-03_2 has full RW, mnt-03 has RO. It can read anything I write from mnt-03_2.
Hi, I am able to reproduce the issue in my vm environment without enable SSL.After checked the logs I think it is defined behavior.As you also mentioned issue is occurred only when mount is remaining from when quorum is lost. Reproduction steps 1) Create 1X3 cluster 2) Activate Quorum (Set cluster.quorum-type to auto) 3) Mount the client 4) Run below command on client to write a small line in a file cd /mnt;cnt=0;while [ 1 ]; do echo "instance no. is $cnt"; echo "Start instance is $cnt" >> f1; cnt=`expr $cnt + 1 `; done 5) kill glusterfsd on two nodes 6) Got "Read-only file system failure messages on client 7) Press CTRL-C to kill while loop 8) After checked the contents of f1 file it shows after kill glusterfsd no line was not updated in file. 9) Restart glusterd to start glusterfsd on one node after regain quorum it will show message like ("Connected to replicate-client-0) The logs(Read-only file system) are showing in mnt-03.log when quorum is not met and client is trying to write a buffer on mount point continuously. >>>>>>>>>>>>>>> [2016-07-15 16:11:20.399760] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 97: FLUSH() ERR => -1 (Read-only file system) [2016-07-15 16:11:21.725593] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 105: FLUSH() ERR => -1 (Read-only file system) [2016-07-15 16:11:22.310991] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 112: FLUSH() ERR => -1 (Read-only file system) [2016-07-15 16:11:22.951805] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 120: FLUSH() ERR => -1 (Read-only file system) [2016-07-15 16:11:43.370784] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 128: FLUSH() ERR => -1 (Read-only file system) [2016-07-15 16:11:44.233100] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 135: FLUSH() ERR => -1 (Read-only file system) [2016-07-15 16:11:45.018485] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 143: FLUSH() ERR => -1 (Read-only file system) [2016-07-15 16:12:28.109760] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 227: FLUSH() ERR => -1 (Read-only file system) >>>>>>>>>>>>>>> After regain the quorum it will show below kind of message in mnt.log >>>>>>>>>>>>>>>>>>>>>> 2016-07-18 10:12:12.167800] I [MSGID: 114046] [client-handshake.c:1222:client_setvolume_cbk] 0-replicate-client-1: Connected to replicate-client-1, attached to remote volume '/dist1/brick2'. [2016-07-18 10:12:12.167841] I [MSGID: 114047] [client-handshake.c:1233:client_setvolume_cbk] 0-replicate-client-1: Server and Client lk-version numbers are not same, reopening the fds [2016-07-18 10:12:12.167863] I [MSGID: 114042] [client-handshake.c:1053:client_post_handshake] 0-replicate-client-1: 1 fds open - Delaying child_up until they are re-opened >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> As per documentation if cluster.quorum-type is auto It allows writes to the file only if the percentage of active replicate bricks is more than 50% of the total number of bricks that constitute that replica so i think it is expected behavior.After stopped glusterfsd on one node you will not get any error but after kill glusterfsd on other node you will get this kind of error messages. Regards Mohit Agrawal
Mohit: That it goes read-only is expected, that is how it's supposed to work. That is the functionality I want (to prevent split-brain). The problem is that with SSL, it doesn't recover even after quorum has been reached. So, the following flow is expected: 1: A mount is working, 3 servers are functioning, quorum has been reached. 2: A mount is still working, 2 servers are functioning, quorum has been reached. 3: A mount has failed, 1 server is functioning, quorum has not been reached. 4: A mount starts working again, 2 servers are functioning, quorum been reached. What actually happens: 1: A mount is working, 3 servers are functioning, quorum has been reached. 2: A mount is still working, 2 servers are functioning, quorum has been reached. 3: A mount has failed, 1 server is functioning, quorum has not been reached. 4: A mount is still failed, 2 servers are functioning, quorum been reached. 5: A remount is required of the volume to make it work again, even though quorum has been reached. For me, this does not happen when client.ssl and server.ssl is off, but happens when they are on.
Even after enable ssl(client.ssl/server.ssl on) i am not able to reproduce the issue as followed steps as i have mentioned in Comment 16. After restart glusterd on other two nodes after showin below messages on client side(in mnt.log) write is workking fine. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>. [ [2016-07-18 10:52:40.206263] W [fuse-bridge.c:1291:fuse_err_cbk] 0-glusterfs-fuse: 12611: FLUSH() ERR => -1 (Read-only file system) [2016-07-18 10:52:40.238430] I [rpc-clnt.c:1855:rpc_clnt_reconfig] 0-replicate-client-2: changing port to 49152 (from 0) [2016-07-18 10:52:40.271836] I [socket.c:343:ssl_setup_connection] 0-replicate-client-2: peer CN = test [2016-07-18 10:52:40.271923] I [socket.c:346:ssl_setup_connection] 0-replicate-client-2: SSL verification succeeded (client: ) [2016-07-18 10:52:40.272245] W [fuse-bridge.c:1291:fuse_err_cbk] 0-glusterfs-fuse: 12618: FLUSH() ERR => -1 (Read-only file system) [2016-07-18 10:52:40.273095] I [MSGID: 114057] [client-handshake.c:1446:select_server_supported_programs] 0-replicate-client-2: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-07-18 10:52:40.277389] I [MSGID: 114046] [client-handshake.c:1222:client_setvolume_cbk] 0-replicate-client-2: Connected to replicate-client-2, attached to remote volume '/dist1/brick3'. [2016-07-18 10:52:40.277429] I [MSGID: 114047] [client-handshake.c:1233:client_setvolume_cbk] 0-replicate-client-2: Server and Client lk-version numbers are not same, reopening the fds [2016-07-18 10:52:40.277446] I [MSGID: 114042] [client-handshake.c:1053:client_post_handshake] 0-replicate-client-2: 1 fds open - Delaying child_up until they are re-opened [2016-07-18 10:52:40.278426] I [MSGID: 114041] [client-handshake.c:675:client_child_up_reopen_done] 0-replicate-client-2: last fd open'd/lock-self-heal'd - notifying CHILD-UP [2016-07-18 10:52:40.278587] I [MSGID: 108002] [afr-common.c:4213:afr_notify] 0-replicate-replicate-0: Client-quorum is met [2016-07-18 10:52:40.278803] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 0-replicate-client-2: Server lk version = 1 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In your attached mnt-03_2.log there is no log after FLUSH(Read-Only file system) messages. Can you please confirm there is no log in mnt.log after showing the FLUSH ERR messages.
Good if it's not general bug! To bad it still happens to me! Nothing new has happened in the client logs since this friday barring the two read-only errors supplied below. I also added some steps I use to test if writing exists: root@client:/mnt# mount | grep /mnt/03 server1.some.domain:klma03 on /mnt/03 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) server1.some.domain:klma03 on /mnt/03_2 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) root@client:/mnt# date >> 03/date3 -bash: 03/date3: Read-only file system root@client:/mnt# date >> 03_2/date3 root@client:/mnt# cat 03/date3 Mon 18 Jul 13:18:54 CEST 2016 root@client:/mnt# root@client:/mnt# tail -n 3 [2016-07-15 16:12:28.109760] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 227: FLUSH() ERR => -1 (Read-only file system) [2016-07-18 11:17:02.217629] W [fuse-bridge.c:2002:fuse_create_cbk] 0-glusterfs-fuse: 253: /date3 => -1 (Read-only file system) [2016-07-18 11:18:43.640304] W [fuse-bridge.c:2002:fuse_create_cbk] 0-glusterfs-fuse: 289: /date3 => -1 (Read-only file system)
(In reply to Klas Mattsson from comment #17) > Mohit: > That it goes read-only is expected, that is how it's supposed to work. > That is the functionality I want (to prevent split-brain). > > The problem is that with SSL, it doesn't recover even after quorum has been > reached. > > So, the following flow is expected: > 1: A mount is working, 3 servers are functioning, quorum has been reached. > 2: A mount is still working, 2 servers are functioning, quorum has been > reached. > 3: A mount has failed, 1 server is functioning, quorum has not been reached. > 4: A mount starts working again, 2 servers are functioning, quorum been > reached. > > What actually happens: > 1: A mount is working, 3 servers are functioning, quorum has been reached. > 2: A mount is still working, 2 servers are functioning, quorum has been > reached. > 3: A mount has failed, 1 server is functioning, quorum has not been reached. > 4: A mount is still failed, 2 servers are functioning, quorum been reached. When you mention that quorum is reached, could you provide the output of gluster peer status and gluster volume status? > 5: A remount is required of the volume to make it work again, even though > quorum has been reached. > > For me, this does not happen when client.ssl and server.ssl is off, but > happens when they are on.
I already have, in logfile2. Well, not peer status, but all of the says connected, which is what I'm assuming you are after?
(In reply to Klas Mattsson from comment #21) > I already have, in logfile2. > Well, not peer status, but all of the says connected, which is what I'm > assuming you are after? Right. We will deep dive into it more. Thanks Klas.
Please tell me if you need any more info, up to and including the .deb files being used.
Hi Klas, I tried to reproduce the issue again on my vm environment.I am able to reproduce the issue but behavior is slightly different what you were facing. I am not getting any FLUSH () ERR logs but after restart glusterfsd mount connection turns into stale. After debug the issue at client side more why it is hanged i found that it is happening due to thread_reaper code in socket_poller thread in case of SSL. The issue is already known and it is fixed in mainline branch(http://review.gluster.org/#/c/14694/) soon it will be merge in release-3.7 and 3.8 also. I have tested the patch after apply the fix it is working in my environment. For the time being let me know if you want to test the patch then i will share rpms. Before test the patch i want to share the disclaimer , it is only for testing purpose not for production system. Regards Mohit Agrawal
Ah, ok. Sounds good. I assume this will be in 3.7.14 or something along those lines as well? Building my own packages atm.
Hi Klas, Do you have any update? Did you face the issue again after apply the patch? Regards Mohit Agrawal
Hi, Patch is merged from below upstream commit http://review.gluster.org/#/c/14886/ Regards Mohit Agrawal
Hi, Great, I've built new packages with 3.7.14 and I can't reproduce that error anymore =) Took a while since I was on a vacation. Good job!
Hmm, unfortunately, I was wrong. It still persists.
Can you please share the logs? Thanks Mohit Agrawal
Ah, sorry, checked again, new bug, not full reproducible. I will make a new bug report once I've narrowed it down, if I can.
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.