Bug 1228598
Summary: | [Backup]: Glusterfind session(s) created before starting the volume results in 'changelog not available' error, eventually | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Sweta Anandpara <sanandpa> | |
Component: | glusterfind | Assignee: | Kotresh HR <khiremat> | |
Status: | CLOSED ERRATA | QA Contact: | Sweta Anandpara <sanandpa> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.1 | CC: | avishwan, khiremat, mchangir, nsathyan, rhs-bugs, storage-qa-internal, vagarwal | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.1.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.7.1-6 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1232729 (view as bug list) | Environment: | ||
Last Closed: | 2015-07-29 04:58:07 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1202842, 1223636, 1232729, 1233518 |
Description
Sweta Anandpara
2015-06-05 10:02:23 UTC
RCA: --- This is nothing to do with disperse volume. It should happen with any other volume if it is done in following way. 1. gluster volume is created first 2. Immediately glusterfind session is created which updates cuurent time as say 't1' as start time for changelogs available time. Even though, the changelog is enabled (marked 'on' in vol file), backend .glusterfs and actual changelog files gets created during volume start. 3. gluster volume is started. The changelog and HITIME.TSTAMP gets created during volume start and the TSTAMP is current time say 't1+n' If above is the case, then glustefind pre is requesting History API with start time as 't1' where as the changelog is actually available from 't1+s' which will always fail. Solution: -------- With the following patches which is already merged, the glusterfind session creation will fail unless volume is oneline. So that would fix the above issue Upstream Master: http://review.gluster.org/#/c/10955/ Upstream 3.7: http://review.gluster.org/#/c/11187/ The BZ 1224236 fixes this bug. As mentioned in the RCA, changing the title of this bug. Also, the patch mentioned above fixes the issue only when the status of the volume is in 'stopped' state. The issue still persists if the volume status is 'created'. Upstream Patch (Mater): http://review.gluster.org/#/c/11278/ Upstream Patch: (3.7): http://review.gluster.org/#/c/11322/ Downstream Patch: https://code.engineering.redhat.com/gerrit/#/c/51457/ Tested and verified this on the build glusterfs-server-3.7.1-6.el6rhs.x86_64 When the volume's state is not 'started' (i.e., either 'created' or 'stopped'), it does not allow us to create glusterfind sessions, removing the probability of hitting the error mentioned in the title. Moving this to fixed in 3.1 everglades. Detailed logs are pasted below: [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# gluster v status Status of volume: gv1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.43.93:/rhs/b1 49154 0 Y 13880 NFS Server on localhost 2049 0 Y 13881 NFS Server on 10.70.43.155 2049 0 Y 23445 Task Status of Volume gv1 ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: slave Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.43.93:/rhs/thinbrick1/slave 49152 0 Y 13892 Brick 10.70.43.155:/rhs/thinbrick1/slave 49152 0 Y 23444 Brick 10.70.43.93:/rhs/thinbrick2/slave 49153 0 Y 13901 Brick 10.70.43.155:/rhs/thinbrick2/slave 49153 0 Y 23455 NFS Server on localhost 2049 0 Y 13881 Self-heal Daemon on localhost N/A N/A N N/A NFS Server on 10.70.43.155 2049 0 Y 23445 Self-heal Daemon on 10.70.43.155 N/A N/A N N/A Task Status of Volume slave ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp43-93 ~]# gluster peer status Number of Peers: 1 Hostname: 10.70.43.155 Uuid: 97f53dc5-1ba1-45dc-acdd-ddf38229035b State: Peer in Cluster (Connected) [root@dhcp43-93 ~]# [root@dhcp43-93 thinbrick2]# gluster v create vol1 10.70.43.93:/rhs/thinbrick1/vol1 10.70.43.155:/rhs/thinbrick1/vol1 10.70.43.93:/rhs/thinbrick2/vol1 10.70.43.155:/rhs/thinbrick2/vol1 volume create: vol1: success: please start the volume to access data [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# gluster v status Status of volume: gv1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.43.93:/rhs/b1 49154 0 Y 13880 NFS Server on localhost 2049 0 Y 13881 NFS Server on 10.70.43.155 2049 0 Y 23445 Task Status of Volume gv1 ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: slave Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.43.93:/rhs/thinbrick1/slave 49152 0 Y 13892 Brick 10.70.43.155:/rhs/thinbrick1/slave 49152 0 Y 23444 Brick 10.70.43.93:/rhs/thinbrick2/slave 49153 0 Y 13901 Brick 10.70.43.155:/rhs/thinbrick2/slave 49153 0 Y 23455 NFS Server on localhost 2049 0 Y 13881 Self-heal Daemon on localhost N/A N/A N N/A NFS Server on 10.70.43.155 2049 0 Y 23445 Self-heal Daemon on 10.70.43.155 N/A N/A N N/A Task Status of Volume slave ------------------------------------------------------------------------------ There are no active volume tasks Volume vol1 is not started [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# glutser v info vol1 -bash: glutser: command not found [root@dhcp43-93 thinbrick2]# gluster v info vol1 Volume Name: vol1 Type: Distribute Volume ID: 8918e433-d903-4bb8-80c2-42a1b5a0244e Status: Created Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: 10.70.43.93:/rhs/thinbrick1/vol1 Brick2: 10.70.43.155:/rhs/thinbrick1/vol1 Brick3: 10.70.43.93:/rhs/thinbrick2/vol1 Brick4: 10.70.43.155:/rhs/thinbrick2/vol1 Options Reconfigured: performance.readdir-ahead: on [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# glusterfind create sv1 vol1 Volume vol1 is not online [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# glusterfind list SESSION VOLUME SESSION TIME --------------------------------------------------------------------------- ss2 slave 2015-06-27 00:08:39 ss1 slave 2015-06-27 00:25:26 [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# glusterfind create fdsfds vol1 Volume vol1 is not online [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# [root@dhcp43-93 thinbrick2]# ls /var/lib/glusterd/glusterfind/ ss1 ss2 [root@dhcp43-93 thinbrick2]# ls slave vol1 [root@dhcp43-93 thinbrick2]# cd [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# ls /var/log/glusterfs/glusterfind/ cli.log fdsfds nash plutos1 ps1 ps2 ps3 sess21 sessn1 sessn2 sessn3 sessn4 sesso1 sesso2 sesso3 sessp1 sessp2 sessv1 sgv1 ss1 ss2 sumne sv1 vol1s1 vol1s2 vol1s3 [root@dhcp43-93 ~]# cat /var/log/glusterfs/glusterfind/sv1/vol1/cli.log [2015-07-04 15:48:32,839] ERROR [utils - 152:fail] - Volume vol1 is not online [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# gluster v start vol1 volume start: vol1: success [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind create sv1 vol1 Session sv1 created with volume vol1 [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind list SESSION VOLUME SESSION TIME --------------------------------------------------------------------------- sv1 vol1 2015-07-04 15:50:02 ss2 slave 2015-06-27 00:08:39 ss1 slave 2015-06-27 00:25:26 [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/ ss1 ss2 sv1 [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/sv1 vol1 [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/sv1/vol1 %2Frhs%2Fthinbrick1%2Fvol1.status %2Frhs%2Fthinbrick2%2Fvol1.status status sv1_vol1_secret.pem sv1_vol1_secret.pem.pub [root@dhcp43-93 ~]# gluster v stop vol1 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: vol1: success [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# gluster v status vol1 Volume vol1 is not started [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind create sv2 vol1 Volume vol1 is not online [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind list SESSION VOLUME SESSION TIME --------------------------------------------------------------------------- sv1 vol1 2015-07-04 15:50:02 ss2 slave 2015-06-27 00:08:39 ss1 slave 2015-06-27 00:25:26 [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind pre sv1 vol1 /tmp/out.txt Volume vol1 is not online [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/ .keys/ ss1/ ss2/ sv1/ [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/sv1/vol1/ %2Frhs%2Fthinbrick1%2Fvol1.status %2Frhs%2Fthinbrick2%2Fvol1.status status sv1_vol1_secret.pem sv1_vol1_secret.pem.pub [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/sv1/vol1/ %2Frhs%2Fthinbrick1%2Fvol1.status %2Frhs%2Fthinbrick2%2Fvol1.status status sv1_vol1_secret.pem sv1_vol1_secret.pem.pub [root@dhcp43-93 ~]# glusterfind pre sv2 vol1 /tmp/out.t Invalid session sv2 [root@dhcp43-93 ~]# glusterfind post sv1 vol1 Pre script is not run [root@dhcp43-93 ~]# glusterfind delete sv1 vol1^C [root@dhcp43-93 ~]# glusterfind delete sv2 vol1 Invalid session sv2 [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# gluster v start vol1 volume start: vol1: success [root@dhcp43-93 ~]# glusterfind pre sv2 vol1 /tmp/out.txt Invalid session sv2 [root@dhcp43-93 ~]# glusterfind pre sv1 vol1 /tmp/out.txt Generated output file /tmp/out.txt [root@dhcp43-93 ~]# glusterfind list SESSION VOLUME SESSION TIME --------------------------------------------------------------------------- sv1 vol1 2015-07-04 15:50:02 ss2 slave 2015-06-27 00:08:39 ss1 slave 2015-06-27 00:25:26 [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/ .keys/ ss1/ ss2/ sv1/ [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/sv1/vol1/ %2Frhs%2Fthinbrick1%2Fvol1.status %2Frhs%2Fthinbrick2%2Fvol1.status status sv1_vol1_secret.pem %2Frhs%2Fthinbrick1%2Fvol1.status.pre %2Frhs%2Fthinbrick2%2Fvol1.status.pre status.pre sv1_vol1_secret.pem.pub [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/sv1/vol1/ %2Frhs%2Fthinbrick1%2Fvol1.status %2Frhs%2Fthinbrick2%2Fvol1.status status sv1_vol1_secret.pem %2Frhs%2Fthinbrick1%2Fvol1.status.pre %2Frhs%2Fthinbrick2%2Fvol1.status.pre status.pre sv1_vol1_secret.pem.pub [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind pre sv2 vol1 /tmp/out.txt Invalid session sv2 [root@dhcp43-93 ~]# glusterfind delete sv1 vol1 root.43.155's password: root.43.155's password: root.43.155's password: root.43.155's password: 10.70.43.155 - delete failed: Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). Command delete failed in 10.70.43.155:/rhs/thinbrick1/vol1 [root@dhcp43-93 ~]# glusterfind lsit usage: glusterfind [-h] {pre,create,list,post,delete} ... glusterfind: error: argument mode: invalid choice: 'lsit' (choose from 'pre', 'create', 'list', 'post', 'delete') [root@dhcp43-93 ~]# glusterfind list SESSION VOLUME SESSION TIME --------------------------------------------------------------------------- ss2 slave 2015-06-27 00:08:39 ss1 slave 2015-06-27 00:25:26 [root@dhcp43-93 ~]# glusterfind lsit usage: glusterfind [-h] {pre,create,list,post,delete} ... glusterfind: error: argument mode: invalid choice: 'lsit' (choose from 'pre', 'create', 'list', 'post', 'delete') [root@dhcp43-93 ~]# glusterfind list SESSION VOLUME SESSION TIME --------------------------------------------------------------------------- ss2 slave 2015-06-27 00:08:39 ss1 slave 2015-06-27 00:25:26 [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/ ss1 ss2 [root@dhcp43-93 ~]# ls /var/log/glusterfs/glusterfind/ cli.log nash/ ps1/ ps3/ sessn1/ sessn3/ sesso1/ sesso3/ sessp2/ sgv1/ ss2/ sv1/ vol1s1/ vol1s3/ fdsfds/ plutos1/ ps2/ sess21/ sessn2/ sessn4/ sesso2/ sessp1/ sessv1/ ss1/ sumne/ sv2/ vol1s2/ [root@dhcp43-93 ~]# ls /var/log/glusterfs/glusterfind/sv1/vol1/c changelog.1c27a488a584181d698698190ce633eae6ab4a90.log changelog.log changelog.b85984854053ba4529aeaba8bd2c93408cb68773.log cli.log [root@dhcp43-93 ~]# ls /var/log/glusterfs/glusterfind/sv1/vol1/^C [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind create sv1 vol1 glutSession sv1 created with volume vol1 [root@dhcp43-93 ~]# gluster v info vol1 Volume Name: vol1 Type: Distribute Volume ID: 8918e433-d903-4bb8-80c2-42a1b5a0244e Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: 10.70.43.93:/rhs/thinbrick1/vol1 Brick2: 10.70.43.155:/rhs/thinbrick1/vol1 Brick3: 10.70.43.93:/rhs/thinbrick2/vol1 Brick4: 10.70.43.155:/rhs/thinbrick2/vol1 Options Reconfigured: changelog.capture-del-path: on changelog.changelog: on storage.build-pgfid: on performance.readdir-ahead: on [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# gluster v stop vol1 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: vol1: success [root@dhcp43-93 ~]# glusterfind pre sv1 vol1 /tmp/out.txt Volume vol1 is not online [root@dhcp43-93 ~]# glusterfind post sv1 vol1 Pre script is not run [root@dhcp43-93 ~]# gluster v start vol1 volume start: vol1: success [root@dhcp43-93 ~]# glusterfind pre sv1 vol1 /tmp/out.txt Generated output file /tmp/out.txt [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# gluster v stop vol1 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: vol1: success [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# gluster v info vol1 Volume Name: vol1 Type: Distribute Volume ID: 8918e433-d903-4bb8-80c2-42a1b5a0244e Status: Stopped Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: 10.70.43.93:/rhs/thinbrick1/vol1 Brick2: 10.70.43.155:/rhs/thinbrick1/vol1 Brick3: 10.70.43.93:/rhs/thinbrick2/vol1 Brick4: 10.70.43.155:/rhs/thinbrick2/vol1 Options Reconfigured: changelog.capture-del-path: on changelog.changelog: on storage.build-pgfid: on performance.readdir-ahead: on [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind post sv1 vol1 Session sv1 with volume vol1 updated [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# glusterfind list SESSION VOLUME SESSION TIME --------------------------------------------------------------------------- sv1 vol1 2015-07-04 15:58:10 ss2 slave 2015-06-27 00:08:39 ss1 slave 2015-06-27 00:25:26 [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/sv1/vol1/ %2Frhs%2Fthinbrick1%2Fvol1.status %2Frhs%2Fthinbrick2%2Fvol1.status status sv1_vol1_secret.pem sv1_vol1_secret.pem.pub [root@dhcp43-93 ~]# ls /var/lib/glusterd/glusterfind/sv1/vol1/^C [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# [root@dhcp43-93 ~]# rpm -qa | grep gluster glusterfs-client-xlators-3.7.1-6.el6rhs.x86_64 glusterfs-server-3.7.1-6.el6rhs.x86_64 glusterfs-3.7.1-6.el6rhs.x86_64 glusterfs-api-3.7.1-6.el6rhs.x86_64 glusterfs-cli-3.7.1-6.el6rhs.x86_64 glusterfs-geo-replication-3.7.1-6.el6rhs.x86_64 glusterfs-libs-3.7.1-6.el6rhs.x86_64 glusterfs-fuse-3.7.1-6.el6rhs.x86_64 [root@dhcp43-93 ~]# Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html |