Bug 1232729 - [Backup]: Glusterfind session(s) created before starting the volume results in 'changelog not available' error, eventually
Summary: [Backup]: Glusterfind session(s) created before starting the volume results i...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterfind
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Kotresh HR
QA Contact: bugs@gluster.org
URL:
Whiteboard:
Depends On: 1228598
Blocks: 1223636 1233518
TreeView+ depends on / blocked
 
Reported: 2015-06-17 11:51 UTC by Kotresh HR
Modified: 2016-06-16 13:13 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1228598
: 1233518 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:13:39 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Kotresh HR 2015-06-17 11:51:41 UTC
+++ This bug was initially created as a clone of Bug #1228598 +++

Description of problem:

Have a 4+2 disperse volume 'disperse' and create glusterfind sessions for the same. Mount it over nfs/fuse and create a couple of files/dirs. Execute the glusterfind pre command it fails with 'Historical changelogs not available' on both the nodes of the cluster


[root@dhcp43-140 ~]# 
[root@dhcp43-140 ~]# gluster v status dispersevol
Status of volume: dispersevol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.43.140:/rhs/thinbrick1/disperse 49188     0          Y       21579
Brick 10.70.43.140:/rhs/thinbrick2/disperse 49189     0          Y       21599
Brick 10.70.42.75:/rhs/thinbrick2/disperse  49184     0          Y       17892
Brick 10.70.42.75:/rhs/thinbrick1/disperse  49185     0          Y       17912
Brick 10.70.43.140:/rhs/thinbrick3/disperse 49190     0          Y       21619
Brick 10.70.42.75:/rhs/thinbrick3/disperse  49186     0          Y       17932
NFS Server on localhost                     2049      0          Y       21640
NFS Server on 10.70.42.75                   2049      0          Y       17953
 
Task Status of Volume dispersevol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp43-140 ~]# mkdir /mnt/disperse
[root@dhcp43-140 ~]# mount -t glusterfs 10.70.43.140:/disperse /mnt/disperse
Mount failed. Please check the log file for more details.
[root@dhcp43-140 ~]# mount -t glusterfs 10.70.43.140:/dispersevol /mnt/disperse
[root@dhcp43-140 ~]# cd /mnt/disperse
[root@dhcp43-140 disperse]# ls -a
.  ..  .trashcan
[root@dhcp43-140 disperse]# 
[root@dhcp43-140 disperse]# 
[root@dhcp43-140 disperse]# 
[root@dhcp43-140 disperse]# echo "what a day.. hmph" > test1
[root@dhcp43-140 disperse]# mkdir dir2
[root@dhcp43-140 disperse]# ln test1 test1_ln
[root@dhcp43-140 disperse]# echo "what a day.. hmph" > test2
[root@dhcp43-140 disperse]# ln -s test2 test2_sln
[root@dhcp43-140 disperse]# cd
[root@dhcp43-140 ~]# 
[root@dhcp43-140 ~]# 
[root@dhcp43-140 ~]# 
[root@dhcp43-140 ~]# gluster v info dispersevol
 
Volume Name: dispersevol
Type: Disperse
Volume ID: e70dccdf-9f39-494a-a166-aa142049de07
Status: Created
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.43.140:/rhs/thinbrick1/disperse
Brick2: 10.70.43.140:/rhs/thinbrick2/disperse
Brick3: 10.70.42.75:/rhs/thinbrick2/disperse
Brick4: 10.70.42.75:/rhs/thinbrick1/disperse
Brick5: 10.70.43.140:/rhs/thinbrick3/disperse
Brick6: 10.70.42.75:/rhs/thinbrick3/disperse
Options Reconfigured:
changelog.changelog: on
storage.build-pgfid: on
performance.readdir-ahead: on
[root@dhcp43-140 ~]# gluster v start dispersevol
volume start: dispersevol: success
[root@dhcp43-140 ~]# 
[root@dhcp43-140 ~]# glusterfind list
SESSION                   VOLUME                    SESSION TIME             
---------------------------------------------------------------------------
testvol_sess              testvol                   2015-06-04 23:38:02      
cross32                   cross3vol                 2015-06-04 22:46:01      
sessd2                    dispersevol               2015-06-05 19:44:31      
sessd3                    dispersevol               2015-06-05 19:44:39      
sessd1                    dispersevol               2015-06-05 19:44:23      
cross31                   cross3vol                 2015-06-05 00:00:28      
sessc                     cross3vol                 2015-06-05 15:44:47      
[root@dhcp43-140 ~]# 
[root@dhcp43-140 ~]# glusterfind pre sessd1 dispersevol outd1.txt --regenerate-outfile
10.70.43.140 - pre failed: /rhs/thinbrick2/disperse Historical Changelogs not available: [Errno 2] No such file or directory

10.70.43.140 - pre failed: /rhs/thinbrick1/disperse Historical Changelogs not available: [Errno 2] No such file or directory

10.70.43.140 - pre failed: /rhs/thinbrick3/disperse Historical Changelogs not available: [Errno 2] No such file or directory

10.70.42.75 - pre failed: /rhs/thinbrick2/disperse Historical Changelogs not available: [Errno 2] No such file or directory

10.70.42.75 - pre failed: /rhs/thinbrick1/disperse Historical Changelogs not available: [Errno 2] No such file or directory

10.70.42.75 - pre failed: /rhs/thinbrick3/disperse Historical Changelogs not available: [Errno 2] No such file or directory

Generated output file /root/outd1.txt
[root@dhcp43-140 ~]#

***********************************************************88
This is what is seen in the backend brick logs:


[2015-06-05 14:32:15.422293] I [rpcsvc.c:2213:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2015-06-05 14:32:15.422417] W [options.c:936:xl_opt_validate] 0-dispersevol-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction
[2015-06-05 14:32:15.425732] E [changelog-helpers.c:634:htime_open] 0-dispersevol-changelog: Error extracting HTIME_CURRENT: No data available.
[2015-06-05 14:32:15.427484] E [ctr-helper.c:250:extract_ctr_options] 0-gfdbdatastore: CTR Xlator is disabled.
...
...
...
[2015-06-05 14:42:29.646309] I [server-handshake.c:585:server_setvolume] 0-dispersevol-server: accepted client from dhcp43-140.lab.eng.blr.redhat.com-21847-2015/06/05-14:42:29:558126-dispersevol-client-0-0-0 (version: 3.7.0)
[2015-06-05 14:42:53.867949] E [posix-helpers.c:1088:posix_handle_pair] 0-dispersevol-posix: /rhs/thinbrick1/disperse/dir2: key:glusterfs.inodelk-dom-count flags: 1 length:23 error:Operation not supported
[2015-06-05 14:42:53.868063] E [posix.c:1391:posix_mkdir] 0-dispersevol-posix: setting xattrs on /rhs/thinbrick1/disperse/dir2 failed (Operation not supported)
[2015-06-05 14:43:19.746591] E [posix-helpers.c:1088:posix_handle_pair] 0-dispersevol-posix: /rhs/thinbrick1/disperse/test2_sln: key:glusterfs.inodelk-dom-count flags: 1 length:23 error:Operation not supported
[2015-06-05 14:43:19.746643] E [posix.c:1897:posix_symlink] 0-dispersevol-posix: setting xattrs on /rhs/thinbrick1/disperse/test2_sln failed (Operation not supported)
[2015-06-05 14:44:27.716197] I [server-handshake.c:585:server_setvolume] 0-dispersevol-server: accepted client from dhcp42-75.lab.eng.blr.redhat.com-18098-2015/06/05-14:44:27:456748-dispersevol-client-0-0-0 (version: 3.7.0)
[2015-06-05 14:44:56.663817] W [socket.c:642:__socket_rwv] 0-dispersevol-changelog: readv on /var/run/gluster/.0ea84248fa0f63dc12f2a29bd7f49b5921945.sock failed (No data available)

*********************************************************



Version-Release number of selected component (if applicable):


How reproducible: 1:1

RCA:
---

This is nothing to do with disperse volume. It should happen with any other
volume if it is done in following way.

1. gluster volume is created first
2. Immediately glusterfind session is created which updates cuurent time as
   say 't1' as start time for changelogs available time.
   Even though, the changelog is enabled (marked 'on' in vol file), backend  
   .glusterfs and actual changelog files gets created during volume start.
3. gluster volume is started.
   The changelog and HITIME.TSTAMP gets created during volume start and the
   TSTAMP is current time say 't1+n'

  If above is the case, then glustefind pre is requesting History API with
  start time as 't1' where as the changelog is actually available from 't1+s'
  which will always fail.

Solution:
--------
With the following patches which is already merged, the glusterfind session creation will fail unless volume is oneline. So that would fix the above issue


As mentioned in the RCA, changing the title of this bug.

Also, the patch mentioned above fixes the issue only when the status of the volume is in 'stopped' state. The issue still persists if the volume status is 'created'.

Comment 1 Anand Avati 2015-06-17 11:56:11 UTC
REVIEW: http://review.gluster.org/11278 (tools/glusterfind: Fail glusterfind creation if volume is offline) posted (#1) for review on master by Kotresh HR (khiremat@redhat.com)

Comment 2 Anand Avati 2015-06-17 17:39:48 UTC
REVIEW: http://review.gluster.org/11278 (tools/glusterfind: Fail glusterfind creation if volume is offline) posted (#3) for review on master by Kotresh HR (khiremat@redhat.com)

Comment 3 Nagaprasad Sathyanarayana 2015-10-25 15:03:18 UTC
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.

Comment 4 Niels de Vos 2016-06-16 13:13:39 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.