Bug 1450722 - [RFE] glusterfind: add --end-time and --field-separator options
Summary: [RFE] glusterfind: add --end-time and --field-separator options
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfind
Version: rhgs-3.2
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: RHGS 3.3.0
Assignee: Milind Changire
QA Contact: Anil Shah
URL:
Whiteboard:
Depends On: 1453151
Blocks: 1417138
TreeView+ depends on / blocked
 
Reported: 2017-05-15 03:20 UTC by WenhanShi
Modified: 2020-07-16 09:34 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.8.4-28
Doc Type: Enhancement
Doc Text:
The glusterfind command now provides a 'query' subcommand that provides a list of changed files.
Clone Of:
: 1453151 (view as bug list)
Environment:
Last Closed: 2017-09-21 04:41:45 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Description WenhanShi 2017-05-15 03:20:42 UTC
Description of problem:
We have a windows client to access the gluster volume.
After copy -> paste -> rename a file on the client,
the glusterfind pre session output the wrong info about 
the list of files that are modified.

How reproducible:
Each time

Steps to Reproduce:
1. create a backup session
2. create a fileA on client and check it is in the brick dir of the gluster node.
3. copy fileA by right click and paste it to the same dir. 
4. rename fileA to fileB, and check both fileA and fileB are in the brick dir.
5. run glusterfind pre session to generate output file

Actual results:
Only fileA - Copy was in the output file as a New file

  NEW test/fileA+-+Copy

Expected results:

Both fileA and fileB should in the output, and "fileA - Copy" shouldn't existed.

  NEW test/fileA
  NEW test/fileB

Additional info:
md-cache-timeout has been set to 0

Comment 22 Milind Changire 2017-05-22 12:56:34 UTC
Tentative upstream patch: https://review.gluster.org/17355

Comment 68 Atin Mukherjee 2017-06-06 16:04:31 UTC
upstream patch : https://review.gluster.org/#/c/17439/

Comment 70 Milind Changire 2017-06-12 05:06:33 UTC
The summary of the bug is that there was a difference of expectation.
Essentially this is NOT A BUG.

However, there were two enhancements that got generated out of this exercise:
1. add --end-time option to be used with --since-time for the "query" command
   This option will help to specify the time up to which the change set is
   desired. An application which maintains the check-pointing time-stamp 
   externally will find it helpful to control and supply the --end-time to be 
   passed to glusterfind. The --end-time option is optional: the default is to
   take the current time and deduct the changelog roll-over time and use the
   resulting value as the end-time.

2. add --field-separator option; to be used with "pre" and "query" commands
   glusterfind output file has change tag followed by at most two file names;
   all of which are space separated. If the file name itself has embedded spaces
   and if the user chooses to avoid file name encoding, then it becomes difficult
   to identify file name string boundary.
   The --field-separator option can be used to pass a well known string to be
   embedded between the string fields in the output file. The output file can then
   be processed to safely extract file names with embedded spaces by parsing out
   the field separator string.

Comment 74 Anil Shah 2017-07-11 12:03:44 UTC
Executed basic test case to verify this RFE.
will execute some more cases,to verify this RFE further.

Bug verified on build glusterfs-3.8.4-33


[root@dhcp35-199 yum.repos.d]# glusterfind query  --field-separator "="  --since-time $((now - 3600)) --end-time $now  vol0 /tmp/v1.log 
Generated output file /tmp/v1.log

[root@dhcp35-199 yum.repos.d]# cat /tmp/v1.log 
RENAME=Sales+Department+Budget.xls=Sales+Department+Budget-2017.xls
NEW=file1
NEW=file2
NEW=file3
NEW=file6
NEW=file9
NEW=file10
NEW=file4
NEW=file5
NEW=file7
NEW=file8


Generated output file /tmp/a1.log
[root@dhcp35-199 yum.repos.d]# cat /tmp/a1.log 
NEW=test1
NEW=test2
NEW=test4
NEW=test5
NEW=test3
NEW=test8
NEW=test9
NEW=test6
NEW=test7
NEW=test10
NEW=Sales+Department+Budget-2017.xls

Comment 75 Laura Bailey 2017-08-15 07:08:24 UTC
We can only keep the doc text short for the errata because of character limits, so providing the previous doc text here:

Feature:
glusterfind query

Reason:
Backup and Restore software usually maintain their own checkpoints/timestamps
outside of glusterfind. The "query"
command can be used to extract changed files by providing a
timestamp as desired by the backup application.

Result:
The synopsis of gluserfind "query" command is as follows:
usage: glusterfind query [-h]
                         [--since-time SINCE_TIME]
                         [--end-time END_TIME]
                         [--no-encode] [--full]
                         [--debug]
                         [--disable-partial]
                         [--output-prefix OUTPUT_PREFIX]
                         [-N]
                         [--tag-for-full-find
                            TAG_FOR_FULL_FIND]
                         [--field-separator FIELD_SEPARATOR]
                         volume outfile

The glusterfind "query" command is similar to the "pre"
command. It helps to fetch the list of changed files. The
"query" command expects one of the following options:
1. --since-time <UNIX-time-stamp>
   --end-time <UNIX-time-stamp>
   where:
   --since-time: the starting time from which changes are desired
   --end-time: the time upto which changed files are desired

2. --full
   All the list of the files in the volume. This option does
   not look at the changelogs and instead runs a command at
   the bricks to fetch file names.

The "query" command does not create any internal session
files to log checkpoint timestamps.

The --field-separator option accepts a string that can be used to delimit the strings on a single line in the output file.

Comment 79 errata-xmlrpc 2017-09-21 04:41:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.