Bug 1654805 - Bitrot: Scrub status say file is corrupted even it was just created AND 'path' in the output is broken
Summary: Bitrot: Scrub status say file is corrupted even it was just created AND 'path...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: bitrot
Version: mainline
Hardware: Unspecified
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
bugs@gluster.org
URL:
Whiteboard:
Depends On: 1654370
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-29 17:32 UTC by Raghavendra Bhat
Modified: 2019-03-25 16:32 UTC (History)
2 users (show)

Fixed In Version: glusterfs-6.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1654370
Environment:
Last Closed: 2019-03-25 16:32:19 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 21752 0 None Merged features/bitrot: compare the signature with proper length 2018-12-03 08:45:49 UTC

Description Raghavendra Bhat 2018-11-29 17:32:35 UTC
+++ This bug was initially created as a clone of Bug #1654370 +++

Description of problem:

'gluster vo bitrot <volume> scrub status' show file is corrupted even it was just created. Also the path of that file is broken in the output. There is the suffix "rusted.gfid" added to the files name. "trusted.gfid" is the name of an extended attributes. See below for details


Version-Release number of selected component (if applicable): 
Gluster v5.1


How reproducible:

Write a file in a freshly created FUSE Mount of a volume. Wait until it was marked and do "scrub ondemand" and "scrub status" . Do it with a second file and, as you can see, boths files are affected and marked as corrupted. This seems to be very unlikely.

Steps to Reproduce:

1. fs-davids-c1-n1:# echo file1 >> /gluster/archives/archive1/data/file1.txt

2. fs-davids-c1-n1:# getfattr -d -m "" /gluster/brick1/glusterbrick/data/file1.txt
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/glusterbrick/data/file1.txt
trusted.afr.dirty=0sAAAAAAAAAAAAAAAA
trusted.bit-rot.signature=0sAQIAAAAAAAAA7NxVNvc72uiBbw6kBybvXpuBDZFEkwdZA7uQYj2Xsdg=
trusted.bit-rot.version=0sAgAAAAAAAABb/qsZAAmLDQ==
trusted.gfid=0sXR/JcssRRYaslDCny8BpDw==
trusted.gfid2path.b7b5820f548c9129="f49da84b-d63c-40f5-9ed4-e66a262c671b/file1.txt"
trusted.glusterfs.mdata=0sAQAAAAAAAAAAAAAAAFv+rY0AAAAAJDHeIQAAAABb/q2NAAAAACQx3iEAAAAAW/6tjQAAAAAaza+K
trusted.pgfid.f49da84b-d63c-40f5-9ed4-e66a262c671b=0sAAAAAQ==
trusted.start_time="1543417229"
trusted.worm_file=0sMQA=

3. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub status

Volume name : archive1

State of scrub: Active (Idle)

Scrub impact: lazy

Scrub frequency: daily

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=========================================================

Node: localhost

Number of Scrubbed files: 0

Number of Skipped files: 0

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0


=========================================================

Node: fs-davids-c1-n2

Number of Scrubbed files: 0

Number of Skipped files: 0

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0


=========================================================

Node: fs-davids-c1-n4

Number of Scrubbed files: 0

Number of Skipped files: 0

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0


=========================================================

Node: fs-davids-c1-n3

Number of Scrubbed files: 0

Number of Skipped files: 0

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0

=========================================================

4. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub ondemand
volume bitrot: scrubber started ondemand for volume archive1
5. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub status

Volume name : archive1

State of scrub: Active (Idle)

Scrub impact: lazy

Scrub frequency: daily

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=========================================================

Node: localhost

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n2

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n4

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n3

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

=========================================================

6. fs-davids-c1-n1:# echo file2 >> /gluster/archives/archive1/data/file2.txt
7. fs-davids-c1-n1:# getfattr -d -m "" /gluster/brick1/glusterbrick/data/file2.txt
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/glusterbrick/data/file2.txt
trusted.afr.dirty=0sAAAAAAAAAAAAAAAA
trusted.bit-rot.signature=0sAQIAAAAAAAAAZ+5UeOqtsDS6WZROuXd5e0nKaqjTV0WH8268vutl9w4=
trusted.bit-rot.version=0sAgAAAAAAAABb/qsZAAmLDQ==
trusted.gfid=0szJ3EL9TvSseLHkZOmdw3HQ==
trusted.gfid2path.d3c3e99ef3917352="f49da84b-d63c-40f5-9ed4-e66a262c671b/file2.txt"
trusted.glusterfs.mdata=0sAQAAAAAAAAAAAAAAAFv+sTIAAAAAN+ztgwAAAABb/rEyAAAAADfs7YMAAAAAW/6xMgAAAAA3uxck
trusted.pgfid.f49da84b-d63c-40f5-9ed4-e66a262c671b=0sAAAAAQ==
trusted.start_time="1543418162"
trusted.worm_file=0sMQA=

8. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub status

Volume name : archive1

State of scrub: Active (Idle)

Scrub impact: lazy

Scrub frequency: daily

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=========================================================

Node: localhost

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n2

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n4

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n3

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

=========================================================

9. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub ondemand
volume bitrot: scrubber started ondemand for volume archive1
10. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub status
Volume name : archive1

State of scrub: Active (Idle)

Scrub impact: lazy

Scrub frequency: daily

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=========================================================

Node: localhost

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:19:06

Duration of last scrub (D:M:H:M:S): 0:0:0:3

Error count: 2

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file2.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n3

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:19:06

Duration of last scrub (D:M:H:M:S): 0:0:0:3

Error count: 2

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file2.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n2

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:19:06

Duration of last scrub (D:M:H:M:S): 0:0:0:3

Error count: 2

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file2.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n4

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:19:06

Duration of last scrub (D:M:H:M:S): 0:0:0:3

Error count: 2

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file2.txtrusted.gfid

=========================================================


Actual results:
Both files are marked as corrupted but this seems to be very unlikely for freshly created files. The 'path' in the output has an suffix "rusted.gfid" which is the name of an extended attribut.

Expected results:
Files should not be marked as corrupted and if there is really a file corrupted there should be a correct 'path' in the output

Additional info:

fs-davids-c1-n1:/home/iternity # gluster vo info archive1
 
Volume Name: archive1
Type: Replicate
Volume ID: 6f238af7-5fef-4bea-bb59-ac4e9ea5bf3a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: fs-davids-c1-n1:/gluster/brick1/glusterbrick
Brick2: fs-davids-c1-n2:/gluster/brick1/glusterbrick
Brick3: fs-davids-c1-n3:/gluster/brick1/glusterbrick
Brick4: fs-davids-c1-n4:/gluster/brick1/glusterbrick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
user.smb: disable
features.read-only: off
features.worm: off
features.worm-file-level: on
features.retention-mode: enterprise
features.default-retention-period: 120
network.ping-timeout: 10
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.nl-cache: on
performance.nl-cache-timeout: 600
client.event-threads: 32
server.event-threads: 32
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
performance.cache-samba-metadata: on
performance.cache-ima-xattrs: on
performance.io-thread-count: 64
cluster.use-compound-fops: on
performance.cache-size: 512MB
performance.cache-refresh-timeout: 10
performance.read-ahead: off
performance.write-behind-window-size: 4MB
performance.write-behind: on
storage.build-pgfid: on
auth.ssl-allow: *
client.ssl: on
server.ssl: on
features.utime: on
storage.ctime: on
features.bitrot: on
features.scrub: Active
features.scrub-freq: daily
cluster.enable-shared-storage: enable

--- Additional comment from  on 2018-11-28 10:53:37 EST ---

fs-davids-c1-n1:/home/iternity # cat /var/log/glusterfs/scrub.log 
[2018-11-28 14:50:01.347943] I [MSGID: 118035] [bit-rot-scrub.c:1313:br_scrubber_scale_up] 0-archive1-bit-rot-0: Scaling up scrubbers [0 => 1]
[2018-11-28 14:50:01.348019] I [MSGID: 118048] [bit-rot-scrub.c:1564:br_scrubber_log_option] 0-archive1-bit-rot-0: SCRUB TUNABLES:: [Frequency: daily, Throttle: lazy]
[2018-11-28 14:50:01.348072] I [MSGID: 118031] [bit-rot.c:2054:init] 0-archive1-bit-rot-0: bit-rot xlator loaded in "SCRUBBER" mode
[2018-11-28 14:50:01.351245] I [socket.c:4167:ssl_setup_connection_params] 0-archive1-client-0: SSL support on the I/O path is ENABLED
[2018-11-28 14:50:01.351265] I [socket.c:4170:ssl_setup_connection_params] 0-archive1-client-0: SSL support for glusterd is ENABLED
[2018-11-28 14:50:01.351272] I [socket.c:4180:ssl_setup_connection_params] 0-archive1-client-0: using certificate depth 1
[2018-11-28 14:50:01.348178] I [MSGID: 101190] [event-epoll.c:622:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2
[2018-11-28 14:50:01.356510] I [socket.c:4225:ssl_setup_connection_params] 0-archive1-client-0: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled
[2018-11-28 14:50:01.357084] I [MSGID: 118035] [bit-rot-scrub.c:1313:br_scrubber_scale_up] 0-archive2-bit-rot-0: Scaling up scrubbers [0 => 1]
[2018-11-28 14:50:01.357148] I [MSGID: 118048] [bit-rot-scrub.c:1564:br_scrubber_log_option] 0-archive2-bit-rot-0: SCRUB TUNABLES:: [Frequency: daily, Throttle: lazy]
[2018-11-28 14:50:01.357564] I [MSGID: 118031] [bit-rot.c:2054:init] 0-archive2-bit-rot-0: bit-rot xlator loaded in "SCRUBBER" mode
[2018-11-28 14:50:01.357688] I [socket.c:4167:ssl_setup_connection_params] 0-archive2-client-0: SSL support on the I/O path is ENABLED
[2018-11-28 14:50:01.357697] I [socket.c:4170:ssl_setup_connection_params] 0-archive2-client-0: SSL support for glusterd is ENABLED
[2018-11-28 14:50:01.357702] I [socket.c:4180:ssl_setup_connection_params] 0-archive2-client-0: using certificate depth 1
[2018-11-28 14:50:01.357843] I [socket.c:4225:ssl_setup_connection_params] 0-archive2-client-0: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled
[2018-11-28 14:50:01.358350] I [MSGID: 114020] [client.c:2354:notify] 0-archive1-client-0: parent translators are ready, attempting connect on transport
[2018-11-28 14:50:01.358526] I [MSGID: 114020] [client.c:2354:notify] 0-archive2-client-0: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
  1: volume archive1-client-0
  2:     type protocol/client
  3:     option ping-timeout 42
  4:     option remote-host fs-davids-c1-n1
  5:     option remote-subvolume /gluster/brick1/glusterbrick
  6:     option transport-type socket
  7:     option transport.address-family inet
  8:     option username f79d8acf-5cbc-4066-b3a2-fefa79990db8
  9:     option password 9e2ab4e1-8de3-4b57-ac5e-4da52511e1de
 10:     option transport.socket.ssl-enabled true
 11: end-volume
 12:  
 13: volume archive1-bit-rot-0
 14:     type features/bit-rot
 15:     option scrubber true
 16:     option scrub-throttle lazy
 17:     option scrub-freq daily
 18:     subvolumes archive1-client-0
 19: end-volume
 20:  
 21: volume archive2-client-0
 22:     type protocol/client
 23:     option ping-timeout 42
 24:     option remote-host fs-davids-c1-n1
 25:     option remote-subvolume /gluster/brick2/glusterbrick
 26:     option transport-type socket
 27:     option transport.address-family inet
 28:     option username 1cde2f1b-24d2-4f8e-a3e6-aa0fd339df5c
 29:     option password 6dfe02e1-7c86-46d7-aaaf-6e4b96d89713
 30:     option transport.socket.ssl-enabled true
 31: end-volume
 32:  
 33: volume archive2-bit-rot-0
 34:     type features/bit-rot
 35:     option scrubber true
 36:     option scrub-throttle lazy
 37:     option scrub-freq daily
 38:     subvolumes archive2-client-0
 39: end-volume
 40:  
 41: volume scrub
 42:     type debug/io-stats
 43:     option log-level INFO
 44:     subvolumes archive1-bit-rot-0 archive2-bit-rot-0
 45: end-volume
 46:  
+------------------------------------------------------------------------------+
[2018-11-28 14:50:01.818206] I [rpc-clnt.c:2042:rpc_clnt_reconfig] 0-archive1-client-0: changing port to 49152 (from 0)
[2018-11-28 14:50:01.818540] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2018-11-28 14:50:01.818560] I [socket.c:4167:ssl_setup_connection_params] 0-archive1-client-0: SSL support on the I/O path is ENABLED
[2018-11-28 14:50:01.818566] I [socket.c:4170:ssl_setup_connection_params] 0-archive1-client-0: SSL support for glusterd is ENABLED
[2018-11-28 14:50:01.818572] I [socket.c:4180:ssl_setup_connection_params] 0-archive1-client-0: using certificate depth 1
[2018-11-28 14:50:01.818721] I [socket.c:4225:ssl_setup_connection_params] 0-archive1-client-0: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled
[2018-11-28 14:50:01.857748] W [dict.c:1002:str_to_data] (-->/usr/lib64/glusterfs/5.1/xlator/protocol/client.so(+0x3d6b4) [0x7f18347506b4] -->/usr/lib64/libglusterfs.so.0(dict_set_str+0x16) [0x7f183c0283f6] -->/usr/lib64/libglusterfs.so.0(str_to_data+0x60) [0x7f183c0251c0] ) 0-dict: value is NULL [Invalid argument]
[2018-11-28 14:50:01.832490] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2018-11-28 14:50:01.857814] I [MSGID: 114006] [client-handshake.c:1238:client_setvolume] 0-archive1-client-0: failed to set process-name in handshake msg
[2018-11-28 14:50:01.858415] I [MSGID: 114046] [client-handshake.c:1107:client_setvolume_cbk] 0-archive1-client-0: Connected to archive1-client-0, attached to remote volume '/gluster/brick1/glusterbrick'.
[2018-11-28 14:50:01.860145] I [MSGID: 118039] [bit-rot.c:1274:br_child_enaction] 0-archive1-bit-rot-0: Connected to brick /gluster/brick1/glusterbrick..
[2018-11-28 14:50:01.860180] I [MSGID: 118038] [bit-rot-scrub.c:962:br_fsscan_schedule] 0-archive1-bit-rot-0: Scrubbing is scheduled to run at 2018-11-29 14:50:01
[2018-11-28 14:50:01.860196] I [MSGID: 118038] [bit-rot-scrub.c:1913:wait_for_scrub_to_finish] 0-archive1-bit-rot-0: Waiting for all children to start and finish scrub
[2018-11-28 14:50:01.897105] I [rpc-clnt.c:2042:rpc_clnt_reconfig] 0-archive2-client-0: changing port to 49153 (from 0)
[2018-11-28 14:50:01.897414] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2018-11-28 14:50:01.897441] I [socket.c:4167:ssl_setup_connection_params] 0-archive2-client-0: SSL support on the I/O path is ENABLED
[2018-11-28 14:50:01.897448] I [socket.c:4170:ssl_setup_connection_params] 0-archive2-client-0: SSL support for glusterd is ENABLED
[2018-11-28 14:50:01.897453] I [socket.c:4180:ssl_setup_connection_params] 0-archive2-client-0: using certificate depth 1
[2018-11-28 14:50:01.897605] I [socket.c:4225:ssl_setup_connection_params] 0-archive2-client-0: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled
[2018-11-28 14:50:01.938013] W [dict.c:1002:str_to_data] (-->/usr/lib64/glusterfs/5.1/xlator/protocol/client.so(+0x3d6b4) [0x7f18347506b4] -->/usr/lib64/libglusterfs.so.0(dict_set_str+0x16) [0x7f183c0283f6] -->/usr/lib64/libglusterfs.so.0(str_to_data+0x60) [0x7f183c0251c0] ) 0-dict: value is NULL [Invalid argument]
[2018-11-28 14:50:01.938038] I [MSGID: 114006] [client-handshake.c:1238:client_setvolume] 0-archive2-client-0: failed to set process-name in handshake msg
[2018-11-28 14:50:01.943407] I [MSGID: 114046] [client-handshake.c:1107:client_setvolume_cbk] 0-archive2-client-0: Connected to archive2-client-0, attached to remote volume '/gluster/brick2/glusterbrick'.
[2018-11-28 14:50:01.945974] I [MSGID: 118039] [bit-rot.c:1274:br_child_enaction] 0-archive2-bit-rot-0: Connected to brick /gluster/brick2/glusterbrick..
[2018-11-28 14:50:01.946141] I [MSGID: 118038] [bit-rot-scrub.c:962:br_fsscan_schedule] 0-archive2-bit-rot-0: Scrubbing is scheduled to run at 2018-11-29 14:50:01
[2018-11-28 14:50:01.946166] I [MSGID: 118038] [bit-rot-scrub.c:1913:wait_for_scrub_to_finish] 0-archive2-bit-rot-0: Waiting for all children to start and finish scrub
[2018-11-28 14:50:05.454712] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:50:05.468263] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 14:50:05.655672] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2018-11-28 14:50:05.655745] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:50:05.668527] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 14:50:05.817169] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:50:05.833762] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 14:50:05.985347] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:50:06.000810] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 14:50:06.014248] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:50:06.031913] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 14:50:06.287688] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:50:06.300733] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
The message "E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler" repeated 6 times between [2018-11-28 14:50:05.655672] and [2018-11-28 14:50:06.300691]
[2018-11-28 14:57:31.505688] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:57:31.516034] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2018-11-28 14:57:31.516159] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 14:57:31.576154] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:57:31.589743] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 14:57:31.783030] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:57:31.815136] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 14:57:31.885691] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 14:57:31.896616] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
The message "E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler" repeated 9 times between [2018-11-28 14:57:31.516034] and [2018-11-28 14:57:31.896576]
[2018-11-28 15:07:36.246938] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2018-11-28 15:07:36.259385] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 15:07:36.311037] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 15:07:50.377803] I [bit-rot.c:1703:notify] 0-archive1-bit-rot-0: BitRot scrub ondemand called
[2018-11-28 15:07:50.377828] I [MSGID: 118038] [bit-rot-scrub.c:1105:br_fsscan_ondemand] 0-archive1-bit-rot-0: Ondemand Scrubbing scheduled to run at 2018-11-28 15:07:51
[2018-11-28 15:07:50.389810] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 15:07:50.398858] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 15:07:51.603162] I [MSGID: 118044] [bit-rot-scrub.c:618:br_scrubber_log_time] 0-archive1-bit-rot-0: Scrubbing started at 2018-11-28 15:07:51
[2018-11-28 15:07:52.609932] A [MSGID: 118023] [bit-rot-scrub.c:239:bitd_compare_ckum] 0-archive1-bit-rot-0: CORRUPTION DETECTED: Object /data/file1.txt {Brick: /gluster/brick1/glusterbrick | GFID: 5d1fc972-cb11-4586-ac94-30a7cbc0690f}
[2018-11-28 15:07:52.609999] A [MSGID: 118024] [bit-rot-scrub.c:260:bitd_compare_ckum] 0-archive1-bit-rot-0: Marking /data/file1.txt [GFID: 5d1fc972-cb11-4586-ac94-30a7cbc0690f | Brick: /gluster/brick1/glusterbrick] as corrupted..
[2018-11-28 15:07:53.611063] I [MSGID: 118045] [bit-rot-scrub.c:622:br_scrubber_log_time] 0-archive1-bit-rot-0: Scrubbing finished at 2018-11-28 15:07:53
[2018-11-28 15:07:53.611105] I [MSGID: 118038] [bit-rot-scrub.c:1009:br_fsscan_activate] 0-archive1-bit-rot-0: Scrubbing is rescheduled to run at 2018-11-29 15:07:53
[2018-11-28 15:07:53.611113] I [MSGID: 118038] [bit-rot-scrub.c:1913:wait_for_scrub_to_finish] 0-archive1-bit-rot-0: Waiting for all children to start and finish scrub
[2018-11-28 15:07:36.259349] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2018-11-28 15:08:03.304028] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 15:08:03.345501] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 15:18:37.925316] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 15:18:37.972416] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 15:19:01.922193] I [bit-rot.c:1703:notify] 0-archive1-bit-rot-0: BitRot scrub ondemand called
[2018-11-28 15:19:01.922225] I [MSGID: 118038] [bit-rot-scrub.c:1105:br_fsscan_ondemand] 0-archive1-bit-rot-0: Ondemand Scrubbing scheduled to run at 2018-11-28 15:19:02
[2018-11-28 15:19:01.935622] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 15:19:02.000110] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2018-11-28 15:19:03.213076] I [MSGID: 118044] [bit-rot-scrub.c:618:br_scrubber_log_time] 0-archive1-bit-rot-0: Scrubbing started at 2018-11-28 15:19:03
[2018-11-28 15:19:04.219146] E [MSGID: 114031] [client-rpc-fops_v2.c:281:client4_0_open_cbk] 0-archive1-client-0: remote operation failed. Path: /data/file1.txt (5d1fc972-cb11-4586-ac94-30a7cbc0690f) [Input/output error]
[2018-11-28 15:19:04.219213] E [MSGID: 118008] [bit-rot.c:471:br_log_object] 0-archive1-bit-rot-0: open() failed on object 5d1fc972-cb11-4586-ac94-30a7cbc0690f [Input/output error]
[2018-11-28 15:19:05.222689] A [MSGID: 118023] [bit-rot-scrub.c:239:bitd_compare_ckum] 0-archive1-bit-rot-0: CORRUPTION DETECTED: Object /data/file2.txt {Brick: /gluster/brick1/glusterbrick | GFID: cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d}
[2018-11-28 15:19:05.222753] A [MSGID: 118024] [bit-rot-scrub.c:260:bitd_compare_ckum] 0-archive1-bit-rot-0: Marking /data/file2.txt [GFID: cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d | Brick: /gluster/brick1/glusterbrick] as corrupted..
[2018-11-28 15:19:06.223988] I [MSGID: 118045] [bit-rot-scrub.c:622:br_scrubber_log_time] 0-archive1-bit-rot-0: Scrubbing finished at 2018-11-28 15:19:06
[2018-11-28 15:19:06.224046] I [MSGID: 118038] [bit-rot-scrub.c:1009:br_fsscan_activate] 0-archive1-bit-rot-0: Scrubbing is rescheduled to run at 2018-11-29 15:19:06
[2018-11-28 15:19:06.224055] I [MSGID: 118038] [bit-rot-scrub.c:1913:wait_for_scrub_to_finish] 0-archive1-bit-rot-0: Waiting for all children to start and finish scrub
[2018-11-28 15:19:12.608391] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2018-11-28 15:19:12.623021] I [glusterfsd-mgmt.c:2005:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing

--- Additional comment from  on 2018-11-28 11:28:49 EST ---

After the files get marked as corrupted, they seems to be 'Read Only' although the write bits are not removed. The files are not WORMed, too.

fs-davids-c1-n1:/home/iternity # rm -f /gluster/archives/archive1/data/file*
rm: cannot remove '/gluster/archives/archive1/data/file1.txt': Read-only file system
rm: cannot remove '/gluster/archives/archive1/data/file2.txt': Read-only file system
fs-davids-c1-n1:/home/iternity # stat /gluster/archives/archive1/data/file*
  File: /gluster/archives/archive1/data/file1.txt
  Size: 6         	Blocks: 1          IO Block: 131072 regular file
Device: 2fh/47d	Inode: 12435617968312838415  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-11-28 15:00:29.449687434 +0000
Modify: 2018-11-28 15:00:29.607247905 +0000
Change: 2018-11-28 15:00:29.607247905 +0000
 Birth: -
  File: /gluster/archives/archive1/data/file2.txt
  Size: 6         	Blocks: 1          IO Block: 131072 regular file
Device: 2fh/47d	Inode: 10024527123976042269  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-11-28 15:16:02.935008036 +0000
Modify: 2018-11-28 15:16:02.938274179 +0000
Change: 2018-11-28 15:16:02.938274179 +0000
 Birth: -

fs-davids-c1-n1:/home/iternity # getfattr -d -m "" /gluster/brick1/glusterbrick/data/file2.txt
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/glusterbrick/data/file2.txt
trusted.afr.dirty=0sAAAAAAAAAAAAAAAA
trusted.bit-rot.bad-file=0sMQA=
trusted.bit-rot.signature=0sAQIAAAAAAAAAZ+5UeOqtsDS6WZROuXd5e0nKaqjTV0WH8268vutl9w4=
trusted.bit-rot.version=0sAgAAAAAAAABb/qsZAAmLDQ==
trusted.gfid=0szJ3EL9TvSseLHkZOmdw3HQ==
trusted.gfid2path.d3c3e99ef3917352="f49da84b-d63c-40f5-9ed4-e66a262c671b/file2.txt"
trusted.glusterfs.mdata=0sAQAAAAAAAAAAAAAAAFv+sTIAAAAAN+ztgwAAAABb/rEyAAAAADfs7YMAAAAAW/6xMgAAAAA3uxck
trusted.pgfid.f49da84b-d63c-40f5-9ed4-e66a262c671b=0sAAAAAQ==
trusted.start_time="1543418162"
trusted.worm_file=0sMQA=

Comment 1 Worker Ant 2018-11-29 17:35:28 UTC
REVIEW: https://review.gluster.org/21752 (features/bitrot: compare the signature with proper length) posted (#1) for review on master by Raghavendra Bhat

Comment 2 Worker Ant 2018-12-03 08:45:47 UTC
REVIEW: https://review.gluster.org/21752 (features/bitrot: compare the signature with proper length) posted (#3) for review on master by Kotresh HR

Comment 3 Shyamsundar 2019-03-25 16:32:19 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.