Bug 1503514

Summary: [Bitrot+Arbiter]: Errors seen in bitd.log
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rochelle <rallan>
Component: arbiterAssignee: Ravishankar N <ravishankar>
Status: CLOSED WONTFIX QA Contact: Karan Sandha <ksandha>
Severity: medium Docs Contact:
Priority: low    
Version: rhgs-3.3CC: amukherj, bkunal, ccalhoun, lav, nchilaka, ravishankar, rhinduja, rhs-bugs, sanandpa, sankarshan, sheggodu, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.3.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-15 13:33:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1481177    

Description Rochelle 2017-10-18 10:52:38 UTC
Description of problem:
========================
Post discussion, there is no bit-rot signature for the arbiter brick but there were errors observed in bitd.log for the node having the arbiter brick.


[2017-10-18 06:55:58.466438] E [MSGID: 118002] [bit-rot.c:303:br_object_read_block_and_sign] 0-master-bit-rot-0: readv on 6ee876b8-a1db-44fc-a3de-46afbf7c8ce7 failed
[2017-10-18 06:55:58.466543] E [MSGID: 118003] [bit-rot.c:356:br_calculate_obj_checksum] 0-master-bit-rot-0: reading block with offset 0 of object 6ee876b8-a1db-44fc-a3de-46afbf7c8ce7 failed
[2017-10-18 06:55:58.466568] E [MSGID: 118004] [bit-rot.c:409:br_object_read_sign] 0-master-bit-rot-0: calculating checksum for the object 6ee876b8-a1db-44fc-a3de-46afbf7c8ce7 failed
[2017-10-18 06:55:58.466590] E [MSGID: 118009] [bit-rot.c:618:br_sign_object] 0-master-bit-rot-0: reading and signing of the object 6ee876b8-a1db-44fc-a3de-46afbf7c8ce7 failed
[2017-10-18 06:55:58.466892] E [MSGID: 118010] [bit-rot.c:674:br_process_object] 0-master-bit-rot-0: SIGNING FAILURE [6ee876b8-a1db-44fc-a3de-46afbf7c8ce7]
[2017-10-18 06:55:58.466939] W [MSGID: 114031] [client-rpc-fops.c:3004:client3_3_readv_cbk] 0-master-client-5: remote operation failed [Transport endpoint is not connected]
[2017-10-18 06:55:58.467042] E [MSGID: 118002] [bit-rot.c:303:br_object_read_block_and_sign] 0-master-bit-rot-0: readv on 5152ece3-8a10-453e-a70b-c01a35acc9bb failed
[2017-10-18 06:55:58.467078] E [MSGID: 118003] [bit-rot.c:356:br_calculate_obj_checksum] 0-master-bit-rot-0: reading block with offset 0 of object 5152ece3-8a10-453e-a70b-c01a35acc9bb failed
[2017-10-18 06:55:58.467117] E [MSGID: 118004] [bit-rot.c:409:br_object_read_sign] 0-master-bit-rot-0: calculating checksum for the object 5152ece3-8a10-453e-a70b-c01a35acc9bb failed
[2017-10-18 06:55:58.467139] E [MSGID: 118009] [bit-rot.c:618:br_sign_object] 0-master-bit-rot-0: reading and signing of the object 5152ece3-8a10-453e-a70b-c01a35acc9bb failed
[2017-10-18 06:55:58.467288] E [MSGID: 118010] [bit-rot.c:674:br_process_object] 0-master-bit-rot-0: SIGNING FAILURE [5152ece3-8a10-453e-a70b-c01a35acc9bb]

Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.8.4-49.el7rhgs.x86_64

How reproducible:
=================

2/2

Steps to Reproduce:
===================
1. Created a 2x(2+1) volume from a 3 node cluster and started writing data.
2. Enabled bitrot on the volume 


Actual results:
===============

Bitrot signature does not happen on the arbiter brick so there should be no "SIGNING FAILURE" error.

Expected results:
=================

The error should not be present in the log messages.

Comment 3 Sweta Anandpara 2017-10-18 12:39:22 UTC
Post discussion with Ravi and Kotresh:

* Arbiter brick is 0 byte, and stores only metadata. Bitrot is not aware that it is an arbiter brick and hence tries to read it. Errors seen in logs because arbiter doesn't allow bitd/scrub to read the arbiter brick
* Errors seen in brick logs, and bitd.log. Will not hamper bitrot or arbiter functionality.
* Bitrot is for bit-flips for data, and that is HIGHLY unlikely to happen on an arbiter brick (because it has no data). If by chance some data does get written to arbiter brick, theoretically, it will still not impact bitrot or arbiter functionality. (However error messages will continue to be seen in the mentioned log locations)

If replica 3 and arbiter is the way forward that we are recommending to our customers, would like this bug to be discussed for the present release. Hence marking blocker to '?'.