Bug 1297300 - Stale stat information for corrupted objects (replicated volume)
Summary: Stale stat information for corrupted objects (replicated volume)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: bitrot
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHGS 3.1.2
Assignee: Bug Updates Notification Mailing List
QA Contact: RamaKasturi
URL:
Whiteboard:
: 1297284 (view as bug list)
Depends On: 1296399
Blocks: 1297213 1297284
TreeView+ depends on / blocked
 
Reported: 2016-01-11 06:24 UTC by Venky Shankar
Modified: 2016-09-17 14:24 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.7.5-16
Doc Type: Bug Fix
Doc Text:
Clone Of: 1296399
Environment:
Last Closed: 2016-03-01 06:07:09 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0193 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 update 2 2016-03-01 10:20:36 UTC

Description Venky Shankar 2016-01-11 06:24:18 UTC
+++ This bug was initially created as a clone of Bug #1296399 +++

Description of problem:
Stale stat information returned when an object is corrupted on a replicated volume.

Version-Release number of selected component (if applicable):
mainline

How reproducible:
always

Steps to Reproduce:
1. Create a replicated volume, enable bitrot, mount, perform I/O
2. Wait for files to get signed
3. Corrupt an object (modify form brick directly) from one of the replica (preferably first replica)
4. Wait till scrubber marks the object as corrupted
5. Perform I/O on the file again
6. stat to check object size

Actual results:
Stale (+ incorrect) stat information (size, mtime, etc..) reported

Expected results:
Correct stat information (updated size, mtim, etc..) should be returned

Additional info:

--- Additional comment from Vijay Bellur on 2016-01-07 02:22:37 EST ---

REVIEW: http://review.gluster.org/13120 (features/bitrot: add check for corrupted object in f{stat}) posted (#3) for review on master by Venky Shankar (vshankar)

--- Additional comment from Vijay Bellur on 2016-01-10 07:43:33 EST ---

COMMIT: http://review.gluster.org/13120 committed in master by Venky Shankar (vshankar) 
------
commit d5d6918ce7dc9f54496da435af546611dfbe7d5c
Author: Venky Shankar <vshankar>
Date:   Wed Dec 30 14:56:12 2015 +0530

    features/bitrot: add check for corrupted object in f{stat}
    
    Check for corrupted objects is done bt bitrot stub component
    for data operations and such fops are denied processing by
    returning EIO. These checks were not done for operations such
    as get/set extended attribute, stat and the likes - IOW, stub
    only blocked pure data operations.
    
    However, its necessary to have these checks for certain other
    fops, most importantly stat (and fstat). This is due to the
    fact that clients could possibly get stale stat information
    (such as size, {a,c,m}time) resulting in incorrect operation
    of the application that rely on these fields. Note that, the
    data that replication would take care of fetching good (and
    correct) data, but the staleness of stat information could
    lead to data inconsistencies (e.g., rebalance, tier).
    
    Change-Id: I5a22780373b182a13f8d2c4ca6b7d9aa0ffbfca3
    BUG: 1296399
    Signed-off-by: Venky Shankar <vshankar>
    Reviewed-on: http://review.gluster.org/13120
    Reviewed-by: Kotresh HR <khiremat>
    Reviewed-by: mohammed rafi  kc <rkavunga>
    Reviewed-by: Raghavendra Bhat <raghavendra>
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>

Comment 3 Bhaskarakiran 2016-01-11 07:32:24 UTC
Just an update to this. 

A lot of bitrot error messages are seen in the brick logs.

[2016-01-09 08:10:31.194770] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode 1572c2f5-9917-423b-a987-86a7a78d342e
[2016-01-09 08:10:31.252071] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode 1572c2f5-9917-423b-a987-86a7a78d342e
[2016-01-09 08:10:31.288027] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode ba5f4b4c-2022-43c5-8d45-fa65ac8f3227
[2016-01-09 08:10:31.296192] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode b7b9d689-4b21-44eb-86c8-1c7f9f6b32ba
[2016-01-09 08:10:31.383698] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode ba5f4b4c-2022-43c5-8d45-fa65ac8f3227
[2016-01-09 08:10:31.501594] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode f7d80536-d2c1-48e0-a9de-0a7c3c55689b
[2016-01-09 08:10:31.566981] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode 9676610d-8d4b-4362-b230-6f2107050059
[2016-01-09 08:10:31.569765] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode f7d80536-d2c1-48e0-a9de-0a7c3c55689b
[2016-01-09 08:10:31.630599] E [MSGID: 116012] [bit-rot-stub.h:364:br_stub_is_bad_object] 0-disperse_vol1-bitrot-stub: failed to get the inode context for the inode 9676610d-8d4b-4362-b230-6f2107050059

Comment 4 Bhaskarakiran 2016-01-11 07:34:24 UTC
*** Bug 1297284 has been marked as a duplicate of this bug. ***

Comment 6 RamaKasturi 2016-01-18 05:34:38 UTC
Verified and works fine with build glusterfs-3.7.5-16.el7rhgs.x86_64.

Followed the steps below to verify this bug:

1. Create a replicated volume, enable bitrot, mount, perform I/O
2. Wait for files to get signed
3. Corrupt an object (modify form brick directly) from one of the replica (preferably first replica)
4. Wait till scrubber marks the object as corrupted
5. Perform I/O on the file again
6. stat to check object size

stat of the corrupted file before performing I/O on the file:
============================================================

[root@dhcp37-75 vol_rep]# stat newfile3
  File: ‘newfile3’
  Size: 5         	Blocks: 1          IO Block: 131072 regular file
Device: 2ah/42d	Inode: 9552773721163233512  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:fusefs_t:s0
Access: 2016-01-17 17:15:21.570454864 +0530
Modify: 2016-01-17 17:15:21.572454864 +0530
Change: 2016-01-18 11:29:36.455934094 +0530
 Birth: -

stat of the corrputed file after performing I/O on the file:
============================================================

[root@dhcp37-75 vol_rep]# stat newfile3
  File: ‘newfile3’
  Size: 42        	Blocks: 1          IO Block: 131072 regular file
Device: 2ah/42d	Inode: 9552773721163233512  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:fusefs_t:s0
Access: 2016-01-17 17:15:21.570454864 +0530
Modify: 2016-01-18 11:30:07.493933375 +0530
Change: 2016-01-18 11:30:08.920933342 +0530
 Birth: -

I see that the size, modify and access time of the file has been changed after doing an I/O and it does not return stale entries.

Comment 8 errata-xmlrpc 2016-03-01 06:07:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html


Note You need to log in before you can comment on or make changes to this bug.