Bug 1773558

Summary: DHT: Lookup-optimize is disabled for dirs with null layout ranges
Product: [Community] GlusterFS Reporter: Nithya Balachandran <nbalacha>
Component: distributeAssignee: Susant Kumar Palai <spalai>
Status: CLOSED UPSTREAM QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: mainlineCC: bugs, rhs-bugs, saraut, spalai, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1761311 Environment:
Last Closed: 2020-03-12 14:49:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1761311    
Bug Blocks:    

Description Nithya Balachandran 2019-11-18 12:57:22 UTC
+++ This bug was initially created as a clone of Bug #1761311 +++

Description of problem:
DHT: A directory with a null layout range on one or more subvols will cause lookup everywhere to be called for its contents.

Version-Release number of selected component (if applicable):


How reproducible:

Consistently.

Steps to Reproduce:

I added the following log message to the dht_lookup_everywhere() function and did a source install:

         gf_msg_debug (this->name, 0,
                       "winding lookup call to %d subvols", call_cnt);
 
+gf_msg ("Nithya", GF_LOG_INFO, 0, 0, "Calling lookup everywhere on %s", loc->path);
+

1. Create a 2 brick distribute volume, fuse mount it and create a directory dir1
2. Add a brick and rebalance the volume
3. Create a file inside dir1 and rename it so that a linkto file is created
4. Add a brick but do not run a rebalance.
5. Delete the linkto file from the backend brick.
6. Unmount and remount the volume
7. cd /mnt/gluster/dir1 
Do not list the dir contents.
8. stat xyz1 

Actual results:
stat succeeds.
The following message is printed in the mount logs:

[2019-10-14 04:53:35.586127] I [MSGID: 0] [dht-common.c:2989:dht_lookup_everywhere] 0-Nithya: Calling lookup everywhere on /dir1/xyz1
The linkto file is recreated on the hashed brick.



Expected results:
stat should fail.

Additional info:


RCA:

A null trusted.glusterfs.dht xattr has a null commit hash as well.

When merging the disk layout information for the parent directory into the layout structure, dht_layout_merge() does the following:


    if (layout->commit_hash == 0) {                                             
        layout->commit_hash = layout->list[i].commit_hash;                      
    } else if (layout->commit_hash != layout->list[i].commit_hash) {            
        layout->commit_hash = DHT_LAYOUT_HASH_INVALID;                          
    } 

As the layout xattr on the newly added brick is 0, the layout->commit hash is set to DHT_LAYOUT_HASH_INVALID. In dht_lookup_cbk for xyz1, the parent_layout->commit_hash will not match the conf->vol_commit_hash, causing lookup everywhere to be called.


The same behaviour is seen in 3.5.0.

--- Additional comment from RHEL Product and Program Management on 2019-10-14 05:17:54 UTC ---

This bug is automatically being proposed for the next minor release of Red Hat Gluster Storage by setting the release flag 'rhgs‑3.5.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Nithya Balachandran on 2019-10-14 05:34:21 UTC ---

As this does not affect file ops and is not a newly introduced issue, I am moving this to 3.5.0-beyond.

--- Additional comment from RHEL Product and Program Management on 2019-10-14 05:34:29 UTC ---

The RHGS 3,5,0 release is at a stage where only release blockers are accepted for fix.

This bug has been deferred from RHGS 3.5.0 since it has not been proposed as a release blocker within the expected time.

If there is sufficient data to justify this bug as release blocker for RHGS 3.5.0, please provide the justification at a BZ comment, clear the '3.5.0-beyond' entry from the 'Internal Whiteboard' field, set the current proposed release flag to 'null', and set both the 'rhgs-3.5.0' release flag and the 'blocker flag' to '?'.

If you do not have the rights to set the 'blocker' flag, please contact RHGS QE to help set it.

Comment 1 Worker Ant 2019-11-18 13:07:29 UTC
REVIEW: https://review.gluster.org/23720 (cluster/dht: Set commit hash for null layouts) posted (#1) for review on master by N Balachandran

Comment 3 Worker Ant 2020-03-12 14:49:00 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/1069, and will be tracked there from now on. Visit GitHub issues URL for further details