Bug 1747844
Summary: | Rebalance doesn't work correctly if performance.parallel-readdir on and with some other specific options set | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Howard <Howard.Chen> | ||||
Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | ||||
Status: | CLOSED WORKSFORME | QA Contact: | |||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.1 | CC: | bugs, nbalacha, pasik | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-11-11 11:37:21 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Howard
2019-09-02 03:48:15 UTC
I'll take a look and get back to you. Hi, Apologies for the delay but I finally managed to spend some time on this. Here is what I have so far: Release 4 is EOL so I tried with release-5. I used Fuse not NFS and could not reproduce the issue with rebalance - the contents of all directories were being migrated to the new bricks. I did however see an issue where I could not list the directories from the fuse mount immediately after they were created. This issue was not seen with parallel-readdir off. [root@rhgs313-7 ~]# glusterd; gluster v create test 192.168.122.7:/bricks/brick1/t-{1..5} ; gluster v set test readdir-ahead on; gluster v set test parallel-readdir on; gluster v start test; volume create: test: success: please start the volume to access data volume set: success volume set: success volume start: test: success [root@rhgs313-7 ~]# mount -t glusterfs -s 192.168.122.7:/test /mnt/fuse1 [root@rhgs313-7 ~]# cd /mnt/fuse1/; mkdir dir_1; mkdir dir_1/dir_2; mkdir dir_1/dir_2/dir_3; mkdir dir_1/dir_2/dir_3/dir_4 [root@rhgs313-7 fuse1]# ll total 0 On further analysis, this was happening because the stat information for the dirs received in dht_readdirp_cbk was invalid because of which dht will strip those entries out of the listing. This was fixed by https://review.gluster.org/#/c/glusterfs/+/21811/ and is available from release-6 onwards. It is possible that the same issue occurred on your volume so rebalance never processed these dirs. As the log-level as been set to ERROR, there are no messages in the rebalance log which can be used to figure out what happened. Please do the following: 1. Enable info level logging for client-log-level, reproduce the issue and send me the rebalance log. 2. Upgrade to release 6.x and see if you can still see the issue. (In reply to Nithya Balachandran from comment #2) > Hi, > > Apologies for the delay but I finally managed to spend some time on this. > Here is what I have so far: > > Release 4 is EOL so I tried with release-5. Apologies - 4 is not EOL yet. I retried the test above with the latest release-4.1 code and could not reproduce the rebalance problem. Please send the logs requested earlier and I will look into it. I'm closing this with WorksForMe. Please reopen if you still see this in the latest releases. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |