Bug 1726219
Summary: | Volume start failed when shd is down in one of the node in cluster | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Anees Patel <anepatel> | |
Component: | glusterd | Assignee: | Mohammed Rafi KC <rkavunga> | |
Status: | CLOSED DEFERRED | QA Contact: | Bala Konda Reddy M <bmekala> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.5 | CC: | amukherj, nchilaka, rhs-bugs, rkavunga, sheggodu, srakonde, storage-qa-internal, vbellur, vdas | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | shd-multiplexing | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1728766 (view as bug list) | Environment: | ||
Last Closed: | 2020-01-20 07:57:56 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1728766 |
Description
Anees Patel
2019-07-02 11:04:56 UTC
Version-Release number of selected component ]# rpm -qa | grep gluster glusterfs-cli-6.0-7.el7rhgs.x86_64 glusterfs-api-6.0-7.el7rhgs.x86_64 glusterfs-resource-agents-6.0-7.el7rhgs.noarch python2-gluster-6.0-7.el7rhgs.x86_64 glusterfs-geo-replication-6.0-7.el7rhgs.x86_64 glusterfs-6.0-7.el7rhgs.x86_64 glusterfs-fuse-6.0-7.el7rhgs.x86_64 glusterfs-api-devel-6.0-7.el7rhgs.x86_64 Changing the component to CLI as it is failing in volume start and giving inconsistent outputs in the status. Might need some attention from glusterd folks, CCing them as well. From the reproducer: 5. Now start volume from node 1 # gluster v start test3 volume start: test3: failed: Commit failed on localhost. Please check log file for details. O/p says volume start failed. As the volume failed to start, the half-cooked "volume start" transaction might have written as to the store as this volume is started. but as the commit is failed, the commit request is not sent to the peers. That's why peers show this volume as stopped when the originator show it is as started. Here, we need to root cause why the volume start transaction has failed. I will follow the reproducer and try to reproduce this on my setup. Looks like, it has some relation with shd too, will update the BZ with details soon. and, changing the component to glusterd for now. Thanks, Sanju (In reply to Sanju from comment #4) > From the reproducer: > 5. Now start volume from node 1 > # gluster v start test3 > volume start: test3: failed: Commit failed on localhost. Please check log > file for details. > O/p says volume start failed. > > As the volume failed to start, the half-cooked "volume start" transaction > might have written as to the store as this volume is started. but as the > commit is failed, the commit request is not sent to the peers. That's why > peers show this volume as stopped when the originator show it is as started. No doubt, that's how it has happened. The title of the bug is now misleading now and it's expected. Just like what you mentioned, we should check why volume start failed. Have we not looked at the respective glusterd logs to see what happened there? > > Here, we need to root cause why the volume start transaction has failed. I > will follow the reproducer and try to reproduce this on my setup. Looks > like, it has some relation with shd too, will update the BZ with details > soon. > > and, changing the component to glusterd for now. > > Thanks, > Sanju The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |