Description of problem: ======================= Killed glusterd,glusterfs and glusterfsd on one of the storage server. Removed the brick directory using "rm". Created a directory with same name under same path. Started glusterd. Tried to start the volume force, it failed with error "volume start: vol-dis-rep: failed: Failed to get extended attribute trusted.glusterfs.volume-id for brick dir /rhs/brick1/b1. Reason : No data available" Version-Release number of selected component (if applicable): ============================================================= [root@rhs-client11 ~]# rpm -qa | grep gluster glusterfs-debuginfo-3.4.0.1rhs-1.el6rhs.x86_64 glusterfs-fuse-3.4.0.1rhs-1.el6rhs.x86_64 gluster-swift-container-1.4.8-4.el6.noarch gluster-swift-1.4.8-4.el6.noarch gluster-swift-doc-1.4.8-4.el6.noarch vdsm-gluster-4.10.2-4.0.qa5.el6rhs.noarch gluster-swift-plugin-1.0-5.noarch gluster-swift-proxy-1.4.8-4.el6.noarch gluster-swift-account-1.4.8-4.el6.noarch glusterfs-geo-replication-3.4.0.1rhs-1.el6rhs.x86_64 org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch glusterfs-3.4.0.1rhs-1.el6rhs.x86_64 glusterfs-server-3.4.0.1rhs-1.el6rhs.x86_64 glusterfs-rdma-3.4.0.1rhs-1.el6rhs.x86_64 gluster-swift-object-1.4.8-4.el6.noarch [root@rhs-client11 ~]# How reproducible: ================= 1/1 Steps to Reproduce: 1. Create 6*2 volume on 4 storage node (bricks: b1 to b6) 2. Mount on FUSE and NFS client 3. killall glusterfs ; killall glusterfsd ; killall glusterd on one of the storage node. 4. Remove brick directories from the same storage node where the glusterfs,glusterd were stopped at step 3 5. Create the brick directories under same path as above. 6. Start the glusterd 7. Start the volume forcefully Actual results: =============== [root@rhs-client11 ~]# gluster v i Volume Name: vol-dis-rep Type: Distributed-Replicate Volume ID: 5d6c5e6b-9ab5-450c-8fb1-9e33a16acb64 Status: Started Number of Bricks: 6 x 2 = 12 Transport-type: tcp Bricks: Brick1: 10.70.36.35:/rhs/brick1/b1 Brick2: 10.70.36.36:/rhs/brick1/b2 Brick3: 10.70.36.35:/rhs/brick1/b3 Brick4: 10.70.36.36:/rhs/brick1/b4 Brick5: 10.70.36.35:/rhs/brick1/b5 Brick6: 10.70.36.36:/rhs/brick1/b6 Brick7: 10.70.36.37:/rhs/brick1/b7 Brick8: 10.70.36.38:/rhs/brick1/b8 Brick9: 10.70.36.37:/rhs/brick1/b9 Brick10: 10.70.36.38:/rhs/brick1/b10 Brick11: 10.70.36.37:/rhs/brick1/b11 Brick12: 10.70.36.38:/rhs/brick1/b12 Options Reconfigured: performance.io-cache: off [root@rhs-client11 ~]# [root@rhs-client11 ~]# [root@rhs-client11 ~]# killall glusterfs ; killall glusterfsd ; killall glusterd [root@rhs-client11 ~]# [root@rhs-client11 ~]# rm -rf /rhs/brick1/b* [root@rhs-client11 ~]# mkdir /rhs/brick1/b1 [root@rhs-client11 ~]# mkdir /rhs/brick1/b3 [root@rhs-client11 ~]# mkdir /rhs/brick1/b5 [root@rhs-client11 ~]# ls /rhs/brick1/b b1/ b3/ b5/ [root@rhs-client11 ~]# ls /rhs/brick1/b1 [root@rhs-client11 ~]# [root@rhs-client11 ~]# service glusterd start Starting glusterd: [ OK ] [root@rhs-client11 ~]# [root@rhs-client11 ~]# gluster volume start vol-dis-rep force volume start: vol-dis-rep: failed: Failed to get extended attribute trusted.glusterfs.volume-id for brick dir /rhs/brick1/b1. Reason : No data available [root@rhs-client11 ~]# Expected results: ================= Gluster volume should be able to start Additional info: ================ The above case used to work on RHS2.0
Rahul, That is expected behavior. The only way you can start the volume is by manually setting the extended attribute 'trusted.glusterfs.volume-id' to the volume-id of the volume, on the newly created brick directory, before attempting to start the volume. And the volume-id of the volume can be found in the file /var/lib/glusterd/vols/<volname>/info. However, the log message could be changed to provide the workaround for starting the volume.
Krutika, IMO volume start should fail (to avoid accident umount cases), however when force is used, it means to say user knows what he is doing and should start the volume.
(In reply to comment #3) > Rahul, > > That is expected behavior. > > The only way you can start the volume is by manually setting the extended > attribute 'trusted.glusterfs.volume-id' to the volume-id of the volume, on > the newly created brick directory, before attempting to start the volume. > And the volume-id of the volume can be found in the file > /var/lib/glusterd/vols/<volname>/info. > > However, the log message could be changed to provide the workaround for > starting the volume. This is a regression which used to work in 2.0 and as mentioned in comment 4, start force should work here. We try this case to simulate a disk-replacement scenario.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html