Description of problem: =============== Glusterd: Creation of volume is failing if one of the brick is down on the server Version-Release number of selected component (if applicable): ========= How reproducible: Steps to Reproduce: ============ 1. Make sure one of the brick is down due to XFS crash 2. Create new volume with other existing bricks but creation of volume is failing with volume create:test_123: failed: Staging failed on transformers.lab.eng.blr.redhat.com. Error: Brick: transformers:/rhs/brick7/dv2-3_rajesh_22 not available. Brick may be containing or be contained by an existing brick 3. Actual results: Expected results: Additional info: ============== Breakpoint 1, glusterd_is_brickpath_available (uuid=0x7f2b4000d8c0 "Z\323\066^n\020J\b\202\273\361\346\tQ\344\026", path=0x7f2b40009fb0 "/rhs/brick11/test123") at glusterd-utils.c:1166 1166 { (gdb) n 1171 char tmp_path[PATH_MAX+1] = {0}; (gdb) 1166 { (gdb) 1171 char tmp_path[PATH_MAX+1] = {0}; (gdb) 1172 char tmp_brickpath[PATH_MAX+1] = {0}; (gdb) 1176 strncpy (tmp_path, path, PATH_MAX); (gdb) 1171 char tmp_path[PATH_MAX+1] = {0}; (gdb) 1172 char tmp_brickpath[PATH_MAX+1] = {0}; (gdb) 1174 priv = THIS->private; (gdb) 1176 strncpy (tmp_path, path, PATH_MAX); (gdb) 1174 priv = THIS->private; (gdb) 1176 strncpy (tmp_path, path, PATH_MAX); (gdb) 1178 if (!realpath (path, tmp_path)) { (gdb) 1179 if (errno != ENOENT) { (gdb) 1183 strncpy(tmp_path,path,PATH_MAX); (gdb) 1186 cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) { (gdb) p tmp_path $1 = "/rhs/brick11/test123", '\000' <repeats 4076 times> (gdb) n 1200 if (_is_prefix (tmp_brickpath, tmp_path)) (gdb) 1186 cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) { (gdb) 1187 cds_list_for_each_entry (brickinfo, &volinfo->bricks, (gdb) 1189 if (gf_uuid_compare (uuid, brickinfo->uuid)) (gdb) 1189 if (gf_uuid_compare (uuid, brickinfo->uuid)) (gdb) p brickinfo $2 = (glusterd_brickinfo_t *) 0x7f2b6d5d1120 (gdb) p brickinfo.hostname $3 = "transformers.lab.eng.blr.redhat.com", '\000' <repeats 988 times> (gdb) p brickinfo.path $4 = "/rhs/brick1/afr1x2_attach_hot", '\000' <repeats 4066 times> (gdb) n 1192 if (!realpath (brickinfo->path, tmp_brickpath)) { (gdb) n 1193 if (errno == ENOENT) (gdb) p errno $5 = 5 (gdb) n 1170 gf_boolean_t available = _gf_false; (gdb) 1207 } (gdb) p available $6 = _gf_false (gdb) Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel_transformers-root 50G 20G 31G 39% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 36M 32G 1% /dev/shm tmpfs 32G 3.4G 28G 11% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/sda1 494M 159M 336M 33% /boot /dev/mapper/rhel_transformers-home 477G 13G 464G 3% /home tmpfs 6.3G 0 6.3G 0% /run/user/0 /dev/mapper/RHS_vg1-RHS_lv1 1.9T 323G 1.5T 18% /rhs/brick1 /dev/mapper/RHS_vg2-RHS_lv2 1.9T 57G 1.8T 4% /rhs/brick2 /dev/mapper/RHS_vg3-RHS_lv3 1.9T 57G 1.8T 4% /rhs/brick3 /dev/mapper/RHS_vg4-RHS_lv4 1.9T 57G 1.8T 4% /rhs/brick4 /dev/mapper/RHS_vg5-RHS_lv5 1.9T 57G 1.8T 4% /rhs/brick5 /dev/mapper/RHS_vg6-RHS_lv6 1.9T 57G 1.8T 4% /rhs/brick6 /dev/mapper/RHS_vg7-RHS_lv7 1.9T 1.4G 1.9T 1% /rhs/brick7 /dev/mapper/RHS_vg8-RHS_lv8 1.9T 1.4G 1.9T 1% /rhs/brick8 /dev/mapper/RHS_vg9-RHS_lv9 1.9T 1.4G 1.9T 1% /rhs/brick9 /dev/mapper/RHS_vg10-RHS_lv10 1.9T 4.2G 1.8T 1% /rhs/brick10 /dev/mapper/RHS_vg11-RHS_lv11 1.9T 4.2G 1.8T 1% /rhs/brick11 /dev/mapper/RHS_vg12-RHS_lv12 1.9T 4.2G 1.8T 1% /rhs/brick12 ninja.lab.eng.blr.redhat.com:afr2x2_tier 1.9T 567G 1.3T 31% /mnt/glusterfs ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31% /mnt/glusterfs2 ninja.lab.eng.blr.redhat.com:/disperse_vol2 4.2T 3.6T 596G 86% /mnt/glusterfs_EC ninja.lab.eng.blr.redhat.com:/disperse_vol2 4.2T 3.6T 596G 86% /mnt/glusterfs_EC_NO ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31% /mnt/glusterfs2_new ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31% /mnt/glusterfs2_new2 ninja.lab.eng.blr.redhat.com:/afr2x2_tier_mod 1.9T 564G 1.3T 31% /mnt/glusterfs2_mod ninja.lab.eng.blr.redhat.com:afr2x2_tier_new 1.9T 567G 1.3T 31% /mnt/afr2x2_tier_new
Hi Rajesh, Can you make sure that the brick is already used by another volume means .glusterfs directory is not there while creating new volume.?
As Gaurav mentioned in #c2 iIt seems like you have tried to reuse a brick which is or was earlier used for other gluster volume, that's exactly the error message says. I strongly believe this is not a bug. Please confirm.
After going through the code, it looks like a bug. If realpath () call fails with an EIO (which indicates the underlying file system of existing bricks may have some problem) then we return the path is not available instead of skipping the same brick path
Upstream patch http://review.gluster.org/13258 is posted for review
Development team is able to re-create the problem
The fix is now available in rhgs-3.1.3 branch, hence moving the state to Modified.
Verified this bug using the build "glusterfs-3.7.9-1" Steps followed: =============== 1. Created 1*2 volume using one node cluster and started it. 2. crashed underlying xfs for one of volume brick using "godown" tool 3. created the new volume using bricks not part of volume created in step-1, able to create new volume successfully. With this Fix, reported issue is working fine. Moving to verified state. Note: Issues found around this fix will be tracked in different bugs.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240