+++ This bug was initially created as a clone of Bug #1299710 +++ +++ This bug was initially created as a clone of Bug #1299432 +++ Description of problem: =============== Glusterd: Creation of volume is failing if one of the brick is down on the server Version-Release number of selected component (if applicable): ========= How reproducible: Steps to Reproduce: ============ 1. Make sure one of the brick is down due to XFS crash 2. Create new volume with other existing bricks but creation of volume is failing with volume create:test_123: failed: Staging failed on transformers.lab.eng.blr.redhat.com. Error: Brick: transformers:/rhs/brick7/dv2-3_rajesh_22 not available. Brick may be containing or be contained by an existing brick 3. Actual results: Expected results: Additional info: ============== Breakpoint 1, glusterd_is_brickpath_available (uuid=0x7f2b4000d8c0 "Z\323\066^n\020J\b\202\273\361\346\tQ\344\026", path=0x7f2b40009fb0 "/rhs/brick11/test123") at glusterd-utils.c:1166 1166 { (gdb) n 1171 char tmp_path[PATH_MAX+1] = {0}; (gdb) 1166 { (gdb) 1171 char tmp_path[PATH_MAX+1] = {0}; (gdb) 1172 char tmp_brickpath[PATH_MAX+1] = {0}; (gdb) 1176 strncpy (tmp_path, path, PATH_MAX); (gdb) 1171 char tmp_path[PATH_MAX+1] = {0}; (gdb) 1172 char tmp_brickpath[PATH_MAX+1] = {0}; (gdb) 1174 priv = THIS->private; (gdb) 1176 strncpy (tmp_path, path, PATH_MAX); (gdb) 1174 priv = THIS->private; (gdb) 1176 strncpy (tmp_path, path, PATH_MAX); (gdb) 1178 if (!realpath (path, tmp_path)) { (gdb) 1179 if (errno != ENOENT) { (gdb) 1183 strncpy(tmp_path,path,PATH_MAX); (gdb) 1186 cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) { (gdb) p tmp_path $1 = "/rhs/brick11/test123", '\000' <repeats 4076 times> (gdb) n 1200 if (_is_prefix (tmp_brickpath, tmp_path)) (gdb) 1186 cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) { (gdb) 1187 cds_list_for_each_entry (brickinfo, &volinfo->bricks, (gdb) 1189 if (gf_uuid_compare (uuid, brickinfo->uuid)) (gdb) 1189 if (gf_uuid_compare (uuid, brickinfo->uuid)) (gdb) p brickinfo $2 = (glusterd_brickinfo_t *) 0x7f2b6d5d1120 (gdb) p brickinfo.hostname $3 = "transformers.lab.eng.blr.redhat.com", '\000' <repeats 988 times> (gdb) p brickinfo.path $4 = "/rhs/brick1/afr1x2_attach_hot", '\000' <repeats 4066 times> (gdb) n 1192 if (!realpath (brickinfo->path, tmp_brickpath)) { (gdb) n 1193 if (errno == ENOENT) (gdb) p errno $5 = 5 (gdb) n 1170 gf_boolean_t available = _gf_false; (gdb) 1207 } (gdb) p available $6 = _gf_false (gdb) Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel_transformers-root 50G 20G 31G 39% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 36M 32G 1% /dev/shm tmpfs 32G 3.4G 28G 11% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/sda1 494M 159M 336M 33% /boot /dev/mapper/rhel_transformers-home 477G 13G 464G 3% /home tmpfs 6.3G 0 6.3G 0% /run/user/0 /dev/mapper/RHS_vg1-RHS_lv1 1.9T 323G 1.5T 18% /rhs/brick1 /dev/mapper/RHS_vg2-RHS_lv2 1.9T 57G 1.8T 4% /rhs/brick2 /dev/mapper/RHS_vg3-RHS_lv3 1.9T 57G 1.8T 4% /rhs/brick3 /dev/mapper/RHS_vg4-RHS_lv4 1.9T 57G 1.8T 4% /rhs/brick4 /dev/mapper/RHS_vg5-RHS_lv5 1.9T 57G 1.8T 4% /rhs/brick5 /dev/mapper/RHS_vg6-RHS_lv6 1.9T 57G 1.8T 4% /rhs/brick6 /dev/mapper/RHS_vg7-RHS_lv7 1.9T 1.4G 1.9T 1% /rhs/brick7 /dev/mapper/RHS_vg8-RHS_lv8 1.9T 1.4G 1.9T 1% /rhs/brick8 /dev/mapper/RHS_vg9-RHS_lv9 1.9T 1.4G 1.9T 1% /rhs/brick9 /dev/mapper/RHS_vg10-RHS_lv10 1.9T 4.2G 1.8T 1% /rhs/brick10 /dev/mapper/RHS_vg11-RHS_lv11 1.9T 4.2G 1.8T 1% /rhs/brick11 /dev/mapper/RHS_vg12-RHS_lv12 1.9T 4.2G 1.8T 1% /rhs/brick12 ninja.lab.eng.blr.redhat.com:afr2x2_tier 1.9T 567G 1.3T 31% /mnt/glusterfs ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31% /mnt/glusterfs2 ninja.lab.eng.blr.redhat.com:/disperse_vol2 4.2T 3.6T 596G 86% /mnt/glusterfs_EC ninja.lab.eng.blr.redhat.com:/disperse_vol2 4.2T 3.6T 596G 86% /mnt/glusterfs_EC_NO ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31% /mnt/glusterfs2_new ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new 1.9T 567G 1.3T 31% /mnt/glusterfs2_new2 ninja.lab.eng.blr.redhat.com:/afr2x2_tier_mod 1.9T 564G 1.3T 31% /mnt/glusterfs2_mod ninja.lab.eng.blr.redhat.com:afr2x2_tier_new 1.9T 567G 1.3T 31% /mnt/afr2x2_tier_new --- Additional comment from Red Hat Bugzilla Rules Engine on 2016-01-18 06:22:44 EST --- This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from Gaurav Kumar Garg on 2016-01-18 08:15:37 EST --- Hi Rajesh, Can you make sure that the brick is already used by another volume means .glusterfs directory is not there while creating new volume.? --- Additional comment from Atin Mukherjee on 2016-01-18 09:13:50 EST --- As Gaurav mentioned in #c2 iIt seems like you have tried to reuse a brick which is or was earlier used for other gluster volume, that's exactly the error message says. I strongly believe this is not a bug. Please confirm. --- Additional comment from Byreddy on 2016-01-18 12:05:25 EST --- Hi Gaurav, The issue here is, brick went down because of xfs crash after that he tried to create a new volume using other bricks in that vm ( not used for any volume ), it's not allowing to create the new volume with error message mentioned in description. Similar XFS crash with gluster link - http://oss.sgi.com/archives/xfs/2013-01/msg00059.html Thanks --- Additional comment from Atin Mukherjee on 2016-01-18 23:43:18 EST --- After going through the code, it looks like a bug. If realpath () call fails with an EIO (which indicates the underlying file system of existing bricks may have some problem) then we return the path is not available instead of skipping the same brick path --- Additional comment from Vijay Bellur on 2016-01-19 00:37:06 EST --- REVIEW: http://review.gluster.org/13258 (glusterd: Skip brickpath validation if realpath returns EIO) posted (#1) for review on master by Atin Mukherjee (amukherj) --- Additional comment from Vijay Bellur on 2016-01-19 07:06:25 EST --- REVIEW: http://review.gluster.org/13258 (glusterd: remove glusterd_is_brickpath_available () check) posted (#2) for review on master by Atin Mukherjee (amukherj) --- Additional comment from Vijay Bellur on 2016-01-21 00:44:45 EST --- REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for realpath checks in glusterd_is_brickpath_available) posted (#3) for review on master by Atin Mukherjee (amukherj) --- Additional comment from Vijay Bellur on 2016-02-05 04:54:24 EST --- REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for realpath checks in glusterd_is_brickpath_available) posted (#4) for review on master by Atin Mukherjee (amukherj) --- Additional comment from Vijay Bellur on 2016-02-17 07:09:00 EST --- REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for realpath checks in glusterd_is_brickpath_available) posted (#5) for review on master by Atin Mukherjee (amukherj) --- Additional comment from Vijay Bellur on 2016-02-22 04:32:32 EST --- REVIEW: http://review.gluster.org/13258 (glusterd: use string comparison for realpath checks in glusterd_is_brickpath_available) posted (#6) for review on master by Atin Mukherjee (amukherj) --- Additional comment from Vijay Bellur on 2016-02-29 06:55:31 EST --- COMMIT: http://review.gluster.org/13258 committed in master by Jeff Darcy (jdarcy) ------ commit a60c39de31e8258cb56d8db6bd8ec2491a942a4e Author: Atin Mukherjee <amukherj> Date: Tue Jan 19 10:45:22 2016 +0530 glusterd: use string comparison for realpath checks in glusterd_is_brickpath_available glusterd_is_brickpath_available () used to call realpath() for checking the whether the new brick path matches with the existing ones. The problem with this is if the underlying file system is bad for any one of the existing bricks then realpath() would fail and we wouldn't allow to create the new brick even if it should be allowed. Fix is to use string comparison with having a new field real_path in brickinfo to store the absolute path Change-Id: I1250ea5345f00fca0f6128056ebd08750d604f0a BUG: 1299710 Signed-off-by: Atin Mukherjee <amukherj> Reviewed-on: http://review.gluster.org/13258 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Jeff Darcy <jdarcy>
REVIEW: http://review.gluster.org/13550 (glusterd: use string comparison for realpath checks in glusterd_is_brickpath_available) posted (#1) for review on release-3.7 by Atin Mukherjee (amukherj)
COMMIT: http://review.gluster.org/13550 committed in release-3.7 by Atin Mukherjee (amukherj) ------ commit cdc96d5d4c6247d9b0ca942eeb37338dacfe93ee Author: Atin Mukherjee <amukherj> Date: Tue Jan 19 10:45:22 2016 +0530 glusterd: use string comparison for realpath checks in glusterd_is_brickpath_available Backport of http://review.gluster.org/13258 glusterd_is_brickpath_available () used to call realpath() for checking the whether the new brick path matches with the existing ones. The problem with this is if the underlying file system is bad for any one of the existing bricks then realpath() would fail and we wouldn't allow to create the new brick even if it should be allowed. Fix is to use string comparison with having a new field real_path in brickinfo to store the absolute path Change-Id: I1250ea5345f00fca0f6128056ebd08750d604f0a BUG: 1312878 Signed-off-by: Atin Mukherjee <amukherj> Reviewed-on: http://review.gluster.org/13258 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Jeff Darcy <jdarcy> Reviewed-on: http://review.gluster.org/13550
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.9, please open a new bug report. glusterfs-3.7.9 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/gluster-users/2016-March/025922.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user