Bug 1299432

Summary: Glusterd: Creation of volume is failing if one of the brick is down on the server
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: RajeshReddy <rmekala>
Component: glusterdAssignee: Atin Mukherjee <amukherj>
Status: CLOSED ERRATA QA Contact: Byreddy <bsrirama>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: asrivast, bsrirama, mzywusko, rhinduja, rhs-bugs, rmekala, sasundar, smohan, storage-qa-internal, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.9-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1299710 (view as bug list) Environment:
Last Closed: 2016-06-23 05:02:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1299184, 1299710, 1312878    

Description RajeshReddy 2016-01-18 11:22:40 UTC
Description of problem:
===============
Glusterd: Creation of volume is failing if one of the brick is down on the server 

Version-Release number of selected component (if applicable):
=========


How reproducible:


Steps to Reproduce:
============
1. Make sure one of the brick is down due to XFS crash 
2. Create new volume with other existing bricks but creation of volume is failing with 

volume create:test_123: failed: Staging failed on transformers.lab.eng.blr.redhat.com. Error: Brick: transformers:/rhs/brick7/dv2-3_rajesh_22 not available. Brick may be containing or be contained by an existing brick
3.

Actual results:


Expected results:


Additional info:
==============


Breakpoint 1, glusterd_is_brickpath_available (uuid=0x7f2b4000d8c0 "Z\323\066^n\020J\b\202\273\361\346\tQ\344\026", path=0x7f2b40009fb0 "/rhs/brick11/test123") at glusterd-utils.c:1166
1166    {
(gdb) n
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1166    {
(gdb)
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1172            char                    tmp_brickpath[PATH_MAX+1] = {0};
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1172            char                    tmp_brickpath[PATH_MAX+1] = {0};
(gdb)
1174            priv = THIS->private;
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1174            priv = THIS->private;
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1178            if (!realpath (path, tmp_path)) {
(gdb)
1179                    if (errno != ENOENT) {
(gdb)
1183                    strncpy(tmp_path,path,PATH_MAX);
(gdb)
1186            cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) {
(gdb) p tmp_path
$1 = "/rhs/brick11/test123", '\000' <repeats 4076 times>
(gdb) n
1200                            if (_is_prefix (tmp_brickpath, tmp_path))
(gdb)
1186            cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) {
(gdb)
1187                    cds_list_for_each_entry (brickinfo, &volinfo->bricks,
(gdb)
1189                            if (gf_uuid_compare (uuid, brickinfo->uuid))
(gdb)
1189                            if (gf_uuid_compare (uuid, brickinfo->uuid))
(gdb) p brickinfo
$2 = (glusterd_brickinfo_t *) 0x7f2b6d5d1120
(gdb) p brickinfo.hostname
$3 = "transformers.lab.eng.blr.redhat.com", '\000' <repeats 988 times>
(gdb) p brickinfo.path
$4 = "/rhs/brick1/afr1x2_attach_hot", '\000' <repeats 4066 times>
(gdb) n
1192                            if (!realpath (brickinfo->path, tmp_brickpath)) {
(gdb) n
1193                                if (errno == ENOENT)
(gdb) p errno
$5 = 5
(gdb) n
1170            gf_boolean_t            available  = _gf_false;
(gdb)
1207    }
(gdb) p available
$6 = _gf_false
(gdb) 

Filesystem                                              Size  Used Avail Use% Mounted on
/dev/mapper/rhel_transformers-root                       50G   20G   31G  39% /
devtmpfs                                                 32G     0   32G   0% /dev
tmpfs                                                    32G   36M   32G   1% /dev/shm
tmpfs                                                    32G  3.4G   28G  11% /run
tmpfs                                                    32G     0   32G   0% /sys/fs/cgroup
/dev/sda1                                               494M  159M  336M  33% /boot
/dev/mapper/rhel_transformers-home                      477G   13G  464G   3% /home
tmpfs                                                   6.3G     0  6.3G   0% /run/user/0
/dev/mapper/RHS_vg1-RHS_lv1                             1.9T  323G  1.5T  18% /rhs/brick1
/dev/mapper/RHS_vg2-RHS_lv2                             1.9T   57G  1.8T   4% /rhs/brick2
/dev/mapper/RHS_vg3-RHS_lv3                             1.9T   57G  1.8T   4% /rhs/brick3
/dev/mapper/RHS_vg4-RHS_lv4                             1.9T   57G  1.8T   4% /rhs/brick4
/dev/mapper/RHS_vg5-RHS_lv5                             1.9T   57G  1.8T   4% /rhs/brick5
/dev/mapper/RHS_vg6-RHS_lv6                             1.9T   57G  1.8T   4% /rhs/brick6
/dev/mapper/RHS_vg7-RHS_lv7                             1.9T  1.4G  1.9T   1% /rhs/brick7
/dev/mapper/RHS_vg8-RHS_lv8                             1.9T  1.4G  1.9T   1% /rhs/brick8
/dev/mapper/RHS_vg9-RHS_lv9                             1.9T  1.4G  1.9T   1% /rhs/brick9
/dev/mapper/RHS_vg10-RHS_lv10                           1.9T  4.2G  1.8T   1% /rhs/brick10
/dev/mapper/RHS_vg11-RHS_lv11                           1.9T  4.2G  1.8T   1% /rhs/brick11
/dev/mapper/RHS_vg12-RHS_lv12                           1.9T  4.2G  1.8T   1% /rhs/brick12
ninja.lab.eng.blr.redhat.com:afr2x2_tier                1.9T  567G  1.3T  31% /mnt/glusterfs
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31% /mnt/glusterfs2
ninja.lab.eng.blr.redhat.com:/disperse_vol2             4.2T  3.6T  596G  86% /mnt/glusterfs_EC
ninja.lab.eng.blr.redhat.com:/disperse_vol2             4.2T  3.6T  596G  86% /mnt/glusterfs_EC_NO
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31% /mnt/glusterfs2_new
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31% /mnt/glusterfs2_new2
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_mod           1.9T  564G  1.3T  31% /mnt/glusterfs2_mod
ninja.lab.eng.blr.redhat.com:afr2x2_tier_new            1.9T  567G  1.3T  31% /mnt/afr2x2_tier_new

Comment 2 Gaurav Kumar Garg 2016-01-18 13:15:37 UTC
Hi Rajesh,

Can you make sure that the brick is already used by another volume means .glusterfs directory is not there while creating new volume.?

Comment 3 Atin Mukherjee 2016-01-18 14:13:50 UTC
As Gaurav mentioned in #c2 iIt seems like you have tried to reuse a brick which is or was earlier used for other gluster volume, that's exactly the error message says. I strongly believe this is not a bug. Please confirm.

Comment 5 Atin Mukherjee 2016-01-19 04:43:18 UTC
After going through the code, it looks like a bug. If realpath () call fails with an EIO (which indicates the underlying file system of existing bricks may have some problem) then we return the path is not available instead of skipping the same brick path

Comment 6 Atin Mukherjee 2016-01-19 05:38:59 UTC
Upstream patch http://review.gluster.org/13258 is posted for review

Comment 7 RajeshReddy 2016-01-19 05:45:47 UTC
Development team is able to re-create the problem

Comment 9 Atin Mukherjee 2016-03-22 12:06:00 UTC
The fix is now available in rhgs-3.1.3 branch, hence moving the state to Modified.

Comment 11 Byreddy 2016-04-11 04:58:39 UTC
Verified this bug using the build "glusterfs-3.7.9-1"

Steps followed:
===============
1. Created 1*2 volume using one node cluster and started it.
2. crashed underlying xfs for one of volume brick using "godown" tool
3. created the new volume using bricks not part of volume created in step-1, able to create new volume successfully.


With this Fix, reported issue is working fine.

Moving to verified state.


Note: Issues found around this fix will be tracked in different bugs.

Comment 13 errata-xmlrpc 2016-06-23 05:02:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240