Bug 1299432 - Glusterd: Creation of volume is failing if one of the brick is down on the server
Glusterd: Creation of volume is failing if one of the brick is down on the se...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
3.1
Unspecified Unspecified
unspecified Severity unspecified
: ---
: RHGS 3.1.3
Assigned To: Atin Mukherjee
Byreddy
: ZStream
Depends On:
Blocks: 1299184 1299710 1312878
  Show dependency treegraph
 
Reported: 2016-01-18 06:22 EST by RajeshReddy
Modified: 2016-09-17 12:44 EDT (History)
10 users (show)

See Also:
Fixed In Version: glusterfs-3.7.9-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1299710 (view as bug list)
Environment:
Last Closed: 2016-06-23 01:02:49 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description RajeshReddy 2016-01-18 06:22:40 EST
Description of problem:
===============
Glusterd: Creation of volume is failing if one of the brick is down on the server 

Version-Release number of selected component (if applicable):
=========


How reproducible:


Steps to Reproduce:
============
1. Make sure one of the brick is down due to XFS crash 
2. Create new volume with other existing bricks but creation of volume is failing with 

volume create:test_123: failed: Staging failed on transformers.lab.eng.blr.redhat.com. Error: Brick: transformers:/rhs/brick7/dv2-3_rajesh_22 not available. Brick may be containing or be contained by an existing brick
3.

Actual results:


Expected results:


Additional info:
==============


Breakpoint 1, glusterd_is_brickpath_available (uuid=0x7f2b4000d8c0 "Z\323\066^n\020J\b\202\273\361\346\tQ\344\026", path=0x7f2b40009fb0 "/rhs/brick11/test123") at glusterd-utils.c:1166
1166    {
(gdb) n
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1166    {
(gdb)
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1172            char                    tmp_brickpath[PATH_MAX+1] = {0};
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1171            char                    tmp_path[PATH_MAX+1] = {0};
(gdb)
1172            char                    tmp_brickpath[PATH_MAX+1] = {0};
(gdb)
1174            priv = THIS->private;
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1174            priv = THIS->private;
(gdb)
1176            strncpy (tmp_path, path, PATH_MAX);
(gdb)
1178            if (!realpath (path, tmp_path)) {
(gdb)
1179                    if (errno != ENOENT) {
(gdb)
1183                    strncpy(tmp_path,path,PATH_MAX);
(gdb)
1186            cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) {
(gdb) p tmp_path
$1 = "/rhs/brick11/test123", '\000' <repeats 4076 times>
(gdb) n
1200                            if (_is_prefix (tmp_brickpath, tmp_path))
(gdb)
1186            cds_list_for_each_entry (volinfo, &priv->volumes, vol_list) {
(gdb)
1187                    cds_list_for_each_entry (brickinfo, &volinfo->bricks,
(gdb)
1189                            if (gf_uuid_compare (uuid, brickinfo->uuid))
(gdb)
1189                            if (gf_uuid_compare (uuid, brickinfo->uuid))
(gdb) p brickinfo
$2 = (glusterd_brickinfo_t *) 0x7f2b6d5d1120
(gdb) p brickinfo.hostname
$3 = "transformers.lab.eng.blr.redhat.com", '\000' <repeats 988 times>
(gdb) p brickinfo.path
$4 = "/rhs/brick1/afr1x2_attach_hot", '\000' <repeats 4066 times>
(gdb) n
1192                            if (!realpath (brickinfo->path, tmp_brickpath)) {
(gdb) n
1193                                if (errno == ENOENT)
(gdb) p errno
$5 = 5
(gdb) n
1170            gf_boolean_t            available  = _gf_false;
(gdb)
1207    }
(gdb) p available
$6 = _gf_false
(gdb) 

Filesystem                                              Size  Used Avail Use% Mounted on
/dev/mapper/rhel_transformers-root                       50G   20G   31G  39% /
devtmpfs                                                 32G     0   32G   0% /dev
tmpfs                                                    32G   36M   32G   1% /dev/shm
tmpfs                                                    32G  3.4G   28G  11% /run
tmpfs                                                    32G     0   32G   0% /sys/fs/cgroup
/dev/sda1                                               494M  159M  336M  33% /boot
/dev/mapper/rhel_transformers-home                      477G   13G  464G   3% /home
tmpfs                                                   6.3G     0  6.3G   0% /run/user/0
/dev/mapper/RHS_vg1-RHS_lv1                             1.9T  323G  1.5T  18% /rhs/brick1
/dev/mapper/RHS_vg2-RHS_lv2                             1.9T   57G  1.8T   4% /rhs/brick2
/dev/mapper/RHS_vg3-RHS_lv3                             1.9T   57G  1.8T   4% /rhs/brick3
/dev/mapper/RHS_vg4-RHS_lv4                             1.9T   57G  1.8T   4% /rhs/brick4
/dev/mapper/RHS_vg5-RHS_lv5                             1.9T   57G  1.8T   4% /rhs/brick5
/dev/mapper/RHS_vg6-RHS_lv6                             1.9T   57G  1.8T   4% /rhs/brick6
/dev/mapper/RHS_vg7-RHS_lv7                             1.9T  1.4G  1.9T   1% /rhs/brick7
/dev/mapper/RHS_vg8-RHS_lv8                             1.9T  1.4G  1.9T   1% /rhs/brick8
/dev/mapper/RHS_vg9-RHS_lv9                             1.9T  1.4G  1.9T   1% /rhs/brick9
/dev/mapper/RHS_vg10-RHS_lv10                           1.9T  4.2G  1.8T   1% /rhs/brick10
/dev/mapper/RHS_vg11-RHS_lv11                           1.9T  4.2G  1.8T   1% /rhs/brick11
/dev/mapper/RHS_vg12-RHS_lv12                           1.9T  4.2G  1.8T   1% /rhs/brick12
ninja.lab.eng.blr.redhat.com:afr2x2_tier                1.9T  567G  1.3T  31% /mnt/glusterfs
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31% /mnt/glusterfs2
ninja.lab.eng.blr.redhat.com:/disperse_vol2             4.2T  3.6T  596G  86% /mnt/glusterfs_EC
ninja.lab.eng.blr.redhat.com:/disperse_vol2             4.2T  3.6T  596G  86% /mnt/glusterfs_EC_NO
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31% /mnt/glusterfs2_new
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_new           1.9T  567G  1.3T  31% /mnt/glusterfs2_new2
ninja.lab.eng.blr.redhat.com:/afr2x2_tier_mod           1.9T  564G  1.3T  31% /mnt/glusterfs2_mod
ninja.lab.eng.blr.redhat.com:afr2x2_tier_new            1.9T  567G  1.3T  31% /mnt/afr2x2_tier_new
Comment 2 Gaurav Kumar Garg 2016-01-18 08:15:37 EST
Hi Rajesh,

Can you make sure that the brick is already used by another volume means .glusterfs directory is not there while creating new volume.?
Comment 3 Atin Mukherjee 2016-01-18 09:13:50 EST
As Gaurav mentioned in #c2 iIt seems like you have tried to reuse a brick which is or was earlier used for other gluster volume, that's exactly the error message says. I strongly believe this is not a bug. Please confirm.
Comment 5 Atin Mukherjee 2016-01-18 23:43:18 EST
After going through the code, it looks like a bug. If realpath () call fails with an EIO (which indicates the underlying file system of existing bricks may have some problem) then we return the path is not available instead of skipping the same brick path
Comment 6 Atin Mukherjee 2016-01-19 00:38:59 EST
Upstream patch http://review.gluster.org/13258 is posted for review
Comment 7 RajeshReddy 2016-01-19 00:45:47 EST
Development team is able to re-create the problem
Comment 9 Atin Mukherjee 2016-03-22 08:06:00 EDT
The fix is now available in rhgs-3.1.3 branch, hence moving the state to Modified.
Comment 11 Byreddy 2016-04-11 00:58:39 EDT
Verified this bug using the build "glusterfs-3.7.9-1"

Steps followed:
===============
1. Created 1*2 volume using one node cluster and started it.
2. crashed underlying xfs for one of volume brick using "godown" tool
3. created the new volume using bricks not part of volume created in step-1, able to create new volume successfully.


With this Fix, reported issue is working fine.

Moving to verified state.


Note: Issues found around this fix will be tracked in different bugs.
Comment 13 errata-xmlrpc 2016-06-23 01:02:49 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240

Note You need to log in before you can comment on or make changes to this bug.