Bug 1001577 - quota build 3: volume start failed
quota build 3: volume start failed
Status: CLOSED DUPLICATE of bug 979861
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.1
x86_64 Linux
medium Severity high
: ---
: RHGS 2.1.2
Assigned To: Kaushal
Saurabh
: ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-27 06:35 EDT by Saurabh
Modified: 2016-01-19 01:12 EST (History)
7 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0.49rhs
Doc Type: Bug Fix
Doc Text:
Previously the glusterd would listen on port 24007 for CLI requests, there was a possibility for glusterd to reject CLI requests from unprivileged ports (>1024) leading to CLI command execution failures. With this fix, glusterd listens to CLI requests through a socket file (in /var/run/glusterd.sock) preventing CLI command execution failures.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-03 02:31:32 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Saurabh 2013-08-27 06:35:26 EDT
Description of problem:
volume having quota enabled 
volume stopped while having quota enabled
volume start fails

Version-Release number of selected component (if applicable):
glusterfs-fuse-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-libs-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-api-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-geo-replication-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-server-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-rdma-3.4.0.20rhsquota5-1.el6rhs.x86_64


How reproducible:
always

Steps to Reproduce:
1. create a volume of 6x2, start it
2. enable quota,
3. mount over nfs
4. create some direcotries.
5. set quota on the directories and root of the volume
6. stop the volume
7. start the volume 

Now, after this if I do these steps,
a. quota list --- this is a pass
b. gluster volume info <volume name> --- this shows <volume name> volume is started
c. gluster volume status  <volume name>
   shows this response string,
   "Staging failed on 10.70.37.7. Error: Volume <volume-name> is not started"

Actual results:
result of step 6. and  7.
-------------------------
[root@rhsauto034 ~]# gluster volume stop --mode=script dist-rep3
volume stop: dist-rep3: success
[root@rhsauto034 ~]# gluster volume info dist-rep3

Volume Name: dist-rep3
Type: Distributed-Replicate
Volume ID: b305d605-3b96-4278-9005-e8249e4bb7f7
Status: Stopped
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d1r1-3
Brick2: rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d1r2-3
Brick3: rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d2r1-3
Brick4: rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d2r2-3
Brick5: rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d3r1-3
Brick6: rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d3r2-3
Brick7: rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d4r1-3
Brick8: rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d4r2-3
Brick9: rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d5r1-3
Brick10: rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d5r2-3
Brick11: rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d6r1-3
Brick12: rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d6r2-3
Options Reconfigured:
features.quota: on
[root@rhsauto034 ~]# ps -eaf | grep quotad
root       525     1  0 06:04 ?        00:00:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p /var/lib/glusterd/quotad/run/quotad.pid -l /var/log/glusterfs/quotad.log -S /var/run/56f694ad321d4c09fd535f813a2aa43a.socket --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off
root       552 15055  0 06:04 pts/2    00:00:00 grep quotad
[root@rhsauto034 ~]# gluster volume start dist-rep3
volume start: dist-rep3: failed: Commit failed on 10.70.37.7. Please check log file for details.




Expected results:
start should not fail

Additional info:

quotad process is running between stop and start because there was one more volume having quota enabled.
Comment 3 Kaushal 2013-09-30 07:47:11 EDT
The problem here is not with quota. rhsauto032 ran out of privileged ports (which may or may not have been due to quota, most likely because of running gluster commands). The brick process on rhsauto032 connected to glusterd to fetch the brick volfile using an insecure port (>1024). Currently glusterd (and gluster as a whole) rejects incoming requests from insecure ports. Since the brick process couldn't get its volfile, if failed to start and this lead to the inconsistent state observed in the bug report.

The current workaround for this issue is to set the option, 'management.rpc-auth-allow-insecure on' in /etc/glusterfs/glusterd.vol and restart glusterd. Setting this option allows request from insecure ports.

There have been patches posted upstream for the following downstream bugs which track the unprivileged ports issue.
1. https://bugzilla.redhat.com/show_bug.cgi?id=979926 -> upstream bug https://bugzilla.redhat.com/show_bug.cgi?id=980746
2. https://bugzilla.redhat.com/show_bug.cgi?id=979861 -> upstream bug https://bugzilla.redhat.com/show_bug.cgi?id=980754

The upstream patches haven't been accepted yet because of some regression failures. Once those patches are accepted, they can be backported downstream to the u1 branch.
Comment 4 Vivek Agarwal 2013-10-18 10:24:56 EDT
Want to let the patches soak in for U2. removing from u1 list
Comment 5 Kaushal 2013-12-13 02:29:37 EST
The fix for 979861 also addresses this. Moving to ON_QA.
Comment 6 Saurabh 2013-12-19 00:08:29 EST
didn't the same problem again, tried out several times on glusterfs-3.4.0.49rhs
Comment 7 Pavithra 2014-01-03 01:06:15 EST
Can you please verify this doc text for technical accuracy?
Comment 8 Kaushal 2014-01-03 02:31:32 EST
Closing this bug as dup of 979861, as this bug is just a specific incarnation of it.

*** This bug has been marked as a duplicate of bug 979861 ***

Note You need to log in before you can comment on or make changes to this bug.