Bug 1213893

Summary: rebalance stuck at 0 byte when auth.allow is set
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Cedric Buissart <cbuissar>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED ERRATA QA Contact: RajeshReddy <rmekala>
Severity: high Docs Contact:
Priority: medium    
Version: rhgs-3.0CC: bhubbard, bkunal, divya, mzywusko, nbalacha, rhs-bugs, rmekala, sasundar, storage-qa-internal, vagarwal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.1   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.1-12 Doc Type: Bug Fix
Doc Text:
Previously, the brick processes did not consider rebalance processes to be trusted clients. As a consequence, if the auth.allow option was set for a volume, connections from the rebalance processes for that volume were rejected by the brick processes, causing rebalance to hang. With this fix, the rebalance process is treated as a trusted client by the brick processes. Now, the rebalance works even if the auth.allow option is set for a volume.
Story Points: ---
Clone Of:
: 1248415 (view as bug list) Environment:
Last Closed: 2015-10-05 07:08:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1248415, 1251815, 1253542    

Description Cedric Buissart 2015-04-21 13:52:26 UTC
Description of problem:

When setting auth.allow, rebalance will get stuck unless the IPs of the gluster nodes themselves are included.
The rebalance will be kept as 'in progress', but will be kept at 0 Byte.



                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress               0.00
                             rhs30-node3                0        0Bytes             0             0             0          in progress               0.00
                             rhs30-node4                0        0Bytes             0             0             0          in progress               0.00
                         192.168.100.206                0        0Bytes             0             0             0          in progress               0.00
volume rebalance: thingluster: success:

On the bricks logs, we can see the authentication being prevented :
[2015-04-21 13:43:03.131329] E [server-handshake.c:589:server_setvolume] 0-thingluster-server: Cannot authenticate client from cbuissar-rhs30-node1-8521-2015/04/21-13:42:58:108057-thingluster-client-0-0-0 3.6.0.53
[2015-04-21 13:43:08.419405] E [authenticate.c:239:gf_authenticate] 0-auth: no authentication module is interested in accepting remote-client (null)


Version-Release number of selected component (if applicable): tested on 3.0u3 and 3.0u4


How reproducible: 100%/easy


Steps to Reproduce:
1. set auth.allow to some client IP
2. mount and move files
3. start rebalance

Actual results:
rebalance is hung, authentication errors are shown in the brick logs

Expected results:
Rebalance should still work if we restrict auth.allow.


Additional info:
Workaround : add all the IPs of the gluster nodes in auth.allow.

Comment 1 Cedric Buissart 2015-04-21 13:55:45 UTC
And the rebalance-<volume>.log :

[2015-04-21 13:43:08.412805] W [client-handshake.c:1108:client_setvolume_cbk] 0-thingluster-client-3: failed to set the volume (Permission denied)
[2015-04-21 13:43:08.412821] W [client-handshake.c:1134:client_setvolume_cbk] 0-thingluster-client-3: failed to get 'process-uuid' from reply dict
[2015-04-21 13:43:08.412828] E [client-handshake.c:1140:client_setvolume_cbk] 0-thingluster-client-3: SETVOLUME on remote-host failed: Authentication failed
[2015-04-21 13:43:08.412834] I [client-handshake.c:1225:client_setvolume_cbk] 0-thingluster-client-3: sending AUTH_FAILED even

Comment 6 RajeshReddy 2015-09-04 08:35:45 UTC
Tested with build "glusterfs-3.7.1-12" and after setting auth.allow and nfs.rpc-auth-allow able to run re balance job without any problem so marking this bug as verified

Comment 8 errata-xmlrpc 2015-10-05 07:08:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1845.html