Bug 961703

Summary: glusterd : gluster peer status shows node(from where command is given) itself in peer list and because of that gluster volume create/stop/delete/status command always fails
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rachana Patel <racpatel>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED ERRATA QA Contact: amainkar
Severity: high Docs Contact:
Priority: medium    
Version: 2.1CC: amarts, nsathyan, rhs-bugs, sdharane, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.8rhs-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-23 22:39:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rachana Patel 2013-05-10 10:41:53 UTC
Description of problem:
glusterd : gluster peer status shows node(from where command is given)  itself in peer list and because of that gluster volume create/stop/delete/status command always fails

Version-Release number of selected component (if applicable):
3.4.0.5rhs-1.el6rhs.x86_64

How reproducible:


Steps to Reproduce:
1. had a cluster of 4 RHS server - cutlass, mia, fred and fan
2. updated rpm from 3.4.0.4rhs-1.el6rhs.x86_64 to 3.4.0.5rhs-1.el6rhs.x86_64
3. gluster volume create/stop/status/delete commands were failing on 2 RHS nodes
4. check the gluster peer status on each  node and found that 
gluster peer status shows node(from where command is given)  itself in peer list and because of that gluster volume create/stop/delete/status command always fails


  
Actual results:

[root@cutlass ~]# gluster v status io
Another transaction could be in progress. Please try again after sometime.

[root@cutlass ~]# gluster v stop active
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: active: failed: Another transaction could be in progress. Please try again after sometime.

[root@cutlass ~]# gluster v create abc  cutlass.lab.eng.blr.redhat.com:/rhs/brick1/d1 cutlass.lab.eng.blr.redhat.com:/rhs/brick1/d2
volume create: abc: failed: Another transaction could be in progress. Please try again after sometime.

cutlass:-
[root@cutlass ~]# gluster peer status
Number of Peers: 4

Hostname: fred.lab.eng.blr.redhat.com
Uuid: 99d81c05-17fa-4e89-8cf0-3d697d1e8ff8
State: Peer in Cluster (Connected)

Hostname: fan.lab.eng.blr.redhat.com
Uuid: 68f3d556-4372-411f-a2b9-690f121ee489
State: Peer in Cluster (Connected)

Hostname: mia.lab.eng.blr.redhat.com
Uuid: 2e143896-21ee-485a-b2c9-cdbd3e06c654
State: Peer in Cluster (Connected)

Hostname: cutlass.lab.eng.blr.redhat.com
Uuid: 02b1ce4e-ab4e-4205-850a-f93afdc80376
State: Peer in Cluster (Connected)


mia
[root@mia ~]# gluster peer status
Number of Peers: 4

Hostname: fan.lab.eng.blr.redhat.com
Uuid: 68f3d556-4372-411f-a2b9-690f121ee489
State: Peer in Cluster (Connected)

Hostname: cutlass.lab.eng.blr.redhat.com
Uuid: 02b1ce4e-ab4e-4205-850a-f93afdc80376
State: Peer in Cluster (Connected)

Hostname: fred.lab.eng.blr.redhat.com
Uuid: 99d81c05-17fa-4e89-8cf0-3d697d1e8ff8
State: Peer in Cluster (Connected)

Hostname: mia.lab.eng.blr.redhat.com
Uuid: 2e143896-21ee-485a-b2c9-cdbd3e06c654
State: Peer in Cluster (Connected)



Expected results:
like two other nodes, it should not show RHS node it self in peer list

[root@fred ~]# hostname
fred.lab.eng.blr.redhat.com
[root@fred ~]# gluster peer status
Number of Peers: 3

Hostname: mia.lab.eng.blr.redhat.com
Uuid: 2e143896-21ee-485a-b2c9-cdbd3e06c654
State: Peer in Cluster (Connected)

Hostname: cutlass.lab.eng.blr.redhat.com
Uuid: 02b1ce4e-ab4e-4205-850a-f93afdc80376
State: Peer in Cluster (Connected)

Hostname: fan.lab.eng.blr.redhat.com
Uuid: 68f3d556-4372-411f-a2b9-690f121ee489
State: Peer in Cluster (Connected)

[root@fan ~]# hostname
fan.lab.eng.blr.redhat.com
[root@fan ~]# gluster peer status
Number of Peers: 3

Hostname: fred.lab.eng.blr.redhat.com
Uuid: 99d81c05-17fa-4e89-8cf0-3d697d1e8ff8
State: Peer in Cluster (Connected)

Hostname: cutlass.lab.eng.blr.redhat.com
Uuid: 02b1ce4e-ab4e-4205-850a-f93afdc80376
State: Peer in Cluster (Connected)

Hostname: mia.lab.eng.blr.redhat.com
Uuid: 2e143896-21ee-485a-b2c9-cdbd3e

Additional info:

Comment 1 Rachana Patel 2013-05-10 10:52:23 UTC
sosreport @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/961703

Comment 3 Rachana Patel 2013-05-15 07:40:14 UTC
verified on 3.4.0.8rhs-1.el6rhs.x86_64

had a cluster of 3 node, on one node

[root@mia ~]# ifconfig | grep inet
          inet addr:10.70.34.92 
[root@mia ~]# hostname
mia.lab.eng.blr.redhat.com

[root@mia ~]# gluster peer status
Number of Peers: 3

Hostname: fan.lab.eng.blr.redhat.com
Uuid: c6dfd028-d46f-4d20-a9c6-17c04e7fb919
State: Peer in Cluster (Connected)

Hostname: mia.lab.eng.blr.redhat.com
Uuid: 1698dc55-2245-4b20-9b8c-60fbe77a06ff
State: Peer in Cluster (Connected)

Hostname: 10.70.34.80
Uuid: ababf76c-a741-4e27-a6bb-93da035d8fd7
State: Peer in Cluster (Connected)

----> it shows mia in peer from mia as a result of it gluster volume create/stop/delete/status command always fails

e.g.
[root@mia ~]# gluster volume create abc  mia.lab.eng.blr.redhat.com:/rhs/brick1/ll fred.lab.eng.blr.redhat.com:/rhs/brick1/ll
volume create: abc: failed: Another transaction could be in progress. Please try again after sometime.

Comment 5 Amar Tumballi 2013-05-17 06:34:21 UTC
Updated from Kaushal, on RPM issue faced till 3.4.0.8rhs (which has all the right fixes).

------------------
Hi all,
Another small update on the steps to be taken when updating from build 7 to any newer releases.

1. Backup /var/lib/glusterd
2. Upgrade
3. Stop gluster
4. Restore /var/lib/glusterd
5. Delete the /var/lib/glusterd/options file if empty. This will be recreated by glusterd.
6. Start gluster and continue with your testing.

The /var/lib/glusterd/options file being empty causes syncing problems on glusterd restart. Build7 cleared this file. If you hadn't done any server-quorum test with build7, this file is most probably still empty.

So, if anyone is facing any volume syncing issues, do step 5 and restart glusterd.

Thanks,
Kaushal

----- Original Message -----
> From: "Kaushal M" <kaushal>
> To: storage-qa
> Sent: Wednesday, May 15, 2013 12:10:08 PM
> Subject: Re: Warning on upgrade from gluster v3.4.0.7 to v3.4.0.8
>
> A small clarification. The upgrade will not delete all the files in
> /var/lib/glusterd. Only some files/directories like glusterd.info and nfs
> directory can be deleted. This is due to a packaging bug in build 7, in
> which these files/directories were a part of the package itself.
> This may be avoided by uninstalling and installing, instead of and upgrade (I
> haven't tested this). But to be on the safer side, backup and restore the
> /var/lib/glusterd directory.
>
> - Kaushal
>
> ----- Original Message -----
>> From: "Kaushal M" <kaushal>
>> To: storage-qa
>> Sent: Wednesday, May 15, 2013 11:48:05 AM
>> Subject: Warning on upgrade from gluster v3.4.0.7 to v3.4.0.8
>>
>> Hi all,
>>
>> Because of bugs in packaging of build 7, an upgrade from build 7 tp build 8
>> will cause files /var/lib/glusterd/ to be deleted. As you can probably
>> guess
>> this will lead to all sorts of problems.
>> So, before upgrading, backup your /var/lib/glusterd directory. Follow the
>> below steps to make sure you don't break your existing setup,
>>
>> 1. Backup /var/lib/glusterd
>> 2. Upgrade
>> 3. Stop gluster
>> 4. Restore /var/lib/glusterd
>> 5. Start gluster and continue with your testing.
>>
>>
>> Regards,
>> Kaushal
>

Comment 6 Rachana Patel 2013-05-22 05:58:11 UTC
verified on 3.4.0.8rhs-1.el6rhs.x86_64, without rpm upgrade(remove old and install new rpm) and it works fine. so changing status to verified

Comment 7 Rachana Patel 2013-05-22 08:53:48 UTC
verified on 3.4.0.8rhs-1.el6rhs.x86_64, without rpm upgrade(remove old and install new rpm) and it works fine. so changing status to verified

Comment 8 Scott Haines 2013-09-23 22:39:43 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 9 Scott Haines 2013-09-23 22:43:47 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html