Bug 963082 - glusterd : - rpm upgrade from 3.4.0.7rhs-1.el6rhs.x86_64 to 3.4.0.8rhs-1.el6rhs.x86_64 (without stopping glusterd ) is causing problems - peer is shown as disconnect, not able to add new peer to cluster, glusterd.info is missing
glusterd : - rpm upgrade from 3.4.0.7rhs-1.el6rhs.x86_64 to 3.4.0.8rhs-1.el6r...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.1
x86_64 Linux
urgent Severity urgent
: ---
: ---
Assigned To: Amar Tumballi
amainkar
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-15 02:26 EDT by Rachana Patel
Modified: 2015-04-20 07:57 EDT (History)
4 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0.8rhs-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-23 18:39:47 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rachana Patel 2013-05-15 02:26:29 EDT
Description of problem:
glusterd : - rpm upgrade from 3.4.0.7rhs-1.el6rhs.x86_64 to 3.4.0.8rhs-1.el6rhs.x86_64 (without stopping glusterd ) is causing problems - peer is shown as disconnect, not able to add new peer to cluster, glusterd.info is missing

Version-Release number of selected component (if applicable):
3.4.0.8rhs-1.el6rhs.x86_64

How reproducible:
always

Steps to Reproduce:
1.had a cluster of 3 server. upgrade rpm without stopping glusterd
from 3.4.0.7rhs-1.el6rhs.x86_64 to 3.4.0.8rhs-1.el6rhs.x86_64 

2.after that - 

a)one peer is always discoonected
[root@mia ~]# gluster peer status
Number of Peers: 2

Hostname: fan.lab.eng.blr.redhat.com
Uuid: 7b693192-9015-46d9-9b46-d7e5154bd9c8
State: Peer in Cluster (Connected)

Hostname: 10.70.34.80
Uuid: daa7bdb9-de87-4b6e-9f86-e8bff3d47fc0
State: Peer in Cluster (Disconnected)

[root@fan ~]# gluster peer status
Number of Peers: 2

Hostname: mia.lab.eng.blr.redhat.com
Uuid: d665808d-a42a-4eac-bf05-ca53c595486d
State: Peer in Cluster (Connected)

Hostname: 10.70.34.80
Uuid: daa7bdb9-de87-4b6e-9f86-e8bff3d47fc0
State: Peer in Cluster (Disconnected)

[root@fred ~]# gluster peer status
Number of Peers: 2

Hostname: mia.lab.eng.blr.redhat.com
Uuid: d665808d-a42a-4eac-bf05-ca53c595486d
State: Peer in Cluster (Connected)

Hostname: fan.lab.eng.blr.redhat.com
Uuid: 7b693192-9015-46d9-9b46-d7e5154bd9c8
State: Peer in Cluster (Connected)


b) not able to add new peer to cluster

[root@mia ~]# gluster peer probe cutlass.lab.eng.blr.redhat.compeer probe: failed: Failed to get handshake ack from remote server

c) glustrd.info is ,missing on 2 server

[root@cutlass ~]# ls -lh /var/lib/glusterd/
total 16K
drwxr-xr-x. 2 root root 4.0K May 14 23:16 geo-replication
drwxr-xr-x. 2 root root 4.0K May 14 00:12 groups
drwxr-xr-x. 3 root root 4.0K May 14 20:20 hooks
drwxr-xr-x. 2 root root 4.0K May 14 23:22 peers

[root@fred ~]# ls /var/lib/glusterd/
geo-replication  groups  hooks  peers  vols


d) log is filled with below msg
less /var/log/glusterfs/etc-glusterfs-glusterd.vol.log 
<snip>

  
Actual results:


Expected results:


Additional info:
Comment 1 Rachana Patel 2013-05-15 02:28:54 EDT
d) log is filled with below msg
less /var/log/glusterfs/etc-glusterfs-glusterd.vol.log 
<snip>

[2013-05-15 07:26:36.154030] E [glusterd-store.c:1690:glusterd_store_global_info] 0-management: Failed to store glusterd global-info
[2013-05-15 07:26:36.154041] E [glusterd-handshake.c:557:__glusterd_mgmt_hndsk_versions_ack] 0-management: Failed to store op-version
[2013-05-15 07:26:38.805311] I [glusterd-handshake.c:553:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 1
[2013-05-15 07:26:38.805357] E [glusterd-store.c:1648:glusterd_store_global_info] 0-management: chmod error for glusterd.info: No such
 file or directory
[2013-05-15 07:26:38.805372] E [glusterd-store.c:1690:glusterd_store_global_info] 0-management: Failed to store glusterd global-info
[2013-05-15 07:26:38.805383] E [glusterd-handshake.c:557:__glusterd_mgmt_hndsk_versions_ack] 0-management: Failed to store op-version
[2013-05-15 07:26:39.158819] I [glusterd-handshake.c:553:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 1
[2013-05-15 07:26:39.158854] E [glusterd-store.c:1648:glusterd_store_global_info] 0-management: chmod error for glusterd.info: No such
 file or directory
[2013-05-15 07:26:39.158868] E [glusterd-store.c:1690:glusterd_store_global_info] 0-management: Failed to store glusterd global-info
[2013-05-15 07:26:39.158879] E [glusterd-handshake.c:557:__glusterd_mgmt_hndsk_versions_ack] 0-management: Failed to store op-version
[2013-05-15 07:26:41.809993] I [glusterd-handshake.c:553:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 1
[2013-05-15 07:26:41.810071] E [glusterd-store.c:1648:glusterd_store_global_info] 0-management: chmod error for glusterd.info: No such
 file or directory
Comment 3 Amar Tumballi 2013-05-16 04:43:53 EDT
Below is the summary of things which were broken with 3.4.0.7rhs RPMs. with 3.4.0.8rhs all of these are fixed (and hence all updates will work fine).

Can't do anything with the already available binary of 3.4.0.7rhs build other than pointing out to the work-around, just that this will not happen from 3.4.0.8rhs-> anything else.

--------------------
Hi all,
Another small update on the steps to be taken when updating from build 7 to any newer releases.

1. Backup /var/lib/glusterd
2. Upgrade
3. Stop gluster
4. Restore /var/lib/glusterd
5. Delete the /var/lib/glusterd/options file if empty. This will be recreated by glusterd.
6. Start gluster and continue with your testing.

The /var/lib/glusterd/options file being empty causes syncing problems on glusterd restart. Build7 cleared this file. If you hadn't done any server-quorum test with build7, this file is most probably still empty.

So, if anyone is facing any volume syncing issues, do step 5 and restart glusterd.

Thanks,
Kaushal

----- Original Message -----
> From: "Kaushal M" <kaushal@redhat.com>
> To: storage-qa@redhat.com
> Sent: Wednesday, May 15, 2013 12:10:08 PM
> Subject: Re: Warning on upgrade from gluster v3.4.0.7 to v3.4.0.8
>
> A small clarification. The upgrade will not delete all the files in
> /var/lib/glusterd. Only some files/directories like glusterd.info and nfs
> directory can be deleted. This is due to a packaging bug in build 7, in
> which these files/directories were a part of the package itself.
> This may be avoided by uninstalling and installing, instead of and upgrade (I
> haven't tested this). But to be on the safer side, backup and restore the
> /var/lib/glusterd directory.
>
> - Kaushal
>
> ----- Original Message -----
>> From: "Kaushal M" <kaushal@redhat.com>
>> To: storage-qa@redhat.com
>> Sent: Wednesday, May 15, 2013 11:48:05 AM
>> Subject: Warning on upgrade from gluster v3.4.0.7 to v3.4.0.8
>>
>> Hi all,
>>
>> Because of bugs in packaging of build 7, an upgrade from build 7 tp build 8
>> will cause files /var/lib/glusterd/ to be deleted. As you can probably
>> guess
>> this will lead to all sorts of problems.
>> So, before upgrading, backup your /var/lib/glusterd directory. Follow the
>> below steps to make sure you don't break your existing setup,
>>
>> 1. Backup /var/lib/glusterd
>> 2. Upgrade
>> 3. Stop gluster
>> 4. Restore /var/lib/glusterd
>> 5. Start gluster and continue with your testing.
>>
>>
>> Regards,
>> Kaushal
>
---------------
Comment 4 Rachana Patel 2013-05-22 01:52:16 EDT
going with work around and marking this bug as verified. If it will come again for rpm upgrade from 3.4.0.8rhs-> will reopen the same.
Comment 5 Scott Haines 2013-09-23 18:39:47 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html
Comment 6 Scott Haines 2013-09-23 18:43:48 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.