| Summary: | upgrade: peer probe fails to add new machine in cluster once the ISO upgrade is performed. | ||
|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Rahul Hinduja <rhinduja> |
| Component: | glusterd | Assignee: | Kaushal <kaushal> |
| Status: | CLOSED ERRATA | QA Contact: | Rahul Hinduja <rhinduja> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 2.1 | CC: | amarts, kdhananj, kparthas, rhs-bugs, shaines, ssaha, surs, vbellur |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-3.4.0.25rhs-1 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-09-23 22:25:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | |||
| Bug Blocks: | 840810 | ||
|
Description
Rahul Hinduja
2013-08-26 09:13:02 UTC
Since, No error reported when the system is part of cluster is different issue than the original operating-version mismatch after upgrade, raised a separate bug 1001056 to track it separately. I need the following test to be performed to assess the seriousness of this bug 1. Start with a 2 or 4 node RHSS 2.0 u4/5/6 cluster. 2. Update this using the ISO method 3. After all the servers are up, peer probe and re-create the cluster 2/4 node cluster. Now all nodes should be at 2.1. 4. Try adding new RHSS 2.1 servers to the existing cluster and then rebalance. Let me know the results. (In reply to Sayan Saha from comment #5) > I need the following test to be performed to assess the seriousness of this > bug > > 1. Start with a 2 or 4 node RHSS 2.0 u4/5/6 cluster. Done > 2. Update this using the ISO method Done > 3. After all the servers are up, peer probe and re-create the cluster 2/4 > node cluster. Now all nodes should be at 2.1. Once all the servers are up with RHS2.1, for upgrade we copy back the configuration files /var/lib/glusterd and start the volume, this automatically recreates the cluster and volumes are up and running. We need not to probe the machines again to re-create a cluster. If we do a peer probe before copying the configuration files, the peer probe works as it is the fresh installation of RHS2.1 machines. But for upgrade we need to copy back /var/lib/glusterd and that's where it fails because after copying the operating-version changes to 1 > 4. Try adding new RHSS 2.1 servers to the existing cluster and then > rebalance. This step can not be performed as the probe of new machine with RHS2.1 fails as the new system would have operating-version 2. > > Let me know the results. http://review.gluster.org/#/c/5450/3/doc/release-notes/3.4.0.md has been added in upstream code, which talks about the work arounds. Should we try this and if works, go ahead for now? The code fix looks more harder at the time. Verified with the upgraded build: glusterfs-server-3.4.0.30rhs-2.el6rhs.x86_64 1. Upgraded the cluster from update5 to RHS2.1 2. Tried to probe the new machine (dj) in cluster. Probe is successful. Before the machine (dj) was probed the operating-version was 2 but than it reduced the operating-version to 1 as the fix describes. Before probing: ============== [root@dj ~]# cat /var/lib/glusterd/glusterd.info UUID=0d06c2c6-a5dd-4179-bd83-dbdaf66233df operating-version=2 [root@dj ~]# After Probing: ============== [root@dj ~]# cat /var/lib/glusterd/glusterd.info UUID=0d06c2c6-a5dd-4179-bd83-dbdaf66233df operating-version=1 [root@upgrade-1 ~]# gluster peer probe 10.70.34.90 peer probe: success. [root@upgrade-1 ~]# Moving the bug to verified state. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html |