Bug 1085447
| Summary: | CTDB rebase to 2.5.x requires a cluster restart | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Abhijith Das <adas> |
| Component: | ctdb | Assignee: | Sumit Bose <sbose> |
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | medium | ||
| Version: | 6.6 | CC: | adas, dpal, fdinitto, jpayne, mnavrati, sbradley, swhiteho |
| Target Milestone: | rc | Keywords: | Rebase |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ctdb-2.5.1-1.el6 | Doc Type: | Release Note |
| Doc Text: |
CTDB Upgrade
Red Hat Enterprise Linux 6.6 contains a new version of the CTDB agent, in which some internal operations have changed to improve stability and reliability. As a consequence, the new version cannot be mixed with older versions running in parallel in the same cluster. To update CTDB in an existing cluster, CTDB must be stopped on all nodes in the cluster before the upgrade starts, and the nodes can then be updated one by one and started again.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-10-14 06:47:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Abhijith Das
2014-04-08 15:47:07 UTC
Here's the discussion I had with Sumit on IRC yesterday: <sbose> abhi, hi, we are planning a rebase of ctdb in 6.6. I already talked with RHS and we think going to version 2.5.x would be the best move, but it will require a complete shutdown of a cluster to upgrade. Do you think this will work for you (gfs2/cluster team) as well? <abhi> sbose, does it break an existing cluster if the package is upgraded and the cluster not restarted? * fnarel|afk is now known as fnrel * fnrel has quit (Quit: Peace Out!) * fnarel (~fnarel.redhat.com) has joined #samba <sbose> abhi, you mean the package is replaced on-disk, but the processes are not restarted? <abhi> I think it should be ok, as long as the user is aware of this cluster restart requirement before upgrading sbose simo sir_nills|TL sorenson sprabhu sputhenp|pto sreber|gone <abhi> sbose, yeah <sbose> abhi, I guess it would work, the old binary will call some new scripts on shutdown but this should work (but needs to be tested). Nevertheless I would recommend in the release note to shut down the cluster first, then upgrade the packages and restart the clusters. Since you have to take down all node I think you won't save much time by updating the packages first. <abhi> sbose, ok... that sounds reasonable to me <sbose> abhi, ok, thanks, other benefits of the rebase would be then the versions in 6.6 and 7.0 are nearly in sync and we are working on the same tree as upstream, additionally all reported RHEL-6 issues are fixed in 2.5.x. <crh> abhi: sbose: I'm in Westford, and just spoke to Ira a short while ago about going with 2.5.x. We'll be moving to 2.5 in RHS, so it seems. <sbose> crh, yes, that's what I heard from Jose as well. <crh> Good news if you're planning on moving that way too. It will make things easier for all, in the long run. <abhi> sbose, wouldn't simply doing a 'service ctdb stop', followed by upgrade and 'service ctdb start' do the trick? <sbose> abhi, 'service ctdb stop' on all nodes. <abhi> yeah. <sbose> abhi, then you can upgrade and start one after the other or does all upgrades first and then restart all again. <abhi> sbose, I'm thinking...1. "service ctdb stop" on all nodes, 2. upgrade ctdb on all nodes, 3. "service ctdb start" on all nodes <sbose> abhi, yes, sorry, by cluster I only meant the ctdb part of a cluster, not everything running a the cluster. <abhi> sbose, out of curiosity, what's new in 2.5.x? <sbose> abhi, it mostly performance and stability improvements. There are a bit of chaos with different trees in 1.x. There was the 1.0.114.x, the 1.2. 1.10 and some others. The new maintainer did a herculean task to merge all the good in the branches together to get 2.0. Some of the more visible improvements are better systemd integration than my crude patches, pcp support, improved man pages and self tests. <abhi> sbose, having to restart ctdb is not ideal for our customers, but if it's critical for this new version to go into 6.6, we must document the upgrade procedure appropriately and make sure GSS is aware of this because they'll be the ones fielding customer calls. <sbose> abhi, yes I understand, but given the lifetime of RHEL6 I think we have to do this step sooner or later. Btw with the next major release of samba (4.2) ctdb will be part of samba and not stand alone anymore. <abhi> sbose, yeah... crh and jarrpa told me about that (ctdb integrated into samba) last week Sumit, in talking with Steve and Shane, it looks like this rebase is going to interfere with our ability to support rolling upgrades: https://access.redhat.com/site/articles/40753 https://access.redhat.com/site/solutions/39903 Questions: Are the fixes you mentioned above critical enough to warrant this rebase? What would happen if a customer is unaware of the procedure and performs a rolling upgrade? i.e. upgrades one node at a time? What happens if one node is upgraded and rebooted (or cluster restarted). Will ctdb break? What happens if the packages are updated on all the nodes and the cluster is NOT restarted before or after the upgrade? Will ctdb still function correctly? Verified SanityOnly in ctdb-2.5.1-1.el6 from RHEL-6.6-candidate-20140526 via upgrade from 6.5 to 6.6: [root@host-033 ~]# for i in `seq 1 3`; do qarsh root@dash-0$i rpm -q ctdb; done ctdb-1.0.114.5-3.el6.x86_64 ctdb-1.0.114.5-3.el6.x86_64 ctdb-1.0.114.5-3.el6.x86_64 [root@host-033 ~]# for i in `seq 1 3`; do qarsh root@dash-0$i wget -P /etc/yum.repos.d http://sts.lab.msp.redhat.com/dist/brewroot/repos/RHEL-6.6-candidate-20140526.repo; done [root@host-033 ~]# for i in `seq 1 3`; do qarsh root@dash-0$i yum -y update ctdb; done [root@host-033 ~]# for i in `seq 1 3`; do qarsh root@dash-0$i rpm -q ctdb; done ctdb-2.5.1-1.el6.x86_64 ctdb-2.5.1-1.el6.x86_64 ctdb-2.5.1-1.el6.x86_64 [root@host-033 ~]# for i in `seq 1 3`; do qarsh root@dash-0$i clustat; done Cluster Status for dash @ Tue May 27 12:00:40 2014 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ dash-01 1 Online, Local dash-02 2 Online dash-03 3 Online Cluster Status for dash @ Tue May 27 12:00:40 2014 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ dash-01 1 Online dash-02 2 Online, Local dash-03 3 Online Cluster Status for dash @ Tue May 27 12:00:40 2014 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ dash-01 1 Online dash-02 2 Online dash-03 3 Online, Local Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1488.html |