Bug 1189285
| Summary: | [RFE] Putting a RHS node into maintenance mode currently seems to do nothing. | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Harold Miller <hamiller> | |
| Component: | rhsc | Assignee: | Ramesh N <rnachimu> | |
| Status: | CLOSED ERRATA | QA Contact: | Triveni Rao <trao> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | high | |||
| Version: | rhgs-3.0 | CC: | asrivast, byarlaga, divya, knarra, mbukatov, mkalinin, nlevinki, olim, rhs-bugs, rnachimu, sabose, sankarshan, sashinde, sasundar, sgraf | |
| Target Milestone: | --- | Keywords: | FutureFeature, ZStream | |
| Target Release: | RHGS 3.1.2 | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | rhsc-3.1.2-0.68, vdsm-4.16.30-1.1 | Doc Type: | Enhancement | |
| Doc Text: |
Previously, when the Red Hat Gluster Storage node was put into the maintenance mode, it did not stop all gluster related process as expected. Moving a Red Hat Gluster Storage node to maintenance must stop all gluster related processes so that the node is ready for maintenance activity like upgrade, repair, and so on. With this fix, all gluster related processes will be stopped when a node is moved to maintenance mode and glusterd service will be restarted when the node is activated. Restarting the glusterd service will start all gluster related process like brick process, self heal, geo-replication processes, and so on.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1291173 (view as bug list) | Environment: | ||
| Last Closed: | 2016-03-01 06:10:13 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1230247, 1277562, 1286636, 1286638, 1289092, 1294754 | |||
| Bug Blocks: | 1260783, 1291173 | |||
|
Description
Harold Miller
2015-02-04 21:35:59 UTC
A related feature would be to detect when a RHS node was in Maintenance mode (all gluster services stopped) or faulty (any required gluster* service stopped / unresponsive) This was requested in SFDC 01357740. We would then need to update our documentation at: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.1/html-single/Installation_Guide/index.html and in the RHS 3.x manual as well. We now have several tickets open regarding RHS-C, RHS, and Maintenance Mode. We need: Support of 'Maintenance Mode' on RHS RHS-C to be capable of setting a RHS node into and out of Maintenance Mode RHS-C to be capable of detecting when a RHS node is in 'Maintenance Mode' RHS-C/VDSM to not start RHS nodes that are flagged in 'Maintenance Mode' RHGS 3.1 documentation implies that maintenance mode is implemented and not only functional, but required. (https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Console_Administration_Guide/sect-Maintaining_Hosts.html#Moving_Hosts_into_Maintenance_Mode) Is this BZ obsolete, or is RHGS 3.1 broken/incomplete? (In reply to Harold Miller from comment #6) > RHGS 3.1 documentation implies that maintenance mode is implemented and not > only functional, but required. > (https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/ > Console_Administration_Guide/sect-Maintaining_Hosts. > html#Moving_Hosts_into_Maintenance_Mode) > > Is this BZ obsolete, or is RHGS 3.1 broken/incomplete? I don't think this BZ is obsolete, documentation from 3.1 release is actually the same compared to version from 3.0 release. See and compare: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Console_Administration_Guide/sect-Maintaining_Hosts.html https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Console_Administration_Guide/sect-Maintaining_Hosts.html BZ 1230247 is the RFE to stop all gluster services in single go. If this requirement is implemented, then its simple for RHEV workflow to implement MAINTENANCE mode for RHGS node Yes, but meanwhile the document need to be change to just do one "pkll" command. That is the customer's request. Thanks & Regards Oonkwee Emerging Technologies RedHat Global Support This bug is verified with the fixedin version rhsc-3.1.2-0.69.el6.noarch, vdsm-4.16.30-1.3.el7rhgs.x86_64
Couple of tests were performed to make sure the maintenance mode is effective from RHSC.
1. created 2 clusters with 2 nodes and created few volumes.
2. Put one of the nodes to maintenance.
3. Confirmed from background that glusterd process has stopped, no brick processes running.
4. tried with geo-rep session between the clusters. putting master node/slave node into maintenance mode and checking from backend that it works fine showing geo-rep session faulty. once activated status showed ok.
5. tried to upgrade the host to new build of RHGS312 by putting the node to maintenance mode, this works fine by not showing any upgrade related issues.
6. confirmed that 3 way replica works fine by putting all the nodes to maintenance and bring back nodes one after the other.
Over all, all the gluster related processes were stopped once the node is put to maintenance mode and process were online once activated.
Version:
vdsm-4.16.30-1.3.el7rhgs.x86_64 and vdsm-4.16.30-1.3.el6rhgs.x86_64
Output:
After maintenance mode:
[root@dhcp35-215 ~]# service glusterd status
Redirecting to /bin/systemctl status glusterd.service
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Sat 2015-12-26 14:41:36 EST; 29s ago
Main PID: 31684 (code=exited, status=0/SUCCESS)
Dec 26 14:35:32 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Starting GlusterFS, a clustered file-system server...
Dec 26 14:35:34 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Started GlusterFS, a clustered file-system server.
Dec 26 14:41:36 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Stopping GlusterFS, a clustered file-system server...
Dec 26 14:41:36 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Stopped GlusterFS, a clustered file-system server.
[root@dhcp35-215 ~]#
[root@dhcp35-215 ~]# ps aux | grep glusterfsd
root 2358 0.0 0.0 112644 960 pts/0 S+ 14:42 0:00 grep --color=auto glusterfsd
[root@dhcp35-215 ~]# ps aux | grep glusterfs
root 2360 0.0 0.0 112644 952 pts/0 S+ 14:42 0:00 grep --color=auto glusterfs
[root@dhcp35-215 ~]# ps aux | grep ssh
root 1385 0.0 0.0 82548 3628 ? Ss 06:15 0:00 /usr/sbin/sshd -D
root 2304 0.2 0.1 142860 5168 ? Ss 14:41 0:00 sshd: root@pts/0
root 2362 0.0 0.0 112648 956 pts/0 S+ 14:42 0:00 grep --color=auto ssh
[root@dhcp35-215 ~]# clear
[3;J
[root@dhcp35-215 ~]# rpm -qa | grep vdsm
After activating the host/node:
[root@dhcp35-215 ~]# service glusterd status
Redirecting to /bin/systemctl status glusterd.service
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2015-12-26 14:51:49 EST; 23s ago
Process: 2512 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 2513 (glusterd)
CGroup: /system.slice/glusterd.service
└─2513 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Dec 26 14:51:47 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Starting GlusterFS, a clustered file-system server...
Dec 26 14:51:49 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Started GlusterFS, a clustered file-system server.
[root@dhcp35-215 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0310.html The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |