RFE : putting RHS nodes in maintenance has no effect on gluster cluster or gluster node. Expected behavior would be stopping gluster related services on requested node or at least disable glusterd. Ideally both.
A related feature would be to detect when a RHS node was in Maintenance mode (all gluster services stopped) or faulty (any required gluster* service stopped / unresponsive) This was requested in SFDC 01357740. We would then need to update our documentation at: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.1/html-single/Installation_Guide/index.html and in the RHS 3.x manual as well.
We now have several tickets open regarding RHS-C, RHS, and Maintenance Mode. We need: Support of 'Maintenance Mode' on RHS RHS-C to be capable of setting a RHS node into and out of Maintenance Mode RHS-C to be capable of detecting when a RHS node is in 'Maintenance Mode' RHS-C/VDSM to not start RHS nodes that are flagged in 'Maintenance Mode'
RHGS 3.1 documentation implies that maintenance mode is implemented and not only functional, but required. (https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Console_Administration_Guide/sect-Maintaining_Hosts.html#Moving_Hosts_into_Maintenance_Mode) Is this BZ obsolete, or is RHGS 3.1 broken/incomplete?
(In reply to Harold Miller from comment #6) > RHGS 3.1 documentation implies that maintenance mode is implemented and not > only functional, but required. > (https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/ > Console_Administration_Guide/sect-Maintaining_Hosts. > html#Moving_Hosts_into_Maintenance_Mode) > > Is this BZ obsolete, or is RHGS 3.1 broken/incomplete? I don't think this BZ is obsolete, documentation from 3.1 release is actually the same compared to version from 3.0 release. See and compare: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Console_Administration_Guide/sect-Maintaining_Hosts.html https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Console_Administration_Guide/sect-Maintaining_Hosts.html
BZ 1230247 is the RFE to stop all gluster services in single go. If this requirement is implemented, then its simple for RHEV workflow to implement MAINTENANCE mode for RHGS node
Yes, but meanwhile the document need to be change to just do one "pkll" command. That is the customer's request. Thanks & Regards Oonkwee Emerging Technologies RedHat Global Support
This bug is verified with the fixedin version rhsc-3.1.2-0.69.el6.noarch, vdsm-4.16.30-1.3.el7rhgs.x86_64 Couple of tests were performed to make sure the maintenance mode is effective from RHSC. 1. created 2 clusters with 2 nodes and created few volumes. 2. Put one of the nodes to maintenance. 3. Confirmed from background that glusterd process has stopped, no brick processes running. 4. tried with geo-rep session between the clusters. putting master node/slave node into maintenance mode and checking from backend that it works fine showing geo-rep session faulty. once activated status showed ok. 5. tried to upgrade the host to new build of RHGS312 by putting the node to maintenance mode, this works fine by not showing any upgrade related issues. 6. confirmed that 3 way replica works fine by putting all the nodes to maintenance and bring back nodes one after the other. Over all, all the gluster related processes were stopped once the node is put to maintenance mode and process were online once activated. Version: vdsm-4.16.30-1.3.el7rhgs.x86_64 and vdsm-4.16.30-1.3.el6rhgs.x86_64 Output: After maintenance mode: [root@dhcp35-215 ~]# service glusterd status Redirecting to /bin/systemctl status glusterd.service ● glusterd.service - GlusterFS, a clustered file-system server Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled) Active: inactive (dead) since Sat 2015-12-26 14:41:36 EST; 29s ago Main PID: 31684 (code=exited, status=0/SUCCESS) Dec 26 14:35:32 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Starting GlusterFS, a clustered file-system server... Dec 26 14:35:34 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Started GlusterFS, a clustered file-system server. Dec 26 14:41:36 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Stopping GlusterFS, a clustered file-system server... Dec 26 14:41:36 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Stopped GlusterFS, a clustered file-system server. [root@dhcp35-215 ~]# [root@dhcp35-215 ~]# ps aux | grep glusterfsd root 2358 0.0 0.0 112644 960 pts/0 S+ 14:42 0:00 grep --color=auto glusterfsd [root@dhcp35-215 ~]# ps aux | grep glusterfs root 2360 0.0 0.0 112644 952 pts/0 S+ 14:42 0:00 grep --color=auto glusterfs [root@dhcp35-215 ~]# ps aux | grep ssh root 1385 0.0 0.0 82548 3628 ? Ss 06:15 0:00 /usr/sbin/sshd -D root 2304 0.2 0.1 142860 5168 ? Ss 14:41 0:00 sshd: root@pts/0 root 2362 0.0 0.0 112648 956 pts/0 S+ 14:42 0:00 grep --color=auto ssh [root@dhcp35-215 ~]# clear [3;J [root@dhcp35-215 ~]# rpm -qa | grep vdsm After activating the host/node: [root@dhcp35-215 ~]# service glusterd status Redirecting to /bin/systemctl status glusterd.service ● glusterd.service - GlusterFS, a clustered file-system server Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2015-12-26 14:51:49 EST; 23s ago Process: 2512 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 2513 (glusterd) CGroup: /system.slice/glusterd.service └─2513 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO Dec 26 14:51:47 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Starting GlusterFS, a clustered file-system server... Dec 26 14:51:49 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Started GlusterFS, a clustered file-system server. [root@dhcp35-215 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0310.html
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days