1189285 – [RFE] Putting a RHS node into maintenance mode currently seems to do nothing.

Bug 1189285 - [RFE] Putting a RHS node into maintenance mode currently seems to do nothing.

Summary: [RFE] Putting a RHS node into maintenance mode currently seems to do nothing.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	rhsc
Sub Component:
Version:	rhgs-3.0
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	RHGS 3.1.2
Assignee:	Ramesh N
QA Contact:	Triveni Rao
Docs Contact:
URL:
Whiteboard:
Depends On:	1230247 1277562 1286636 1286638 1289092 1294754
Blocks:	1260783 1291173
TreeView+	depends on / blocked

Reported:	2015-02-04 21:35 UTC by Harold Miller
Modified:	2023-09-14 02:54 UTC (History)
CC List:	15 users (show)
Fixed In Version:	rhsc-3.1.2-0.68, vdsm-4.16.30-1.1
Doc Type:	Enhancement
Doc Text:	Previously, when the Red Hat Gluster Storage node was put into the maintenance mode, it did not stop all gluster related process as expected. Moving a Red Hat Gluster Storage node to maintenance must stop all gluster related processes so that the node is ready for maintenance activity like upgrade, repair, and so on. With this fix, all gluster related processes will be stopped when a node is moved to maintenance mode and glusterd service will be restarted when the node is activated. Restarting the glusterd service will start all gluster related process like brick process, self heal, geo-replication processes, and so on.
Clone Of:
Clones:	1291173 (view as bug list)
Environment:
Last Closed:	2016-03-01 06:10:13 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1124999	unspecified	CLOSED	[RHSC] not clear how to run software update of cluster with console	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHBA-2016:0310	normal	SHIPPED_LIVE	Red Hat Gluster Storage Console 3.1 update 2 bug fixes	2016-03-01 10:31:05 UTC
oVirt gerrit	43725	master	MERGED	gluster: Stop gluster processes when host moves to maintenance	Never
oVirt gerrit	43773	master	MERGED	engine: check gluster params while moving Host to maintenance	2016-09-20 05:48:27 UTC
oVirt gerrit	43821	None	None	None	Never
oVirt gerrit	48995	None	None	None	Never

Internal Links: 1124999

Description Harold Miller 2015-02-04 21:35:59 UTC

RFE : putting RHS nodes in maintenance has no effect on gluster cluster or gluster node. Expected behavior would be stopping gluster related services on requested node or at least disable glusterd. Ideally both.

Comment 2 Harold Miller 2015-02-11 21:03:00 UTC

A related feature would be to detect when a RHS node was in Maintenance mode (all gluster services stopped) or faulty (any required gluster* service stopped / unresponsive)

This was requested in SFDC 01357740.

We would then need to update our documentation at:
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.1/html-single/Installation_Guide/index.html

and in the RHS 3.x manual as well.

Comment 3 Harold Miller 2015-02-19 21:02:04 UTC

We now have several tickets open regarding RHS-C, RHS, and Maintenance Mode.

We need:

Support of 'Maintenance Mode' on RHS

RHS-C to be capable of setting a RHS node into and out of Maintenance Mode

RHS-C to be capable of detecting when a RHS node is in 'Maintenance Mode'

RHS-C/VDSM to not start RHS nodes that are flagged in 'Maintenance Mode'

Comment 6 Harold Miller 2015-08-03 14:22:43 UTC

RHGS 3.1 documentation implies that maintenance mode is implemented and not only functional, but required. (https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Console_Administration_Guide/sect-Maintaining_Hosts.html#Moving_Hosts_into_Maintenance_Mode)

Is this BZ obsolete, or is RHGS 3.1 broken/incomplete?

Comment 7 Martin Bukatovic 2015-08-03 14:45:06 UTC

(In reply to Harold Miller from comment #6)
> RHGS 3.1 documentation implies that maintenance mode is implemented and not
> only functional, but required.
> (https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/
> Console_Administration_Guide/sect-Maintaining_Hosts.
> html#Moving_Hosts_into_Maintenance_Mode)
> 
> Is this BZ obsolete, or is RHGS 3.1 broken/incomplete?

I don't think this BZ is obsolete, documentation from 3.1 release is actually
the same compared to version from 3.0 release. See and compare:

https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Console_Administration_Guide/sect-Maintaining_Hosts.html
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Console_Administration_Guide/sect-Maintaining_Hosts.html

Comment 9 SATHEESARAN 2015-09-18 07:21:04 UTC

BZ 1230247 is the RFE to stop all gluster services in single go.
If this requirement is implemented, then its simple for RHEV workflow to implement MAINTENANCE mode for RHGS node

Comment 10 Oonkwee Lim 2015-09-18 15:11:45 UTC

Yes, but meanwhile the document need to be change to just do one "pkll" command.

That is the customer's request.

Thanks & Regards

Oonkwee
Emerging Technologies
RedHat Global Support

Comment 13 Triveni Rao 2015-12-26 19:37:43 UTC

This bug is verified with the fixedin version rhsc-3.1.2-0.69.el6.noarch, vdsm-4.16.30-1.3.el7rhgs.x86_64

Couple of tests were performed to make sure the maintenance mode is effective from RHSC.

1. created 2 clusters with 2 nodes and created few volumes. 
2. Put one of the nodes to maintenance.
3. Confirmed from background that glusterd process has stopped, no brick processes running. 
4. tried with geo-rep session between the clusters. putting master node/slave node  into maintenance mode and checking from backend that it works fine showing geo-rep session faulty. once activated status showed ok.
5. tried to upgrade the host to new build of RHGS312 by putting the node to maintenance mode, this works fine by not showing any upgrade related issues.
6. confirmed that 3 way replica works fine by putting all the nodes to maintenance and bring back nodes one after the other. 

Over all, all the gluster related processes were stopped once the node is put to maintenance mode and process were online once activated.

Version:
vdsm-4.16.30-1.3.el7rhgs.x86_64 and vdsm-4.16.30-1.3.el6rhgs.x86_64


Output:

After maintenance mode:

[root@dhcp35-215 ~]# service glusterd status
Redirecting to /bin/systemctl status  glusterd.service
● glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Sat 2015-12-26 14:41:36 EST; 29s ago
 Main PID: 31684 (code=exited, status=0/SUCCESS)

Dec 26 14:35:32 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Starting GlusterFS, a clustered file-system server...
Dec 26 14:35:34 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Started GlusterFS, a clustered file-system server.
Dec 26 14:41:36 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Stopping GlusterFS, a clustered file-system server...
Dec 26 14:41:36 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Stopped GlusterFS, a clustered file-system server.
[root@dhcp35-215 ~]# 
[root@dhcp35-215 ~]# ps aux | grep glusterfsd
root      2358  0.0  0.0 112644   960 pts/0    S+   14:42   0:00 grep --color=auto glusterfsd
[root@dhcp35-215 ~]# ps aux | grep glusterfs
root      2360  0.0  0.0 112644   952 pts/0    S+   14:42   0:00 grep --color=auto glusterfs
[root@dhcp35-215 ~]# ps aux | grep ssh
root      1385  0.0  0.0  82548  3628 ?        Ss   06:15   0:00 /usr/sbin/sshd -D
root      2304  0.2  0.1 142860  5168 ?        Ss   14:41   0:00 sshd: root@pts/0
root      2362  0.0  0.0 112648   956 pts/0    S+   14:42   0:00 grep --color=auto ssh
[root@dhcp35-215 ~]# clear
[3;J
[root@dhcp35-215 ~]# rpm -qa | grep vdsm


After activating the host/node:

[root@dhcp35-215 ~]# service glusterd status
Redirecting to /bin/systemctl status  glusterd.service
● glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2015-12-26 14:51:49 EST; 23s ago
  Process: 2512 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 2513 (glusterd)
   CGroup: /system.slice/glusterd.service
           └─2513 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Dec 26 14:51:47 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Starting GlusterFS, a clustered file-system server...
Dec 26 14:51:49 dhcp35-215.lab.eng.blr.redhat.com systemd[1]: Started GlusterFS, a clustered file-system server.
[root@dhcp35-215 ~]#

Comment 16 errata-xmlrpc 2016-03-01 06:10:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0310.html

Comment 17 Red Hat Bugzilla 2023-09-14 02:54:23 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.