Bug 1479355 - make sure we bring glusterd up before gluster-blockd and rest of the services comesup
make sure we bring glusterd up before gluster-blockd and rest of the services...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: gluster-block (Show other bugs)
3.3
Unspecified Unspecified
unspecified Severity unspecified
: ---
: RHGS 3.3.0
Assigned To: Prasanna Kumar Kalever
Sweta Anandpara
:
Depends On:
Blocks: 1417151
  Show dependency treegraph
 
Reported: 2017-08-08 08:20 EDT by Prasanna Kumar Kalever
Modified: 2017-09-21 00:20 EDT (History)
8 users (show)

See Also:
Fixed In Version: gluster-block-0.2.1-8.el7rhgs
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-09-21 00:20:54 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:2773 normal SHIPPED_LIVE new packages: gluster-block 2017-09-21 04:16:22 EDT

  None (edit)
Description Prasanna Kumar Kalever 2017-08-08 08:20:01 EDT
Description of problem:

In a node reboot (for example) case if glusterd has not comeup before gluster-blockd service we see a lot of failures due to storage absence. 

Ideally in can be any case when the gluster-blockd is brought up before glusterd
Comment 8 Sweta Anandpara 2017-08-28 02:51:56 EDT
Scenario 1:
* Stop glusterd - that results in gluster-blockd service going into inactive state
* Start/restart gluster-blockd - that fails with dependency error (as expected)

Scenario 2:
* Stop gluster-block-target - which results in gluster-blockd service again going to inactive state
* Stop glusterd
* Start/restart gluster-blockd - that fails with dependency. Tcmu-runner, gluster-block-target, glusterd - all remain down
* Start glusterd - tcmu-runner, gluster-block-target, gluster-blockd all remain down
* Start gluster-blockd - all the mentioned services come up successfully

Scenario 3: 
* Stop glusterd - that results in gluster-blockd service going to inactive state
* Start/restart gluster-blockd - that fails with dependency error (as expected)
* Start glusterd - glusterd comes up. Gluster-blockd continues to remain down
* Start gluster-blockd - that gets gluster-blockd up

Have tested the above mentioned scenarios and multiple permutations of the services. All justify and prove the order of gluster-blockd dependent on tcmu-runner, tcmu-runner dependent on gluster-block-target, which in turn is dependent on glusterd.

One last question, before I move this bug to verified:

Scenario 3, step3: If glusterd is brought up, are we expecting gluster-blockd to automatically come up? In other words, after we do a start/restart in step2, should we make gluster-blockd service check for glusterd at regular intervals, so that gluster-blockd can get itself back online as soon as it sees glusterd up? It presently doesn't..

Prasanna/Atin, please ignore the question of comment7. Keeping the need_info on this bug for the query mentioned above.
Comment 9 Prasanna Kumar Kalever 2017-08-28 03:58:01 EDT
sweta, 

Is it expected to start gluster-blockd when glusterd is brought back ?

If there is such a requirement then we should explore Wanted= or PartOf= options in systemd units.
Comment 10 Prasanna Kumar Kalever 2017-08-28 05:28:50 EDT
Just to confirm that we understand it right, 

Add "WantedBy=glusterd.service" to [Unit] section of gluster-blockd.service.

The modified unit looks like

#cat /usr/lib/systemd/system/gluster-blockd.service
[Unit]
Description=Gluster block storage utility
BindsTo=tcmu-runner.service rpcbind.service
After=tcmu-runner.service rpcbind.service
WantedBy=glusterd.service

[Service]
Type=simple
Environment="GB_GLFS_LRU_COUNT=5"
Environment="GB_LOG_LEVEL=INFO"
EnvironmentFile=-/etc/sysconfig/gluster-blockd
ExecStart=/usr/sbin/gluster-blockd --glfs-lru-count $GB_GLFS_LRU_COUNT --log-level $GB_LOG_LEVEL $GB_EXTRA_ARGS
KillMode=process

[Install]
WantedBy=multi-user.target


Should give you what you are asking for, but we might need a justification why we would need this.
Comment 11 Sweta Anandpara 2017-08-28 07:32:24 EDT
My opinion - it is nice to have. At /this/ stage of the release? We can live with it for now, unless it becomes a bigger problem in CNS

Karthick/Humble, thoughts? If glusterd goes down, does the entire pod go down? If yes, then we might not hit this scenario at all. If no, then please guide/reply to comment9. 

I will be moving this bug to verified if this is acceptable in the CNS environment. A new bug can be raised (if needed) for the new change.
Comment 12 Humble Chirammal 2017-08-28 13:20:27 EDT
(In reply to Sweta Anandpara from comment #11)
> My opinion - it is nice to have. At /this/ stage of the release? We can live
> with it for now, unless it becomes a bigger problem in CNS
> 
> Karthick/Humble, thoughts? If glusterd goes down, does the entire pod go
> down? If yes, then we might not hit this scenario at all. If no, then please
> guide/reply to comment9. 

Yes, if glusterd is down, the pod is restarted.

> 
> I will be moving this bug to verified if this is acceptable in the CNS
> environment. A new bug can be raised (if needed) for the new change.
Comment 15 errata-xmlrpc 2017-09-21 00:20:54 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2773

Note You need to log in before you can comment on or make changes to this bug.