Bug 1349875

Summary: [ RFE ] heketi-cli should support replacement of a failed node
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Prasanth <pprakash>
Component: heketiAssignee: Mohamed Ashiq <mliyazud>
Status: CLOSED ERRATA QA Contact: Tejas Chaphekar <tchaphek>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.1CC: annair, asriram, hchiramm, madam, mliyazud, pprakash, rcyriac, rreddy, rtalur, srmukher
Target Milestone: ---Keywords: FutureFeature
Target Release: CNS 3.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: heketi-5.0.0-1.el7rhgs Doc Type: Enhancement
Doc Text:
Previously, the prescribed way to replace a failed node was to run multiple commands on the node devices followed by the node delete command. With this update, the removal of a failed node has been enhanced with a single command to replace a failed node. For example: heketi-cli node remove [node-id]
Story Points: ---
Clone Of:
: 1358188 (view as bug list) Environment:
Last Closed: 2017-10-11 07:07:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1358188, 1445444    

Description Prasanth 2016-06-24 12:22:53 UTC
Description of problem:

heketi-cli currently supports only add and delete of a node. It should also support a cleaner way to replace a failed node as well:

#####
# heketi-cli node -h
Heketi Node Management

Usage:
  heketi-cli node [command]

Available Commands:
  add         Add new node to be managed by Heketi
  delete      Deletes a node from Heketi management
  info        Retreives information about the node
  enable      Allows node to go online
  disable     Disallow usage of a node by placing it offline
#####

Version-Release number of selected component (if applicable):
heketi-templates-2.0.2-3.el7rhgs.x86_64
heketi-client-2.0.2-3.el7rhgs.x86_64

Comment 1 Luis Pabón 2016-06-27 12:59:04 UTC
Definitely very important feature.  This is planned for Release 3:

https://github.com/heketi/heketi/issues/161

Comment 2 Luis Pabón 2016-09-30 11:21:04 UTC
Work upstream has not started.  This may or may not make it to 3.4

Comment 3 Michael Adam 2016-10-19 13:54:25 UTC
moving out of 3.4 to 3.5

Comment 10 Humble Chirammal 2017-04-27 11:19:49 UTC
*** Bug 1446065 has been marked as a duplicate of this bug. ***

Comment 12 Mohamed Ashiq 2017-05-24 11:16:55 UTC
Merged Upstream:

https://github.com/heketi/heketi/pull/752

Comment 13 Tejas Chaphekar 2017-08-21 08:19:24 UTC
Multiple features are added as a part of 3.6 release and tested successfully as a part of 3.6 Testing

Plz find the logs as follows

[root@dhcp46-116 ~]# heketi-cli node --help
Heketi Node Management

Usage:
  heketi-cli node [command]

Available Commands:
  add         Add new node to be managed by Heketi
  delete      Deletes a node from Heketi management
  disable     Disallow usage of a node by placing it offline
  enable      Allows node to go online
  info        Retreives information about the node
  list        List all nodes in cluster
  remove      Removes a node and all its associated devices from Heketi

Flags:
  -h, --help   help for node

Global Flags:
      --json            
	Print response as JSON
      --secret string   
	Secret key for specified user.  Can also be
	set using the environment variable HEKETI_CLI_KEY
  -s, --server string   
	Heketi server. Can also be set using the
	environment variable HEKETI_CLI_SERVER (the default one is http://localhost:8080)
      --user string     
	Heketi user.  Can also be set using the
	environment variable HEKETI_CLI_USER



[root@dhcp47-20 ~]# heketi-cli node list

Id:75c151258eee05c64f16dcaa85549f7b	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:836b8213fa88164bc152804ceca0a13a	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:92f58358e92be0f010523f87ca4ba017	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:f7a3fe0efe012afa65250258b709ccbf	Cluster:83f2d91d36c5ea0076e6bd9ed165454a

[root@dhcp47-20 ~]# heketi-cli node info 92f58358e92be0f010523f87ca4ba017

Node Id: 92f58358e92be0f010523f87ca4ba017
State: online
Cluster Id: 83f2d91d36c5ea0076e6bd9ed165454a
Zone: 1
Management Hostname: dhcp47-175.lab.eng.blr.redhat.com
Storage Hostname: 10.70.47.175
Devices:
Id:dddb4b7848ba98763b469180fb7d2fc6   Name:/dev/sdd            State:online    Size (GiB):15      Used (GiB):0       Free (GiB):15      

[root@dhcp47-20 ~]# heketi-cli node disable 92f58358e92be0f010523f87ca4ba017

Node 92f58358e92be0f010523f87ca4ba017 is now offline

[root@dhcp47-20 ~]# heketi-cli node list

Id:75c151258eee05c64f16dcaa85549f7b	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:836b8213fa88164bc152804ceca0a13a	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:92f58358e92be0f010523f87ca4ba017	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:f7a3fe0efe012afa65250258b709ccbf	Cluster:83f2d91d36c5ea0076e6bd9ed165454a

[root@dhcp47-20 ~]# heketi-cli node info 92f58358e92be0f010523f87ca4ba017

Node Id: 92f58358e92be0f010523f87ca4ba017
State: offline
Cluster Id: 83f2d91d36c5ea0076e6bd9ed165454a
Zone: 1
Management Hostname: dhcp47-175.lab.eng.blr.redhat.com
Storage Hostname: 10.70.47.175
Devices:
Id:dddb4b7848ba98763b469180fb7d2fc6   Name:/dev/sdd            State:online    Size (GiB):15      Used (GiB):0       Free (GiB):15      

[root@dhcp47-20 ~]# heketi-cli node remove 92f58358e92be0f010523f87ca4ba017

Node 92f58358e92be0f010523f87ca4ba017 is now removed

[root@dhcp47-20 ~]# heketi-cli node list

Id:75c151258eee05c64f16dcaa85549f7b	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:836b8213fa88164bc152804ceca0a13a	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:92f58358e92be0f010523f87ca4ba017	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:f7a3fe0efe012afa65250258b709ccbf	Cluster:83f2d91d36c5ea0076e6bd9ed165454a

[root@dhcp47-20 ~]# heketi-cli node info 92f58358e92be0f010523f87ca4ba017

Node Id: 92f58358e92be0f010523f87ca4ba017
State: failed
Cluster Id: 83f2d91d36c5ea0076e6bd9ed165454a
Zone: 1
Management Hostname: dhcp47-175.lab.eng.blr.redhat.com
Storage Hostname: 10.70.47.175
Devices:
Id:dddb4b7848ba98763b469180fb7d2fc6   Name:/dev/sdd            State:failed    Size (GiB):15      Used (GiB):0       Free (GiB):15      

[root@dhcp47-20 ~]# heketi-cli node delete 92f58358e92be0f010523f87ca4ba017

Error: Unable to delete node [92f58358e92be0f010523f87ca4ba017] because it contains devices

[root@dhcp47-20 ~]# heketi-cli node info 92f58358e92be0f010523f87ca4ba017
Node Id: 92f58358e92be0f010523f87ca4ba017
State: failed
Cluster Id: 83f2d91d36c5ea0076e6bd9ed165454a
Zone: 1
Management Hostname: dhcp47-175.lab.eng.blr.redhat.com
Storage Hostname: 10.70.47.175
Devices:
Id:dddb4b7848ba98763b469180fb7d2fc6   Name:/dev/sdd            State:failed    Size (GiB):15      Used (GiB):0       Free (GiB):15      

[root@dhcp47-20 ~]# heketi-cli device delete dddb4b7848ba98763b469180fb7d2fc6

Device dddb4b7848ba98763b469180fb7d2fc6 deleted

[root@dhcp47-20 ~]# heketi-cli node info 92f58358e92be0f010523f87ca4ba017

Node Id: 92f58358e92be0f010523f87ca4ba017
State: failed
Cluster Id: 83f2d91d36c5ea0076e6bd9ed165454a
Zone: 1
Management Hostname: dhcp47-175.lab.eng.blr.redhat.com
Storage Hostname: 10.70.47.175
Devices:

[root@dhcp47-20 ~]# heketi-cli node delete 92f58358e92be0f010523f87ca4ba017

Node 92f58358e92be0f010523f87ca4ba017 deleted

[root@dhcp47-20 ~]# heketi-cli node list

Id:75c151258eee05c64f16dcaa85549f7b	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:836b8213fa88164bc152804ceca0a13a	Cluster:83f2d91d36c5ea0076e6bd9ed165454a
Id:f7a3fe0efe012afa65250258b709ccbf	Cluster:83f2d91d36c5ea0076e6bd9ed165454a

Comment 14 Tejas Chaphekar 2017-08-21 08:20:44 UTC
Following builds were used for the verification

heketi-client-5.0.0-7.el7rhgs.x86_64
cns-deploy-5.0.0-14.el7rhgs.x86_64
Gluster - rhgs-server-rhel7:3.3.0-11
Heketi -  rhgs-volmanager-rhel7:3.3.0-9

Comment 16 Srijita Mukherjee 2017-10-03 09:52:55 UTC
Talur,

This bug has been proposed for CNS-3.6 release. Kindly review the doc text and acknowledge.

Comment 17 Raghavendra Talur 2017-10-04 09:24:58 UTC
made a minor change to command in the doc text. rest looks good.

Comment 19 errata-xmlrpc 2017-10-11 07:07:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2879