1839998 – Replace host fails with gluster-maintenance ansible role

Bug 1839998 - Replace host fails with gluster-maintenance ansible role

Summary: Replace host fails with gluster-maintenance ansible role

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	rhhi
Sub Component:
Version:	rhhiv-1.8
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHHI-V 1.8
Assignee:	Prajith
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Depends On:	1840003
Blocks:	RHHI-V-1.8-Engineering-Inflight-BZs
TreeView+	depends on / blocked

Reported:	2020-05-26 08:20 UTC by SATHEESARAN
Modified:	2020-08-04 14:52 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	1840003 (view as bug list)
Environment:
Last Closed:	2020-08-04 14:52:09 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2020:3314	0	None	None	None	2020-08-04 14:52:26 UTC

Description SATHEESARAN 2020-05-26 08:20:52 UTC

Description of problem:
------------------------
Replace host with the same host, fails with the volume reset task. As I understand this should be possibly the way the brick path names are extracted using grep from the volume info output.

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
gluster-ansible-maintenance-1.0.1-2.el8rhgs

How reproducible:
-----------------
Always

Steps to Reproduce:
---------------------
1. Execute the playbook to replace the host with itself

Actual results:
----------------
Failure to do volume reset

Expected results:
------------------
volume reset command invocation should be successful

Comment 2 SATHEESARAN 2020-05-26 08:26:52 UTC

The fix should do 2 things:

1. Fix the existing logic that extracts the brick_path from the 'gluster volume info <vol>' output.

In this case, there could chance of asymmetric brick.

<existing_logic>
---
# Set up the volume management
- name: Fetch the directory and volume details
  block:
    - name: Get the list of volumes on the machine
      shell: ls "{{ glusterd_libdir }}/vols"
      register: dir_list

    - set_fact:
        volumes: "{{ dir_list.stdout.split() }}"

    # Find the list of bricks on the machine
    - name: Get the list of bricks corresponding to volume
      shell: >
        gluster vol info {{ item }} | grep "Brick.*{{ item }}:" |              <--------not greps for the hostname
        awk -F: '{ print $3 }'
      with_items: "{{ volumes }}"
      register: brick_list
</existing_logic>

So the new logic should do :

 -----> gluster volume info vmstore | grep {{gluster_maintenance_cluster_node}}  | awk -F: '{ print $3 }'

2. The above logic should also consider the possibility of nx3 replicate volume, where there could be more than
one entry per volume to be 'volume reset'

Comment 4 SATHEESARAN 2020-06-11 18:43:40 UTC

Verified with gluster-ansible-maintenance-1.0.1-4.el8rhgs

1. Once the host is replaced by the reinstalled version of host with same FQDN,
replace-brick worked good and that replace-brick command was successful

2. Healing got triggered post this operation

Comment 6 errata-xmlrpc 2020-08-04 14:52:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHHI for Virtualization 1.8 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3314

Note You need to log in before you can comment on or make changes to this bug.