Bug 1839998

Summary: Replace host fails with gluster-maintenance ansible role
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: rhhiAssignee: Prajith <pkesavap>
Status: CLOSED ERRATA QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhhiv-1.8CC: godas, rhs-bugs
Target Milestone: ---   
Target Release: RHHI-V 1.8   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1840003 (view as bug list) Environment:
Last Closed: 2020-08-04 14:52:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1840003    
Bug Blocks: 1779977    

Description SATHEESARAN 2020-05-26 08:20:52 UTC
Description of problem:
------------------------
Replace host with the same host, fails with the volume reset task. As I understand this should be possibly the way the brick path names are extracted using grep from the volume info output.

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
gluster-ansible-maintenance-1.0.1-2.el8rhgs

How reproducible:
-----------------
Always

Steps to Reproduce:
---------------------
1. Execute the playbook to replace the host with itself

Actual results:
----------------
Failure to do volume reset

Expected results:
------------------
volume reset command invocation should be successful

Comment 2 SATHEESARAN 2020-05-26 08:26:52 UTC
The fix should do 2 things:

1. Fix the existing logic that extracts the brick_path from the 'gluster volume info <vol>' output.

In this case, there could chance of asymmetric brick.

<existing_logic>
---
# Set up the volume management
- name: Fetch the directory and volume details
  block:
    - name: Get the list of volumes on the machine
      shell: ls "{{ glusterd_libdir }}/vols"
      register: dir_list

    - set_fact:
        volumes: "{{ dir_list.stdout.split() }}"

    # Find the list of bricks on the machine
    - name: Get the list of bricks corresponding to volume
      shell: >
        gluster vol info {{ item }} | grep "Brick.*{{ item }}:" |              <--------not greps for the hostname
        awk -F: '{ print $3 }'
      with_items: "{{ volumes }}"
      register: brick_list
</existing_logic>

So the new logic should do :

 -----> gluster volume info vmstore | grep {{gluster_maintenance_cluster_node}}  | awk -F: '{ print $3 }'

2. The above logic should also consider the possibility of nx3 replicate volume, where there could be more than
one entry per volume to be 'volume reset'

Comment 4 SATHEESARAN 2020-06-11 18:43:40 UTC
Verified with gluster-ansible-maintenance-1.0.1-4.el8rhgs

1. Once the host is replaced by the reinstalled version of host with same FQDN,
replace-brick worked good and that replace-brick command was successful

2. Healing got triggered post this operation

Comment 6 errata-xmlrpc 2020-08-04 14:52:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHHI for Virtualization 1.8 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3314