Bug 1840003

Summary: Replace host fails with gluster-maintenance ansible role
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: gluster-ansibleAssignee: Prajith <pkesavap>
Status: CLOSED ERRATA QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.5CC: godas, pkesavap, pprakash, puebele, rhs-bugs, sabose, sasundar
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.5.z Batch Update 2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: gluster-ansible-maintenance-1.0.1-4.el8rhgs Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1839998 Environment:
Last Closed: 2020-06-16 05:57:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1641431, 1839998    

Description SATHEESARAN 2020-05-26 08:31:02 UTC
Description of problem:
------------------------
Replace host with the same host, fails with the volume reset task. As I understand this should be possibly the way the brick path names are extracted using grep from the volume info output.

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
gluster-ansible-maintenance-1.0.1-2.el8rhgs

How reproducible:
-----------------
Always

Steps to Reproduce:
---------------------
1. Execute the playbook to replace the host with itself

Actual results:
----------------
Failure to do volume reset

Expected results:
------------------
volume reset command invocation should be successful

Comment 1 SATHEESARAN 2020-05-26 08:31:25 UTC
The fix should do 2 things:

1. Fix the existing logic that extracts the brick_path from the 'gluster volume info <vol>' output.

In this case, there could chance of asymmetric brick.

<existing_logic>
---
# Set up the volume management
- name: Fetch the directory and volume details
  block:
    - name: Get the list of volumes on the machine
      shell: ls "{{ glusterd_libdir }}/vols"
      register: dir_list

    - set_fact:
        volumes: "{{ dir_list.stdout.split() }}"

    # Find the list of bricks on the machine
    - name: Get the list of bricks corresponding to volume
      shell: >
        gluster vol info {{ item }} | grep "Brick.*{{ item }}:" |              <--------not greps for the hostname
        awk -F: '{ print $3 }'
      with_items: "{{ volumes }}"
      register: brick_list
</existing_logic>

So the new logic should do :

 -----> gluster volume info vmstore | grep {{gluster_maintenance_cluster_node}}  | awk -F: '{ print $3 }'

2. The above logic should also consider the possibility of nx3 replicate volume, where there could be more than
one entry per volume to be 'volume reset'

Comment 2 Gobinda Das 2020-05-26 11:29:52 UTC
PR: https://github.com/gluster/gluster-ansible/pull/108

Comment 4 SATHEESARAN 2020-05-27 05:49:23 UTC
(In reply to Gobinda Das from comment #2)
> PR: https://github.com/gluster/gluster-ansible/pull/108

Gobinda,

Lets track the replace-host playbook with this bug - https://bugzilla.redhat.com/show_bug.cgi?id=1641431

This bug is particularly to fix gluster-ansible-maintenance task that does volume restoration for
replace-host workflow

Comment 5 SATHEESARAN 2020-05-27 08:11:39 UTC
*** Bug 1840540 has been marked as a duplicate of this bug. ***

Comment 8 SATHEESARAN 2020-06-08 10:44:03 UTC
Tested with RHV 4.4.1 and gluster-ansible-maintenance-1.0.3.el8rhgs

There are 2 failures

1. Syntax error related to unremoved "-block" in the role
2. Semantic error, where the 'gluster_ansible_cluster_node' is used to replace host workflow

Comment 10 SATHEESARAN 2020-06-11 18:42:45 UTC
Verified with gluster-ansible-maintenance-1.0.1-4.el8rhgs

1. Once the host is replaced by the reinstalled version of host with same FQDN,
replace-brick worked good and that replace-brick command was successful

2. Healing got triggered post this operation

Comment 12 errata-xmlrpc 2020-06-16 05:57:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:2575