Bug 1676921

Summary: Rollback lvm2 package to version lvm2-2.02.180-10.el7_6.2.x86_64 in rhgs-server-container
Product: Red Hat Gluster Storage Reporter: Humble Chirammal <hchiramm>
Component: rhgs-server-containerAssignee: Saravanakumar <sarumuga>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: urgent Docs Contact:
Priority: urgent    
Version: ocs-3.11CC: akrishna, hchiramm, jrivera, knarra, kramdoss, madam, ndevos, pasik, rcyriac, rgeorge, rhs-bugs, sankarshan
Target Milestone: ---Keywords: ZStream
Target Release: OCS 3.11.z Async   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: rhgs3/rhgs-server-rhel7:3.11.1-15 Doc Type: Bug Fix
Doc Text:
The lvm2-2.02.180-10.el7_6.3 package introduces a new way of detecting MDRAID and multipath devices. This new technique detects the type of device that relies on data gathered by udev but as it is not available in the container, it causes LVM commands to take more time. As a consequence, higher level operations done by Heketi receives timeout and creating new bricks on gluster volumes fail. With this fix, the LVM2 package and dependencies are downgraded to the previous known working version. As a result, no connection is made to udev and the initialization of block devices by the gluster server containers works again.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-20 04:23:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1674475    

Description Humble Chirammal 2019-02-13 15:43:58 UTC
Description of problem:

OCS 3.11.1 GA shipped rhgs server container has an issue with latest lvm2 package (lvm2-2.02.180-10.el7_6.3.x86_64) as discussed in https://bugzilla.redhat.com/show_bug.cgi?id=1674485, To workaround this issue, we have to rollback lvm2 package in rhgs server container to below version:

lvm2-2.02.180-10.el7_6.2.x86_64

This report is a request for the same.

Version-Release number of selected component (if applicable):

lvm2-2.02.180-10.el7_6.3.x86_64



Additional info:

Mail:  http://post-office.corp.redhat.com/archives/cns-team/2019-February/msg00187.html

LVM 2 bug : https://bugzilla.redhat.com/show_bug.cgi?id=1676612

Comment 2 Niels de Vos 2019-02-15 09:35:36 UTC
In addition to the doc-text, the following may be useful:

The "use udev for detection of MDRAID and multipath devices" has been introduced through https://access.redhat.com/errata/RHBA-2019:0187 (bug 1657640). This new feature is currently not disabled through the 'obtain_device_list_from_udev' configuration option in /etc/lvm/lvm.conf. Once bug 1676612 has been addressed, the rhgs-gluster-server container can use the configuration option to disable the udev usage (bug 1676466).

Comment 5 Saravanakumar 2019-02-16 07:27:16 UTC
*** Bug 1677821 has been marked as a duplicate of this bug. ***

Comment 6 RamaKasturi 2019-02-16 08:38:34 UTC
Moving the bug to assigned state as i see that the lvm version with the latest container is  lvm2-2.02.180-10.el7_6.3.x86_64 where as it was supposed to be lvm2-2.02.180-10.el7_6.2.x86_64

Comment 7 RamaKasturi 2019-02-16 08:42:49 UTC
Moving this back to ON_QA as i see the correct lvm to be present in the container image.

Comment 8 Anjana KD 2019-02-18 13:22:09 UTC
Hello Niels,

I have updated the doc text kindly review it for technical accuracy.

Comment 10 Anjana KD 2019-02-19 07:12:56 UTC
Thankyou for the review Humble. 

Canceling the need info on Neils.

Comment 11 Anjana KD 2019-02-19 08:01:11 UTC
Hi Humble, could you please ack it here for the doc text.

Comment 12 Humble Chirammal 2019-02-19 08:54:52 UTC
(In reply to Anjana from comment #11)
> Hi Humble, could you please ack it here for the doc text.

LGTM. Thanks!

Comment 13 Michael Adam 2019-02-19 16:16:20 UTC
Agree with Humble: looks good in general.

Just 2 minor grammar nits:

s/operations ... receives timeout/operations ... receive timeout/
s/creating new ... fail/creating new ... fails/

Comment 14 RamaKasturi 2019-02-19 17:31:51 UTC
Below are the tests performed to validate the bug.

1) Fresh install on vmware and aws environment.
2) Upgrade on vmware and aws environment.

Image used for testing these packages is rhgs3/rhgs-server-rhel7:3.11.1-15

Below are the versions of gluster and lvm2 present in the container:
==================================================================

[ec2-user@ip-172-16-16-77 ~]$ oc rsh glusterfs-storage-mnlkh 
sh-4.2# cat /etc/redhat-storage-release 
Red Hat Gluster Storage Server 3.4.2(Container)
sh-4.2# rpm -qa | grep gluster
glusterfs-libs-3.12.2-32.el7rhgs.x86_64
glusterfs-3.12.2-32.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-32.el7rhgs.x86_64
glusterfs-server-3.12.2-32.el7rhgs.x86_64
gluster-block-0.2.1-30.el7rhgs.x86_64
glusterfs-api-3.12.2-32.el7rhgs.x86_64
glusterfs-cli-3.12.2-32.el7rhgs.x86_64
python2-gluster-3.12.2-32.el7rhgs.x86_64
glusterfs-fuse-3.12.2-32.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-32.el7rhgs.x86_64
sh-4.2# rpm -qa | grep lvm
lvm2-libs-2.02.180-10.el7_6.2.x86_64
lvm2-2.02.180-10.el7_6.2.x86_64

pvs & pvscan on the gluster pod:
======================================

sh-4.2# pvs
  /run/lvm/lvmetad.socket: connect failed: Connection refused
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  PV         VG                                  Fmt  Attr PSize    PFree
  /dev/xvdb1 dockervg                            lvm2 a--  <100.00g    0 
  /dev/xvdf  vg_558b9fb496a0c15f5d9e41bc323a114a lvm2 a--     1.95t 1.80t
sh-4.2# pvscan
  /run/lvm/lvmetad.socket: connect failed: Connection refused
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  PV /dev/xvdf    VG vg_558b9fb496a0c15f5d9e41bc323a114a   lvm2 [1.95 TiB / 1.80 TiB free]
  PV /dev/xvdb1   VG dockervg                              lvm2 [<100.00 GiB / 0    free]
  Total: 2 [2.05 TiB] / in use: 2 [2.05 TiB] / in no VG: 0 [0   ]


pv & pvscan on the node where gluster pod is running:
=======================================================
[ec2-user@ip-172-16-37-138 ~]$ sudo pvs
  /run/lvm/lvmetad.socket: connect failed: Connection refused
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  PV         VG                                  Fmt  Attr PSize    PFree
  /dev/xvdb1 dockervg                            lvm2 a--  <100.00g    0 
  /dev/xvdf  vg_558b9fb496a0c15f5d9e41bc323a114a lvm2 a--     1.95t 1.80t
[ec2-user@ip-172-16-37-138 ~]$ sudo pvscan
  /run/lvm/lvmetad.socket: connect failed: Connection refused
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  PV /dev/xvdf    VG vg_558b9fb496a0c15f5d9e41bc323a114a   lvm2 [1.95 TiB / 1.80 TiB free]
  PV /dev/xvdb1   VG dockervg                              lvm2 [<100.00 GiB / 0    free]
  Total: 2 [2.05 TiB] / in use: 2 [2.05 TiB] / in no VG: 0 [0   ]


logs for the command heketi-cli server state examine gluster is present in the link below.

http://rhsqe-repo.lab.eng.blr.redhat.com/cns/311async/

Moving this bug to verified state since no udev related issues are seen and the container image has the lvm2 version as  lvm2-2.02.180-10.el7_6.2.x86_64

Comment 16 errata-xmlrpc 2019-02-20 04:23:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0383