Bug 1674485
Summary: | Cannot install latest version of ocp + ocs in aws environment. | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | RamaKasturi <knarra> |
Component: | cns-ansible | Assignee: | John Mulligan <jmulligan> |
Status: | CLOSED WORKSFORME | QA Contact: | Prasanth <pprakash> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | ocs-3.11 | CC: | admin, bkunal, bward, dyocum, hchiramm, jarrpa, jocelyn.thode, knarra, kramdoss, madam, mrobson, nchilaka, ndevos, nravinas, pasik, pdwyer, puebele, rcyriac, rhs-bugs, sarumuga |
Target Milestone: | --- | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-05-20 03:48:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1676466, 1676612, 1684133 | ||
Bug Blocks: |
Description
RamaKasturi
2019-02-11 13:12:24 UTC
Will be uploading the logs shortly While running lvm commands in the glusterfs pod, there are many references to udev. It is not recommended to run udev in both the host and the pod, so we configure LVM to not use udev. sh-4.2# pvs -vvv ... lvmcache /dev/xvdb1: VG dockervg: set VGID to QMLH4dQ53pkpyho6e7qxf1pOpUIFSbzg. Opened /dev/xvda RO O_DIRECT /dev/xvda: size is 209715200 sectors Closed /dev/xvda /dev/xvda: Skipping: Partition table signature found dm version [ opencount flush ] [16384] (*1) dm status (253:0) [ noopencount noflush ] [16384] (*1) Opened /dev/dockervg/dockerlv RO O_DIRECT /dev/dockervg/dockerlv: size is 209707008 sectors Closed /dev/dockervg/dockerlv /dev/dockervg/dockerlv: using cached size 209707008 sectors Device /dev/dockervg/dockerlv not initialized in udev database (1/100, 0 microseconds). Device /dev/dockervg/dockerlv not initialized in udev database (2/100, 100000 microseconds). Device /dev/dockervg/dockerlv not initialized in udev database (3/100, 200000 microseconds). Device /dev/dockervg/dockerlv not initialized in udev database (4/100, 300000 microseconds). Device /dev/dockervg/dockerlv not initialized in udev database (5/100, 400000 microseconds). ... And this all takes a pretty long time. There is an option in /etc/lvm/lvm.conf that we should probably disable: obtain_device_list_from_udev = 1 Even when disabling this option, the messages still occur. With ltrace it is shown that libudev is used a lot: sh-4.2# ltrace -e '*udev*' pvs ... libudev.so.1->udev_list_entry_get_next(0, 0x7f554bfed778, 0xffffffff, 0x55f6dcd0e000) = 0 libudev.so.1->udev_list_entry_get_next(0, 0x7f554bfed778, 0xffffffff, 0x55f6dcd0e000) = 0 libudev.so.1->udev_list_entry_get_next(0, 0x7f554bfed778, 0xffffffff, 0x55f6dcd0e000) = 0 <... udev_device_unref resumed> ) = 0 pvs->udev_device_new_from_devnum(0x55f6dccd5010, 98, 0xfd00, 0x55f6dcd0d830 <unfinished ...> libudev.so.1->udev_device_new_from_syspath(0x55f6dccd5010, 0x7ffd81a312e0, 0x7ffd81a312f4, 20) = 0x55f6dcd0d840 <... udev_device_new_from_devnum resumed> ) = 0x55f6dcd0d840 pvs->udev_device_get_is_initialized(0x55f6dcd0d840, 0x55f6dcd0e0c2, 0x7f554bd99b20, 0 <unfinished ...> libudev.so.1->udev_device_get_subsystem(0x55f6dcd0d840, 0, 0x7f554bd99b20, 0) = 0x55f6dcd0e0a0 libudev.so.1->udev_device_get_devnum(0x55f6dcd0d840, 0x55f6dcd0de18, 0, 0) = 0xfd00 libudev.so.1->udev_device_get_devnum(0x55f6dcd0d840, 0, 253, 0) = 0xfd00 libudev.so.1->udev_device_get_devnum(0x55f6dcd0d840, 0, 253, 0) = 0xfd00 libudev.so.1->udev_device_get_subsystem(0x55f6dcd0d840, 0, 253, 0) = 0x55f6dcd0e0a0 <... udev_device_get_is_initialized resumed> ) = 0 pvs->udev_device_unref(0x55f6dcd0d840, 0, 0, -1 <unfinished ...> libudev.so.1->udev_list_entry_get_next(0, 0x7f554bfed768, 0xffffffff, 0x55f6dcd0e500) = 0 ... Looking at the lvm2 code, there has been a recent change that pretty much describes our problem: - https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=3ebce8dbd2d9afc031e0737f8feed796ec7a8df9;hp=d19e3727951853093828b072e254e447f7d61c60 apply obtain_device_list_from_udev to all libudev usage udev_dev_is_md_component and udev_dev_is_mpath_component are not used for obtaining the device list, but they still use libudev for device info. When there are problems with udev, these functions can get stuck. So, use the existing obtain_device_list_from_udev config setting to also control whether these "is component" functions are used, which gives us a way to avoid using libudev entirely when it's causing problems. I'll run some tests to see if this fixes the problem for us: 1. edit lvm.conf and disable 'obtain_device_list_from_udev' 2. build a patched lvm2 package and install it in a running pod In two different pods, the 1st with the original lvm2 packages and config: --- %< --- sh-4.2# time pvs WARNING: Device /dev/dockervg/dockerlv not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/vg_9beb04a3ac2e50d6ff4ca1e44fde85bc/bz1674485 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/xvda2 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/xvdb1 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/xvdc not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/xvdd not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/xvdf not initialized in udev database even after waiting 10000000 microseconds. PV VG Fmt Attr PSize PFree /dev/xvdb1 dockervg lvm2 a-- <100.00g 0 /dev/xvdf vg_9beb04a3ac2e50d6ff4ca1e44fde85bc lvm2 a-- 999.87g 997.87g real 1m10.215s user 0m0.044s sys 0m0.066s --- %< --- And the 2nd pod with updated lvm2 packages and obtain_device_list_from_udev=0 in lvm.conf: --- %< --- sh-4.2# time pvs PV VG Fmt Attr PSize PFree /dev/xvdb1 dockervg lvm2 a-- <100.00g 0 /dev/xvdf vg_4f47afe74c28587a6f0b4110eae142cd lvm2 a-- 999.87g 997.87g real 0m0.031s user 0m0.004s sys 0m0.010s --- %< --- The updated packages (+src.rpm) are at https://people.redhat.com/ndevos/bz1674485/ kasturi, Thanks for providing these details. It would be also appreciated if you could provide some more details from host: #rpm -qa |egrep 'lvm2|systemd|udev' output from the hosts where gluster pods are running : ========================================================== [ec2-user@ip-172-31-25-77 ~]$ ansible -i inv_3.11 nodes -b -m shell -a "rpm -qa |egrep 'lvm2|systemd|udev'" [WARNING]: Consider using the yum, dnf or zypper module rather than running rpm. If you need to use command because yum, dnf or zypper is insufficient you can add warn=False to this command task or set command_warnings=False in ansible.cfg to get rid of this message. ip-172-16-43-224.ap-south-1.compute.internal | SUCCESS | rc=0 >> python-gudev-147.2-7.el7.x86_64 lvm2-2.02.180-10.el7_6.3.x86_64 systemd-219-62.el7.x86_64 systemd-sysv-219-62.el7.x86_64 python-pyudev-0.15-9.el7.noarch oci-systemd-hook-0.1.18-3.git8787307.el7_6.x86_64 systemd-libs-219-62.el7.x86_64 libgudev1-219-62.el7.x86_64 lvm2-libs-2.02.180-10.el7_6.3.x86_64 ip-172-16-25-44.ap-south-1.compute.internal | SUCCESS | rc=0 >> python-gudev-147.2-7.el7.x86_64 lvm2-2.02.180-10.el7_6.3.x86_64 systemd-219-62.el7.x86_64 systemd-sysv-219-62.el7.x86_64 python-pyudev-0.15-9.el7.noarch oci-systemd-hook-0.1.18-3.git8787307.el7_6.x86_64 systemd-libs-219-62.el7.x86_64 libgudev1-219-62.el7.x86_64 lvm2-libs-2.02.180-10.el7_6.3.x86_64 ip-172-16-30-46.ap-south-1.compute.internal | SUCCESS | rc=0 >> python-gudev-147.2-7.el7.x86_64 lvm2-2.02.180-10.el7_6.3.x86_64 systemd-219-62.el7.x86_64 systemd-sysv-219-62.el7.x86_64 python-pyudev-0.15-9.el7.noarch oci-systemd-hook-0.1.18-3.git8787307.el7_6.x86_64 systemd-libs-219-62.el7.x86_64 libgudev1-219-62.el7.x86_64 lvm2-libs-2.02.180-10.el7_6.3.x86_64 Niels, we can disable the param (obtain_device_list_from_udev) in lvm.conf and commit an image for QE testing. Thats the easiest or quick thing we can do here to isolate the issue. kasturi, Were we able to do successful deployment in other cloud or baremetal cluster ? I am trying to understand whats special about AWS instance. (In reply to Humble Chirammal from comment #12) > Niels, we can disable the param (obtain_device_list_from_udev) in lvm.conf > and commit an image for QE testing. Thats the easiest or quick thing we > can do here to isolate the issue. > > kasturi, Were we able to do successful deployment in other cloud or > baremetal cluster ? Our primary test environment has been vmware and we did not face any issues related to deployment on this environment. I am trying to understand whats special about AWS instance. setting need info back on ndevos as this got cleared (In reply to Humble Chirammal from comment #12) > Niels, we can disable the param (obtain_device_list_from_udev) in lvm.conf > and commit an image for QE testing. Thats the easiest or quick thing we > can do here to isolate the issue. Unfortunately changing obtain_device_list_from_udev is not sufficient. LVM has a bug where it still tries to use udev for some actions (or some devices?). The patch mentioned in comment #8 is not included in a build of lvm2 yet (that is why I did a test build for comment #9). I am checking with the lvm developers if they have a bz for getting that patch included already, otherwise we'll open a new one. (In reply to RamaKasturi from comment #13) > (In reply to Humble Chirammal from comment #12) > > Niels, we can disable the param (obtain_device_list_from_udev) in lvm.conf > > and commit an image for QE testing. Thats the easiest or quick thing we > > can do here to isolate the issue. > > > > kasturi, Were we able to do successful deployment in other cloud or > > baremetal cluster ? > Our primary test environment has been vmware and we did not face any issues > related to deployment on this environment. > > I am trying to understand whats special about AWS instance. Kasturi, could you please fetch the lvm , udev, systemd package versions from a system where we ***dont*** hit this issue ? Because, the RCA mapping between AWS and our OCS 3.11.1 container has to be established somehow, so the question. *** Bug 1675134 has been marked as a duplicate of this bug. *** I took a look at the vmware setup where QE didnt hit installation issue. however I still see udev db fetch take long time on this setup too. [root@dhcp46-138 ~]# oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE glusterblock-storage-provisioner-dc-1-79sn5 1/1 Running 0 4d 10.130.0.2 dhcp47-37.lab.eng.blr.redhat.com <none> glusterfs-storage-6vstf 1/1 Running 1 4d 10.70.46.115 dhcp46-115.lab.eng.blr.redhat.com <none> glusterfs-storage-88jxh 1/1 Running 0 4d 10.70.47.37 dhcp47-37.lab.eng.blr.redhat.com <none> glusterfs-storage-zcppk 1/1 Running 0 4d 10.70.47.2 dhcp47-2.lab.eng.blr.redhat.com <none> heketi-storage-1-vzpwb 1/1 Running 0 4d 10.130.0.7 dhcp47-37.lab.eng.blr.redhat.com <none> Inside 'glusterfs-storage-6vstf' pod [root@dhcp46-115 /]# time pvs WARNING: Device /dev/rhel_dhcp47-42/root not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/sda1 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/sda2 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/sda3 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/vg_0f0b820a8ce7358d933692c82db220fe/brick_539df0f01872cc2d2250f8344de1d1e3 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/rhel_dhcp47-42/home not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/mapper/docker-8:17-11610-a15d2327feccd20bfbdec749b9ae6a8f6e3e8c84ec6317057fb0d6e470c8f743 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/mapper/docker-8:17-11610-0db4c3fb7c2a280f4e2e7384a7108630c4c71f13b49a00a8d8323cbf47d654fb not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/mapper/docker-8:17-11610-c9d00c30643dc58c88c339e3353dc115af8630fde26272e9b771c4b8d51530f3 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/mapper/docker-8:17-11610-ffa8367b3094fdcc953f823ee6b1f71fa9da5da436f835c8f0f2b15e4586717d not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/mapper/docker-8:17-11610-404e60308514ee2e68bd3460e1b597b5c09af3957ce3263dccb767a329f6b902 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/mapper/docker-8:17-11610-53a6ace3a8119115dfde7716553919bd07cfc9ced0ab81bf3d1c7f91a951ccaf not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/mapper/docker-8:17-11610-b1c63776580ea421898bd35ac5c97137015075d94e1911f49e2783471713e4da not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/sdb1 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/mapper/docker-8:17-11610-ae756cef22b62b9b7b810ff3e7b8102251a98592f0d074ae74a832a9ede9f476 not initialized in udev database even after waiting 10000000 microseconds. [root@dhcp46-115 ~]# uname -r 3.10.0-957.5.1.el7.x86_64 [root@dhcp46-115 ~]# rpm -qa |egrep 'lvm2|udev|systemd' lvm2-2.02.180-10.el7_6.3.x86_64 python-pyudev-0.15-9.el7.noarch python-gudev-147.2-7.el7.x86_64 oci-systemd-hook-0.1.18-3.git8787307.el7_6.x86_64 systemd-libs-219-62.el7_6.3.x86_64 systemd-219-62.el7_6.3.x86_64 systemd-sysv-219-62.el7_6.3.x86_64 lvm2-libs-2.02.180-10.el7_6.3.x86_64 libgudev1-219-62.el7_6.3.x86_64 ----------------------------- The pending query is that, how did this setup passed the installation stage. kasturi, can you please confirm ansible installation is unchanged between this VMWARE and AWS setup? Hello Humble & Niels, I have tested the build provided at comment 20 and i could get the install successful on aws environment. Below are some of the output details. INSTALLER STATUS ************************************************************************************************************************************************************ Initialization : Complete (0:02:29) Health Check : Complete (0:01:13) Node Bootstrap Preparation : Complete (0:59:03) etcd Install : Complete (0:13:48) Master Install : Complete (0:35:00) Master Additional Install : Complete (0:20:59) Node Join : Complete (0:02:33) GlusterFS Install : Complete (0:10:41) Hosted Install : Complete (0:05:28) Cluster Monitoring Operator : Complete (0:02:27) Web Console Install : Complete (0:01:53) Console Install : Complete (0:01:34) metrics-server Install : Complete (0:00:01) Service Catalog Install : Complete (0:09:42) Images used for the test : ================================ [ec2-user@ip-172-16-28-144 ~]$ sudo docker images REPOSITORY TAG IMAGE ID CREATED SIZE brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7 ocs3.11-testbuild 6e57b47ca4e7 18 hours ago 268 MB registry.access.redhat.com/rhgs3/rhgs-volmanager-rhel7 v3.11 f4a8b6113476 11 days ago 277 MB registry.access.redhat.com/rhgs3/rhgs-gluster-block-prov-rhel7 v3.11 e74761279746 11 days ago 957 MB pvs and pvscan output : ======================== [ec2-user@ip-172-16-28-144 ~]$ oc exec -it glusterfs-storage-fhwnx bash [root@ip-172-16-19-162 /]# pvs PV VG Fmt Attr PSize PFree /dev/xvdb1 dockervg lvm2 a-- <100.00g 0 /dev/xvdf vg_cd65d867303a3297078a165dce522092 lvm2 a-- 999.87g <997.85g [root@ip-172-16-19-162 /]# pvscan PV /dev/xvdb1 VG dockervg lvm2 [<100.00 GiB / 0 free] PV /dev/xvdf VG vg_cd65d867303a3297078a165dce522092 lvm2 [999.87 GiB / <997.85 GiB free] Total: 2 [1.07 TiB] / in use: 2 [1.07 TiB] / in no VG: 0 [0 ] lvm version : ====================== [root@ip-172-16-19-162 /]# rpm -qa | grep lvm lvm2-libs-2.02.180-10.el7_6.2.x86_64 lvm2-2.02.180-10.el7_6.2.x86_64 pvs, pvscan & lvm output from the node hosting the second pod: ========================================================== [ec2-user@ip-172-16-28-144 ~]$ oc exec -it glusterfs-storage-nbggv bash [root@ip-172-16-17-223 /]# pvs PV VG Fmt Attr PSize PFree /dev/xvdb1 dockervg lvm2 a-- <100.00g 0 /dev/xvdf vg_a91d09b1b311fe03a90e4fc5d741ed06 lvm2 a-- 999.87g <997.85g [root@ip-172-16-17-223 /]# pvscan PV /dev/xvdf VG vg_a91d09b1b311fe03a90e4fc5d741ed06 lvm2 [999.87 GiB / <997.85 GiB free] PV /dev/xvdb1 VG dockervg lvm2 [<100.00 GiB / 0 free] Total: 2 [1.07 TiB] / in use: 2 [1.07 TiB] / in no VG: 0 [0 ] [root@ip-172-16-17-223 /]# rpm -qa | gerep lvm bash: gerep: command not found [root@ip-172-16-17-223 /]# rpm -qa | grep lvm lvm2-libs-2.02.180-10.el7_6.2.x86_64 lvm2-2.02.180-10.el7_6.2.x86_64 [root@ip-172-16-17-223 /]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.6 (Maipo) [root@ip-172-16-17-223 /]# cat /etc/redhat-storage-release Red Hat Gluster Storage Server 3.4.2(Container) Third node: =================== [ec2-user@ip-172-16-28-144 ~]$ oc exec -it glusterfs-storage-vmv2b bash [root@ip-172-16-38-52 /]# pvs PV VG Fmt Attr PSize PFree /dev/xvdb1 dockervg lvm2 a-- <100.00g 0 /dev/xvdf vg_933fad2019ed74e294c19b0635ae6d97 lvm2 a-- 999.87g <997.85g [root@ip-172-16-38-52 /]# pvscan PV /dev/xvdf VG vg_933fad2019ed74e294c19b0635ae6d97 lvm2 [999.87 GiB / <997.85 GiB free] PV /dev/xvdb1 VG dockervg lvm2 [<100.00 GiB / 0 free] Total: 2 [1.07 TiB] / in use: 2 [1.07 TiB] / in no VG: 0 [0 ] [root@ip-172-16-38-52 /]# rpm -qa | grep lvm lvm2-libs-2.02.180-10.el7_6.2.x86_64 lvm2-2.02.180-10.el7_6.2.x86_64 Created a file and block pvc on the default gluster storage class and that works fine : ======================================================================================== [ec2-user@ip-172-16-28-144 ~]$ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE blockvtest Bound pvc-7ec61e7f-2f93-11e9-b418-029d9c8ade0e 10Gi RWO glusterfs-storage-block 7s filevtest Bound pvc-601cef03-2f93-11e9-b418-029d9c8ade0e 10Gi RWO glusterfs-storage 58s @Humble / Niels, above are the basic validations i have performed on the aws setup. Are there any other tests we need to run before we say that the container image works fine ? I see that a need info on Jose has been cleared, so setting it back. Hey Guys, I also hit this Issue on our OKD 3.11 Cluster. I have summarized part of my findings here: https://github.com/gluster/gluster-containers/issues/128 I did not encounter this problem on an older gluster-centos image which was running gluster 4.1.6 with the following packages: sh-4.2# rpm -qa |egrep 'lvm2|udev|systemd' python-pyudev-0.15-9.el7.noarch systemd-219-62.el7.x86_64 lvm2-libs-2.02.180-10.el7_6.2.x86_64 lvm2-2.02.180-10.el7_6.2.x86_64 systemd-libs-219-62.el7.x86_64 systemd-sysv-219-62.el7.x86_64 However when using the latest built gluster-centos image which had 4.1.7 in it, the lvm packages were updated as well with the following versions: sh-4.2# rpm -qa |egrep 'lvm2|udev|systemd' python-pyudev-0.15-9.el7.noarch systemd-219-62.el7_6.3.x86_64 lvm2-libs-2.02.180-10.el7_6.3.x86_64 lvm2-2.02.180-10.el7_6.3.x86_64 systemd-libs-219-62.el7_6.3.x86_64 systemd-sysv-219-62.el7_6.3.x86_64 With these new versions I can always reproduce this issue on our OKD cluster Is this issue resolved with the fix provided by bug 1676921 ? My customer deployed the latest successfully and determined this issue was resolved by the fix provided by bug 1676921. Kasturi, Jose, what do you think about closing this as a duplicate of bug 1676921? Hello Niels, I feel that the other bug was raised to track the lvm package downgrade and this bug cannot be closed duplicate of the other bug. It would be good if you can move this to on_qa and will verify this. Thanks kasturi With the current rhgs-server container this problem should not happen anymore. I do not see this issue happening anymore with the rhgs-server-container build rhgs3/rhgs-server-rhel7:3.11.1-15. Fix for this was to have a downgraded lvm package in the build. Below are the tests performed to validate the bug. 1) Fresh install on vmware and aws environment. 2) Upgrade on vmware and aws environment. Image used for testing these packages is rhgs3/rhgs-server-rhel7:3.11.1-15 Below are the versions of gluster and lvm2 present in the container: ================================================================== [ec2-user@ip-172-16-16-77 ~]$ oc rsh glusterfs-storage-mnlkh sh-4.2# cat /etc/redhat-storage-release Red Hat Gluster Storage Server 3.4.2(Container) sh-4.2# rpm -qa | grep gluster glusterfs-libs-3.12.2-32.el7rhgs.x86_64 glusterfs-3.12.2-32.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-32.el7rhgs.x86_64 glusterfs-server-3.12.2-32.el7rhgs.x86_64 gluster-block-0.2.1-30.el7rhgs.x86_64 glusterfs-api-3.12.2-32.el7rhgs.x86_64 glusterfs-cli-3.12.2-32.el7rhgs.x86_64 python2-gluster-3.12.2-32.el7rhgs.x86_64 glusterfs-fuse-3.12.2-32.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-32.el7rhgs.x86_64 sh-4.2# rpm -qa | grep lvm lvm2-libs-2.02.180-10.el7_6.2.x86_64 lvm2-2.02.180-10.el7_6.2.x86_64 pvs & pvscan on the gluster pod: ====================================== sh-4.2# pvs /run/lvm/lvmetad.socket: connect failed: Connection refused WARNING: Failed to connect to lvmetad. Falling back to device scanning. PV VG Fmt Attr PSize PFree /dev/xvdb1 dockervg lvm2 a-- <100.00g 0 /dev/xvdf vg_558b9fb496a0c15f5d9e41bc323a114a lvm2 a-- 1.95t 1.80t sh-4.2# pvscan /run/lvm/lvmetad.socket: connect failed: Connection refused WARNING: Failed to connect to lvmetad. Falling back to device scanning. PV /dev/xvdf VG vg_558b9fb496a0c15f5d9e41bc323a114a lvm2 [1.95 TiB / 1.80 TiB free] PV /dev/xvdb1 VG dockervg lvm2 [<100.00 GiB / 0 free] Total: 2 [2.05 TiB] / in use: 2 [2.05 TiB] / in no VG: 0 [0 ] pv & pvscan on the node where gluster pod is running: ======================================================= [ec2-user@ip-172-16-37-138 ~]$ sudo pvs /run/lvm/lvmetad.socket: connect failed: Connection refused WARNING: Failed to connect to lvmetad. Falling back to device scanning. PV VG Fmt Attr PSize PFree /dev/xvdb1 dockervg lvm2 a-- <100.00g 0 /dev/xvdf vg_558b9fb496a0c15f5d9e41bc323a114a lvm2 a-- 1.95t 1.80t [ec2-user@ip-172-16-37-138 ~]$ sudo pvscan /run/lvm/lvmetad.socket: connect failed: Connection refused WARNING: Failed to connect to lvmetad. Falling back to device scanning. PV /dev/xvdf VG vg_558b9fb496a0c15f5d9e41bc323a114a lvm2 [1.95 TiB / 1.80 TiB free] PV /dev/xvdb1 VG dockervg lvm2 [<100.00 GiB / 0 free] Total: 2 [2.05 TiB] / in use: 2 [2.05 TiB] / in no VG: 0 [0 ] logs for the command heketi-cli server state examine gluster is present in the link below. http://rhsqe-repo.lab.eng.blr.redhat.com/cns/311async/ Moving this bug to verified state since no udev related issues are seen and the container image has the lvm2 version as lvm2-2.02.180-10.el7_6.2.x86_64 |