Created attachment 1324893 [details] Files created by cns deployer Description of problem: Installing 3.6 OCP with GlusterFS backend for registry fails. Version-Release number of selected component (if applicable): atomic-openshift-utils-3.6.173.0.21-2.git.0.44a4038.el7.noarch Kernel: Linux node01 3.10.0-693.2.1.el7.x86_64 #1 SMP Fri Aug 11 04:58:43 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux Docker storage is overlay2 NOT device mapper All VMs that targeted to run gluster have brick /dev/sdc available How reproducible: Following glusterfs related sections in inventory file [OSEv3:children] masters nodes etcd glusterfs glusterfs_registry ... openshift_hosted_registry_selector='region=infra' openshift_hosted_registry_replicas=3 openshift_hosted_registry_storage_kind=glusterfs ... [glusterfs] svlnxocpa11 glusterfs_ip=172.30.238.124 glusterfs_devices='[ "/dev/sdc"]' svlnxocpa12 glusterfs_ip=172.30.238.125 glusterfs_devices='[ "/dev/sdc"]' svlnxocpa13 glusterfs_ip=172.30.238.126 glusterfs_devices='[ "/dev/sdc"]' [glusterfs_registry] svlnxocpi11 glusterfs_ip=172.30.238.121 glusterfs_devices='[ "/dev/sdc"]' svlnxocpi12 glusterfs_ip=172.30.238.122 glusterfs_devices='[ "/dev/sdc"]' svlnxocpi13 glusterfs_ip=172.30.238.123 glusterfs_devices='[ "/dev/sdc"]' Steps to Reproduce: 1. install openshift ansible inventory containing glusterfs configuration 2. 3. Actual results: TASK [openshift_storage_glusterfs : Load heketi topology] ***************************************** changed: [svlnxocpm11] TASK [openshift_storage_glusterfs : Create heketi DB volume] ************************************** fatal: [svlnxocpm11]: FAILED! => { "changed": true, "cmd": [ "oc", "rsh", "--namespace=glusterfs", "deploy-heketi-storage-1-6r1mb", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "kEccFnOh+tkyM/zKcDSNynXleSUqycmZsX2tcQ4uNnM=", "setup-openshift-heketi-storage", "--listfile", "/tmp/heketi-storage.json" ], "delta": "0:00:10.669560", "end": "2017-09-12 12:04:19.411283", "failed": true, "rc": 255, "start": "2017-09-12 12:04:08.741723" } STDERR: Error: Unable to execute command on glusterfs-storage-lbwrr: /usr/sbin/modprobe failed: 1 thin: Required device-mapper target(s) not detected in your kernel. Run `lvcreate --help' for more information. command terminated with exit code 255 Expected results: GlusterFS would be installed and registry installed using CNS as persistent storage Additional info: Attached file contents were created on masters[0] /tmp/openshift-glusterfs-ansible-hUg7cw /tmp/heketi-storage.json file was not present on deployer container, masters[0] or in bastion host from where installation was executed. Installed OCP without CNS and it works fine. Description of problem: Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
Tested running cns-deploy manually and got the same problem. Managed to fix problem by running modprobe dm_thin_pool on all nodes running glusterfs pods. Then CNS deployed and works ok.
Okay, that's what I anticipated. At this time there is not much that we can do to resolve this in cns-deploy. We have provided an advisory message at the beginning of the deployment that all nodes running GlusterFS need to have a certain set of kernel modules running: https://github.com/gluster/gluster-kubernetes/blob/master/deploy/gk-deploy#L535-L538 When cns-deploy is deprecated in favor of the openshift-ansible installer, this will be addressed.
*** Bug 1493705 has been marked as a duplicate of this bug. ***
https://github.com/openshift/openshift-ansible/pull/5720
Verified with version openshift-ansible-3.7.0-0.189.0.git.0.d497c5e.el7, CNS deployment succeed when Docker storage is overlay2.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188