Bug 1656897 - Unable to deploy OCP 3.11.1 when it uses a multipath device.
Summary: Unable to deploy OCP 3.11.1 when it uses a multipath device.
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: CNS-deployment
Version: ocs-3.11
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ---
: ---
Assignee: Niels de Vos
QA Contact: Prasanth
URL:
Whiteboard:
Depends On: 1651270
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-06 15:52 UTC by Steve Reichard
Modified: 2018-12-06 20:31 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-12-06 20:31:25 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Steve Reichard 2018-12-06 15:52:02 UTC
Description of problem:

I previously was able to deploy OCS 3.11 using this same hardware. 
I have an iSCSI disk attached to each of my compute nodes that I use for OCS.  I have multipath configured on the host and use that device for OCS.

My install is intended for CNV, thus I am configuring with CRIO. With some known issues, I've been attempting to use 3.11.1.   I do modify the glusterfs ansible template to comment out the /dev mounts as done in BZ 1651270.

However now glusterfs is failing to use my disks.


Eventually you see the device is not available in the pod but is on the host.


TASK [openshift_storage_glusterfs : Load heketi topology] **************************************************************************
fatal: [ospha1.cloud.lab.eng.bos.redhat.com]: FAILED! => {"changed": true, "cmd": ["oc", "--config=/tmp/openshift-glusterfs-ansible-llhdf4/admin.kubeconfig", "rsh", "--namespace=app-storage", "deploy-heketi-storage-1-g4nft", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "j9FIKZz4KWaDRqNdfAwoSL/r07QP3IVeAm7IwrSYx7c=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-llhdf4/topology.json", "2>&1"], "delta": "0:00:09.160740", "end": "2018-12-06 02:03:13.730799", "failed_when_result": true, "rc": 0, "start": "2018-12-06 02:03:04.570059", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: 44d0599b38ea79c4bde9003eab3ca066\n\tAllowing file volumes on cluster.\n\tAllowing block volumes on cluster.\n\tCreating node ospha2.cloud.lab.eng.bos.redhat.com ... ID: 4ce02c81f0ae0eee2ffd303960464157\n\t\tAdding device /dev/mapper/mpatha ... Unable to add device: Device /dev/mapper/mpatha not found.\n\tCreating node ospha3.cloud.lab.eng.bos.redhat.com ... ID: eb2df0c3b91a6eeb9279b60966f6de4a\n\t\tAdding device /dev/mapper/mpatha ... Unable to add device: Device /dev/mapper/mpatha not found.\n\tCreating node ospha4.cloud.lab.eng.bos.redhat.com ... ID: d436d9a24b9ca56ab85f652079a3f28e\n\t\tAdding device /dev/mapper/mpatha ... Unable to add device: Device /dev/mapper/mpatha not found.\n\tCreating node ospha5.cloud.lab.eng.bos.redhat.com ... ID: d5d38710b051f6757f34fc497965fc81\n\t\tAdding device /dev/mapper/mpatha ... Unable to add device: Device /dev/mapper/mpatha not found.", "stdout_lines": ["Creating cluster ... ID: 44d0599b38ea79c4bde9003eab3ca066", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node ospha2.cloud.lab.eng.bos.redhat.com ... ID: 4ce02c81f0ae0eee2ffd303960464157", "\t\tAdding device /dev/mapper/mpatha ... Unable to add device: Device /dev/mapper/mpatha not found.", "\tCreating node ospha3.cloud.lab.eng.bos.redhat.com ... ID: eb2df0c3b91a6eeb9279b60966f6de4a", "\t\tAdding device /dev/mapper/mpatha ... Unable to add device: Device /dev/mapper/mpatha not found.", "\tCreating node ospha4.cloud.lab.eng.bos.redhat.com ... ID: d436d9a24b9ca56ab85f652079a3f28e", "\t\tAdding device /dev/mapper/mpatha ... Unable to add device: Device /dev/mapper/mpatha not found.", "\tCreating node ospha5.cloud.lab.eng.bos.redhat.com ... ID: d5d38710b051f6757f34fc497965fc81", "\t\tAdding device /dev/mapper/mpatha ... Unable to add device: Device /dev/mapper/mpatha not found."]}
	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry

PLAY RECAP ***********


Here is my inventory file:

[ansible@spr ospha]$ cat ocp311
# Create an OSEv3 group that contains the masters, nodes, and etcd groups
[OSEv3:children]
masters
nodes
etcd
glusterfs
# lb


# Set variables common for all OSEv3 hosts
[OSEv3:vars]
# SSH user, this user should allow ssh based auth without requiring a password
ansible_ssh_user=root

# If ansible_ssh_user is not root, ansible_become must be set to true
#ansible_become=true

openshift_schedulable=true
openshift_clock_enabled=true

deployment_type=openshift-enterprise
debug_level=4

openshift_disable_check=docker_image_availability

dynamic_volumes_check=False


oreg_auth_user=se-sreichar
oreg_auth_password=<redacted>

openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
openshift_master_htpasswd_users={'spr': '$apr1$nCcl7wT.$BpxEIiV2W4uhc2dntFiRl0', 'admin': '$apr1$nCcl7wT.$BpxEIiV2W4uhc2dntFiRl0', 'developer': '$apr1$nCcl7wT.$BpxEIiV2W4uhc2dntFiRl0'}

openshift_master_cluster_method=native
openshift_master_default_subdomain=ospha.cloud.lab.eng.bos.redhat.com
openshift_master_cluster_hostname=ospha1.cloud.lab.eng.bos.redhat.com
openshift_master_cluster_public_hostname=ospha1.cloud.lab.eng.bos.redhat.com

openshift_use_openshift_sdn=true

ansible_service_broker_dev_broker=true

openshift_use_crio=true
openshift_crio_use_rpm=true
openshift_service_catalog_retries=150

openshift_console_install=true




#version="v3.10.8-1"

## internal
#openshift_web_console_image="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/cnv11-tech-preview/origin-web-console:cnv-1.1-rhel-7-candidate-46941-20180625103248"
#oreg_url="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-${component}:${version}"
#openshift_examples_modify_imagestreams=true
#system_images_registry="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888"
openshift_docker_additional_registries="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888"
openshift_docker_insecure_registries="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888"
#ansible_service_broker_image_prefix="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-"
#openshift_service_catalog_image_prefix="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-"
#template_service_broker_prefix="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-"
#openshift_cockpit_deployer_prefix="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/"


#openshift_storage_glusterfs_image=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7
#openshift_storage_glusterfs_version=3.3.1-17
#openshift_storage_glusterfs_heketi_image=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-volmanager-rhel7
#openshift_storage_glusterfs_heketi_version=3.3.1-15


openshift_storage_glusterfs_heketi_image=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/ocs/rhgs-volmanager-rhel7:3.11.1-1
openshift_storage_glusterfs_block_image=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/ocs/rhgs-gluster-block-prov-rhel7:3.11.1-1
openshift_storage_glusterfs_image=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/ocs/rhgs-server-rhel7:3.11.1-1
 
#os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant'

#openshift_master_admission_plugin_config='{"PersistentVolumeClaimResize":{"configuration":{"apiVersion":"v1", "kind":"DefaultAdmissionConfig", "disable":false}}}'

openshift_master_admission_plugin_config={"ValidatingAdmissionWebhook":{"configuration":{"kind": "DefaultAdmissionConfig","apiVersion": "v1","disable": false}},"MutatingAdmissionWebhook":{"configuration":{"kind": "DefaultAdmissionConfig","apiVersion": "v1","disable": false}}}

#openshift_web_console_image="registry.access.redhat.com/cnv12-tech-preview/origin-web-console-server:1.2"

## registry
openshift_hosted_registry_storage_kind=glusterfs 
openshift_hosted_registry_storage_volume_size=15Gi  
openshift_hosted_registry_routehost="registry.ospha.cloud.lab.eng.boss.redhat.com"
openshift_storage_glusterfs_timeout=900


openshift_master_dynamic_provisioning_enabled=True


# CNS storage for applications
openshift_storage_glusterfs_namespace=app-storage
openshift_storage_glusterfs_block_deploy=false    
openshift_storage_glusterfs_storageclass=true
openshift_storageclass_default=false
openshift_storage_glusterfs_storageclass_default=True

# CNS storage for OpenShift infrastructure
openshift_storage_glusterfs_registry_namespace=infra-storage  
openshift_storage_glusterfs_registry_storageclass=false       
openshift_storage_glusterfs_registry_block_deploy=true   
openshift_storage_glusterfs_registry_block_host_vol_create=true    
openshift_storage_glusterfs_registry_block_host_vol_size=200   
openshift_storage_glusterfs_registry_block_storageclass=true
openshift_storage_glusterfs_registry_block_storageclass_default=true

## metric
openshift_metrics_install_metrics=true
openshift_metrics_cassanda_pvc_storage_class_name=glusterfs-storage
openshift_metrics_storage_access_modes=['ReadWriteOnce']
#openshift_metrics_storage_volume_name=metrics
#openshift_metrics_storage_volume_size=10Gi
openshift_metrics_storage_kind=dynamic


openshift_cluster_monitoring_operator_install=true

openshift_prometheus_storage_type=pvc
openshift_prometheus_alertmanager_storage_type=pvc
openshift_prometheus_alertbuffer_storage_type=pvc


# logging
#openshift_logging_es_nodeselector='node-role.kubernetes.io/infra=true'
#openshift_logging_kibana_nodeselector='node-role.kubernetes.io/infra=true'
#openshift_logging_curator_nodeselector='node-role.kubernetes.io/infra=true'
openshift_logging_install_logging=true
openshift_logging_storage_kind=dynamic
openshift_logging_es_pvc_dynamic=true
#openshift_logging_es_pvc_size=10Gi
openshift_logging_es_pvc_storage_class_name="glusterfs-registry-block"
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_ops_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_cluster_size=1

## asb
#ansible_service_broker_registry_organization=ansibleplaybookbundle
ansible_service_broker_registry_whitelist=['.*-apb$']
#ansible_service_broker_local_registry_whitelist=['.*-apb$']
openshift_hosted_etcd_storage_kind=dynamic
openshift_hosted_etcd_storage_volume_name=etcd-vol
openshift_hosted_etcd_storage_access_modes=["ReadWriteOnce"]
openshift_hosted_etcd_storage_volume_size=1G
openshift_hosted_etcd_storage_labels={'storage'='etcd'}

# [lb]
# ospha-inst.cloud.lab.eng.bos.redhat.com


# host group for masters
[masters]
ospha1.cloud.lab.eng.bos.redhat.com openshift_node_group_name="node-config-master-infra"
# ospha1.cloud.lab.eng.bos.redhat.com

# host group for etcd
[etcd]
ospha1.cloud.lab.eng.bos.redhat.com

[glusterfs]
ospha2.cloud.lab.eng.bos.redhat.com glusterfs_devices='[ "/dev/mapper/mpatha" ]'
ospha3.cloud.lab.eng.bos.redhat.com glusterfs_devices='[ "/dev/mapper/mpatha" ]'
ospha4.cloud.lab.eng.bos.redhat.com glusterfs_devices='[ "/dev/mapper/mpatha" ]'
ospha5.cloud.lab.eng.bos.redhat.com glusterfs_devices='[ "/dev/mapper/mpatha" ]'


# host group for nodes, includes region info
[nodes]
ospha1.cloud.lab.eng.bos.redhat.com 
ospha2.cloud.lab.eng.bos.redhat.com  openshift_node_group_name="node-config-compute"
ospha3.cloud.lab.eng.bos.redhat.com  openshift_node_group_name="node-config-compute"
ospha4.cloud.lab.eng.bos.redhat.com  openshift_node_group_name="node-config-compute"
ospha5.cloud.lab.eng.bos.redhat.com  openshift_node_group_name="node-config-compute"

# ospha2.cloud.lab.eng.bos.redhat.com  openshift_node_labels="{'region': 'infra', 'zone': 'default', 'node-role.kubernetes.io/compute': 'true'}"
# ospha3.cloud.lab.eng.bos.redhat.com  openshift_node_labels="{'region': 'infra', 'zone': 'default', 'node-role.kubernetes.io/compute': 'true'}"
# ospha4.cloud.lab.eng.bos.redhat.com  openshift_node_labels="{'region': 'infra', 'zone': 'default', 'node-role.kubernetes.io/compute': 'true'}"
# ospha5.cloud.lab.eng.bos.redhat.com  openshift_node_labels="{'region': 'infra', 'zone': 'default', 'node-role.kubernetes.io/compute': 'true'}"
[ansible@spr ospha]$


[root@ospha1 files]# oc project app-storage
oc Now using project "app-storage" on server "https://ospha1.cloud.lab.eng.bos.redhat.com:8443".
[root@ospha1 files]# oc get pods
NAME                            READY     STATUS    RESTARTS   AGE
deploy-heketi-storage-1-g4nft   1/1       Running   0          12h
glusterfs-storage-92b9p         1/1       Running   0          12h
glusterfs-storage-9859f         1/1       Running   0          12h
glusterfs-storage-pm4w8         1/1       Running   0          12h
glusterfs-storage-rbvzt         1/1       Running   0          12h
[root@ospha1 files]#



[root@ospha1 files]# oc describe pod glusterfs-storage-9859f
Name:               glusterfs-storage-9859f
Namespace:          app-storage
Priority:           0
PriorityClassName:  <none>
Node:               ospha5.cloud.lab.eng.bos.redhat.com/10.19.139.35
Start Time:         Thu, 06 Dec 2018 01:59:14 +0000
Labels:             controller-revision-hash=2003017835
                    glusterfs=storage-pod
                    glusterfs-node=pod
                    pod-template-generation=1
Annotations:        openshift.io/scc=privileged
Status:             Running
IP:                 10.19.139.35
Controlled By:      DaemonSet/glusterfs-storage
Containers:
  glusterfs:
    Container ID:   cri-o://ba782c9d66b2d882b28b14dda9c65dc735d3d980160f907467a3d6ba226ca3db
    Image:          brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/ocs/rhgs-server-rhel7:3.11.1-1
    Image ID:       brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/ocs/rhgs-server-rhel7@sha256:6320642cf2f00171878b934f0ded696d070d1c1b3adb6bb476b971c7e645b559
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Thu, 06 Dec 2018 02:00:06 +0000
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   100Mi
    Liveness:   exec [/bin/bash -c if command -v /usr/local/bin/status-probe.sh; then /usr/local/bin/status-probe.sh liveness; else systemctl status glusterd.service; fi] delay=40s timeout=3s period=25s #success=1 #failure=50
    Readiness:  exec [/bin/bash -c if command -v /usr/local/bin/status-probe.sh; then /usr/local/bin/status-probe.sh readiness; else systemctl status glusterd.service; fi] delay=40s timeout=3s period=25s #success=1 #failure=50
    Environment:
      GLUSTER_BLOCKD_STATUS_PROBE_ENABLE:  1
      GB_GLFS_LRU_COUNT:                   15
      TCMU_LOGDIR:                         /var/log/glusterfs/gluster-block
      GB_LOGDIR:                           /var/log/glusterfs/gluster-block
    Mounts:
      /etc/glusterfs from glusterfs-etc (rw)
      /etc/ssl from glusterfs-ssl (ro)
      /etc/target from glusterfs-target (rw)
      /run from glusterfs-run (rw)
      /run/lvm from glusterfs-lvm (rw)
      /sys/fs/cgroup from glusterfs-cgroup (ro)
      /usr/lib/modules from kernel-modules (ro)
      /var/lib/glusterd from glusterfs-config (rw)
      /var/lib/heketi from glusterfs-heketi (rw)
      /var/lib/misc/glusterfsd from glusterfs-misc (rw)
      /var/log/glusterfs from glusterfs-logs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qjxwg (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  glusterfs-heketi:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/heketi
    HostPathType:  
  glusterfs-run:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  glusterfs-lvm:
    Type:          HostPath (bare host directory volume)
    Path:          /run/lvm
    HostPathType:  
  glusterfs-etc:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/glusterfs
    HostPathType:  
  glusterfs-logs:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/glusterfs
    HostPathType:  
  glusterfs-config:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/glusterd
    HostPathType:  
  glusterfs-misc:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/misc/glusterfsd
    HostPathType:  
  glusterfs-cgroup:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/fs/cgroup
    HostPathType:  
  glusterfs-ssl:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl
    HostPathType:  
  kernel-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/lib/modules
    HostPathType:  
  glusterfs-target:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/target
    HostPathType:  
  default-token-qjxwg:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-qjxwg
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  glusterfs=storage-host
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
Events:          <none>
[root@ospha1 files]# ssh 10.19.139.35 multipath -l
Warning: Permanently added '10.19.139.35' (ECDSA) to the list of known hosts.
mpatha (3690b11c00004e0ff000083335a97b1d6) dm-2 DELL    ,MD36xxi         
size=250G features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| |- 7:0:0:0  sdd 8:48 active undef running
| `- 8:0:0:0  sdb 8:16 active undef running
`-+- policy='service-time 0' prio=0 status=enabled
  |- 10:0:0:0 sde 8:64 active undef running
  `- 9:0:0:0  sdc 8:32 active undef running
[root@ospha1 files]# ssh 10.19.139.35 ls -l /dev/mapper/mpatha
Warning: Permanently added '10.19.139.35' (ECDSA) to the list of known hosts.
lrwxrwxrwx. 1 root root 7 Dec  6 01:03 /dev/mapper/mpatha -> ../dm-2
[root@ospha1 files]# oc exec -it glusterfs-storage-9859f /bin/bash
[root@ospha5 /]# ls -l /dev/mapper/mpatha
ls: cannot access /dev/mapper/mpatha: No such file or directory
[root@ospha5 /]# 

  


Version-Release number of selected component (if applicable):


How reproducible:
Each of the several time I have attempted

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Niels de Vos 2018-12-06 16:09:38 UTC
This looks very much like bug 1651270 (which you referenced in comment #0).

Multipath devices under /dev/mapper/ should work with upstream openshift-ansible-3.11.52-1. Which version of openshift-ansible was used for deploying?

Comment 3 Steve Reichard 2018-12-06 16:22:21 UTC

I user the currently shipping downstream, but also made the changes that were previously suggested.

[root@ospha1 files]# yum list openshift-ansible
Loaded plugins: product-id, search-disabled-repos, subscription-manager
Installed Packages
openshift-ansible.noarch                          3.11.43-1.git.0.fa69a02.el7                           @rhel-7-server-ose-3.11-rpms
[root@ospha1 files]# grep '^#' glusterfs-template.yml 
#          - name: glusterfs-dev
#            mountPath: "/dev"
#        - name: glusterfs-dev
#          hostPath:
#            path: "/dev"
[root@ospha1 files]# 

If you can point me to the upstream openshift-ansible RPM, I will give it a try.

Comment 5 Steve Reichard 2018-12-06 20:31:25 UTC
Using the referenced openshift ansible version, I sucessfully made it past the previous error.

Closing.


Note You need to log in before you can comment on or make changes to this bug.