Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1550271

Summary:	Unable to mount PVC on GCP in 3.11
Product:	OpenShift Container Platform	Reporter:	Clayton Coleman <ccoleman>
Component:	Storage	Assignee:	Tomas Smetana <tsmetana>
Status:	CLOSED NOTABUG	QA Contact:	Wei Duan <wduan>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.11.0	CC:	agk, altvnk, ansverma, aos-bugs, aos-storage-staff, bleanhar, bmarzins, chancez, dma, eparis, farandac, hchen, heinzm, hekumar, jmselmi, mmariyan, msnitzer, prajnoha, tatanaka, tibrahim, tsmetana, wehe, wmeng, xtian
Target Milestone:	---	Keywords:	Reopened
Target Release:	3.11.z	Flags:	altvnk: needinfo-
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-04-27 10:04:37 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Clayton Coleman 2018-02-28 21:52:27 UTC

Description of problem:

GCP cluster on 3.9 is getting an error trying to mount a PVC

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

Warning  FailedMount            1h               kubelet, ci-chancez-chargeback-openshift-ig-n-zsn5  MountVolume.MountDevice failed for volume "pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005" : failed to mount the volume as "ext4", it already contains mpath_member. Mount error: mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 --scope -- mount -t ext4 -o defaults /dev/disk/by-id/google-kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005
Output: Running scope as unit run-50422.scope.
mount: /dev/sdb is already mounted or /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 busy
  Warning  FailedMount  1h  kubelet, ci-chancez-chargeback-openshift-ig-n-zsn5  MountVolume.MountDevice failed for volume "pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005" : failed to mount the volume as "ext4", it already contains mpath_member. Mount error: mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 --scope -- mount -t ext4 -o defaults /dev/disk/by-id/google-kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005
Output: Running scope as unit run-50436.scope.
mount: /dev/sdb is already mounted or /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 busy
  Warning  FailedMount  1h  kubelet, ci-chancez-chargeback-openshift-ig-n-zsn5  MountVolume.MountDevice failed for volume "pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005" : failed to mount the volume as "ext4", it already contains mpath_member. Mount error: mount failed: exit status 32

---

Server https://internal-api.openshift.XXXXX.team.coreos.systems:8443
openshift v3.9.0-alpha.4+1f02cb5-492
kubernetes v1.9.1+a0ce1bc657

---

[chance@ci-chancez-chargeback-openshift-ig-m-0nl0 ~]$ uname -a
Linux ci-chancez-chargeback-openshift-ig-m-0nl0 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[chance@ci-chancez-chargeback-openshift-ig-m-0nl0 ~]$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Comment 1 Chance Zibolski 2018-03-01 16:46:28 UTC

To do this I used https://github.com/openshift/release from commit 00731fe9d6a3e970aa1dc727041de471744d28b8. I used the cluster/test-deploy Makefile/instructions for deployment. The following link is a diff of my vars-origin.yaml from the original https://gist.github.com/chancez/1c0f28eb05d8f4ab4e66e9c261e3329a.

Besides that, I've ran ansible a few times to make a couple changes to auth settings to add Github auth but my auth settings weren't working so I reverted those, so the only major thing I can think of is re-running ansible a few times to make changes, and then again to undo those changes.

Comment 2 hchen 2018-03-01 23:41:40 UTC

"it already contains mpath_member" is odd, that device was somehow managed by multipathd. Is the instance still available? I've never seen a multipath pd.

Comment 3 hchen 2018-03-02 00:11:27 UTC

Indeed, the device is managed by multipath:

[root@ci-chancez-chargeback-openshift-ig-n-zsn5 ~]# multipath -ll
0Google_PersistentDisk_kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e dm-0 Google  ,PersistentDisk  
size=5.0G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 0:0:2:0 sdb 8:16 active ready running

# dmsetup  ls --tree
0Google_PersistentDisk_kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 (253:0)
 └─ (8:16)

# ls -l /dev/disk/by-id/google-kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 
lrwxrwxrwx. 1 root root 9 Feb 28 20:31 /dev/disk/by-id/google-kubernetes-dynamic-pvc-4dc8bbab-1cc6-11e8-be9c-42010a8e0005 -> ../../sdb

Comment 4 hchen 2018-03-02 00:37:40 UTC

Temporarily disabled GCE PD in multipath and verified the disk was no longer managed by multipathd
steps:
1. blacklist 0Google_PersistentDisk:

# cat /etc/multipath.conf
# LIO iSCSI
# TODO: Add env variables for tweaking
devices {
        device {
                vendor "LIO-ORG"
                user_friendly_names "yes" 
                path_grouping_policy "failover"
                path_selector "round-robin 0"
                failback immediate
                path_checker "tur"
                prio "const"
                no_path_retry 120
                rr_weight "uniform"
        }
}

blacklist {
	wwid 0Google_PersistentDisk
}

defaults {
}

2. systemctl restart multipathd

3. verify 
# dmsetup ls --tree
No devices found

4. format the disk:
# mkfs -t ext4 /dev/sdb

Comment 5 Jianwei Hou 2018-03-02 03:27:57 UTC

I have seen same issue on one of our vSphere clusters too on 3.9.

Comment 6 Wenqi He 2018-03-02 06:34:55 UTC

Also met this once on Azure, but after I set a new OCP was unable to reproduce...

Comment 7 DeShuai Ma 2018-03-02 06:41:56 UTC

Met the issue hawkular-cassandra failed to start due to failed mount the pv.

Version-Release number of selected component (if applicable):
openshift v3.9.1
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.16

OS version: Red Hat Enterprise Linux Server release 7.5 Beta (Maipo)
kernel: 3.10.0-855.el7.x86_64

Steps:
1. Deploy hawkular metrics on gcp then check the status
[root@qe-dma-master-etcd-1 test]# oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                   STORAGECLASS   REASON    AGE
pvc-2c2e2208-1de2-11e8-833e-42010af0001e   10Gi       RWO            Delete           Bound     openshift-infra/metrics-cassandra-1     standard                 7m
pvc-b1a3e901-1dc3-11e8-8e76-42010af0001e   1Gi        RWO            Delete           Bound     openshift-ansible-service-broker/etcd   standard                 3h
[root@qe-dma-master-etcd-1 test]# oc get pvc
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
metrics-cassandra-1   Bound     pvc-2c2e2208-1de2-11e8-833e-42010af0001e   10Gi       RWO            standard       7m
[root@qe-dma-master-etcd-1 test]# oc get po
NAME                         READY     STATUS              RESTARTS   AGE
hawkular-cassandra-1-k5tb6   0/1       ContainerCreating   0          6m
hawkular-metrics-r8rgr       0/1       Running             9          3h
heapster-lbtgx               0/1       Running             7          3h
[root@qe-dma-master-etcd-1 test]# oc describe po hawkular-cassandra-1-k5tb6
Name:           hawkular-cassandra-1-k5tb6
Namespace:      openshift-infra
Node:           qe-dma-node-registry-router-1/10.240.0.31
Start Time:     Fri, 02 Mar 2018 01:23:55 -0500
Labels:         metrics-infra=hawkular-cassandra
                name=hawkular-cassandra-1
                type=hawkular-cassandra
Annotations:    openshift.io/scc=restricted
Status:         Pending
IP:             
Controlled By:  ReplicationController/hawkular-cassandra-1
Containers:
  hawkular-cassandra-1:
    Container ID:  
    Image:         registry.reg-aws.openshift.com:443/openshift3/metrics-cassandra:v3.9
    Image ID:      
    Ports:         9042/TCP, 9160/TCP, 7000/TCP, 7001/TCP
    Command:
      /opt/apache-cassandra/bin/cassandra-docker.sh
      --cluster_name=hawkular-metrics
      --data_volume=/cassandra_data
      --internode_encryption=all
      --require_node_auth=true
      --enable_client_encryption=true
      --require_client_auth=true
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  2G
    Requests:
      memory:   1G
    Readiness:  exec [/opt/apache-cassandra/bin/cassandra-docker-ready.sh] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      CASSANDRA_MASTER:               true
      CASSANDRA_DATA_VOLUME:          /cassandra_data
      JVM_OPTS:                       -Dcassandra.commitlog.ignorereplayerrors=true
      ENABLE_PROMETHEUS_ENDPOINT:     True
      TRUSTSTORE_NODES_AUTHORITIES:   /hawkular-cassandra-certs/tls.peer.truststore.crt
      TRUSTSTORE_CLIENT_AUTHORITIES:  /hawkular-cassandra-certs/tls.client.truststore.crt
      POD_NAMESPACE:                  openshift-infra (v1:metadata.namespace)
      MEMORY_LIMIT:                   2000000000 (limits.memory)
      CPU_LIMIT:                      node allocatable (limits.cpu)
    Mounts:
      /cassandra_data from cassandra-data (rw)
      /hawkular-cassandra-certs from hawkular-cassandra-certs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cassandra-token-tsg9f (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  cassandra-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  metrics-cassandra-1
    ReadOnly:   false
  hawkular-cassandra-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hawkular-cassandra-certs
    Optional:    false
  cassandra-token-tsg9f:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cassandra-token-tsg9f
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
Events:
  Type     Reason                 Age                From                                    Message
  ----     ------                 ----               ----                                    -------
  Normal   Scheduled              6m                 default-scheduler                       Successfully assigned hawkular-cassandra-1-k5tb6 to qe-dma-node-registry-router-1
  Normal   SuccessfulMountVolume  6m                 kubelet, qe-dma-node-registry-router-1  MountVolume.SetUp succeeded for volume "cassandra-token-tsg9f"
  Normal   SuccessfulMountVolume  6m                 kubelet, qe-dma-node-registry-router-1  MountVolume.SetUp succeeded for volume "hawkular-cassandra-certs"
  Warning  FailedMount            28s (x11 over 6m)  kubelet, qe-dma-node-registry-router-1  MountVolume.MountDevice failed for volume "pvc-2c2e2208-1de2-11e8-833e-42010af0001e" : failed to mount the volume as "ext4", it already contains mpath_member. Mount error: exit status 32
  Warning  FailedMount            23s (x3 over 4m)   kubelet, qe-dma-node-registry-router-1  Unable to mount volumes for pod "hawkular-cassandra-1-k5tb6_openshift-infra(489e1083-1de2-11e8-833e-42010af0001e)": timeout expired waiting for volumes to attach/mount for pod "openshift-infra"/"hawkular-cassandra-1-k5tb6". list of unattached/unmounted volumes=[cassandra-data]

Comment 8 Weihua Meng 2018-03-02 07:23:06 UTC

I did not meet this on GCP.

Events:
  Type    Reason                 Age   From                                  Message
  ----    ------                 ----  ----                                  -------
  Normal  Scheduled              16m   default-scheduler                     Successfully assigned asb-etcd-1-hshdc to qe-wmeng391ah-master-etcd-1
  Normal  SuccessfulMountVolume  16m   kubelet, qe-wmeng391ah-master-etcd-1  MountVolume.SetUp succeeded for volume "asb-token-jts4r"
  Normal  SuccessfulMountVolume  16m   kubelet, qe-wmeng391ah-master-etcd-1  MountVolume.SetUp succeeded for volume "etcd-tls"
  Normal  SuccessfulMountVolume  16m   kubelet, qe-wmeng391ah-master-etcd-1  MountVolume.SetUp succeeded for volume "etcd-auth"
  Normal  SuccessfulMountVolume  16m   kubelet, qe-wmeng391ah-master-etcd-1  MountVolume.SetUp succeeded for volume "pvc-ea9e92d8-1de6-11e8-a434-42010af00023"
  Normal  Pulled                 16m   kubelet, qe-wmeng391ah-master-etcd-1  Container image "registry.access.redhat.com/rhel7/etcd:latest" already present on machine
  Normal  Created                16m   kubelet, qe-wmeng391ah-master-etcd-1  Created container
  Normal  Started                16m   kubelet, qe-wmeng391ah-master-etcd-1  Started container


openshift v3.9.1
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.16

Kernel Version: 3.10.0-855.el7.x86_64
Operating System: Red Hat Enterprise Linux Atomic Host 7.5.0

docker-1.13.1-55.rhel75.git774336d.el7.x86_64
Server Version: 1.13.1
Storage Driver: overlay2

Comment 9 Tomas Smetana 2018-03-02 08:40:39 UTC

If Huamin is correct (and it looks like he is), then each time there is the "mpath_member" in the events checks also the multipathd log. On the affected machine I can see this:

Feb 27 21:13:55 ci-chancez-chargeback-openshift-build-image-instance multipathd[288]: sda: spurious uevent, path already in pathvec
Feb 27 21:13:55 ci-chancez-chargeback-openshift-build-image-instance multipathd[288]: 0Google_PersistentDisk_persistent-disk-0: failed in domap for addition of new path sda
Feb 27 21:13:55 ci-chancez-chargeback-openshift-build-image-instance multipathd[288]: uevent trigger error

It would also mean this is not something we can fix in OpenShift (see Huamin's comment #4) -- multipathd has to be configured to ignore the GCE PD disks. Disabling multipathd altogether on machines where it's not needed should work too.

Also note: to reproduce this the multipathd must be installed and running on the system (not a case of Atomic Host AFAIK).

I will try to create a pod with several disks in GCE and check their WWID -- if there is a collision we have the cause.

Comment 10 Tomas Smetana 2018-03-02 14:03:43 UTC

I was wrong: the workarounds I thought would work don't seem to help. Mount complains about mpath_member... This is the udev ID_FS_TYPE attribute value being set by udev to the multipath "legs". And mount refuses to mount those (since it should be the dm device that should be mounted). Might be there is an udev rule causing this attribute to be set for the GCE PD disks.

Comment 11 Tomas Smetana 2018-03-02 15:22:53 UTC

I've created a VM in GCE and "manually" attached a GCE PD (again, created in the console) and:

[root@tsmetana-mp-master-etcd-1 ~]# multipath -ll
0Google_PersistentDisk_multipath-bug-test-1 dm-16 Google  ,PersistentDisk  
size=10G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 0:0:3:0 sdc 8:32 active ready running

[root@tsmetana-mp-master-etcd-1 ~]# udevadm info --name=/dev/sdc
P: /devices/pci0000:00/0000:00:03.0/virtio0/host0/target0:0:3/0:0:3:0/block/sdc
N: sdc
S: disk/by-id/scsi-0Google_PersistentDisk_multipath-bug-test-1
S: disk/by-path/virtio-pci-0000:00:03.0-scsi-0:0:3:0
E: DEVLINKS=/dev/disk/by-id/scsi-0Google_PersistentDisk_multipath-bug-test-1 /dev/disk/by-path/virtio-pci-0000:00:03.0-scsi-0:0:3:0
E: DEVNAME=/dev/sdc
E: DEVPATH=/devices/pci0000:00/0000:00:03.0/virtio0/host0/target0:0:3/0:0:3:0/block/sdc
E: DEVTYPE=disk
E: DM_MULTIPATH_DEVICE_PATH=1
E: DM_MULTIPATH_TIMESTAMP=1520002717
E: DM_MULTIPATH_WIPE_PARTS=1
E: ID_BUS=scsi
E: ID_FS_TYPE=mpath_member
E: ID_MODEL=PersistentDisk
E: ID_MODEL_ENC=PersistentDisk\x20\x20
E: ID_PATH=virtio-pci-0000:00:03.0-scsi-0:0:3:0
E: ID_PATH_TAG=virtio-pci-0000_00_03_0-scsi-0_0_3_0
E: ID_REVISION=1
E: ID_SCSI=1
E: ID_SERIAL=0Google_PersistentDisk_multipath-bug-test-1
E: ID_SERIAL_SHORT=multipath-bug-test-1
E: ID_TYPE=disk
E: ID_VENDOR=Google
E: ID_VENDOR_ENC=Google\x20\x20
E: MAJOR=8
E: MINOR=32
E: MPATH_SBIN_PATH=/sbin
E: SUBSYSTEM=block
E: SYSTEMD_READY=0
E: TAGS=:systemd:
E: USEC_INITIALIZED=26212172

I think the udev rules are not OK for GCE. Obviously, this disk can't be mounted:

[root@tsmetana-mp-master-etcd-1 ~]# mkdir /mnt/test
[root@tsmetana-mp-master-etcd-1 ~]# mount -t ext4 /dev/sdc /mnt/test
mount: /dev/sdc is already mounted or /mnt/test busy

OpenShift is not involved here.

Comment 12 hchen 2018-03-02 16:43:22 UTC

assign to RHEL device mapper multipath, based on comment 7

Comment 14 Ben Marzinski 2018-03-02 19:15:44 UTC

Where did this multipath.conf come from. If you create a default multipath.conf file, by running

# mpathconf --enable

without an already existing multipath.conf file, it automatically sets

find_multipaths yes

in the defaults section.  This makes multipath only claim devices when it sees that they have multiple paths, or if it has previously claimed them.

If you add that find_multipaths line to /etc/multipath.conf, and run

multipath -w /dev/sdc (or whatever devname the google persistent disk has)

That should fix your problem.  The real issue here is that a multipath.conf file without either find_multipaths or a manual blacklist will just claim all scsi devices.  Who or whatever created that multipath.conf file needs to do one other the other. Like I said, the default multipath setup uses find_multipaths.

Comment 15 hchen 2018-03-02 19:39:41 UTC

Thank you Ben, that explains the mystery. The config in question comes from this commit
https://github.com/openshift/openshift-ansible/commit/2573825c06e9d3a5601b6c1492f71fd0b70b2578

Comment 16 hchen 2018-03-02 20:04:24 UTC

ansible fix at https://github.com/openshift/openshift-ansible/pull/7367

Comment 17 Eric Paris 2018-03-02 20:26:06 UTC

for 3.9: https://github.com/openshift/openshift-ansible/pull/7368

Comment 19 Jianwei Hou 2018-03-07 08:25:25 UTC

Tested with the multipath.conf and the problem isn't reproducible. I'll test the openshift-ansible and choose a different cloud as regression test tomorrow. I can verify this bug now.

Comment 20 Alex Lytvynenko 2018-04-19 09:04:03 UTC

Faced this issue on Azure. And it is no matter if `find_multipaths yes` is set in config or not, it always claims Azure disks as multipath devices. So i need to blacklist them or disable multipath at system level.
Both workarounds are not acceptable since they will not survive single ansible run...

Comment 21 hchen 2018-04-23 14:05:07 UTC

Alex, can you post more info as in Comments 11 and 14?

Comment 23 hchen 2018-04-26 12:26:35 UTC

3.7 fix is proposed at https://github.com/openshift/openshift-ansible/pull/8152
3.6 fix is proposed at https://github.com/openshift/openshift-ansible/pull/8151

Comment 24 hchen 2018-04-26 17:49:42 UTC

actually 3.6 already has the fix

Comment 25 Takayoshi Tanaka 2018-04-26 23:54:51 UTC

I have a customer who is facing this issue.
multipathd is installed in customer's node and OpenShift playbook always enables this service even though they have disabled it. It this related to this issue? Or should I file another bug for playbook?

Comment 26 jmselmi 2018-05-22 10:12:20 UTC

Hello,
I am facing the same issue on ocp 3.7 deployed on GCP.
I already tried Ben workaround proposed above and some issues.


log from oc describe pod :
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-099654ed-5da2-11e8-9e4c-42010a840007 --scope -- mount -t xfs -o defaults /dev/disk/by-id/google-kubernetes-dynamic-pvc-099654ed-5da2-11e8-9e4c-42010a840007 /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-099654ed-5da2-11e8-9e4c-42010a840007
Output: Running scope as unit run-88207.scope.
mount: /dev/sde is already mounted or /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-099654ed-5da2-11e8-9e4c-42010a840007 busy


 oc get pods -o wide
NAME                                         READY     STATUS              RESTARTS   AGE       IP             NODE
ssp-kafka-0                                  1/1       Running             0          1h        172.16.5.134   ocp-a1-node-zgzf
ssp-kafka-1                                  1/1       Running             0          1h        172.16.24.13   ocp-a1-node-rn5r
ssp-kafka-2                                  1/1       Running             0          1h        172.16.22.25   ocp-a1-node-dtjv
ssp-kafka-3                                  0/1       ContainerCreating   0          15m       <none>         ocp-a1-node-qngj
ssp-kafka-4                                  1/1       Running             0          1h        172.16.12.65   ocp-a1-node-dd0n
ssp-topic-controller-3558947362-7ljx4        1/1       Running             0          7d        172.16.18.7    ocp-a1-node-qngj
ssp-zookeeper-0                              0/1       ContainerCreating   0          1h        <none>         ocp-a1-node-qngj
ssp-zookeeper-1                              1/1       Running             0          1h        172.16.24.12   ocp-a1-node-rn5r
ssp-zookeeper-2                              1/1       Running             0          1h        172.16.22.
24   ocp-a1-node-dtjv
strimzi-cluster-controller-969217113-qdrkw   1/1       Running             0          1h        172.16.10.23   ocp-a1-node-l6dt

[root@ocp-a1-node-qngj ~]# df -h |grep sde
[root@ocp-a1-node-qngj ~]# lsblk
NAME                                                                                 MAJ:MIN RM SIZE RO TYPE  MOUNTPOINT
sda                                                                                    8:0    0  25G  0 disk
└─sda1                                                                                 8:1    0  25G  0 part  /
sdb                                                                                    8:16   0  25G  0 disk  /var/lib/docker
sdc                                                                                    8:32   0  50G  0 disk  /var/lib/origin/openshift.local.volumes
sdd                                                                                    8:48   0   1G  0 disk
└─0Google_PersistentDisk_kubernetes-dynamic-pvc-3ef72dcd-449d-11e8-97d2-42010a84000a 253:0    0   1G  0 mpath
sde                                                                                    8:64   0  10G  0 disk
└─0Google_PersistentDisk_kubernetes-dynamic-pvc-099654ed-5da2-11e8-9e4c-42010a840007 253:1    0  10G  0 mpath


Any ideas ?
Cheers,
/JM

Comment 27 hchen 2018-05-24 19:16:22 UTC

have you tried multipath -w /dev/sdd (see comment 14)?