Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1594201 - Pod uses local block volume can not run succcessfully
Pod uses local block volume can not run succcessfully
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage (Show other bugs)
3.10.0
Unspecified Unspecified
unspecified Severity medium
: ---
: 3.11.0
Assigned To: Matthew Wong
Qin Ping
: UpcomingRelease
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-06-22 07:26 EDT by Qin Ping
Modified: 2018-10-11 03:21 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-10-11 03:20:54 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 None None None 2018-10-11 03:21 EDT

  None (edit)
Description Qin Ping 2018-06-22 07:26:28 EDT
Description of problem:
Pod uses local block volume can not run successfully in containerized kubelet.

Version-Release number of selected component (if applicable):
oc v3.10.3
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://qe-piqin-master-etcd-1:8443
openshift v3.10.3
kubernetes v1.10.0+b81c8f8

openshift-external-storage-local-provisioner-0.0.2-3.gitd3c94f0.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Deploy local block provisioner per https://github.com/openshift/openshift-docs/blob/master/install_config/configuring_local.adoc
2. Create a PVC
3. Create a Pod

Actual results:
Pod stuck in ContainerCreating status and report event:
Warning  FailedMapVolume        2s (x5 over 9s)  kubelet, qe-piqin-node-1  MapVolume.AttachFileDevice failed for volume "local-pv-8b912542" : exit status 1

Expected results:
Pod can run successfully.

Master Log:

Node Log (of failed PODs):
Jun 22 11:23:11 qe-piqin-node-1 atomic-openshift-node[5958]: I0622 11:23:11.360896    5969 volume_path_handler_linux.go:75] Failed device create command for path: /mnt/local-storage/block-devices/vdc exit status 1 losetup: /mnt/local-storage/block-devices/vdc: failed to set up loop device: No such file or directory

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
# ls -lZ /mnt/local-storage/block-devices/
lrwxrwxrwx. root root unconfined_u:object_r:container_file_t:s0 vdb -> /dev/vdb
lrwxrwxrwx. root root unconfined_u:object_r:container_file_t:s0 vdc -> /dev/vdc

# ls -lZ /dev/vdb /dev/vdc
brw-rw----. root disk unconfined_u:object_r:container_file_t:s0 /dev/vdb
brw-rw----. root disk unconfined_u:object_r:container_file_t:s0 /dev/vdc

Pod can run successfully in non-containerized kubelet.
Comment 4 Matthew Wong 2018-06-25 11:41:21 EDT
Thank you!

So if we add something like
        {
            "type": "bind",
            "source": "/mnt/local-storage",
            "destination": "/mnt/local-storage",
            "options": [
                "rbind",
                "rw",
                "mode=755"
            ]
        }
to kubelet mounts, i.e. on the compute node, edit /var/lib/containers/atomic/atomic-openshift-node.0/config.json then `systemctl restart atomic-openshift-node` then losetup succeeds and pod starts running with device attached, as expected.

We can add this mount to all kubelet configs upon openshift installation, but the directory is configurable by the user. We do document that the user must use /mnt/local-storage for our instructions to work, but users may have different requirements about where to mount volumes...

I am not sure if there is a way we can get around this using the /rootfs mount similar to how we do nsenter mounting when containerized
Comment 5 Matthew Wong 2018-06-25 11:54:43 EDT
Also, this error would not be limited to local volume since any Block PV with Path inaccessible by kubelet will have this error.
Comment 6 Matthew Wong 2018-06-25 12:42:22 EDT
Disregard last comment :) . I am thinking now the fix is to evaluate the symlink so we end up with a /dev devicepath.
Comment 7 Qin Ping 2018-06-25 21:46:24 EDT
Yeah, the other Block volumes(e.g. GCE PD, AWS EBS) do not have this issue.
Comment 8 Matthew Wong 2018-07-03 10:53:18 EDT
PR opened origin https://github.com/openshift/origin/pull/20117
Comment 10 Qin Ping 2018-08-31 03:13:55 EDT
Verified in OCP:
oc v3.11.0-0.25.0
openshift v3.11.0-0.25.0
kubernetes v1.11.0+d4cacc0

# uname -a
Linux qe-piqin-master-etcd-1 3.10.0-862.11.6.el7.x86_64 #1 SMP Fri Aug 10 16:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/redhat-release 
Red Hat Enterprise Linux Atomic Host release 7.5
Comment 12 errata-xmlrpc 2018-10-11 03:20:54 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.