Bug 1657003

Summary: Pre-allocated PV not utilized on OSP
Product: OpenShift Container Platform Reporter: Ben <bkincaid>
Component: InstallerAssignee: Tzu-Mainn Chen <tzumainn>
Installer sub component: openshift-ansible QA Contact: Weihua Meng <wmeng>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: bkincaid, gpei, sponnaga, tzumainn, wmeng
Version: 3.10.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1690184 1690186 (view as bug list) Environment:
Last Closed: 2019-04-11 05:38:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1690184, 1690186    

Description Ben 2018-12-06 20:03:25 UTC
Description of problem:
When using the openshift-ansible 3.11, 3.10, and 3.9 on OSP and passing in a pre-allocated PV it is not used. Instead it attaches to a dynamic PV when a storageclass is provided. This can be seen when a persistent volume claim is made for the internal registry.

Version-Release number of the following components:
OSP v12

rpm -q openshift-ansible
release-v3.11, release-v3.10, release-v3.9

rpm -q ansible 
ansible-2.7.2-1.fc28.noarch

ansible --version 
ansible 2.7.2


How reproducible:

Steps to Reproduce:
1. With a pre-defined cinder volume in OpenStack pass the variables as defined in the documentation:

#openshift_hosted_registry_storage_kind=openstack
#openshift_hosted_registry_storage_access_modes=['ReadWriteOnce']
#openshift_hosted_registry_storage_openstack_filesystem=ext4
#openshift_hosted_registry_storage_openstack_volumeID=3a650b4f-c8c5-4e0a-8ca5-eaee11f16c57
#openshift_hosted_registry_storage_volume_size=10Gi

2. Observe that this is not utilized for persistent volume claims.

Comment 1 Tzu-Mainn Chen 2018-12-07 14:31:23 UTC
Hi! I think there may be some confusion; the listed parameters are for a Cinder-backed openshift registry, not persistent volume claims. Separate volumes are created for those: https://github.com/openshift/openshift-ansible/blob/master/playbooks/openstack/configuration.md#cinder-backed-persistent-volumes-configuration

Let me know if there's something else that's not quite working!

Comment 2 Ben 2018-12-11 13:43:23 UTC
Hi sorry, let me explain more clearly. When we define these variables to attach cinder storage to an internal registry, it does not use the storage we specified. Instead it uses dynamically provisioned storage.

Thanks!

Comment 3 Tzu-Mainn Chen 2018-12-11 14:49:03 UTC
Odd, we've tested this successfully in the past, but maybe there's some combination of parameters we didn't account for. Could you
attach your inventory files, as well as the output of 'openstack volume list' and 'openstack server list'?

Comment 4 Ben 2018-12-12 14:17:04 UTC
So this is what we have in our inventory:

openshift_hosted_registry_storage_kind: openstack
openshift_hosted_registry_storage_access_modes:
- ReadWriteOnce
openshift_hosted_registry_storage_openstack_filesystem: "ext4"
openshift_hosted_registry_storage_openstack_volumeID: "e4ae25b4-930a-4ad4-b674-2e360746f647"
openshift_hosted_registry_storage_volume_size: "20Gi"


And the commands give the following:

> openstack volume list
| e4ae25b4-930a-4ad4-b674-2e360746f647 | pvbugfx.mydomain.internal-registry | available |   20 ||
| 4f1f0ad8-5835-4c78-b1b9-b7a44b28b5c4 | kubernetes-dynamic-pvc-da077dbd-a050-11e8-a561-fa163e2847e3 | in-use  |   20 | Attached to infra-node-0.pvbugfx.mydomain.internal on /dev/vdc |


> oc get pv | grep registry
NAME                                       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM                                 STORAGECLASS   REASON    AGE 
registry-volume                            20Gi       RWO           Retain          Available                                               118d
pvc-da077dbd-a050-11e8-a561-fa163e2847e3   20Gi       RWO            Delete           Bound   default/registry-claim                     standard                    118d


> openstack server list
| 4b4e7e2d-5a90-4d1e-9256-4714f68adb03 | infra-node-0.pvbugfx.mydomain.internal | ACTIVE | openshift-ansible-pvbugfx.mydomain.internal-net=192.168.40.7, 10.9.53.44  | rhel-server-7.6-x86_64 | m1.xlarge   |
| bc88f735-6292-4221-8845-a9d3cb9dc547 | master-0.pvbugfx.mydomain.internal     | ACTIVE | openshift-ansible-pvbugfx.mydomain.internal-net=192.168.40.5, 10.9.53.123 | rhel-server-7.6-x86_64 | m1.xlarge   |
| 1c889094-43fc-4f6d-adf2-d4f3afd736ea | app-node-0.pvbugfx.mydomain.internal   | ACTIVE | openshift-ansible-pvbugfx.mydomain.internal-net=192.168.40.8, 10.9.53.140 | rhel-server-7.6-x86_64 | m1.xlarge   |

Comment 5 Tzu-Mainn Chen 2018-12-12 14:24:01 UTC
Interesting - could you attach the full all.yml and OSEv3.yml (with any passwords removed)? I want to make sure I get the exact right parameters before I try and reproduce.

Comment 6 Ben 2018-12-17 11:32:25 UTC
Here's an inventory that produces this issue:

all.yml is empty


OSEv3.yml
---------

ansible_user: openshift
ansible_become: True

dns_domain: "mydomain.com"
dns_nameservers:
- 10.8.15.4

deployment_type: openshift-enterprise
openshift_deployment_type: openshift-enterprise
openshift_release: v3.11

# Enable the Multi-Tenant plugin
os_sdn_network_plugin_name: 'redhat/openshift-ovs-multitenant'

# Change default ports to use standard 443
openshift_master_api_port: 443
openshift_master_console_port: 443


# OpenShift FQDNs, DNS, App domain specific configurations
openshift_master_default_subdomain: "apps.pvbugfx.mydomain.com"
openshift_master_cluster_hostname: "master-0.pvbugfx.mydomain.com"
openshift_master_cluster_public_hostname: "console.pvbugfx.mydomain.com"

oreg_auth_user: "{{ lookup('env', 'OREG_AUTH_USER' )}}"
oreg_auth_password: "{{ lookup('env', 'OREG_AUTH_PASSWORD' )}}"

openshift_cloudprovider_kind: openstack
openshift_cloudprovider_openstack_auth_url: "{{ lookup('env','OCP_OSP_AUTH_URL') }}"
openshift_cloudprovider_openstack_username: "{{ lookup('env','OCP_OSP_USERNAME') }}"
openshift_cloudprovider_openstack_password: "{{ lookup('env','OCP_OSP_PASSWORD') }}"
openshift_cloudprovider_openstack_domain_name: Default
openshift_cloudprovider_openstack_tenant_id: "{{ lookup('env','OS_PROJECT_ID') }}"
openshift_cloudprovider_openstack_tenant_name: "{{ lookup('env','OS_PROJECT_NAME') }}"
openshift_cloudprovider_openstack_region: "{{ lookup('env','OS_REGION_NAME') }}"
openshift_cloudprovider_openstack_lb_subnet_id: subnet_id
openshift_cloudprovider_openstack_blockstorage_version: v2


openshift_hosted_registry_storage_kind: openstack
openshift_hosted_registry_storage_access_modes:
- 'ReadWriteOnce'
openshift_hosted_registry_storage_openstack_filesystem: ext4
openshift_hosted_registry_storage_openstack_volumeID: 64ff44bf-ecd1-4ca4-8fbc-f57a365a6467
openshift_hosted_registry_storage_volume_size: 20Gi


openshift_master_identity_providers:
- 'name': 'htpasswd_auth'
  'login': 'true'
  'challenge': 'true'
  'kind': 'HTPasswdPasswordIdentityProvider'

# OpenShift Node specific parameters
openshift_node_groups:
- name: node-config-master
  labels:
  - 'node-role.kubernetes.io/master=true'
  edits:
  - key: kubeletArguments.kube-reserved
    value:
    - 'cpu={{ ansible_processor_vcpus * 50 }}m'
    - 'memory={{ ansible_processor_vcpus * 50 }}M'
  - key: kubeletArguments.system-reserved
    value:
    - 'cpu={{ ansible_processor_vcpus * 50 }}m'
    - 'memory={{ ansible_processor_vcpus * 100 }}M'
- name: node-config-infra
  labels:
  - 'node-role.kubernetes.io/infra=true'
  edits:
  - key: kubeletArguments.kube-reserved
    value:
    - 'cpu={{ ansible_processor_vcpus * 50 }}m'
    - 'memory={{ ansible_processor_vcpus * 50 }}M'
  - key: kubeletArguments.system-reserved
    value:
    - 'cpu={{ ansible_processor_vcpus * 50 }}m'
    - 'memory={{ ansible_processor_vcpus * 100 }}M'
- name: node-config-compute
  labels:
  - 'node-role.kubernetes.io/compute=true'
  edits:
  - key: kubeletArguments.kube-reserved
    value:
    - 'cpu={{ ansible_processor_vcpus * 50 }}m'
    - 'memory={{ ansible_processor_vcpus * 50 }}M'
  - key: kubeletArguments.system-reserved
    value:
    - 'cpu={{ ansible_processor_vcpus * 50 }}m'
    - 'memory={{ ansible_processor_vcpus * 100 }}M'

Comment 7 Tzu-Mainn Chen 2018-12-18 04:50:28 UTC
Ah, I've reproduced the issue; strange! I'll dig in and let you know what I find.

Comment 8 Tzu-Mainn Chen 2018-12-18 23:26:56 UTC
Okay, I figured out the source of the issue. I need to dig in deeper to understand what the right fix may be, but one thing that works is setting this in your OSEv3.yml inventory file:

openshift_storageclass_default: "false"

It looks like default storage classes were introduced relatively recently. This feature is active by default, and if on then PVCs will have their storage class set to the default; however this doesn't happen for PVs, resulting in the behavior you reported.

Comment 9 Tzu-Mainn Chen 2019-01-03 22:54:54 UTC
Fixed by https://github.com/openshift/openshift-ansible/pull/10948

Comment 10 Tzu-Mainn Chen 2019-01-21 14:38:27 UTC
Merge fixed. Note that part of the fix includes a doc update that explains an added parameter that needs to be present in order for this to work.

Comment 12 Weihua Meng 2019-03-15 08:43:07 UTC
Hi, 

I still meet this issue with openshift-ansible-3.11.95

openshift_hosted_registry_storage_kind=openstack
openshift_hosted_registry_storage_provider=
openshift_hosted_registry_storage_access_modes=["ReadWriteOnce"]
openshift_hosted_registry_storage_openstack_filesystem=xfs
openshift_hosted_registry_storage_volume_size=20Gi
openshift_hosted_registry_storage_openstack_volumeID=d063fd5a-fc1a-4e6e-9524-fb6641aff7eb


# oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                    STORAGECLASS   REASON    AGE
pvc-60d1d64d-46f3-11e9-9624-fa163e2e3062   20Gi       RWO            Delete           Bound       default/registry-claim   standard                 15m
registry-volume                            20Gi       RWO            Retain           Available                                                     15m
# oc get pvc
NAME             STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
registry-claim   Bound     pvc-60d1d64d-46f3-11e9-9624-fa163e2e3062   20Gi       RWO            standard       15m

Comment 16 Weihua Meng 2019-03-16 01:39:34 UTC
Thanks.

Fixed.

openshift-ansible-3.11.96-1.git.0.c2a3f89.el7

[root@preserve-wmengl6a311d2rcin-me-1 ~]# oc get pv
NAME              CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                    STORAGECLASS   REASON    AGE
registry-volume   20Gi       RWO            Retain           Bound     default/registry-claim   standard                 11m
[root@preserve-wmengl6a311d2rcin-me-1 ~]# oc get pvc
NAME             STATUS    VOLUME            CAPACITY   ACCESS MODES   STORAGECLASS   AGE
registry-claim   Bound     registry-volume   20Gi       RWO            standard       11m


Need to update on-line doc for this.

https://docs.openshift.com/container-platform/3.11/install/configuring_inventory_file.html
not only v3.11 but all relevant versions.

Chen, is there doc bug to track on-line doc update? 
If yes, could you put bug link here?
or need I file doc bug to track this?

one more thing, 
we also need bugs to ensure all related versions get this fix 
Need I clone bug to track?

Comment 17 Tzu-Mainn Chen 2019-03-18 14:17:27 UTC
Ah, I don't believe there's a doc bug, nor has the bug been cloned to previous version. Note that this fix only applies to shiftstack install, so we can only clone as far back as 3.10. Thanks!

Comment 18 Weihua Meng 2019-03-21 02:26:32 UTC
Fixed.

openshift-ansible-3.11.98-1.git.0.3cfa7c3.el7

# oc get pv
NAME              CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                    STORAGECLASS   REASON    AGE
registry-volume   20Gi       RWO            Retain           Bound     default/registry-claim   standard                 10m
[root@qe-wmengl6a31198d2rcin-master-etcd-1 ~]# oc get pvc
NAME             STATUS    VOLUME            CAPACITY   ACCESS MODES   STORAGECLASS   AGE
registry-claim   Bound     registry-volume   20Gi       RWO            standard       10m

Comment 20 errata-xmlrpc 2019-04-11 05:38:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0636