Bug 1923956 - [aws-c2s] Storage can not be used in the cluster
Summary: [aws-c2s] Storage can not be used in the cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.7.0
Assignee: Matthew Staebler
QA Contact: Yunfei Jiang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-02 10:06 UTC by Yunfei Jiang
Modified: 2021-03-16 08:43 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
This PR concerns new functionality.
Clone Of:
Environment:
Last Closed: 2021-03-16 08:42:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
log-bundle-20210209100005.tar.gz (1.56 MB, application/gzip)
2021-02-09 15:44 UTC, Yunfei Jiang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:0749 0 None None None 2021-03-16 08:43:14 UTC

Description Yunfei Jiang 2021-02-02 10:06:37 UTC
After launching the cluster, one of node labels is:

{
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/instance-type": "m5.large",
"beta.kubernetes.io/os": "linux",
"failure-domain.beta.kubernetes.io/region": "us-east-1",
"failure-domain.beta.kubernetes.io/zone": "us-iso-east-1c",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "ip-10-143-1-71",
"kubernetes.io/os": "linux",
"node-role.kubernetes.io/worker": "",
"node.kubernetes.io/instance-type": "m5.large",
"node.openshift.io/os_id": "rhcos",
"topology.ebs.csi.aws.com/zone": "us-east-1c",
"topology.kubernetes.io/region": "us-east-1",
"topology.kubernetes.io/zone": "us-iso-east-1c"
}

With these labels storage can not be used.


The PV created by default storage class gp2 with the following node affinity:
nodeAffinity:
   required:
     nodeSelectorTerms:
     - matchExpressions:
       - key: failure-domain.beta.kubernetes.io/zone
         operator: In
         values:
         - us-iso-east-1c
       - key: failure-domain.beta.kubernetes.io/region
         operator: In
         values:
         - us-iso-east-1


the node labels and PV node affinity are not the same, an get following error:
> oc -n openshift-monitoring describe pod alertmanager-main-0
...
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  7m35s  default-scheduler  running PreBind plugin "VolumeBinding": binding volumes: pv "pvc-67fbd7e1-654c-4e5d-86ed-eb169ca38f88" node affinity doesn't match node "ip-10-143-1-13.ec2.internal": no matching NodeSelectorTerms
 Warning  FailedScheduling  7m35s  default-scheduler  0/8 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 5 node(s) had volume node affinity conflict.
 Warning  FailedScheduling  7m9s   default-scheduler  0/8 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 5 node(s) had volume node affinity conflict.


tried to override the label value, pod is scheduled successfully and is running:

> oc label node ip-10-143-1-13.ec2.internal failure-domain.beta.kubernetes.io/region=us-iso-east-1 --overwrite
> oc get no ip-10-143-1-13.ec2.internal -ojson | jq .metadata
<--snip-->
  "labels": {
<--snip-->
    "failure-domain.beta.kubernetes.io/region": "us-iso-east-1",
    "failure-domain.beta.kubernetes.io/zone": "us-iso-east-1c",
<--snip-->

but it is recovered after 12 hours approximately:
> oc get no ip-10-143-1-13.ec2.internal -ojson | jq .metadata
<--snip-->
  "labels": {
<--snip-->
    "failure-domain.beta.kubernetes.io/region": "us-east-1",
<--snip→

And also can not use storageclass gp2-csi provision PVs.
Get the following error msg:

I0202 04:06:50.375383 1 controller.go:95] CreateVolume: called with args {Name:pvc-65e29807-2dfc-4a12-a558-1810a4024112 CapacityRange:required_bytes:5368709120 VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[csi.storage.k8s.io/pv/name:pvc-65e29807-2dfc-4a12-a558-1810a4024112 csi.storage.k8s.io/pvc/name:test-pvc-5 csi.storage.k8s.io/pvc/namespace:default encrypted:true type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:requisite:<segments:<key:"topology.ebs.csi.aws.com/zone" value:"us-east-1c" > > preferred:<segments:<key:"topology.ebs.csi.aws.com/zone" value:"us-east-1c" > > XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0202 04:07:00.375108 1 driver.go:115] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled

OCP Version:
4.7.0-0.nightly-2021-01-31-031653

Comment 1 Matthew Staebler 2021-02-02 13:51:37 UTC
This is a deficiency of the simulated environment. Any component that uses the instance metadata is going to get us-east-1 as the region instead of us-iso-east-1. This would not occur in the real environment, but I don't have any way to actually test that. I cannot think of anything that we can do to work around this in the simulated environment.

Comment 2 Matthew Staebler 2021-02-05 16:12:19 UTC
I am going to look into whether we can setup a local man-in-the-middle on each machine to manipulate the instance metadata response.

Comment 3 Matthew Staebler 2021-02-09 01:09:41 UTC
@yunjiang I have a workaround for dealing with the instance metadata. I created a service [1] to run on OpenShift nodes to do emulation of the instance metadata. With this, I was able to create a gp2 PV.

$ ./oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                   STORAGECLASS   REASON   AGE
pvc-366e9653-df2f-43d9-99ac-f2f5d2faf203   3Gi        RWO            Delete           Bound    default/task-pv-claim   gp2                     6m54s

With the service, you no longer need to have the custom endpoints for us-east-1. There is one outstanding problem where the master MachineConfigPool is out-of-sync, resulting in the machine-config operator being degraded. But all of the operators successfully install and the cluster works fine.

There is a README with usage in the repo. Let me know if you have any questions or run into any issues.

[1] https://github.com/staebler/c2s-instance-metadata

Comment 4 Yunfei Jiang 2021-02-09 15:43:08 UTC
Hello Matthew
Thanks for your quick fix and detailed instruction.

I tried to use your fix to install disconnect cluster on C2S, but bootstrap process failed. (log-bundle-20210209100005.tar.gz is attached)

Is there something wrong?

Following are my steps:
1. create registry and mirrored quay.io/staebler/c2s-instance-metadata:latest into local registry.
2. create install-config.yaml and set credentialsMode as Manual, and then create manifests
3. create Secrets config files and copy them to [installdir]/openshift/
3. follow step 3, 4 and create cluster

> manifests/cloud-provider-config.yaml

apiVersion: v1
data:
  config: |
    [Global]
  ca-bundle.pem: |
    -----BEGIN CERTIFICATE-----
<--snip-->
    -----END CERTIFICATE-----
kind: ConfigMap
metadata:
  creationTimestamp: null
  name: cloud-provider-config
  namespace: openshift-config

> manifests/meta-master.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  name: c2s-instance-metadata-master
<--snip-->

> manifests/meta-worker.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: c2s-instance-metadata-worker
<--snip-->

Comment 5 Yunfei Jiang 2021-02-09 15:44:04 UTC
Created attachment 1755975 [details]
log-bundle-20210209100005.tar.gz

Comment 6 Matthew Staebler 2021-02-09 16:57:33 UTC
@yunjiang I forgot to add the part in the Usage where you need to change the image used in the MachineConfigs from quay.io/staebler/c2s-instance-metadata to the image in your local image registry.

Comment 7 Matthew Staebler 2021-02-09 18:14:11 UTC
I have update the Usage in the README in github.com/staebler/c2s-instance-metadata.

I have also added a fix for the issue where the MCO ends up degraded. The MCO did not like the use of passwd in the ignition config. I have replaced it with calls to add the metadata user in the c2s-instance-metadata-setup service. With this change, the install is completing successfully.

Comment 8 Qin Ping 2021-02-10 08:20:12 UTC
Did some basic test for aws ebs in-tree plugin and aws ebs csi driver with 4.7.0-0.nightly-2021-02-08-052658, all the test passed. Storage can be used in this cluster.

Comment 9 Yunfei Jiang 2021-02-10 09:55:59 UTC
Matthew,

After applied your patch, the cluster was installed successfully, Qin Ping already did some tests on this cluster. Thanks.

btw, since I use a mirror registry, there is an authorization issue on it, after I changed [1] to:

```
          ExecStart=/usr/bin/sh -c 'if ! id metadata &>/dev/null; then useradd -p "*" -U -m metadata -G sudo; fi; mkdir /var/home/metadata/.docker ; cp /var/lib/kubelet/config.json /var/home/metadata/.docker/ ; chown -R metadata:metadata /var/home/metadata/.docker'
```

it works well, let me know if there is any potential issue against this minor fix, thanks.

[1] https://github.com/staebler/c2s-instance-metadata/blob/master/config/c2s-instance-metadata-machineconfig.yaml#L22


>> issue

I noticed that the cluster have some pending CSRs:

./oc get csr
NAME        AGE    SIGNERNAME                                    REQUESTOR                                                                   CONDITION
csr-2j9km   32m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-54trl   32m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-164.ec2.internal                                    Pending
csr-58bg8   32m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-14.ec2.internal                                     Pending
csr-5vdwx   17m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-164.ec2.internal                                    Pending
csr-7kkc5   52m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-161.ec2.internal                                    Approved,Issued
csr-8grsf   32m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-8w5tk   52m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-dfsbc   17m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-127.ec2.internal                                    Pending
csr-fdccc   2m6s   kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-127.ec2.internal                                    Pending
csr-mj25q   52m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-px29f   52m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-qpzhr   17m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-14.ec2.internal                                     Pending
csr-rh8tn   32m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-sdgmp   52m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-139.ec2.internal                                    Approved,Issued
csr-ssgz5   2m9s   kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-164.ec2.internal                                    Pending
csr-xp8s2   32m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-127.ec2.internal                                    Pending
csr-zkd2j   2m8s   kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-14.ec2.internal                                     Pending
csr-zxdl4   52m    kubernetes.io/kubelet-serving                 system:node:ip-10-143-1-239.ec2.internal                                    Approved,Issued

is this expected status?

Comment 11 Matthew Staebler 2021-02-11 01:32:38 UTC
(In reply to Yunfei Jiang from comment #9)
> I noticed that the cluster have some pending CSRs:

This is caused by the nodes not getting a valid ExternalIP or ExternalDNS. I am not sure how that is supposed to get filled out.

> $ oc get nodes ip-10-143-1-127.ec2.internal -ojson | jq -r '.status.addresses[] | select(.type == "ExternalIP").address'
> <?xml version="1.0" encoding="iso-8859-1"?>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> 		 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
>  <head>
>   <title>404 - Not Found</title>
>  </head>
>  <body>
>   <h1>404 - Not Found</h1>
>  </body>
> </html>

The machine approver is then choking on that weird DNS name.

> $ oc logs -n openshift-cluster-machine-approver machine-approver-6979d5cdff-vq64c -c machine-approver-controller
> <snip>
> I0211 01:29:41.112695       1 main.go:218] Error syncing csr csr-f4xlb: DNS name '<?xml version="1.0" encoding="iso-8859-1"?>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> 		 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
>  <head>
>   <title>404 - Not Found</title>
>  </head>
>  <body>
>   <h1>404 - Not Found</h1>
>  </body>
> </html>
> ' not in machine names: ip-10-143-1-127.ec2.internal ip-10-143-1-127.ec2.internal

Comment 12 Matthew Staebler 2021-02-11 01:37:33 UTC
(In reply to Yunfei Jiang from comment #9)
> btw, since I use a mirror registry, there is an authorization issue on it,
> after I changed [1] to:
> 
> ```
>           ExecStart=/usr/bin/sh -c 'if ! id metadata &>/dev/null; then
> useradd -p "*" -U -m metadata -G sudo; fi; mkdir /var/home/metadata/.docker
> ; cp /var/lib/kubelet/config.json /var/home/metadata/.docker/ ; chown -R
> metadata:metadata /var/home/metadata/.docker'
> ```
> 
> it works well, let me know if there is any potential issue against this
> minor fix, thanks.
> 
> [1]
> https://github.com/staebler/c2s-instance-metadata/blob/master/config/c2s-
> instance-metadata-machineconfig.yaml#L22

That change makes sense. Thanks. I will update the repo with it.

If you are satisfied that storage works, could you move the bug to verify or closed?

If the only outstanding issue is the CSRs, then I think we should track that in a separate BZ. Do you recall seeing that CSR behavior when we were using the custom endpoints for us-east-1 instead of using the c2s-instance-metadata interceptor?

Comment 13 Matthew Staebler 2021-02-11 05:44:27 UTC
I figured out the problem with the pending CSRs. The instance metadata interceptor is not preserving the status code returned by the real server. The interceptor is always returning 200. So, for example, when a call to http://169.254.169.254/latest/meta-data/public-ipv4 is made on an instance without a public IP, the status code should be 404, but when going through the interceptor it is 200 instead. I'll work on a fix for the interceptor.

Comment 14 Matthew Staebler 2021-02-11 18:25:10 UTC
The c2s-instance-metadata repo has been updated with a fix for the issue of the pending CSRs.

Comment 15 Yunfei Jiang 2021-02-18 09:25:55 UTC
Thanks Matthew, I re-tried with your latest fix, it works well now.

Comment 18 errata-xmlrpc 2021-03-16 08:42:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.7.2 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0749


Note You need to log in before you can comment on or make changes to this bug.