Bug 1911016
| Summary: | Prometheus unable to mount NFS volumes after upgrading to 4.6 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Yash Chouksey <ychoukse> |
| Component: | Node | Assignee: | Peter Hunt <pehunt> |
| Node sub component: | CRI-O | QA Contact: | MinLi <minmli> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | alegrand, anpicker, aos-bugs, dkulkarn, erooth, hekumar, kakkoyun, ksathe, lcosic, minmli, pehunt, pkrupa, rbost, rdomnu, rphillips, schoudha, surbania, wking, xingli |
| Version: | 4.6 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | runc-1.0.0-82.rhaos4.6.git086e841.el8 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-02-24 15:48:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Yash Chouksey
2020-12-27 01:19:46 UTC
upgrade from 4.6.16 to 4.7.0-0.nightly-2021-02-09-192846 succeed, and after upgrade the prometheus pods runs well and mount nfs volume normally.
@Peter Hunt, I am not sure if we could verify the fix by upgrading from 4.6 to 4.7, because in 4.6, the user id in prometheus pod is 65534 already, and when upgrade to 4.7, the user id keep the same. But in the original bug, in 4.5, the user id is 0(root), and upgrade to 4.6, the user id change to 65534.
Can you confirm this? If yes, I think this bug is verified.
before upgrade ================
volumeMounts:
- mountPath: /prometheus
name: prometheus-k8s-db
subPath: prometheus-db
volumes:
- name: prometheus-k8s-db
persistentVolumeClaim:
claimName: prometheus-k8s-db-prometheus-k8s-0
$ oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
prometheus-k8s-db-prometheus-k8s-0 Bound nfspv6 20Gi RWO,ROX,RWX nfs215 82s
prometheus-k8s-db-prometheus-k8s-1 Bound nfspv5 20Gi RWO,ROX,RWX nfs215 82s
$ oc rsh prometheus-k8s-0
Defaulting container name to prometheus.
Use 'oc describe pod/prometheus-k8s-0 -n openshift-monitoring' to see all of the containers in this pod.
sh-4.4$ id
uid=65534(nobody) gid=65534(nobody) groups=65534(nobody)
sh-4.4$ pwd
/prometheus
sh-4.4$ ls
chunks_head queries.active wal
sh-4.4$ ls -lR
.:
total 20
drwxr-xr-x. 2 nobody nobody 6 Feb 10 09:22 chunks_head
-rw-r--r--. 1 nobody nobody 20001 Feb 10 09:27 queries.active
drwxr-xr-x. 2 nobody nobody 22 Feb 10 09:22 wal
after upgrade ============
prometheus-k8s-0 7/7 Running 1 37m
prometheus-k8s-1 7/7 Running 1 42m
$ oc rsh prometheus-k8s-0
Defaulting container name to prometheus.
Use 'oc describe pod/prometheus-k8s-0 -n openshift-monitoring' to see all of the containers in this pod.
sh-4.4$ id
uid=65534(nobody) gid=65534(nobody) groups=65534(nobody)
sh-4.4$ pwd
/prometheus
sh-4.4$ ls
chunks_head queries.active wal
sh-4.4$ ls -lR
.:
total 20
drwxr-xr-x. 2 nobody nobody 48 Feb 10 10:20 chunks_head
-rw-r--r--. 1 nobody nobody 20001 Feb 10 10:52 queries.active
drwxr-xr-x. 2 nobody nobody 54 Feb 10 10:15 wal
I believe the only reason it was failing in upgrade from 4.5 to 4.6 was the directory permission wasn't correctly handled for the new ID, not necessarily that there was a switch in ID. As in, the old ID (root) had permission but the new one did not. Thus, I think the upgrade from 4.6 to 4.7 succeeding verifies the bug is fixed, as this ID does have the correct permission Thanks Peter for confirmation, marking it verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |