Bug 2275977 - Pods created with nfs pvcs which are created with nfs restricted storageclasses are stuck at 'Container Creating' [NEEDINFO]
Summary: Pods created with nfs pvcs which are created with nfs restricted storageclass...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: csi-driver
Version: 4.16
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Niels de Vos
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-04-18 19:30 UTC by Amrita Mahapatra
Modified: 2024-09-12 12:20 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
muagarwa: needinfo? (ammahapa)


Attachments (Terms of Use)

Description Amrita Mahapatra 2024-04-18 19:30:51 UTC
Description of problem (please be detailed as possible and provide log
snippests):
 Pods created with nfs pvcs which are created with nfs restricted storageclasses are stuck at 'Container Creating'
The error message is,

"Generated from kubelet on ip-10-0-85-151.us-east-2.compute.internal

MountVolume.SetUp failed for volume "pvc-406b305d-3731-41e1-8603-e44b03d23130" : rpc error: code = Internal desc = nfs: failed to mount "ocs-storagecluster-cephnfs-service:/0001-0011-openshift-storage-0000000000000001-0b58eb88-db27-4f3d-b8dc-3e232864a06b" to "/var/lib/kubelet/pods/58785246-977c-4746-a800-92cd2a337c84/volumes/kubernetes.io~csi/pvc-406b305d-3731-41e1-8603-e44b03d23130/mount" : mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs ocs-storagecluster-cephnfs-service:/0001-0011-openshift-storage-0000000000000001-0b58eb88-db27-4f3d-b8dc-3e232864a06b /var/lib/kubelet/pods/58785246-977c-4746-a800-92cd2a337c84/volumes/kubernetes.io~csi/pvc-406b305d-3731-41e1-8603-e44b03d23130/mount Output: mount.nfs: mounting ocs-storagecluster-cephnfs-service:/0001-0011-openshift-storage-0000000000000001-0b58eb88-db27-4f3d-b8dc-3e232864a06b failed, reason given by server: No such file or directory stderr: ""


nfs-ganesha log:

18/04/2024 19:24:53 : epoch 66216cf8 : openshift-storage-ocs-storagecluster-cephnfs : nfs-ganesha-1[svc_95] nfs4_export_check_access :NFS4 :INFO :Access not allowed on Export_Id 1 /0001-0011-openshift-storage-0000000000000001-45e2a520-1af9-4684-90f7-21afbbb798a3 for client ::ffff:100.64.0.5


Version of all relevant components (if applicable):
OCP: 4.16.0-0.nightly-2024-04-15-184947
ODF: 4.16.0-75.stable

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)? Yes


Is there any workaround available to the best of your knowledge? NA


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)? 3


Can this issue reproducible? Yes


Can this issue reproduce from the UI? Yes


If this is a regression, please provide more details to justify this: NA


Steps to Reproduce:
1. Create odf cluster enable nfs feature
2. Create a nfs restricted storage class with clients: <supported hosts>
3. Create a pvc with the restricted nfs sc
4. Create a pod with the nfs pvc


Actual results:
Pods created with nfs pvcs which are created with nfs restricted storageclasses are stuck at 'Container Creating'

Expected results:
Pods should be Running successfully.

Additional info:

Comment 6 Niels de Vos 2024-04-19 12:05:30 UTC
The NFS-Ganesha logs contain "Access not allowed" for "client ::ffff:100.64.0.5". This is the IPv6 notation for IPv4 address 100.64.0.5.

The workernode is expected to be able to mount the NFS-export when 100.64.0.5 is included in the "clients:" parameter of the StorageClass.

It is unclear where IPv4 100.64.0.5 comes from when the node is connecting. The nodes (and Ceph-CSI pods with host-networking) are part of a different IP-range.

The main question is, why is that IP-address used, and not the real IP-address of the workernode.

Comment 15 Niels de Vos 2024-05-07 11:37:35 UTC
Hi Blaine!

Is it possible that the Rook configured NFS-Ganesha server does not have the `HAProxy_Hosts` configuration option set? Without that option, NFS-Ganesha might not try to detect/parse the HA-proxy header, which causes the "Permission Denied" errors.

The option is documented here:
https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/doc/man/ganesha-core-config.rst#nfs_core_param-


Note You need to log in before you can comment on or make changes to this bug.