2089167 – [RFE] Enable cephadm to provide one virtual per ganesha instance of the NFS service

Bug 2089167 - [RFE] Enable cephadm to provide one virtual per ganesha instance of the NFS service

Summary: [RFE] Enable cephadm to provide one virtual per ganesha instance of the NFS s...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Cephadm
Sub Component:
Version:	6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	7.1
Assignee:	Adam King
QA Contact:	Manisha Saini
Docs Contact:	Akash Raj
URL:
Whiteboard:
Duplicates (1):	2089166 (view as bug list)
Depends On:
Blocks:	2160010 2267614 2298578 2298579
TreeView+	depends on / blocked

Reported:	2022-05-23 07:11 UTC by Francesco Pantano
Modified:	2024-07-18 07:59 UTC (History)
CC List:	9 users (show)
Fixed In Version:	ceph-18.2.1-2.el9cp
Doc Type:	Enhancement
Doc Text:	.Ingress service with NFS backend can now be set up to use only `keepalived` to create a virtual IP (VIP) for the NFS daemon to bind to, without the HAProxy layer involved With this enhancement, ingress service with an NFS backend can be set up to only use `keepalived` to create a virtual IP for the NFS daemon to bind to, without the HAProxy layer involved. This is useful in cases where the NFS daemon is moved around and clients need not use a different IP to connect to it. Cephadm deploys `keepalived` to set up a VIP and then have the NFS daemon bind to that VIP. This can also be setup using the NFS module via the `ceph nfs cluster create` command, using the flags `--ingress --ingress-mode keepalive-only --virtual-ip <VIP>`. The specification file looks as follows: ---- service_type: ingress service_id: nfs.nfsganesha service_name: ingress.nfs.nfsganesha placement: count: 1 label: foo spec: backend_service: nfs.nfsganesha frontend_port: 12049 monitor_port: 9049 virtual_ip: 10.8.128.234/24 virtual_interface_networks: 10.8.128.0/24 keepalive_only: true ---- that includes the `keepalive_ony: true` setting. An NFS specification looks as below: ---- networks: - 10.8.128.0/21 service_type: nfs service_id: nfsganesha placement: count: 1 label: foo spec: virtual_ip: 10.8.128.234 port: 2049 ---- that includes the `virtual_ip` field that should match the VIP in the ingress specification.
Clone Of:
Environment:
Last Closed:	2024-06-13 14:19:21 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	55663	None	None	None	2022-05-23 07:12:57 UTC
Red Hat Issue Tracker	RHCEPH-4344	None	None	None	2022-05-23 07:26:44 UTC
Red Hat Product Errata	RHSA-2024:3925	None	None	None	2024-06-13 14:19:27 UTC

Description Francesco Pantano 2022-05-23 07:11:32 UTC

Description of problem:

cephadm allows ingress service (haporxy+keepalived) to be deployed in front of NFS-ganesha service (a cluster of NFS-ganesha daemons) to provide HA stable virtual IP to NFS clients, https://docs.ceph.com/en/quincy/mgr/nfs/#ingress .
One of the downsides of this HA NFS service setup is that the backend NFS servers cannot see the source client IPs, and can only see the proxy server's IP.
So use-cases such as OpenStack manila, where the NFS-ganesha enforces client IP based authorization for export access cannot be met.

A HA model for NFS service was discussed earlier in https://pad.ceph.com/p/cephadm-nfs-ha (Option 2 and 2a) and https://www.spinics.net/lists/dev-ceph/msg03442.html (Option 5 and 6), where cephadm internally manages the virtual IP of ganesha servers.
cephadm would set up one virtual IP per each ganesha daemon it deploys.
It would add or remove the virtual IP when creating or removing a ganesha server.

From https://www.spinics.net/lists/dev-ceph/msg03442.html,

"
single ganesha + single virtual IP
- 1 ganesha daemon
- 1 virtual IP that follows the ganesha daemon
- on failure, cephadm would deploy ganesha elsewhere + move virtual IP
- not implemented

multiple ganesha + multiple virtual IPs
- N ganesha daemons
- N virtual IPs
- requires ganesha changes to (1) make ganesha aware of peers and (2)
instruct clients to move around
- on failure, cephadm would deploy failed ganesha elsewhere + move
that virtual IP
- not implemented (in cephadm or ganesha)
"

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

Comment 1 Francesco Pantano 2022-05-24 16:44:11 UTC

*** Bug 2089166 has been marked as a duplicate of this bug. ***

Comment 14 Manisha Saini 2024-03-28 00:18:00 UTC

Build and test details-->

# rpm -qa | grep nfs
libnfsidmap-2.5.4-20.el9.x86_64
nfs-utils-2.5.4-20.el9.x86_64
nfs-ganesha-selinux-5.7-2.el9cp.noarch
nfs-ganesha-5.7-2.el9cp.x86_64
nfs-ganesha-rgw-5.7-2.el9cp.x86_64
nfs-ganesha-ceph-5.7-2.el9cp.x86_64
nfs-ganesha-rados-grace-5.7-2.el9cp.x86_64
nfs-ganesha-rados-urls-5.7-2.el9cp.x86_64


# ceph --version
ceph version 18.2.1-89.el9cp (926619fe7135cbd6d305b46782ee7ecc7be199a3) reef (stable)


Steps performed - 

Scenario 1:
===========
[ceph: root@cali013 /]# ceph orch ps | grep nfs
[ceph: root@cali013 /]# ceph orch ps | grep ha
[ceph: root@cali013 /]# ceph nfs cluster ls
[]


[ceph: root@cali013 /]# ceph nfs cluster create foo --placement "cali013" --ingress --ingress-mode keepalive-only --virtual-ip 10.8.130.236/21
[ceph: root@cali013 /]# ceph nfs cluster ls
[
  "foo"
]
[ceph: root@cali013 /]# ceph nfs cluster info foo
{
  "foo": {
    "backend": [
      {
        "hostname": "cali013",
        "ip": "10.8.130.13",
        "port": 2049
      }
    ],
    "port": 9049,
    "virtual_ip": "10.8.130.236"
  }
}
[ceph: root@cali013 /]# ceph orch ps | grep nfs
keepalived.nfs.foo.cali013.oqsazj  cali013  *:9049       running (23s)    19s ago  23s    3980k        -  2.2.8            f6f3a07d6384  92fbe905bdef
nfs.foo.0.0.cali013.krakiu         cali013  *:2049       running (24s)    19s ago  24s    51.7M        -  5.7              2abcbe3816d6  6ac675f3f2f8


[ceph: root@cali013 /]# ceph orch ps | grep ha


[ceph: root@cali013 /]# ceph orch ls
NAME                       PORTS              RUNNING  REFRESHED  AGE  PLACEMENT
ingress.nfs.foo            10.8.130.236:9049      1/1  31s ago    47s  cali013;count:1
mds.cephfs                                        2/2  40s ago    5w   count:2
mgr                                               3/3  43s ago    5M   cali013;cali015;cali016;count:3
mon                                               1/5  31s ago    5M   <unmanaged>
nfs.foo                    ?:2049                 1/1  31s ago    47s  cali013;count:1
node-proxy                                        0/6  -          5w   *
osd.all-available-devices                          35  2m ago     5M   *
rgw.rgw.1                  ?:80                   2/2  2m ago     5M   label:rgw
[ceph: root@cali013 /]#

Mount the NFS share on client via VIP
====================================

[ceph: root@cali013 /]#  ceph nfs export create cephfs foo /ganesha1 cephfs --path=/volumes/subgroup0/sub0/536d7252-d1bf-45ba-93d0-f15649a1e002{
  "bind": "/ganesha1",
  "cluster": "foo",
  "fs": "cephfs",
  "mode": "RW",
  "path": "/volumes/subgroup0/sub0/536d7252-d1bf-45ba-93d0-f15649a1e002"
}


[root@ceph-msaini-faptco-node7 mnt]# mount -t nfs -o vers=4.1 10.8.130.236:/ganesha1 /mnt/ganesha/
[root@ceph-msaini-faptco-node7 mnt]# cd /mnt/ganesha/
[root@ceph-msaini-faptco-node7 ganesha]# ls
dir1  dir2  f1  file_ops  pynfs  tmp  tree
[root@ceph-msaini-faptco-node7 ganesha]#




Scenario 2: Using spec file and more then 1 NFS daemon
========================



[root@cali013 ~]# cat nfs_ha.yaml
service_type: ingress
service_id: nfs.nfsganesha
service_name: ingress.nfs.nfsganesha
placement:
  count: 1
  hosts:
  - cali015
  - cali016
spec:
  backend_service: nfs.nfsganesha
  frontend_port: 12049
  keepalive_only: true
  monitor_port: 9049
  virtual_ip: 10.8.130.236/24


[root@cali013 ~]# cephadm shell --mount nfs_ha.yaml:/var/lib/ceph/nfs_ha.yaml
Inferring fsid 4e687a60-638e-11ee-8772-b49691cee574
Inferring config /var/lib/ceph/4e687a60-638e-11ee-8772-b49691cee574/mon.cali013/config
Using ceph image with id '2abcbe3816d6' and tag 'ceph-7.1-rhel-9-containers-candidate-63457-20240326021251' created on 2024-03-26 02:15:29 +0000 UTC
registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:358fc7e11068221bbe1a0172e0f056bfd47cf7f1a983bbb8d6d238d3be21f5eb


[ceph: root@cali013 /]# ceph orch apply -i  /var/lib/ceph/nfs_ha.yaml
Scheduled ingress.nfs.nfsganesha update...


[ceph: root@cali013 /]# ceph orch ls
NAME                       PORTS                    RUNNING  REFRESHED  AGE  PLACEMENT
ingress.nfs.nfsganesha     10.8.130.236:9049,12049      1/1  13s ago    26s  cali015;cali016;count:1
mds.cephfs                                              2/2  6m ago     5w   count:2
mgr                                                     3/3  2m ago     5M   cali013;cali015;cali016;count:3
mon                                                     1/5  2m ago     5M   <unmanaged>
node-proxy                                              0/6  -          5w   *
osd.all-available-devices                                35  7m ago     5M   *
rgw.rgw.1                  ?:80                         2/2  7m ago     5M   label:rgw


[ceph: root@cali013 /]# ceph orch ps | grep nfs
keepalived.nfs.nfsganesha.cali015.narbtn  cali015  *:12049,9049  running (28s)    21s ago  28s    3976k        -  2.2.4            b79b516c07ed  a0b9e21cd815
[ceph: root@cali013 /]#

[ceph: root@cali013 /]# ceph nfs cluster ls
[]
[ceph: root@cali013 /]#



[root@cali013 ~]# cat nfs.yaml
networks:
    - 10.8.128.0/21
service_type: nfs
service_id: nfsganesha
placement:
  count: 1
  hosts:
    - cali015
    - cali016
spec:
  virtual_ip: 10.8.130.236
  port: 2049


[root@cali013 ~]# cephadm shell --mount nfs.yaml:/var/lib/ceph/nfs.yaml
Inferring fsid 4e687a60-638e-11ee-8772-b49691cee574
Inferring config /var/lib/ceph/4e687a60-638e-11ee-8772-b49691cee574/mon.cali013/config
Using ceph image with id '2abcbe3816d6' and tag 'ceph-7.1-rhel-9-containers-candidate-63457-20240326021251' created on 2024-03-26 02:15:29 +0000 UTC
registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:358fc7e11068221bbe1a0172e0f056bfd47cf7f1a983bbb8d6d238d3be21f5eb


[ceph: root@cali013 /]# ceph orch apply -i  /var/lib/ceph/nfs.yaml
Scheduled nfs.nfsganesha update...


[ceph: root@cali013 /]# ceph nfs cluster ls
[
  "nfsganesha"
]


[ceph: root@cali013 /]# ceph nfs cluster info nfsganesha
{
  "nfsganesha": {
    "backend": [
      {
        "hostname": "cali015",
        "ip": "10.8.130.15",
        "port": 2049
      }
    ],
    "monitor_port": 9049,
    "port": 12049,
    "virtual_ip": "10.8.130.236"
  }
}



[ceph: root@cali013 /]# ceph orch ps | grep nfs
keepalived.nfs.nfsganesha.cali015.narbtn  cali015  *:12049,9049      running (2m)     44s ago   2m    3976k        -  2.2.4            b79b516c07ed  a0b9e21cd815
nfs.nfsganesha.0.0.cali015.fqtexu         cali015  10.8.130.15:2049  running (52s)    44s ago  52s    18.3M        -  5.7              2abcbe3816d6  6cc6c70653e3

[ceph: root@cali013 /]#  ceph nfs export create cephfs nfsganesha /ganesha1 cephfs --path=/volumes/subgroup0/sub0/536d7252-d1bf-45ba-93d0-f15649a1e002
{
  "bind": "/ganesha1",
  "cluster": "nfsganesha",
  "fs": "cephfs",
  "mode": "RW",
  "path": "/volumes/subgroup0/sub0/536d7252-d1bf-45ba-93d0-f15649a1e002"
}


On client
========

[root@cali022 mnt]# mount -t nfs -o vers=4 10.8.130.236:/ganesha1 /mnt/ganesha/
[root@cali022 mnt]# cd /mnt/ganesha/
[root@cali022 ganesha]# ls
dir1  dir2  f1  file_ops  pynfs  tmp  tree

Comment 15 Manisha Saini 2024-03-28 00:21:52 UTC

Hi Adam,

As part of this BZ verification, we tested two scenarios outlined in comment #14. 

Can you please provide clarity on the scope of this BZ? Are there any particular scenarios that require testing?

Comment 16 Adam King 2024-03-28 01:34:44 UTC

(In reply to Manisha Saini from comment #15)
> Hi Adam,
> 
> As part of this BZ verification, we tested two scenarios outlined in comment
> #14. 
> 
> Can you please provide clarity on the scope of this BZ? Are there any
> particular scenarios that require testing?

What you did is what I would have done for testing. Theoretically you could also verify that putting the host the nfs or keeplaive are on offline gets them moved to another host. Keep in mind with that though that the placement must be such that there are other hosts available that match. For example.

placement:
  count: 1
  label: foo

with multiple hosts having label foo.

But personally, I think what you did is roughly what I would do for verifying the feature and am okay with it unless Francesco has some particular case he'd like tested with this setup. Leaving needinfo on Francesco to answer if such a particular case exists.

Comment 24 errata-xmlrpc 2024-06-13 14:19:21 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:3925

Note You need to log in before you can comment on or make changes to this bug.