Back to bug 2224527

Who When What Removed Added
Red Hat One Jira (issues.redhat.com) 2023-07-21 10:49:26 UTC Link ID Red Hat Issue Tracker OSP-26808
Francesco Pantano 2023-07-21 11:17:39 UTC CC alfrgarc, gfidente, johfulto
Francesco Pantano 2023-07-21 11:20:14 UTC Doc Type If docs needed, set a value Known Issue
Doc Text Cause:
Due to bz#2224351, after the cephadm adoption, RGW instances are not bound to the storage network.

Consequence:
The FFU execution fails and Haproxy is unable to recover.

Workaround (if any):

The following procedure represents the workaround, and assumes that the Ceph cluster has been upgraded from 4 to 5 and adopted by cephadm.


````````
1. Log in to the undercloud host as the stack user.
2. Source the stackrc undercloud credentials file:

$ source ~/stackrc

3. Log in to a Controller node and create the following file:

$
cat <<EOF>rgw_spec
---
service_type: rgw
service_id: rgw
service_name: rgw.rgw
placement:
hosts:
- controller-0
- controller-1
- controller-2
networks:
- 172.17.3.0/24
spec:
rgw_frontend_port: 8080
rgw_realm: default
rgw_zone: default
EOF

Replace the network 172.17.3.0/24 with the subnet assigned to the Storage
network.

4. Run the cephadm shell as root, remove the adopted RGW daemons and apply
the spec created in the previous step:

$ cephadm shell -m rgw_spec
$ ceph orch apply -i /mnt/rgw_spec

5. Remove the adopted RGW from the ceph cluster

$ for i in 0 1 2; do
ceph orch rm rgw.controller-$i;
done


6. As root user, stop haproxy to point to the new Ceph RGW daemons:

$ pcs resource unmanage haproxy-bundle
$ pcs resource disable haproxy-bundle
$ pcs resource manage haproxy-bundle

7. Double check the three RGW instances are up && running:

$ cephadm shell -- ceph orch ps | grep rgw

8. As root user, re-enable Haproxy via pacemaker:

$ pcs resource enable haproxy-bundle
````
Result:
Francesco Pantano 2023-07-21 11:23:25 UTC Flags needinfo?(kgilliga)
CC kgilliga
Giulio Fidente 2023-07-21 11:59:24 UTC Blocks 2160009
Depends On 2224351
Giulio Fidente 2023-07-21 12:00:03 UTC Priority unspecified high
Jenny-Anne Lynch 2023-07-26 09:44:26 UTC Keywords Documentation, Triaged
CC jelynch
Status NEW ON_DEV
Assignee rhos-docs jelynch
Khomesh Thakre 2023-07-26 10:01:57 UTC Flags needinfo?(fpantano)
Giulio Fidente 2023-07-26 12:08:22 UTC Target Milestone --- ga
Target Release --- 17.1
Francesco Pantano 2023-07-31 07:30:42 UTC Flags needinfo?(fpantano)
Francesco Pantano 2023-07-31 07:37:03 UTC Flags needinfo?(kgilliga)
Jenny-Anne Lynch 2023-08-02 10:43:53 UTC Doc Text Cause:
Due to bz#2224351, after the cephadm adoption, RGW instances are not bound to the storage network.

Consequence:
The FFU execution fails and Haproxy is unable to recover.

Workaround (if any):

The following procedure represents the workaround, and assumes that the Ceph cluster has been upgraded from 4 to 5 and adopted by cephadm.


````````
1. Log in to the undercloud host as the stack user.
2. Source the stackrc undercloud credentials file:

$ source ~/stackrc

3. Log in to a Controller node and create the following file:

$
cat <<EOF>rgw_spec
---
service_type: rgw
service_id: rgw
service_name: rgw.rgw
placement:
hosts:
- controller-0
- controller-1
- controller-2
networks:
- 172.17.3.0/24
spec:
rgw_frontend_port: 8080
rgw_realm: default
rgw_zone: default
EOF

Replace the network 172.17.3.0/24 with the subnet assigned to the Storage
network.

4. Run the cephadm shell as root, remove the adopted RGW daemons and apply
the spec created in the previous step:

$ cephadm shell -m rgw_spec
$ ceph orch apply -i /mnt/rgw_spec

5. Remove the adopted RGW from the ceph cluster

$ for i in 0 1 2; do
ceph orch rm rgw.controller-$i;
done


6. As root user, stop haproxy to point to the new Ceph RGW daemons:

$ pcs resource unmanage haproxy-bundle
$ pcs resource disable haproxy-bundle
$ pcs resource manage haproxy-bundle

7. Double check the three RGW instances are up && running:

$ cephadm shell -- ceph orch ps | grep rgw

8. As root user, re-enable Haproxy via pacemaker:

$ pcs resource enable haproxy-bundle
````
Result:
There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, when RGW is deployed as part of director-deployed Red Hat Ceph Storage. The procedure fails when HAProxy does not restart on the next stack update. Workaround: Apply the workaround from Red Hat Knowledge-Centered Service (KCS) solution 7025985 - link:https://access.redhat.com/solutions/7025985[HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and RGW is enabled]
Jenny-Anne Lynch 2023-08-03 10:11:36 UTC Summary After cephadm adoption, haproxy fails to start when RGW is deployed After cephadm adoption, HAProxy fails to start when RGW is deployed
Doc Text There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, when RGW is deployed as part of director-deployed Red Hat Ceph Storage. The procedure fails when HAProxy does not restart on the next stack update. Workaround: Apply the workaround from Red Hat Knowledge-Centered Service (KCS) solution 7025985 - link:https://access.redhat.com/solutions/7025985[HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and RGW is enabled] There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, when RADOS Gateway (RGW) is deployed as part of director-deployed Red Hat Ceph Storage. The procedure fails when HAProxy does not restart on the next stack update. Workaround: Apply the workaround from Red Hat Knowledge-Centered Service (KCS) solution 7025985 - link:https://access.redhat.com/solutions/7025985[HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and RGW is enabled]
Jenny-Anne Lynch 2023-08-09 10:25:04 UTC Status ON_DEV RELEASE_PENDING
Jenny-Anne Lynch 2023-08-16 12:11:00 UTC Doc Text There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, when RADOS Gateway (RGW) is deployed as part of director-deployed Red Hat Ceph Storage. The procedure fails when HAProxy does not restart on the next stack update. Workaround: Apply the workaround from Red Hat Knowledge-Centered Service (KCS) solution 7025985 - link:https://access.redhat.com/solutions/7025985[HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and RGW is enabled] There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, when RADOS Gateway (RGW) is deployed as part of director-deployed Red Hat Ceph Storage. The procedure fails when HAProxy does not restart on the next stack update. Workaround: Apply the workaround from KCS solution 7025985: link:https://access.redhat.com/solutions/7025985[HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and RGW is enabled]
Jenny-Anne Lynch 2023-08-16 12:12:36 UTC Doc Text There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, when RADOS Gateway (RGW) is deployed as part of director-deployed Red Hat Ceph Storage. The procedure fails when HAProxy does not restart on the next stack update. Workaround: Apply the workaround from KCS solution 7025985: link:https://access.redhat.com/solutions/7025985[HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and RGW is enabled] There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, when RADOS Gateway (RGW) is deployed as part of director-deployed Red Hat Ceph Storage. The procedure fails when HAProxy does not restart on the next stack update. Workaround: Apply the workaround from Red Hat KCS solution 7025985: link:https://access.redhat.com/solutions/7025985[HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and RGW is enabled]

Back to bug 2224527