Bug 2255324
Summary: | After OSP update or FFU (16.2->17.1) manila CephFS NFS shares are not accessible | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Itzik Brown <itbrown> |
Component: | tripleo-ansible | Assignee: | OpenStack Manila Bugzilla Bot <openstack-manila-bugs> |
Status: | CLOSED ERRATA | QA Contact: | Alfredo <alfrgarc> |
Severity: | urgent | Docs Contact: | RHOS Documentation Team <rhos-docs> |
Priority: | high | ||
Version: | 17.1 (Wallaby) | CC: | alfrgarc, anbs, ashrodri, astupnik, dhughes, fpantano, gouthamr, gregraka, imatza, jamsmith, jbadiapa, jelynch, jjoyce, jjung, lsvaty, mariel, mburns, mgarciac, pgrist |
Target Milestone: | z3 | Keywords: | AutomationBlocker, Triaged |
Target Release: | 17.1 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | tripleo-ansible-3.3.1-17.1.20231101230824.el9ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2024-01-29 14:36:43 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Itzik Brown
2023-12-20 06:15:46 UTC
The setup is using HCI. Adding some more comments here; We were considering if this needs to be a blocker bug.. After the failure occurred, I saw that a rados object that contained all the export object entries (%url rados://manila_data/ganesha-export-index) was erased/re-set.. we have some code that deals with this object: https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansible/roles/tripleo_cephadm/tasks/nfs.yaml#L26-L44 I see that the tripleo step that checks for the object omits "become: true" (I suspect this command could fail silently) We'll try to capture logs from this playbook in a re-run of the test. Adding the steps to apply the workaround: Prior to running the overcloud FFU (https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.1/html-single/framework_for_upgrades_16.2_to_17.1/index#upgrading-a-standard-overcloud_upgrading-overcloud-standard) Run the following on the controller node that contains the "ceph-nfs-pacemaker" service: # podman exec ceph-nfs-pacemaker rados -n client.manila -p manila_data get ganesha-export-index export_index_backup.txt You may inspect the data in the "export_index_backup.txt" file. If you had manila shares created, you will have one or more lines in this file, each containing a RADOS URL to export information. This export information exists on rados, and is not affected by this bug. Once the FFU is complete, and prior to proceeding to the system upgrade steps, ensure that the ganesha-export-index is recreated: # podman exec ceph-nfs-pacemaker rados -n client.manila -p manila_data put ganesha-export-index export_index_backup.txt Verify that the object exists, and its contents match with: # podman exec ceph-nfs-pacemaker rados -n client.manila -p manila_data get ganesha-export-index - Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:0547 |