Bug 2307256

Summary: [FFU] Ceph noout/norecover/etc flags are set during step 1, but were not unset at step-5
Product: Red Hat OpenStack Reporter: Alex Stupnikov <astupnik>
Component: openstack-tripleo-heat-templatesAssignee: Manoj Katari <mkatari>
Status: CLOSED ERRATA QA Contact: Alfredo <alfrgarc>
Severity: medium Docs Contact:
Priority: medium    
Version: 17.1 (Wallaby)CC: fpantano, jbadiapa, lbezdick, mburns, mkatari, ramishra
Target Milestone: z5Keywords: Triaged
Target Release: 17.1   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-14.3.1-17.1.20240909080753.e7c7ce3.el9ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-11-21 09:30:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alex Stupnikov 2024-08-22 11:26:07 UTC
Description of problem:
Customer faced several problems during upgrade and at some point he found himself in a situation when several Ceph flags were still defined [1] because they were not removed by TripleO. We tried to understand how this situation happened and found logs [2] in provided /home/stack/overcloud-deploy/ folder: flags were applied during step 1 and commands to remove them were skipped during step 5.

I want to ask engineering to take a look and help us to understand the root cause here. overcloud-deploy.tar.gz is attached to a case


[1]
health: HEALTH_WARN
    noout,nobackfill,norebalance,norecover,nodeep-scrub flag(s) set
    Degraded data redundancy: 8/36663183 objects degraded (0.000%), 3 pgs degraded, 1 pg undersized

[2]
config-download/stdout:2024-08-19 10:16:48.713903 | b02628ea-a882-1146-5e25-000000000249 |       TASK | Set noout flag
config-download/stdout:2024-08-19 10:16:51.155642 | b02628ea-a882-1146-5e25-000000000249 |    CHANGED | Set noout flag | HOSTNAME -> 10.128.8.171 | item=noout
config-download/stdout:2024-08-19 10:16:53.169812 | b02628ea-a882-1146-5e25-000000000249 |    CHANGED | Set noout flag | HOSTNAME -> 10.128.8.171 | item=norecover
config-download/stdout:2024-08-19 10:16:55.181602 | b02628ea-a882-1146-5e25-000000000249 |    CHANGED | Set noout flag | HOSTNAME -> 10.128.8.171 | item=nobackfill
config-download/stdout:2024-08-19 10:16:57.199063 | b02628ea-a882-1146-5e25-000000000249 |    CHANGED | Set noout flag | HOSTNAME -> 10.128.8.171 | item=norebalance
config-download/stdout:2024-08-19 10:16:59.198725 | b02628ea-a882-1146-5e25-000000000249 |    CHANGED | Set noout flag | HOSTNAME -> 10.128.8.171 | item=nodeep-scrub
config-download/stdout:2024-08-19 10:16:59.211422 | b02628ea-a882-1146-5e25-000000000249 |     TIMING | Set noout flag | HOSTNAME | 0:00:13.578213 | 10.50s
config-download/stdout:2024-08-19 10:38:25.155952 | b02628ea-a882-1146-5e25-00000000036d |       TASK | Unset noout flag
config-download/stdout:2024-08-19 10:38:25.186908 | b02628ea-a882-1146-5e25-00000000036d |    SKIPPED | Unset noout flag | HOSTNAME | item=noout
config-download/stdout:2024-08-19 10:38:25.201769 | b02628ea-a882-1146-5e25-00000000036d |    SKIPPED | Unset noout flag | HOSTNAME | item=norecover
config-download/stdout:2024-08-19 10:38:25.216525 | b02628ea-a882-1146-5e25-00000000036d |    SKIPPED | Unset noout flag | HOSTNAME | item=nobackfill
config-download/stdout:2024-08-19 10:38:25.232053 | b02628ea-a882-1146-5e25-00000000036d |    SKIPPED | Unset noout flag | HOSTNAME | item=norebalance
config-download/stdout:2024-08-19 10:38:25.242730 | b02628ea-a882-1146-5e25-00000000036d |    SKIPPED | Unset noout flag | HOSTNAME | item=nodeep-scrub
config-download/stdout:2024-08-19 10:38:25.246093 | b02628ea-a882-1146-5e25-00000000036d |     TIMING | Unset noout flag | HOSTNAME | 0:21:39.612885 | 0.09s
config-download/stdout:2024-08-19 10:38:30.116665 | b02628ea-a882-1146-5e25-000000000249 |    SUMMARY |    HOSTNAME | Set noout flag | 10.50s


Version-Release number of selected component (if applicable): RHOSP 17


How reproducible: check description


Actual results: Ceph cluster has several flags preventing rebalancing left after FFU


Expected results: Ceph cluster is expected to be in fully functional state after FFU

Comment 15 errata-xmlrpc 2024-11-21 09:30:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHOSP 17.1.4 (openstack-tripleo-heat-templates) security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:9978