Bug 1853275 - When doing FFU with Ceph, we need to set noout,norebalance flags on the cluster before LEAPP reboots the node
Summary: When doing FFU with Ceph, we need to set noout,norebalance flags on the clust...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z1
: 16.1 (Train on RHEL 8.2)
Assignee: Francesco Pantano
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-02 10:31 UTC by Giulio Fidente
Modified: 2020-10-27 15:47 UTC (History)
14 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-0.20200616081536.396affd.el8ost
Doc Type: Bug Fix
Doc Text:
Before this update, director did not set the `noout` flag on Red Hat Ceph Storage OSDs before running a Leapp upgrade. As a result, additional time was required for the OSDs to rebalance after the upgrade. + With this update, director sets the `noout` flag before the Leapp upgrade, which accelerates the upgrade process. Director also unsets the `noout` flag after the Leapp upgrade.
Clone Of:
Environment:
Last Closed: 2020-08-27 15:19:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 739232 0 None MERGED Set and then unset Ceph's noout flag before/after node is rebooted 2021-02-05 15:38:01 UTC
OpenStack gerrit 744018 0 None MERGED [TRAIN-ONLY] Set the right container_client when set/unset noout 2021-02-05 15:38:01 UTC
Red Hat Bugzilla 1858278 0 urgent CLOSED document setting Ceph OSD servers (and HCI) noout prior to LEAPP 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHBA-2020:3542 0 None None None 2020-08-27 15:19:28 UTC

Description Giulio Fidente 2020-07-02 10:31:29 UTC
When doing FFU with Ceph, operators need to put noout,norebalance flags on the cluster manually before upgrading OSD nodes because LEAP will reboot the nodes

The flags need to be cleared before running *converge* step.

To set the flag:

- log on one of the controllers (before starting with the cephstorage nodes upgrade or the computehci nodes upgrade)

- log into the ceph-mon container

  eg. podman exec -ti ceph-mon-controller-0 /bin/bash

- set the flags

  ceph osd set noout
  ceph osd set norebalance
  exit (to exit from the container)

Comment 1 Giulio Fidente 2020-07-02 10:33:52 UTC
To clear (remove) the flags, after the OSD nodes are upgraded and before converge:

- log on one of the controllers (before starting with the cephstorage nodes upgrade or the computehci nodes upgrade)

- log into the ceph-mon container

  eg. podman exec -ti ceph-mon-controller-0 /bin/bash

- set the flags

  ceph osd unset noout
  ceph osd unset norebalance
  exit (to exit from the container)

Comment 26 Yogev Rabl 2020-08-07 16:23:52 UTC
Verified

Comment 31 errata-xmlrpc 2020-08-27 15:19:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3542


Note You need to log in before you can comment on or make changes to this bug.