Bug 2007298 - cephadm mds upgrade procedure is incomplete
Summary: cephadm mds upgrade procedure is incomplete
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 5.0
Hardware: All
OS: All
urgent
urgent
Target Milestone: ---
: 5.1
Assignee: Patrick Donnelly
QA Contact: Amarnath
Karen Norteman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-23 14:13 UTC by Patrick Donnelly
Modified: 2022-04-04 10:22 UTC (History)
3 users (show)

Fixed In Version: ceph-16.2.7-4.el8cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-04 10:21:43 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 52654 0 None None None 2021-09-23 14:13:59 UTC
Ceph Project Bug Tracker 53150 0 None None None 2021-11-05 19:06:10 UTC
Ceph Project Bug Tracker 53155 0 None None None 2021-11-05 19:06:10 UTC
Github ceph ceph pull 43214 0 None Draft pybind/mgr/cephadm: set allow_standby_replay during CephFS upgrade 2021-09-27 12:54:03 UTC
Red Hat Issue Tracker RHCEPH-1887 0 None None None 2021-09-25 08:12:53 UTC
Red Hat Product Errata RHSA-2022:1174 0 None None None 2022-04-04 10:22:07 UTC

Description Patrick Donnelly 2021-09-23 14:13:59 UTC
Description of problem:

See: https://tracker.ceph.com/issues/52654

In particular: standby_replay is not disabled and the mds pre-upgrade procedure is not followed for minor releases (which will be the case for 5.0->5.1).

Comment 1 Sebastian Wagner 2021-11-04 16:20:58 UTC
https://github.com/ceph/ceph/pull/43800 is related, right?

Comment 2 Patrick Donnelly 2021-11-05 19:05:31 UTC
Yes, it is.

Comment 7 Amarnath 2022-01-11 09:02:25 UTC
Hi @pdonnell 

The steps given in the below doc are getting done by cephadm itself.

https://docs.ceph.com/en/pacific/cephfs/upgrading/


I had 1 filesystem(cephfs) with 2 active mds and 1 standby.

I triggered the upgrade without scaling down the mds nodes.

Do I need to validate anything else as part of this BZ

Cephadm Watch log snippet
2022-01-11T03:26:08.351387-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Scaling down filesystem cephfs
2022-01-11T03:26:12.400376-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node1-installer: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:12.400467-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node3: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:12.400515-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node2: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:12.434772-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Waiting for fs cephfs to scale down to reach 1 MDS
2022-01-11T03:26:25.446514-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node1-installer: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:25.446563-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node3: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:25.446602-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node2: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:25.466194-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Waiting for fs cephfs to scale down to reach 1 MDS
2022-01-11T03:26:38.376810-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node1-installer: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:38.376859-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node3: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:38.376898-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node2: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:26:38.403195-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: It appears safe to stop mds.cephfs.ceph-amk5-0-x4iy0t-node4.oznwkx
2022-01-11T03:26:38.851637-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Pulling registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:160faad614cca1a59e4277cb2390fe6f06590e8a2a30a9dcdcda066c5b13ce42 on ceph-amk5-0-x4iy0t-node4
2022-01-11T03:27:07.768027-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Updating mds.cephfs.ceph-amk5-0-x4iy0t-node4.oznwkx
2022-01-11T03:27:07.797611-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Deploying daemon mds.cephfs.ceph-amk5-0-x4iy0t-node4.oznwkx on ceph-amk5-0-x4iy0t-node4
2022-01-11T03:27:15.749383-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node1-installer: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:27:15.749453-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node3: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:27:15.749496-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node2: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:27:15.774618-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: It appears safe to stop mds.cephfs.ceph-amk5-0-x4iy0t-node5.oqwcen
2022-01-11T03:27:17.221240-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Updating mds.cephfs.ceph-amk5-0-x4iy0t-node5.oqwcen
2022-01-11T03:27:17.252551-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Deploying daemon mds.cephfs.ceph-amk5-0-x4iy0t-node5.oqwcen on ceph-amk5-0-x4iy0t-node5
2022-01-11T03:27:29.348173-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node1-installer: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:27:29.348330-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node3: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:27:29.348389-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node2: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:27:29.382209-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: It appears safe to stop mds.cephfs.ceph-amk5-0-x4iy0t-node6.qtibap
2022-01-11T03:27:29.949726-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Pulling registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:160faad614cca1a59e4277cb2390fe6f06590e8a2a30a9dcdcda066c5b13ce42 on ceph-amk5-0-x4iy0t-node6
2022-01-11T03:28:06.548533-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Updating mds.cephfs.ceph-amk5-0-x4iy0t-node6.qtibap
2022-01-11T03:28:06.581508-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Deploying daemon mds.cephfs.ceph-amk5-0-x4iy0t-node6.qtibap on ceph-amk5-0-x4iy0t-node6
2022-01-11T03:28:12.053903-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node1-installer: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:28:12.054020-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node3: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:28:12.054068-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Filtered out host ceph-amk5-0-x4iy0t-node2: does not belong to mon public_network (10.0.208.0/22)
2022-01-11T03:28:12.079396-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Setting container_image for all mds
2022-01-11T03:28:12.165909-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Scaling up filesystem cephfs max_mds to 2
2022-01-11T03:28:13.221406-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Setting container_image for all rgw
2022-01-11T03:28:13.244813-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Setting container_image for all rbd-mirror
2022-01-11T03:28:13.272128-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Setting container_image for all iscsi
2022-01-11T03:28:13.321711-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Setting container_image for all nfs
2022-01-11T03:28:13.353968-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Finalizing container_image settings
2022-01-11T03:28:13.638490-0500 mgr.ceph-amk5-0-x4iy0t-node1-installer.alzftf [INF] Upgrade: Complete!

Ceph version before Upgrade : 
[root@ceph-amk5-0-x4iy0t-node7 ~]# ceph versions
{
    "mon": {
        "ceph version 16.2.0-150.el8cp (a95dd77b7d968852ee415f1b7dfe08c89414c96a) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.0-150.el8cp (a95dd77b7d968852ee415f1b7dfe08c89414c96a) pacific (stable)": 2
    },
    "osd": {
        "ceph version 16.2.0-150.el8cp (a95dd77b7d968852ee415f1b7dfe08c89414c96a) pacific (stable)": 12
    },
    "mds": {
        "ceph version 16.2.0-150.el8cp (a95dd77b7d968852ee415f1b7dfe08c89414c96a) pacific (stable)": 3
    },
    "overall": {
        "ceph version 16.2.0-150.el8cp (a95dd77b7d968852ee415f1b7dfe08c89414c96a) pacific (stable)": 20
    }
}
[root@ceph-amk5-0-x4iy0t-node7 ~]# ceph -s
  cluster:
    id:     1e55c34c-72ad-11ec-8aa5-fa163e0355b1
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph-amk5-0-x4iy0t-node1-installer,ceph-amk5-0-x4iy0t-node3,ceph-amk5-0-x4iy0t-node2 (age 67m)
    mgr: ceph-amk5-0-x4iy0t-node1-installer.alzftf(active, since 70m), standbys: ceph-amk5-0-x4iy0t-node2.sdlbaf
    mds: 2/2 daemons up, 1 standby
    osd: 12 osds: 12 up (since 65m), 12 in (since 66m)
 
  data:
    volumes: 1/1 healthy
    pools:   3 pools, 65 pgs
    objects: 41 objects, 3.6 KiB
    usage:   70 MiB used, 180 GiB / 180 GiB avail
    pgs:     65 active+clean

Ceph Version After Upgrade:

[root@ceph-amk5-0-x4iy0t-node7 ~]# ceph versions
{
    "mon": {
        "ceph version 16.2.7-18.el8cp (86cfa49b08da370afb3f98be618f2c3c1eae71fb) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.7-18.el8cp (86cfa49b08da370afb3f98be618f2c3c1eae71fb) pacific (stable)": 2
    },
    "osd": {
        "ceph version 16.2.7-18.el8cp (86cfa49b08da370afb3f98be618f2c3c1eae71fb) pacific (stable)": 12
    },
    "mds": {
        "ceph version 16.2.7-18.el8cp (86cfa49b08da370afb3f98be618f2c3c1eae71fb) pacific (stable)": 3
    },
    "overall": {
        "ceph version 16.2.7-18.el8cp (86cfa49b08da370afb3f98be618f2c3c1eae71fb) pacific (stable)": 20
    }
}
[root@ceph-amk5-0-x4iy0t-node7 ~]# ceph -s
  cluster:
    id:     1e55c34c-72ad-11ec-8aa5-fa163e0355b1
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph-amk5-0-x4iy0t-node1-installer,ceph-amk5-0-x4iy0t-node3,ceph-amk5-0-x4iy0t-node2 (age 108s)
    mgr: ceph-amk5-0-x4iy0t-node1-installer.alzftf(active, since 24m), standbys: ceph-amk5-0-x4iy0t-node2.sdlbaf
    mds: 2/2 daemons up, 1 standby
    osd: 12 osds: 12 up (since 21m), 12 in (since 94m)
 
  data:
    volumes: 1/1 healthy
    pools:   3 pools, 65 pgs
    objects: 41 objects, 7.5 KiB
    usage:   89 MiB used, 180 GiB / 180 GiB avail
    pgs:     65 active+clean
 
[root@ceph-amk5-0-x4iy0t-node7 ~]#

Comment 11 errata-xmlrpc 2022-04-04 10:21:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.1 Security, Enhancement, and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1174


Note You need to log in before you can comment on or make changes to this bug.