Bug 1892206 - [GSS] Ceph image/version mismatch
Summary: [GSS] Ceph image/version mismatch
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: build
Version: 4.6
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
: OCS 4.6.0
Assignee: Boris Ranto
QA Contact: Oded
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-28 07:58 UTC by Bipin Kunal
Modified: 2021-08-17 17:40 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-17 06:25:20 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:5605 0 None None None 2020-12-17 06:25:41 UTC

Description Bipin Kunal 2020-10-28 07:58:51 UTC
Description of problem (please be detailed as possible and provide log
snippests):
I observe ceph version and corresponding image mismatch for rook-ceph-operator/toolbox and OSD/MON/MGR etc


Version of all relevant components (if applicable):
OCS-4.5.1
OCP-4.5.16


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
No. but this can lead to confusion while debugging issue


Is there any workaround available to the best of your knowledge?
Nothing I am aware of.


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1 - very simple

Can this issue reproducible?
Yes. Deploy a fresh new OCS-4.5 cluster


Can this issue reproduce from the UI?
No


If this is a regression, please provide more details to justify this:
Not sure

Steps to Reproduce:
1.Install OCP-4.5
2.Install OCS-4.5
3.check ceph version and ceph image corresponding to rook-ceph-operator, toolbox, OSD, MON, MGR etc


Actual results:
--------------
$ oc rsh rook-ceph-mon-a-69fdf4544b-4965x
sh-4.4# ceph -v
ceph version 14.2.8-91.el8cp (75b4845da7d469665bd48d1a49badcc3677bf5cd) nautilus (stable)
--------------
$ oc rsh rook-ceph-mgr-a-68544f48bc-4qscx
sh-4.4# ceph -v
ceph version 14.2.8-91.el8cp (75b4845da7d469665bd48d1a49badcc3677bf5cd) nautilus (stable)
--------------
$ oc rsh rook-ceph-osd-0-f79777f9-hp9p6
sh-4.4# ceph -v
ceph version 14.2.8-91.el8cp (75b4845da7d469665bd48d1a49badcc3677bf5cd) nautilus (stable)
--------------
$ oc rsh rook-ceph-tools-6658bc55fb-gqb2w
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
--------------
$ oc rsh rook-ceph-operator-677cfd7cf8-67pjt
sh-4.4$ ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
--------------

Here rook-ceph-operator and rook-ceph-tools pods show the version 14.2.8-111.el8cp whereas MON/OSD/MGR pods show the version 14.2.8-91.el8cp


Expected results:
All the pods should display the same ceph version.

Additional info:

Image information:
--------------

  *MON*
    
    Container ID:  cri-o://83e1a1dc2624af1aab5d86c97968ad7e6f1ffb4b50fc89a947db80326db4bc81
    Image:         registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:eafd1acb0ada5d7cf93699056118aca19ed7a22e4938411d307ef94048746cc8
    Image ID:      registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:3def885ad9e8440c5bd6d5c830dafdd59edf9c9e8cce0042b0f44a5396b5b0f6
--------------

  *MGR*
  
    Container ID:  cri-o://7c2ef4c663cdb573d9c7475139779a55024e7252dede53502ef73e22b7dd3e7b
    Image:         registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:eafd1acb0ada5d7cf93699056118aca19ed7a22e4938411d307ef94048746cc8
    Image ID:      registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:3def885ad9e8440c5bd6d5c830dafdd59edf9c9e8cce0042b0f44a5396b5b0f6
--------------

  *OSD*

    Container ID:  cri-o://ad61eaa5b7ef0df256f16305adf9b3899297944ae96c1c5bf5692a727e02ef14
    Image:         registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:eafd1acb0ada5d7cf93699056118aca19ed7a22e4938411d307ef94048746cc8
    Image ID:      registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:3def885ad9e8440c5bd6d5c830dafdd59edf9c9e8cce0042b0f44a5396b5b0f6
--------------

  *rook-ceph-tools:*
    Container ID:  cri-o://d3b356466a0a2e6af16a52555845dadf21e30b9968ca7f7c6a617023fb6ae3b1
    Image:         registry.redhat.io/ocs4/rook-ceph-rhel8-operator@sha256:b9fd6c06423d6fbe213837089cac9a68cc0fb431c1b5c2b4fcf2cf6d19a910a4
    Image ID:      registry.redhat.io/ocs4/rook-ceph-rhel8-operator@sha256:b9fd6c06423d6fbe213837089cac9a68cc0fb431c1b5c2b4fcf2cf6d19a910a4
--------------

  *rook-ceph-operator*
  
    Container ID:  cri-o://81f4a11afdbb4ec6ea751c4fe6a89a868d72f37bdc5401bb543100251a34593c
    Image:         registry.redhat.io/ocs4/rook-ceph-rhel8-operator@sha256:b9fd6c06423d6fbe213837089cac9a68cc0fb431c1b5c2b4fcf2cf6d19a910a4
    Image ID:      registry.redhat.io/ocs4/rook-ceph-rhel8-operator@sha256:b9fd6c06423d6fbe213837089cac9a68cc0fb431c1b5c2b4fcf2cf6d19a910a4

Comment 2 Raz Tamir 2020-10-28 08:06:42 UTC
Marking as a regression - https://bugzilla.redhat.com/show_bug.cgi?id=1754892

Comment 4 Yaniv Kaul 2020-10-28 08:08:03 UTC
This has zero user impact, to the best of my knowledge, thus, lowering severity. Please correct if you see any impact.

Comment 7 Jose A. Rivera 2020-10-29 15:55:50 UTC
The Rook-Ceph Pods use a different image than the Ceph Pods, registry.redhat.io/ocs4/rook-ceph-rhel8-operator vs registry.redhat.io/rhceph/rhceph-4-rhel8. Any discrepancy in Ceph versions between the two are down to the DS build process and what dependencies the Rook-Ceph project pulls in.

That said, if there is no technical problem then I don't think this matters. If there was a significant difference then we would probably have already made sure they were at advanced enough versions to remove the problem.

Moving this to the rook component and OCS 4.7 in case there's further need for discussion.

Comment 8 Sébastien Han 2020-10-30 13:14:04 UTC
Just like Jose said, the mismatch is on the DS build.
Nothing Rook can do at this point.

Also, I don't know how the build can always guarantee the exact same version between the Ceph image and Operator image Ceph packages.
If you update the operator image, you might just get newer Ceph packages and the running cluster might have "older" pin-point release packages.

Typically, higher Ceph packages is not an issue since they are compatible with earlier versions.

Comment 9 Michael Adam 2020-11-03 16:40:34 UTC
@Boris: We are suspecting a minor issue with the downstream builds: The rook-ceph-operator shows a different ceph build version (14.2.8-111) than the ceph component containers (osd/mon/mgr...) (14.2.8-91).

Can you comment?

Comment 10 Boris Ranto 2020-11-03 17:17:14 UTC
We are running dnf update -y in rook Dockerfile to download any security fixes. However, we are consuming rhceph-4 repos in rook build which also updates ceph packages in that container. We can remove that line but we won't get any security fixes for the other packages anymore. Maybe, we could add some --exclude (or --disablerepo) flags to the command to avoid updating the ceph packages.

Comment 11 Boris Ranto 2020-11-03 23:30:54 UTC
I pushed a fix for this issue to ocs-4.4 (and onwards) dist-git branches. Do we want to do another RC of OCS 4.5.2 for this? It would be a relatively simple rebuild (we would just have to rebuild rook and the operator bundle).

btw: We are still doing security updates in rook, I just disabled the rhceph repos to prevent the ceph packages from being updated.

Comment 12 Boris Ranto 2020-11-04 11:42:35 UTC
This should already be fixed in 4.6.0 in the latest build (it did rebuild rook).

Do we know if we want to target 4.5.2, too? Anyway, this would be fixed by any 4.5 rebuild (i.e. even in 4.5.3 if we ever release).

Comment 14 Raz Tamir 2020-11-12 11:36:03 UTC
If the fix is in 4.6 already can I retarget this to 4.6?
Unless another RC is needed? If so, this will be in only if one more RC will be required

Comment 21 Oded 2020-11-22 10:42:47 UTC
Bug Fixed, all the pods display the same ceph version.


SetUp:
Provider: Vmware_Dynamic
OCP Version:4.6.0-0.nightly-2020-11-21-194817
OCS Version:ocs-operator.v4.6.0-160.ci


Test Process:
1.Check Ceph version on all relevnat components.
============================================================================================
$ oc rsh rook-ceph-mon-a-74b5fcb97d-s972f
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================
$ oc rsh rook-ceph-mgr-a-5f8695cc48-7j7l7
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================
$ oc rsh rook-ceph-mon-b-54fdc9c889-wj47l
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================
$ oc rsh rook-ceph-mon-c-8bbd8f9d4-rdp7l
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================
$ oc rsh rook-ceph-osd-0-75cb6d6fc8-6gvnk
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================
$ oc rsh rook-ceph-osd-1-55545b4fcc-4m6t2
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================
$ oc rsh rook-ceph-osd-2-5bbd9d949b-7fvzj
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================
$ oc rsh rook-ceph-tools-85dc5f7bc8-tnlqj
sh-4.4# ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================
$ oc rsh rook-ceph-operator-54f449df55-bzwrq
sh-4.4$ ceph -v
ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)
============================================================================================

Comment 23 errata-xmlrpc 2020-12-17 06:25:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5605


Note You need to log in before you can comment on or make changes to this bug.