Bug 1405881 - yum update failed on controller and compute nodes when ceph-osd repos is not enabled
Summary: yum update failed on controller and compute nodes when ceph-osd repos is not ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z5
: 10.0 (Newton)
Assignee: Alan Bishop
QA Contact: Yogev Rabl
Don Domingo
URL:
Whiteboard:
: 1421303 1619472 (view as bug list)
Depends On:
Blocks: 1421303 1481685 1493744
TreeView+ depends on / blocked
 
Reported: 2016-12-19 03:35 UTC by Chen
Modified: 2021-12-10 15:12 UTC (History)
33 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.3.0-6.el7ost
Doc Type: Bug Fix
Doc Text:
The ceph-osd package is only available in a repository that requires a special entitlement that is otherwise not required on OpenStack controller and compute nodes. However, the ceph-osd package is part of the common overcloud image, and its presence creates an RPM dependency problem when it cannot be updated along with the rest of the Ceph packages. As such, yum updates fail on nodes that do not have the ceph-osd entitlement, even though they do not require the ceph-osd package. Now, prior to performing the yum update, the ceph-osd package is removed from overcloud nodes that do not require the package. The ceph-osd package is only required on Ceph storage nodes (including hyperconverged nodes running Ceph OSD and Compute services). Yum updates succeed on nodes that do not require the ceph-osd package. Ceph storage and hyperconverged nodes that require the ceph-osd package will still require the necessary Ceph OSD entitlement.
Clone Of:
: 1489484 1489490 1493744 (view as bug list)
Environment:
Last Closed: 2017-09-28 16:35:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
proposed heat template for deleting osd-ceph from compute and controller (608 bytes, text/plain)
2017-07-13 13:52 UTC, Don Domingo
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1713292 0 None None None 2017-08-28 13:37:54 UTC
OpenStack gerrit 496921 0 None MERGED Maintain ceph-osd package only on nodes hosting CephOSD service 2020-11-22 17:20:11 UTC
Red Hat Issue Tracker OSP-4587 0 None None None 2021-12-10 15:12:58 UTC
Red Hat Knowledge Base (Solution) 2196011 0 None None None 2017-07-13 14:40:25 UTC
Red Hat Product Errata RHBA-2017:2825 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 director Bug Fix Advisory 2017-09-28 20:33:35 UTC

Description Chen 2016-12-19 03:35:27 UTC
Description of problem:

yum update failed on controller and compute nodes when ceph-osd repos is not enabled.

As we are using the same image for compute controller and ceph nodes, ceph-osd package is pre-installed on compute and controller nodes from the beginning. When we want to update the system by using "yum update", ceph-osd package needs to be updated due to the dependency. However, OpenStack subscription doesn't provide Ceph OSD entitlement so ceph-osd can not be updated. As a result the whole yum update will fail. 

Version-Release number of selected component (if applicable):

OSP7

How reproducible:

100%

Steps to Reproduce:
1. Deploy an overcloud
2. Subscribe the node with OpenStack subscription
3. Run yum update

Actual results:

Due to the lack of ceph-osd repos yum update will fail

Expected results:

yum update should succeed

Additional info:

Comment 14 Jeremy 2017-05-19 20:12:42 UTC
Another customer seeing this in OSP10. ceph-osd is installed on the controller image, however it's provided by rhel-7-server-rhceph-2-osd-rpms repositiory. As docs suggest that repos should be on ceph node not controller node. So overcloud update fails if ceph-osd package isn't removed from the controller. Or add the ceph-osd repo.  https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html-single/upgrading_red_hat_openstack_platform/#sect-Repository_Requirements

Comment 16 Pablo Iranzo Gómez 2017-07-12 06:32:58 UTC
Adding another case on OSP10, as https://bugzilla.redhat.com/show_bug.cgi?id=1405881#c14

Customer cannot see ceph osd repository to proceed as the guide suggests

Comment 20 Sofer Athlan-Guyot 2017-07-13 11:17:56 UTC
Hi Giulio,

well, the best course of action would be to have a set of specific images for this, but given the complexity of maintaining those[1], I think the best course of action is to document the whole thing.  Something along the line of:

"If you don't use Ceph, you still need a ceph registration before upgrade and upgrade.  You can have a temporary one registration from ?"

I don't know the details here, but some warnings like this before upgrade and update.

I think the consensus will be to have those doc in, Guilio do you want to create the associated bz or should I do it (I'm not that familiar with registration process so I may not be the right person here)

Thanks,

[1] Mike Burns may have more to say here.

Comment 21 Giulio Fidente 2017-07-13 11:23:57 UTC
Sofer yeah in theory each image should only bundle what is needed but for composable roles we would need to install packages on the fly on any given node while the overcloud deployment is in progress, which I think we don't want to do. As a result we end up with everything everywhere, for example, haproxy installed on the compute nodes ... but that happens to work because by default the required repos are active on the compute node.

I think we might change this bug into a doc bug, adding Don Domingo to see if he has ideas.

Comment 23 Don Domingo 2017-07-13 13:52:35 UTC
Created attachment 1297622 [details]
proposed heat template for deleting osd-ceph from compute and controller

Add this via:

resource_registry:
  OS::TripleO::NodeExtraConfigPost: /home/stack/templates/bz1405881-delete-ceph-osd.yaml

Comment 24 Don Domingo 2017-07-13 13:54:45 UTC
Hi Giulio,

From what I can tell, it's the unnecessary ceph-osd package on the Compute and Controller nodes that is causing the problem. As a workaround, can we just instruct users to delete the package from those nodes via heat template during post? Something like what I've attached in Comment#23, would that work? 

If that's the way we want to go about solving this, then it makes sense to also document this in the Director guide. AFAICT users will never need ceph-osd on the Controller and Compute nodes anyway (unless they're deploying hyperconverged compute, in which case we just ask them to remove the hostname check for 'compute'). As such, removing the package should make those nodes a bit cleaner, and doing it via director means it'll be persistent throughout overcloud updates too. Does that sound like a fair assessment?

Comment 38 Alan Bishop 2017-08-14 12:19:35 UTC
Hi Irina,

This is still a work in progress, and the final resolution has not yet been determined.

Comment 52 Irina Petrova 2017-09-04 09:30:34 UTC
Hi Alan,

Can you provide a build? We have a HF request already.

Comment 53 Alan Bishop 2017-09-05 13:11:49 UTC
Hi Irina,

I won't be the one to decide when the solution is ready for release. Right now I'm concentrating on getting the change approved upstream, after which I can begin a series of downstream backports. QE has done some initial testing, but possibly not on the version that you require it. What version are you interested in?

Comment 54 Irina Petrova 2017-09-06 08:57:15 UTC
Hi Alan.

Thank you for the update. 

> What version are you interested in?
It's Ocata / OSP-11.

Comment 58 Yogev Rabl 2017-09-26 02:04:08 UTC
verified on update & upgrade of OSP 10

Comment 60 errata-xmlrpc 2017-09-28 16:35:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2825

Comment 61 Sofer Athlan-Guyot 2018-03-28 13:51:09 UTC
*** Bug 1421303 has been marked as a duplicate of this bug. ***

Comment 62 Alan Bishop 2018-09-17 17:17:02 UTC
*** Bug 1619472 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.