Description of problem: yum update failed on controller and compute nodes when ceph-osd repos is not enabled. As we are using the same image for compute controller and ceph nodes, ceph-osd package is pre-installed on compute and controller nodes from the beginning. When we want to update the system by using "yum update", ceph-osd package needs to be updated due to the dependency. However, OpenStack subscription doesn't provide Ceph OSD entitlement so ceph-osd can not be updated. As a result the whole yum update will fail. Version-Release number of selected component (if applicable): OSP7 How reproducible: 100% Steps to Reproduce: 1. Deploy an overcloud 2. Subscribe the node with OpenStack subscription 3. Run yum update Actual results: Due to the lack of ceph-osd repos yum update will fail Expected results: yum update should succeed Additional info:
Another customer seeing this in OSP10. ceph-osd is installed on the controller image, however it's provided by rhel-7-server-rhceph-2-osd-rpms repositiory. As docs suggest that repos should be on ceph node not controller node. So overcloud update fails if ceph-osd package isn't removed from the controller. Or add the ceph-osd repo. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html-single/upgrading_red_hat_openstack_platform/#sect-Repository_Requirements
Adding another case on OSP10, as https://bugzilla.redhat.com/show_bug.cgi?id=1405881#c14 Customer cannot see ceph osd repository to proceed as the guide suggests
Hi Giulio, well, the best course of action would be to have a set of specific images for this, but given the complexity of maintaining those[1], I think the best course of action is to document the whole thing. Something along the line of: "If you don't use Ceph, you still need a ceph registration before upgrade and upgrade. You can have a temporary one registration from ?" I don't know the details here, but some warnings like this before upgrade and update. I think the consensus will be to have those doc in, Guilio do you want to create the associated bz or should I do it (I'm not that familiar with registration process so I may not be the right person here) Thanks, [1] Mike Burns may have more to say here.
Sofer yeah in theory each image should only bundle what is needed but for composable roles we would need to install packages on the fly on any given node while the overcloud deployment is in progress, which I think we don't want to do. As a result we end up with everything everywhere, for example, haproxy installed on the compute nodes ... but that happens to work because by default the required repos are active on the compute node. I think we might change this bug into a doc bug, adding Don Domingo to see if he has ideas.
Created attachment 1297622 [details] proposed heat template for deleting osd-ceph from compute and controller Add this via: resource_registry: OS::TripleO::NodeExtraConfigPost: /home/stack/templates/bz1405881-delete-ceph-osd.yaml
Hi Giulio, From what I can tell, it's the unnecessary ceph-osd package on the Compute and Controller nodes that is causing the problem. As a workaround, can we just instruct users to delete the package from those nodes via heat template during post? Something like what I've attached in Comment#23, would that work? If that's the way we want to go about solving this, then it makes sense to also document this in the Director guide. AFAICT users will never need ceph-osd on the Controller and Compute nodes anyway (unless they're deploying hyperconverged compute, in which case we just ask them to remove the hostname check for 'compute'). As such, removing the package should make those nodes a bit cleaner, and doing it via director means it'll be persistent throughout overcloud updates too. Does that sound like a fair assessment?
Hi Irina, This is still a work in progress, and the final resolution has not yet been determined.
Hi Alan, Can you provide a build? We have a HF request already.
Hi Irina, I won't be the one to decide when the solution is ready for release. Right now I'm concentrating on getting the change approved upstream, after which I can begin a series of downstream backports. QE has done some initial testing, but possibly not on the version that you require it. What version are you interested in?
Hi Alan. Thank you for the update. > What version are you interested in? It's Ocata / OSP-11.
verified on update & upgrade of OSP 10
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2825
*** Bug 1421303 has been marked as a duplicate of this bug. ***
*** Bug 1619472 has been marked as a duplicate of this bug. ***