Bug 1908266
| Summary: | RHOSP 16.1 minor update fails because of release lock enforcement on Ceph nodes | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Bernd Zehrfuchs <bzehrfuc> |
| Component: | tripleo-ansible | Assignee: | Sofer Athlan-Guyot <sathlang> |
| Status: | CLOSED ERRATA | QA Contact: | Jason Grosso <jgrosso> |
| Severity: | urgent | Docs Contact: | Andy Stillman <astillma> |
| Priority: | urgent | ||
| Version: | 16.1 (Train) | CC: | astillma, dmacpher, dmcphers, gfidente, jamsmith, jelle.hoylaerts.ext, jjoyce, jpretori, jschluet, mburns, michal.vasko, msufiyan, sathlang, sgolovat, slinaber, tvignaud, vgrosu |
| Target Milestone: | z6 | Keywords: | Triaged |
| Target Release: | 16.1 (Train on RHEL 8.2) | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | tripleo-ansible-0.5.1-1.20210310113105.902c3c8.el8ost openstack-tripleo-heat-templates-11.3.2-1.20210310113344.29a02c1.el8ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-26 11:43:47 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Bernd Zehrfuchs
2020-12-16 09:25:38 UTC
Hi, I think this is a documentation issue as I don't think ceph osd node on rhel-16.1 should run on anything but rhel-8.2 as suggested in [1]. I'm asking confirmation of Ceph team and rhos-delivery team. Teams: is there any reason why we shouldn't run "subscription-manager release --set=8.2" on ceph-osd for OSP16.1 ? RHOS-DELIVERY: more generally is there some specific subscription for ceph-osd and what are they ? Note, there are many way to get over this error as it's a configuration option (that can be deactivated) but I we should get confirmation that ceph-osd need another type of rhel pinned down. If 8.2 was indeed needed (as I think it is) then omitting it could lead to problem. Setting this to urgent as the answer should be straightforward and the documentation updated as quickly as possible if there's a issue. [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/keeping_red_hat_openstack_platform_updated Hi, so thanks to the explanation on the internal mailing, I think we do have an issue here. Ceph nodes are not bound to EUS constraint an can be on any version of rhel[1]. That means that the assumption that all overcloud nodes should follow the EUS streams constraints for every 16 release is wrong. We should be able to compose role where this check is disabled. Please let me knon if those assertion are correct. Thanks, [1] referring to that https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/installation_guide/requirements-for-installing-rhcs#enabling-the-red-hat-ceph-storage-repositories-install Hi, started implementing a solution for this, where one can disable enforcement on a per role basis by adding "rhsm_enforce: false" to the role definition. Hi @dmcphers , this is a new parameter for role definition that should be set to false in the role definition called "rhsm_enforce". This is useful for Ceph role using overcloud-minimal where rhel is not necessarily pinned to a specific version. Do you think it's worth some more documentation in some specific ceph/osp documentation ? @sathlang -- Sure, although we'd probably need to be clear on the specific steps as to what customers need to do and where to put the rhsm_enforce? For example, do we set it here: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/keeping_red_hat_openstack_platform_updated/preparing-for-a-minor-update#locking-the-environment-to-a-red-hat-enterprise-linux-release_keeping-updated And/or here: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/keeping_red_hat_openstack_platform_updated/preparing-for-a-minor-update Hi @dmacpher , so the way I see it should be a warning there : https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/keeping_red_hat_openstack_platform_updated/assembly-updating_the_overcloud#running-the-overcloud-update-preparation_keeping-updated Just before we run the overcloud update prepare command. The warning section should point to this https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/deploying_an_overcloud_with_containerized_red_hat_ceph/index#using-the-overcloud-minimal-image-to-avoid-using-a-Red-Hat-subscription-entitlement This is where we have to add that the role file should set the rhsm_enforce to false when using overcloud-minimal-image. Basically, 'for all Ceph osd "role" which are using overcloud-minimal image, their role should have rhsm_enforce set to false' to avoid checking rhosp version enforcement. Hi Sofer, I can see that by default the roles_data.yaml file has set rhsm_enforce: False [1]. Is the expectation that they've unset that parameter? What actions are we required to ensure that this parameter is set to false? Is it to inspect the roles_data.yaml file and edit it if needed? Also, if we're making a change in the Deploying an overcloud with containerized Red Hat Ceph guide, section 2.6. Using the overcloud-minimal image to avoid using a Red Hat subscription entitlement [2], does having this parameter set to false affect any other scenarios except for updates/upgrades? I will create a draft for these changes shortly and update this ticket with the details. Many thanks, Vlada [1] https://github.com/openstack/tripleo-heat-templates/blob/stable/train/roles_data.yaml [2] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/deploying_an_overcloud_with_containerized_red_hat_ceph/index#using-the-overcloud-minimal-image-to-avoid-using-a-Red-Hat-subscription-entitlement (In reply to Sofer Athlan-Guyot from comment #16) > Hi @dmacpher , > > so the way I see it should be a warning there : > > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16. > 1/html/keeping_red_hat_openstack_platform_updated/assembly- > updating_the_overcloud#running-the-overcloud-update-preparation_keeping- > updated > > Just before we run the overcloud update prepare command. > > The warning section should point to this > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16. > 1/html-single/deploying_an_overcloud_with_containerized_red_hat_ceph/ > index#using-the-overcloud-minimal-image-to-avoid-using-a-Red-Hat- > subscription-entitlement > > This is where we have to add that the role file should set the rhsm_enforce > to false when using overcloud-minimal-image. > > Basically, 'for all Ceph osd "role" which are using overcloud-minimal image, > their role should have rhsm_enforce set to false' to avoid checking rhosp > version enforcement. Hi,
made sure that we were enforcing the rhel check:
(undercloud) [stack@undercloud-0 ~]$ openstack stack environment show qe-Cloud-0 > env.txt
(undercloud) [stack@undercloud-0 ~]$ grep Enforce env.txt
SkipRhelEnforcement: false
The role data for CephStorage has rhsm_enforce set to false:
- name: CephStorage
description: |
Ceph OSD Storage node role
networks:
Storage:
subnet: storage_subnet
StorageMgmt:
subnet: storage_mgmt_subnet
uses_deprecated_params: False
deprecated_nic_config_name: 'ceph-storage.yaml'
# CephOSD present so serial has to be 1
update_serial: 1
rhsm_enforce: False
...
Ceph-1 has subscription but nothing is set:
[root@ceph-1 ~]# sudo subscription-manager release --show
Release not set
Compute-0 has no subscription whatsoever (used to prove that the check is indeed enable there)
[heat-admin@compute-0 ~]$ sudo subscription-manager release --show
This system is not yet registered. Try 'subscription-manager register --help' for more information.
Now if I'm updating ceph-1, there is no check implemented:
TASK [tripleo-redhat-enforce : Enforce RHEL/OSP version pair] ******************
Wednesday 05 May 2021 13:20:09 +0000 (0:00:00.069) 0:00:17.324 *********
skipping: [ceph-1] => {"changed": false, "skip_reason": "Conditional result was False"}
while for compute-0:
TASK [tripleo-redhat-enforce : Enforce RHEL/OSP version pair] ******************
Wednesday 05 May 2021 13:34:12 +0000 (0:00:00.068) 0:00:17.875 *********
included: /usr/share/ansible/roles/tripleo-redhat-enforce/tasks/enforce_release.yml for compute-0
TASK [tripleo-redhat-enforce : get current release settings] *******************
Wednesday 05 May 2021 13:34:12 +0000 (0:00:00.088) 0:00:17.964 *********
fatal: [compute-0]: FAILED! => {"attempts": 1, "changed": true, "cmd": ["subscription-manager", "release", "--show"], "delta": "0:00:01.162262", "end": "2021-05-05 13:34:14.583573", "msg": "non-zero return code", "rc": 1, "start": "2021-05-05 13:34:13.421311", "stderr": "This system is not yet registered. Try 'subscription-manager register --help' for more information.", "stderr_lines": ["This system is not yet registered. Try 'subscription-manager register --help' for more information."], "stdout": "", "stdout_lines": []}
...ignoring
TASK [tripleo-redhat-enforce : fails if not registered] ************************
Wednesday 05 May 2021 13:34:14 +0000 (0:00:01.701) 0:00:19.666 *********
fatal: [compute-0]: FAILED! => {"changed": false, "msg": "Your environment is not subscribed! If it is expected, please set SkipRhelEnforcement to true. For Director the documentation is there https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html-single/ director_installation_and_usage/index#configuring-the-undercloud-with-environment-files, for the Overcloud you need to add a new parameter file to your deploy command with that parameter set. You can also disable it in the role, see rhsm_enforce role parameter. If this is unexpected, you have to subscribe this node and ensure that RHEL is pinned to 8.2 as this is the only version supported for 16.1."}
Failure happens as expected.
Verified.
(In reply to Vlada Grosu from comment #25) > Hi Sofer, > > I can see that by default the roles_data.yaml file has set rhsm_enforce: > False [1]. > > Is the expectation that they've unset that parameter? so roles_data.yaml is an "example" and if you look closely it's set to false only for the CephStorage role which is where this parameter should make sense. > What actions are we required to ensure that this parameter is set to false? > Is it to inspect the roles_data.yaml file and edit it if needed? Yes. > > > Also, if we're making a change in the Deploying an overcloud with > containerized Red Hat Ceph guide, section 2.6. Using the overcloud-minimal > image to avoid using a Red Hat subscription entitlement [2], does having > this parameter set to false affect any other scenarios except for > updates/upgrades? The check happens only during update, but it would be a good idea to properly set this from the deployment. > > I will create a draft for these changes shortly and update this ticket with > the details. > Thanks. > Many thanks, > Vlada > > > [1] > https://github.com/openstack/tripleo-heat-templates/blob/stable/train/ > roles_data.yaml > > [2] > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16. > 1/html-single/deploying_an_overcloud_with_containerized_red_hat_ceph/ > index#using-the-overcloud-minimal-image-to-avoid-using-a-Red-Hat- > subscription-entitlement > > > (In reply to Sofer Athlan-Guyot from comment #16) > > Hi @dmacpher , > > > > so the way I see it should be a warning there : > > > > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16. > > 1/html/keeping_red_hat_openstack_platform_updated/assembly- > > updating_the_overcloud#running-the-overcloud-update-preparation_keeping- > > updated > > > > Just before we run the overcloud update prepare command. > > > > The warning section should point to this > > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16. > > 1/html-single/deploying_an_overcloud_with_containerized_red_hat_ceph/ > > index#using-the-overcloud-minimal-image-to-avoid-using-a-Red-Hat- > > subscription-entitlement > > > > This is where we have to add that the role file should set the rhsm_enforce > > to false when using overcloud-minimal-image. > > > > Basically, 'for all Ceph osd "role" which are using overcloud-minimal image, > > their role should have rhsm_enforce set to false' to avoid checking rhosp > > version enforcement. Thanks, Sofer! The changes will be reflected in Director installation and usage guide, Deploying an overcloud with containerized Red Hat Ceph, and Keeping Red Hat OpenStack Platform updated. They will be published for 16.1.6. I've added the docs Jira tracker for this. Thank you! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenStack Platform 16.1.6 (tripleo-ansible) security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2119 |