Bug 1658258

Summary: [rhsm] Upgrade fails when using rhsm if the overcloud nodes are alread registered in satellite
Product: Red Hat OpenStack Reporter: Jose Luis Franco <jfrancoa>
Component: openstack-tripleo-heat-templatesAssignee: Jose Luis Franco <jfrancoa>
Status: CLOSED ERRATA QA Contact: Ronnie Rasouli <rrasouli>
Severity: high Docs Contact:
Priority: high    
Version: 14.0 (Rocky)CC: ccamacho, jschluet, lbezdick, m.andre, mbracho, mburns, pgrist, rheslop, sclewis, slinaber
Target Milestone: z3Keywords: Reopened, Triaged, ZStream
Target Release: 14.0 (Rocky)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
Story Points: ---
Clone Of:
: 1715536 (view as bug list) Environment:
Last Closed: 2019-05-30 15:18:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jose Luis Franco 2018-12-11 16:07:43 UTC
Description of problem:

When upgrading frm osp13 to osp14 having an already registered overcloud via Satellite (old rhel-registration service was used during deployment) the overcloud upgrade run command will fail because it can't find some puppet classes. This is usually a sign that the overcloud nodes did not have their repositories updated to the last version.
This can be observed in the upgrade_tasks when rhsm service invokes redhat-subsciption ansible role:

u'TASK [redhat-subscription : SATELLITE 6 | Install katello-ca-consumer] *********',
 u'Tuesday 11 December 2018  07:15:18 -0500 (0:00:00.916)       0:01:05.081 ****** ',
 u'ok: [controller-0] => {"changed": false, "msg": "", "rc": 0, "results": ["katello-ca-consumer-rhos-compute-node-08.lab.eng.rdu2.redhat.com-1.0-2.noarch providing /tmp/katello-ca-consumer-latest.noarchK641hh.rpm is already installed"]}',
 u'',
 u'TASK [redhat-subscription : SATELLITE 6 | Execute katello-rhsm-consumer] *******',
 u'Tuesday 11 December 2018  07:15:23 -0500 (0:00:05.216)       0:01:10.298 ****** ',
 u'skipping: [controller-0] => {"changed": false, "skip_reason": "Conditional result was False"}',
 u'',
 u'TASK [redhat-subscription : Manage Red Hat subscription] ***********************',
 u'Tuesday 11 December 2018  07:15:24 -0500 (0:00:00.392)       0:01:10.690 ****** ',
 u'ok: [controller-0] => {"changed": false, "msg": "System already registered."}',
 u'',
 u'TASK [redhat-subscription : Configure repository subscriptions] ****************',
 u'Tuesday 11 December 2018  07:15:26 -0500 (0:00:01.948)       0:01:12.639 ****** ',
 u'skipping: [controller-0] => {"changed": false, "skip_reason": "Conditional result was False"}',
 u'',
 u'TASK [Check if swift-proxy or swift-object-expirer are deployed] ***************',
 u'Tuesday 11 December 2018  07:15:26 -0500 (0:00:00.285)       0:01:12.924 ****** ',

The role does not register the environment with the new osp14 activation key provided because the system was already registered.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deploy the overcloud using rhel-registration to connec to a satellite server (via activation key)
cat rhel-registration/environment-rhel-registration.yaml
parameter_defaults:
  rhel_reg_activation_key: "rhos13-dev"
  rhel_reg_auto_attach: ""
  rhel_reg_base_url: ""
  rhel_reg_environment: ""
  rhel_reg_force: ""
  rhel_reg_machine_name: ""
  rhel_reg_org: "Default_Organization"
(some parameters are not included on purpose)
overcloud deploy command used:
openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-e /home/stack/virt/config_lvm.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/enable-tls.yaml \
-e /home/stack/virt/inject-trust-anchor.yaml \
-e /home/stack/virt/public_vip.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \
-e /home/stack/virt/hostnames.yml \
-e /home/stack/virt/nodes_data.yaml \
-e /home/stack/virt/extra_templates.yaml \
-e /home/stack/virt/docker-images.yaml \
-e /home/stack/rhel-registration/rhel-registration-resource-registry.yaml \
-e /home/stack/rhel-registration/environment-rhel-registration.yaml \
--log-file overcloud_deployment_40.log
2. Upgrade the undercloud to osp 14
3. Prepare rhsm.yaml environment file: 
cat rhsm.yaml 
resource_registry:
  OS::TripleO::Services::Rhsm: /usr/share/openstack-tripleo-heat-templates/extraconfig/services/rhsm.yaml
parameter_defaults:
  RhsmVars:
    rhsm_activation_key: "osp14-dev"
    rhsm_org_id: "Default_Organization"
    rhsm_method: satellite
    rhsm_insecure: yes
(some paraemeters are ommitted on purpose)

And run overcloud_upgrade_prepare:
cat overcloud_prepare.sh 
#!/bin/bash

openstack overcloud upgrade prepare \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-e /home/stack/virt/config_lvm.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/enable-tls.yaml \
-e /home/stack/virt/inject-trust-anchor.yaml \
-e /home/stack/virt/public_vip.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \
-e /home/stack/virt/hostnames.yml \
-e /home/stack/virt/nodes_data.yaml \
-e /home/stack/virt/extra_templates.yaml \
-e /home/stack/containers-prepare-parameter-oc.yaml \
-e /home/stack/rhsm.yaml

4. Run external-upgrade-run --tags container_image_prepare
5. Run "overcloud upgrade run --roles Controller --playbook all

Actual results:


Expected results:


Additional info:

Comment 1 Jose Luis Franco 2018-12-11 16:10:18 UTC
The main issue is in the rhsm.yaml service in tht: https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/services/rhsm.yaml#L81 , we should be passing also the rhsm_force_register: True option as the redhat-subscription role does not cover the case of identifying if an activation-key provided is different than the one already registered. For that reason we need to force the registration, which implies unregistering and registering again with the new information provided in rhsm environment file.

Comment 3 Carlos Camacho 2018-12-13 14:49:26 UTC
We need a docs fix for this, NOT A BLOCKER we will wait for the fix to be available in zstream (z1).

Comment 7 Carlos Camacho 2018-12-13 15:40:09 UTC
We discussed this in the daily meeting and due to that we have workarounds we dont need to have this for GA. We will deliver the fix in Z1

Comment 12 Jose Luis Franco 2019-04-11 07:26:51 UTC
Hi Roger,

The solution implemented doesn't need documentation in the end.

Thanks

Comment 14 Jose Luis Franco 2019-04-29 09:47:46 UTC
To workaround this issue, it's needed to set the rhsm parameter:

rhsm_force_register: True

Either in the rhsm.yaml environment file or as an environment parameter, passing -e rhsm_force_register=True during the "overcloud upgrade prepare" step.

For more context, here is the link to the rhsm service configuration steps during the overcloud upgrade process:

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/14/html/upgrading_red_hat_openstack_platform/assembly-preparing_for_overcloud_upgrade#switching-to-the-rhsm-composable-service

Comment 15 Jose Luis Franco 2019-04-29 11:32:34 UTC
Correcting a small nit from the previous comment, the right way to pass the parameter is via rhsm.yaml environment file. There is no way to set a value for a parameter via -e option (that's  only in Ansible).

Comment 18 errata-xmlrpc 2019-04-30 17:51:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0878