Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1903120

Summary: Configuration steps are started before all tasks defined in NodeUserData are completed by cloud-init
Product: Red Hat OpenStack Reporter: Takashi Kajinami <tkajinam>
Component: openstack-tripleo-heat-templatesAssignee: Rabi Mishra <ramishra>
Status: CLOSED ERRATA QA Contact: David Rosenfeld <drosenfe>
Severity: high Docs Contact:
Priority: medium    
Version: 16.1 (Train)CC: aschultz, mburns, ramishra
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20210104205658.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-17 15:36:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Takashi Kajinami 2020-12-01 11:46:43 UTC
Description of problem:

In RHOSP13, the OS::Triple::NodeUserDataor resource[1] (and its role specific version, OS::TripleO::{{role.name}}NodeUserData) is executed after baremetal instances are created and before any other configuration steps are executed.
 [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/advanced_overcloud_customization/index#sect-Customizing_Configuration_on_First_Boot

This is because we use os-collect-config to apply any configurations defined in heat resources, and os-collect-config daemon is started after cloud-init completes all configurations because of the constraint defined in systemd service file.

~~~
$ sudo cat /usr/lib/systemd/system/os-collect-config.service
[Unit]
Description=Collect metadata and run hook commands.
After=cloud-final.service
Before=crond.service

...
~~~

However this behavior has been changed since we replaced os-collect-config by config-download in RHOSP14.
Currently there are no dependency set between cloud-init invocation and actual deployment tasks, thus any deployment tasks can be stated BEFORE cloud-init completes all defined tasks.

This change makes it very difficult to use NodeUserData resource because of unexpected ordering/collisions between configurations. For example if any of the subsequent configurations (like ExtraConfogPre) depend on something implemented in NodeUserData, the configuration doesn't work as expected.

IMO we should implement an ordering enforcement to revive the previous behavior (or enforce users to use ExtraConfigPre instead)


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Deploy overcloud with OS::Tripleo::NodeUserData resource
~~~
heat_template_version: 2014-10-16

description: >
  Extra hostname configuration

resources:
  userdata:
    type: OS::Heat::MultipartMime
    properties:
      parts:
      - config: {get_resource: sleep_config}

  sleep_config:
    type: OS::Heat::SoftwareConfig
    properties:
      config: |
        #!/bin/bash
        sleep 300

outputs:
  OS::stack_id:
    value: {get_resource: userdata}
~~~

Actual results:
Deployment steps start before cloud-init completes applying all settings

Expected results:
Deployment steps start after cloud-init completes applying all settings


Additional info:

Comment 2 Alex Schultz 2020-12-01 16:55:51 UTC
Does this also address the issues in https://bugzilla.redhat.com/show_bug.cgi?id=1903146?

Comment 3 Rabi Mishra 2020-12-02 04:33:26 UTC
> Does this also address the issues in https://bugzilla.redhat.com/show_bug.cgi?id=1903146?

I could not reproduce/understand the issue mentioned. If the issue is about NetworkManager configuration being overridden by cloud-init for some reason, I don't think this patch would address that.

Comment 13 errata-xmlrpc 2021-03-17 15:36:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.4 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0817