Bug 1903120 - Configuration steps are started before all tasks defined in NodeUserData are completed by cloud-init
Summary: Configuration steps are started before all tasks defined in NodeUserData are ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Rabi Mishra
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-01 11:46 UTC by Takashi Kajinami
Modified: 2024-03-25 17:18 UTC (History)
3 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20210104205658.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-17 15:36:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 764943 0 None MERGED Ensure cloud-init has finished before puppet run 2021-02-04 13:52:15 UTC
Red Hat Product Errata RHBA-2021:0817 0 None None None 2021-03-17 15:36:38 UTC

Description Takashi Kajinami 2020-12-01 11:46:43 UTC
Description of problem:

In RHOSP13, the OS::Triple::NodeUserDataor resource[1] (and its role specific version, OS::TripleO::{{role.name}}NodeUserData) is executed after baremetal instances are created and before any other configuration steps are executed.
 [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/advanced_overcloud_customization/index#sect-Customizing_Configuration_on_First_Boot

This is because we use os-collect-config to apply any configurations defined in heat resources, and os-collect-config daemon is started after cloud-init completes all configurations because of the constraint defined in systemd service file.

~~~
$ sudo cat /usr/lib/systemd/system/os-collect-config.service
[Unit]
Description=Collect metadata and run hook commands.
After=cloud-final.service
Before=crond.service

...
~~~

However this behavior has been changed since we replaced os-collect-config by config-download in RHOSP14.
Currently there are no dependency set between cloud-init invocation and actual deployment tasks, thus any deployment tasks can be stated BEFORE cloud-init completes all defined tasks.

This change makes it very difficult to use NodeUserData resource because of unexpected ordering/collisions between configurations. For example if any of the subsequent configurations (like ExtraConfogPre) depend on something implemented in NodeUserData, the configuration doesn't work as expected.

IMO we should implement an ordering enforcement to revive the previous behavior (or enforce users to use ExtraConfigPre instead)


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Deploy overcloud with OS::Tripleo::NodeUserData resource
~~~
heat_template_version: 2014-10-16

description: >
  Extra hostname configuration

resources:
  userdata:
    type: OS::Heat::MultipartMime
    properties:
      parts:
      - config: {get_resource: sleep_config}

  sleep_config:
    type: OS::Heat::SoftwareConfig
    properties:
      config: |
        #!/bin/bash
        sleep 300

outputs:
  OS::stack_id:
    value: {get_resource: userdata}
~~~

Actual results:
Deployment steps start before cloud-init completes applying all settings

Expected results:
Deployment steps start after cloud-init completes applying all settings


Additional info:

Comment 2 Alex Schultz 2020-12-01 16:55:51 UTC
Does this also address the issues in https://bugzilla.redhat.com/show_bug.cgi?id=1903146?

Comment 3 Rabi Mishra 2020-12-02 04:33:26 UTC
> Does this also address the issues in https://bugzilla.redhat.com/show_bug.cgi?id=1903146?

I could not reproduce/understand the issue mentioned. If the issue is about NetworkManager configuration being overridden by cloud-init for some reason, I don't think this patch would address that.

Comment 13 errata-xmlrpc 2021-03-17 15:36:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.4 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0817


Note You need to log in before you can comment on or make changes to this bug.