Bug 1570938

Summary: [OSP13] Overcloud deployment fail when attempting to call heat-templates not from their default path (/usr/share/openstack-tripleo-heat-templates).
Product: Red Hat OpenStack Reporter: Omri Hochman <ohochman>
Component: documentationAssignee: OSP Team <rhos-maint>
Status: CLOSED WONTFIX QA Contact: Gurenko Alex <agurenko>
Severity: high Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: apevec, aschultz, bdobreli, dmacpher, emacchi, hbrock, jslagle, lhh, mburns, mlehmann, ohochman, ramishra, shardy, sreichar, srevivo, tcarlin, ukalifon
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-26 05:15:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Omri Hochman 2018-04-23 19:17:04 UTC
[OSP13] Overcloud deployment fail when attempting to call heat-templates 
not from their default path (/usr/share/openstack-tripleo-heat-templates).


Description:
--------------
When deployment command is using custom path for the heat-templates directory, the overcloud deployment will attempt to search for ceratin scripts, such run-os-net-config.sh in a wrong path and that will cause overcloud deployment to fail. 

 
Steps: 
--------------
(*) attempt to deploy overcloud with templates that are not on their default path.

For example, change the template path by adding this to deploy command:  
openstack overcloud deploy --templates openstack-tripleo-heat-templates/


The results: 
--------------
- Overcloud deployment fails.

It's attempting to search scripts in a wrong path  

Cannot find : openstack-tripleo-heat-templates/usr/share/openstack-tripleo-heat-templates/network/scripts/run-os-net-config.sh

Comment 2 Emilien Macchi 2018-04-24 14:09:43 UTC
Omri, could you please give the full command used please? We need to see which templates / network config you're trying to use.

Thanks!

Comment 3 Omri Hochman 2018-04-25 00:39:49 UTC
(In reply to Emilien Macchi from comment #2)
> Omri, could you please give the full command used please? We need to see
> which templates / network config you're trying to use.
> 
> Thanks!

Sure, actaully this deploy_command was used by UI:DFG team to deploy osp13 on their BM envrionment :

[stack@puma01 ~]$ cat deploy_command
openstack overcloud deploy --templates openstack-tripleo-heat-templates/ -e openstack-tripleo-heat-templates/environments/network-isolation.yaml -e openstack-tripleo-heat-templates/environments/network-environment.yaml -e openstack-tripleo-heat-templates/environments/ssl/enable-tls.yaml -e openstack-tripleo-heat-templates/environments/ssl/inject-trust-anchor.yaml -e /home/stack/virt/docker-images.yaml -e custom.yaml

Comment 4 Emilien Macchi 2018-04-25 00:59:13 UTC
(In reply to Omri Hochman from comment #3)
> [stack@puma01 ~]$ cat deploy_command
> openstack overcloud deploy --templates openstack-tripleo-heat-templates/ -e
> openstack-tripleo-heat-templates/environments/network-isolation.yaml -e
> openstack-tripleo-heat-templates/environments/network-environment.yaml -e
> openstack-tripleo-heat-templates/environments/ssl/enable-tls.yaml -e
> openstack-tripleo-heat-templates/environments/ssl/inject-trust-anchor.yaml
> -e /home/stack/virt/docker-images.yaml -e custom.yaml

Could you please show the content of custom.yaml?

Comment 5 Omri Hochman 2018-04-30 21:01:01 UTC
Udi  - Can you please provide the content of the custom.yaml from what you had on puma01 for the OSP13 UI deployment. 
Thanks,

Comment 6 Udi Kalifon 2018-05-01 12:54:17 UTC
custom.yaml was just this:

parameter_defaults:
  ControllerCount: 3
  ComputeCount: 2
  OvercloudControllerFlavor: control
  OvercloudComputeFlavor: compute

By the way, I also opened a bug on this same issue and it was closed as "not a bug" - see bug #1572990. I think it's still a bug, because paths should always be relative and not point to places outside the plans.

Comment 7 Bogdan Dobrelya 2018-05-07 14:26:10 UTC
This seems also related to https://bugs.launchpad.net/tripleo/+bug/1762403

Comment 8 David Peacock 2018-05-18 19:06:33 UTC
I've been beating on this scenario for a while against upstream tip, and I cannot reproduce the above results with a minimal environment, nor can I see any logic errors in the related code in overcloud_deploy.py.

When I introduce a debugger I see the variables looking correct, and if allowed to continue this finishes with a successful overcloud deployment.

"""
* `import pdb; pdb.set_trace()` on line 361
* Ran deploy:
    `openstack overcloud deploy --templates openstack-tripleo-heat-templates/ -e custom.yaml`
* Result:
(undercloud) [stack@undercloud ~]$ openstack overcloud deploy --templates openstack-tripleo-heat-templates/ -e custom.yaml 
Waiting for messages on queue 'tripleo' with no timeout.
> /usr/lib/python2.7/site-packages/tripleoclient/v1/overcloud_deploy.py(365)_deploy_tripleo_heat_templates()
-> plans = plan_management.list_deployment_plans(self.clients)
(Pdb) tht_root
'/tmp/tripleoclient-p_aDL4/tripleo-heat-templates'
(Pdb) user_tht_root
u'/home/stack/openstack-tripleo-heat-templates'
(Pdb) parsed_args
Namespace(answers_file=None, block_storage_flavor=None, block_storage_scale=None, ceph_storage_flavor=None, ceph_storage_scale=None, compute_flavor=None, compute_scale=None, config_download=True, control_flavor=None, control_scale=None, deployed_server=False, disable_password_generation=False, disable_validations=False, dry_run=False, environment_directories=['/home/stack/.tripleo/environments'], environment_files=[u'custom.yaml'], force_postconfig=False, libvirt_type=None, networks_file=None, no_cleanup=False, no_proxy='', ntp_server=None, output_dir=None, overcloud_ssh_key=None, overcloud_ssh_user='heat-admin', plan_environment_file=None, reg_activation_key='', reg_force=False, reg_method='satellite', reg_org='', reg_sat_url='', rhel_reg=False, roles_file=None, run_validations=False, skip_deploy_identifier=False, skip_postconfig=False, stack='overcloud', swift_storage_flavor=None, swift_storage_scale=None, templates=u'openstack-tripleo-heat-templates/', timeout=240, update_plan_only=False, validation_errors_fatal=True, validation_warnings_fatal=False)
(Pdb) 
"""

My custom.yaml contents:

"""
parameter_defaults:
  ControllerCount: 1
  ComputeCount: 1
  OvercloudControllerFlavor: control
  OvercloudComputeFlavor: compute
"""

I'm planning to close this as NOTABUG unless someone can show me an error in my test or reproduce this in a pristine environment.

Thanks,
David

Comment 9 Udi Kalifon 2018-05-21 06:40:58 UTC
I think that there is no problem as long as the templates don't point to a directory on the filesystem which is outside the template's directory tree (for example pointing to /usr/share when the templates are on the user's home directory). Even if there is such a pointer, the deployment will pass if you run it from the CLI, and only GUI users will hit a problem (because the referenced yaml is not in the container). Omri - can you confirm?

Comment 10 Bogdan Dobrelya 2018-05-21 09:07:15 UTC
I can confirm that outside references for -e files passed it never get it's path processed as it normally does for those sitting inside of --templates path.

There had been a few unit tests added upstream [0] illustrating this behavior. I think that is a design limitation we cannot handle. Is it documented?

[0] http://git.openstack.org/cgit/openstack/python-tripleoclient/tree/tripleoclient/tests/test_utils.py?id=0b3b55288b85cdf158b90c895dd4c17937d81f3c#n685

Comment 11 Omri Hochman 2018-05-21 14:28:24 UTC
Sounds like it confirmed as an issue .on comment#10 - adding require release-notes for osp13

Comment 12 David Peacock 2018-05-24 12:33:15 UTC
@bogdan, Can you please elaborate, preferably with output how you confirmed this?

My non-standard path templates were under ~/openstack-tripleo-heat-templates/ and my custom.yaml was under ~/ directly, not inside the non-standard path provided.

I would like to learn what's going on here and you seem to know more about this than I do. :-)

Thanks,
David

Comment 13 Bogdan Dobrelya 2018-05-24 15:02:35 UTC
@David, there is utility [0] used both with undercloud and overcloud heat based installations. Env files containing resources registry, when referenced outside of the templates dir, never get properly rewritten its content [1] nor redirected its paths relative to the templates path - compare [2] vs [3]. The links to unit tests only illustrate that. In the end though, this may work for some cases, as well as may equally fail for another ones. I think Steven Shardy can comment on that better than me.

[0] http://git.openstack.org/cgit/openstack/python-tripleoclient/tree/tripleoclient/utils.py#n860
[1] http://git.openstack.org/cgit/openstack/python-tripleoclient/tree/tripleoclient/tests/test_utils.py?id=0b3b55288b85cdf158b90c895dd4c17937d81f3c#n718
[2] http://git.openstack.org/cgit/openstack/python-tripleoclient/tree/tripleoclient/tests/test_utils.py?id=0b3b55288b85cdf158b90c895dd4c17937d81f3c#n685
[3] http://git.openstack.org/cgit/openstack/python-tripleoclient/tree/tripleoclient/tests/test_utils.py?id=0b3b55288b85cdf158b90c895dd4c17937d81f3c#n649

Comment 14 Bogdan Dobrelya 2018-05-24 15:13:28 UTC
Specifically to those unit tests linked above, let me explain the issue as I understand it... So there is

 'OS::Foo::Baz': '/twd/templates/inside.yaml'

rewritten correct from the original in-tree path of

  'OS::Foo::Baz': './inside.yaml',
(the relative path assumes it originally sits in /tmp/thtroot/inside.yaml, then
get copied into /twd/templates as the utility logic processes templates by new paths)

While the *relative* outside paths become broken, see the original

'OS::Foo::Bar': '../outside.yaml' # == /tmp/thtroot/../outside.yaml (correct)

vs rewritten:
'OS::Foo::Bar': '/twd/outside.yaml'
Is broken as it does not exist - we do not copy files outside of the templates directory tree!

While the absolute paths, like one for your example @David, seems working fine.
The absolute outside paths are illustrated via the /tmp/thtroot42/notouch.yaml fixture.

Comment 15 Steven Hardy 2018-05-24 15:24:25 UTC
(In reply to David Peacock from comment #12)
> @bogdan, Can you please elaborate, preferably with output how you confirmed
> this?
> 
> My non-standard path templates were under
> ~/openstack-tripleo-heat-templates/ and my custom.yaml was under ~/
> directly, not inside the non-standard path provided.
> 
> I would like to learn what's going on here and you seem to know more about
> this than I do. :-)

Been a while since I looked at this, but IIRC the following summary should be correct, and aligns with the tests referenced by Bogdan:

1. For any environment files which are j2 rendered (there are several in the tripleo-heat-templates tree from recent releases e.g to support custom networks), any -e reference *must* point to the same tree as --templates, e.g 
~/openstack-tripleo-heat-templates/ or whatever.

2. If your -e custom.yaml is outside t-h-t, we try to handle it by translating any resource_registry paths to relative, because the file must be copied into the root of the plan container, e.g the same as ~/openstack-tripleo-heat-templates/.

The (1) case may be missing some docs and validation, but it's a known limitation of the design - we can't handle j2 rendering files in arbitrary locations.

(2) is an attempt to maintain backwards compatibility for the cases where folks have custom environment files that are outside the t-h-t tree, but historically there have been some bugs in this area, so if you encounter issues I'd say the first thing to confirm is if the same environment works in the --templates directory.

https://github.com/openstack/tripleo-heat-templates/blob/master/ci/common/net-config-simple-bridge.yaml#L45 was mentioned above, and I agree that's probably a bug, and would likely result in this kind of problem - all the paths inside t-h-t should be relative or the path mangling for (2) is likely to break when the tree is moved so it'd be good to confirm if that's the issue here (the CLI from comment #3 doesn't look like we are, but it'd be good to confirm)

Comment 16 David Peacock 2018-05-24 18:59:00 UTC
Thank you both, Bogdan and Steven, for the enlightenment. :-)

Comment 17 Steve Reichard 2018-05-25 22:18:25 UTC
I believe the issue is the same,  my concern is that the use of the templates apparently conflicts with a feature that has been used for a while.

Comment 23 Dan Macpherson 2021-08-26 05:15:15 UTC
This bug has been closed as WONTFIX due to limited activity, although the issue in comment #15 around jinja2 rendered has been explained in the documentation:

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/advanced_overcloud_customization/sect-understanding_heat_templates#jinja2-rendering

If you believe this has been closed in error, please re-open and provide a comment telling us why this bug is important to you, and provide a link to your active support case. Thank you for your help!