Description of problem: Customized deployment with OS::TripleO::NodeUserData fails Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-5.0.0-0.20160929150845.4cdc4fc.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud: source ~/stackrc export THT=/usr/share/openstack-tripleo-heat-templates openstack overcloud deploy --templates \ -e $THT/environments/network-isolation.yaml \ -e $THT/environments/network-management.yaml \ -e ~/templates/network-environment.yaml \ -e $THT/environments/storage-environment.yaml \ -e ~/templates/disk-layout.yaml \ -e ~/templates/wipe-disk-env.yaml \ -e ~/templates/enable-tls.yaml \ -e ~/templates/inject-trust-anchor.yaml \ -e ~/templates/tls-endpoints-public-ip.yaml \ -e ~/templates/ssl-ports.yaml \ --control-scale 3 \ --control-flavor controller \ --compute-scale 1 \ --compute-flavor compute \ --ceph-storage-scale 1 \ --ceph-storage-flavor ceph \ --ntp-server ntp.server.com Actual results: Deployment fails with: u'message': u"No connection adapters were found for 'file:///home/stack/templates/wipe-disk.sh'", u'status': u'FAILED'} Expected results: Deployment succeeds. Additional info: cat ~/templates/wipe-disk-env.yaml resource_registry: OS::TripleO::NodeUserData: /home/stack/templates/wipe-disks.yaml cat /home/stack/templates/wipe-disks.yaml heat_template_version: 2014-10-16 description: > Wipe and convert all disks to GPT (except the disk containing the root file system) resources: userdata: type: OS::Heat::MultipartMime properties: parts: - config: {get_resource: wipe_disk} wipe_disk: type: OS::Heat::SoftwareConfig properties: config: {get_file: wipe-disk.sh} outputs: OS::stack_id: value: {get_resource: userdata}
steve, can you triage this one? i guess it might be tripleoclient/tripleo-common related in which case we could send it over to dougal.
jistr, can you take a look at this one? steve already has a few others on his plate
Sure thing.
Marius, can you please post wipe-disk.sh too (or a link)?
Actually i can try with a no-op .sh as well hopefully, i just wanted to reproduce the issue as closely as possible.
wipe-disk.sh: #!/bin/bash if [[ `hostname` = *"stor"* ]] then echo "Number of disks detected: $(lsblk -no NAME,TYPE,MOUNTPOINT | grep "disk" | awk '{print $1}' | wc -l)" for DEVICE in `lsblk -no NAME,TYPE,MOUNTPOINT | grep "disk" | awk '{print $1}'` do ROOTFOUND=0 echo "Checking /dev/$DEVICE..." echo "Number of partitions on /dev/$DEVICE: $(expr $(lsblk -n /dev/$DEVICE | awk '{print $7}' | wc -l) - 1)" for MOUNTS in `lsblk -n /dev/$DEVICE | awk '{print $7}'` do if [ "$MOUNTS" = "/" ] then ROOTFOUND=1 fi done if [ $ROOTFOUND = 0 ] then echo "Root not found in /dev/${DEVICE}" echo "Wiping disk /dev/${DEVICE}" sgdisk -Z /dev/${DEVICE} sgdisk -g /dev/${DEVICE} else echo "Root found in /dev/${DEVICE}" fi done fi
Indeed this should probably be fixed in tripleoclient as suggested in #1. Documenting some rationale below w/r/t the problem and the fix. The problem ----------- I was able to reproduce and debug, this is due to the conceptual problem that when we deploy from Swift rather than locally, so we don't have all the files available. We already have special processing in place because of this, but it dons't go all the way. (Full feature parity to previous state incl. handling of absolute path links in get_file is likely not achievable, especially while keeping the same CLI interface.) All externally referenced files already get uploaded into Swift as `user-files/<hash(original path)>-<file name>`. << This naming scheme changes file names and relative paths. Our current solution amends resource registry to work well with the new names/paths, but we don't scan and edit heat templates themselves for all { get_file: some_file_name } references. Workaround ---------- Immediate workaround is to move the ~/templates directory as a subdirectory of what is passed as --templates location. Since we generally recommend to not modify that directory, doing this is not an ideal solution. Proposed solution ----------------- I'd like to avoid scanning and editing the templates w/r/t get_file references. I think we could make relative get_file links work by changing the naming scheme in Swift to `user-files/<full file path>`, which would preserve both names and relative paths between files. Unfortunately, i still don't have success even with this approach, getting the same error. When i download wipe-disk.yaml from swift, it looks like this: {"outputs": {"OS::stack_id": {"value": {"get_resource": "userdata"}}}, "heat_template_version": "2014-10-16", "description": "Wipe and convert all disks to GPT (except the disk containing the root file system)\n", "resources": {"userdata": {"type": "OS::Heat::MultipartMime", "properties": {"parts": [{"config": {"get_resource": "wipe_disk"}}]}}, "wipe_disk": {"type": "OS::Heat::SoftwareConfig", "properties": {"config": {"get_file": "file:///home/stack/userdata-bz/wipe-disk.sh"}}}}} Obviously something is processing the template and replacing the relative link with an absolute one. I suspect this is done by heatclient itself when processing the passed-in environment files. I'm debugging further and looking how can we solve this.
Unfortunately indeed heatclient replaces the relative links with absolute ones when processing processing the passed-in environment files and other files referenced from them. Looks like we can't avoid parsing and editing the external templates in the end. Working on a patch to parse through the templates and fix the links.
Wrong DFG by mistake, returning back to DF.
Merged to master and stable/newton.
*** Bug 1385153 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html