Bug 1380902
Summary: | Customized deployment with OS::TripleO::NodeUserData fails | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> |
Component: | python-tripleoclient | Assignee: | Jiri Stransky <jstransk> |
Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> |
Severity: | urgent | Docs Contact: | |
Priority: | medium | ||
Version: | 10.0 (Newton) | CC: | dbecker, dtantsur, hbrock, jcoufal, jjoyce, jslagle, jstransk, mburns, mcornea, morazi, rhel-osp-director-maint, sasha, shardy |
Target Milestone: | rc | Keywords: | Regression, Triaged |
Target Release: | 10.0 (Newton) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | python-tripleoclient-5.3.0-1.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-12-14 16:06:55 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Marius Cornea
2016-10-01 09:49:54 UTC
steve, can you triage this one? i guess it might be tripleoclient/tripleo-common related in which case we could send it over to dougal. jistr, can you take a look at this one? steve already has a few others on his plate Sure thing. Marius, can you please post wipe-disk.sh too (or a link)? Actually i can try with a no-op .sh as well hopefully, i just wanted to reproduce the issue as closely as possible. wipe-disk.sh: #!/bin/bash if [[ `hostname` = *"stor"* ]] then echo "Number of disks detected: $(lsblk -no NAME,TYPE,MOUNTPOINT | grep "disk" | awk '{print $1}' | wc -l)" for DEVICE in `lsblk -no NAME,TYPE,MOUNTPOINT | grep "disk" | awk '{print $1}'` do ROOTFOUND=0 echo "Checking /dev/$DEVICE..." echo "Number of partitions on /dev/$DEVICE: $(expr $(lsblk -n /dev/$DEVICE | awk '{print $7}' | wc -l) - 1)" for MOUNTS in `lsblk -n /dev/$DEVICE | awk '{print $7}'` do if [ "$MOUNTS" = "/" ] then ROOTFOUND=1 fi done if [ $ROOTFOUND = 0 ] then echo "Root not found in /dev/${DEVICE}" echo "Wiping disk /dev/${DEVICE}" sgdisk -Z /dev/${DEVICE} sgdisk -g /dev/${DEVICE} else echo "Root found in /dev/${DEVICE}" fi done fi Indeed this should probably be fixed in tripleoclient as suggested in #1. Documenting some rationale below w/r/t the problem and the fix. The problem ----------- I was able to reproduce and debug, this is due to the conceptual problem that when we deploy from Swift rather than locally, so we don't have all the files available. We already have special processing in place because of this, but it dons't go all the way. (Full feature parity to previous state incl. handling of absolute path links in get_file is likely not achievable, especially while keeping the same CLI interface.) All externally referenced files already get uploaded into Swift as `user-files/<hash(original path)>-<file name>`. << This naming scheme changes file names and relative paths. Our current solution amends resource registry to work well with the new names/paths, but we don't scan and edit heat templates themselves for all { get_file: some_file_name } references. Workaround ---------- Immediate workaround is to move the ~/templates directory as a subdirectory of what is passed as --templates location. Since we generally recommend to not modify that directory, doing this is not an ideal solution. Proposed solution ----------------- I'd like to avoid scanning and editing the templates w/r/t get_file references. I think we could make relative get_file links work by changing the naming scheme in Swift to `user-files/<full file path>`, which would preserve both names and relative paths between files. Unfortunately, i still don't have success even with this approach, getting the same error. When i download wipe-disk.yaml from swift, it looks like this: {"outputs": {"OS::stack_id": {"value": {"get_resource": "userdata"}}}, "heat_template_version": "2014-10-16", "description": "Wipe and convert all disks to GPT (except the disk containing the root file system)\n", "resources": {"userdata": {"type": "OS::Heat::MultipartMime", "properties": {"parts": [{"config": {"get_resource": "wipe_disk"}}]}}, "wipe_disk": {"type": "OS::Heat::SoftwareConfig", "properties": {"config": {"get_file": "file:///home/stack/userdata-bz/wipe-disk.sh"}}}}} Obviously something is processing the template and replacing the relative link with an absolute one. I suspect this is done by heatclient itself when processing the passed-in environment files. I'm debugging further and looking how can we solve this. Unfortunately indeed heatclient replaces the relative links with absolute ones when processing processing the passed-in environment files and other files referenced from them. Looks like we can't avoid parsing and editing the external templates in the end. Working on a patch to parse through the templates and fix the links. Wrong DFG by mistake, returning back to DF. Merged to master and stable/newton. *** Bug 1385153 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html |