Bug 1309066

Summary: ospd-8 poodle 2016-02-16.1 fails to deploy due to "RHEL_MAJ_VER: unbound variable"
Product: Red Hat OpenStack Reporter: wes hayutin <whayutin>
Component: diskimage-builderAssignee: Ben Nemec <bnemec>
Status: CLOSED ERRATA QA Contact: yeylon <yeylon>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.0 (Liberty)CC: bnemec, dbecker, jschluet, kbasil, mandreou, mburns, morazi, rhel-osp-director-maint, srevivo, yeylon
Target Milestone: gaKeywords: Automation, AutomationBlocker
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: diskimage-builder-1.11.1-1.el7ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-07 21:29:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description wes hayutin 2016-02-16 18:34:53 UTC
Description of problem:

All the overcloud nodes os-collect-config logs have the following error


-- Logs begin at Tue 2016-02-16 10:17:54 EST, end at Tue 2016-02-16 13:16:20 EST. --
Feb 16 10:18:07 overcloud-novacompute-0.localdomain systemd[1]: Started Collect metadata and run hook commands..
Feb 16 10:18:07 overcloud-novacompute-0.localdomain systemd[1]: Starting Collect metadata and run hook commands....
Feb 16 10:18:08 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:08.968 2354 WARNING os-collect-config [-] Source [request] Unavailable.
Feb 16 10:18:08 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:08.969 2354 WARNING os_collect_config.local [-] /var/lib/os-collect-config/local-d
Feb 16 10:18:08 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:08.969 2354 WARNING os_collect_config.local [-] No local metadata found (['/var/li
Feb 16 10:18:08 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:08.969 2354 WARNING os_collect_config.zaqar [-] No auth_url configured.
Feb 16 10:18:09 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:09,009] (os-refresh-config) [INFO] Starting phase pre-configure
Feb 16 10:18:09 overcloud-novacompute-0.localdomain os-collect-config[2354]: dib-run-parts Tue Feb 16 10:18:09 EST 2016 Running /usr/libexec/os-refresh-config/pre-configure.d/0
Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: /usr/libexec/os-refresh-config/pre-configure.d/06-rhel-registration: line 34: RHEL_MAJ_VER: unbound
Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:11,337] (os-refresh-config) [ERROR] during pre-configure phase. [Command '['dib-r
Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:11,337] (os-refresh-config) [ERROR] Aborting...
Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:11.341 2354 ERROR os-collect-config [-] Command failed,

Comment 3 Marios Andreou 2016-02-17 07:34:58 UTC
poked a little here, some info that might be useful to anyone else looking:

* the bug title isn't entirely accurate as afaics the "auth_url configured" isn't the actual error but rather a warning like "WARNING os_collect_config.zaqar [-] No auth_url configured"

* I believe the error is the "RHEL_MAJ_VER: unbound variable" from the trace in the description above: 

Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: /usr/libexec/os-refresh-config/pre-configure.d/06-rhel-registration: line 34: RHEL_MAJ_VER: unbound
Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:11,337] (os-refresh-config) [ERROR] during pre-configure phase. [Command '['dib-r
Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:11,337] (os-refresh-config) [ERROR] Aborting...

git blame leads me to this https://review.openstack.org/#/c/266016/ "Correct rhel-common for rhel6", where afaics the RHEL_MAJ_VER variable is introduced. It looks like it *should* be exported (but seemingly isn't in the env:

elements/rhel7/environment.d/10-rhel7-distro-name.bash:2:export RHEL_MAJ_VER=7
elements/rhel/environment.d/10-rhel-distro-name.bash:2:export RHEL_MAJ_VER=6

Since the 06-rhel-registration script where this error happens [1] has "set -u" you get an error about the unset variable.


[1] https://github.com/openstack/diskimage-builder/blob/master/elements/rhel-common/os-refresh-config/pre-configure.d/06-rhel-registration

Comment 5 Ben Nemec 2016-02-17 22:15:32 UTC
I believe this is because os-refresh-config scripts don't get environment.d entries, so this probably works for registration during image building but not for deployment.

I wasn't crazy about the change in the first place (there's a separate script for registering rhel 6), so I'm probably just going to revert and if they want it back in badly enough they can fix it so it works.

Comment 6 Ben Nemec 2016-02-17 22:18:46 UTC
Revert submitted.

Comment 7 Ben Nemec 2016-02-24 23:29:24 UTC
The revert upstream has landed, so this should be available in the next build.  I'm also changing the title so I can remember what this bug actually was when I get notifications about it. :-)

Comment 9 Jon Schlueter 2016-03-30 12:04:09 UTC
Job that was failing due to this issue is now passing cleanly.

Comment 10 errata-xmlrpc 2016-04-07 21:29:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0603.html