Description of problem: All the overcloud nodes os-collect-config logs have the following error -- Logs begin at Tue 2016-02-16 10:17:54 EST, end at Tue 2016-02-16 13:16:20 EST. -- Feb 16 10:18:07 overcloud-novacompute-0.localdomain systemd[1]: Started Collect metadata and run hook commands.. Feb 16 10:18:07 overcloud-novacompute-0.localdomain systemd[1]: Starting Collect metadata and run hook commands.... Feb 16 10:18:08 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:08.968 2354 WARNING os-collect-config [-] Source [request] Unavailable. Feb 16 10:18:08 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:08.969 2354 WARNING os_collect_config.local [-] /var/lib/os-collect-config/local-d Feb 16 10:18:08 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:08.969 2354 WARNING os_collect_config.local [-] No local metadata found (['/var/li Feb 16 10:18:08 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:08.969 2354 WARNING os_collect_config.zaqar [-] No auth_url configured. Feb 16 10:18:09 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:09,009] (os-refresh-config) [INFO] Starting phase pre-configure Feb 16 10:18:09 overcloud-novacompute-0.localdomain os-collect-config[2354]: dib-run-parts Tue Feb 16 10:18:09 EST 2016 Running /usr/libexec/os-refresh-config/pre-configure.d/0 Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: /usr/libexec/os-refresh-config/pre-configure.d/06-rhel-registration: line 34: RHEL_MAJ_VER: unbound Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:11,337] (os-refresh-config) [ERROR] during pre-configure phase. [Command '['dib-r Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:11,337] (os-refresh-config) [ERROR] Aborting... Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: 2016-02-16 10:18:11.341 2354 ERROR os-collect-config [-] Command failed,
poked a little here, some info that might be useful to anyone else looking: * the bug title isn't entirely accurate as afaics the "auth_url configured" isn't the actual error but rather a warning like "WARNING os_collect_config.zaqar [-] No auth_url configured" * I believe the error is the "RHEL_MAJ_VER: unbound variable" from the trace in the description above: Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: /usr/libexec/os-refresh-config/pre-configure.d/06-rhel-registration: line 34: RHEL_MAJ_VER: unbound Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:11,337] (os-refresh-config) [ERROR] during pre-configure phase. [Command '['dib-r Feb 16 10:18:11 overcloud-novacompute-0.localdomain os-collect-config[2354]: [2016-02-16 10:18:11,337] (os-refresh-config) [ERROR] Aborting... git blame leads me to this https://review.openstack.org/#/c/266016/ "Correct rhel-common for rhel6", where afaics the RHEL_MAJ_VER variable is introduced. It looks like it *should* be exported (but seemingly isn't in the env: elements/rhel7/environment.d/10-rhel7-distro-name.bash:2:export RHEL_MAJ_VER=7 elements/rhel/environment.d/10-rhel-distro-name.bash:2:export RHEL_MAJ_VER=6 Since the 06-rhel-registration script where this error happens [1] has "set -u" you get an error about the unset variable. [1] https://github.com/openstack/diskimage-builder/blob/master/elements/rhel-common/os-refresh-config/pre-configure.d/06-rhel-registration
I believe this is because os-refresh-config scripts don't get environment.d entries, so this probably works for registration during image building but not for deployment. I wasn't crazy about the change in the first place (there's a separate script for registering rhel 6), so I'm probably just going to revert and if they want it back in badly enough they can fix it so it works.
Revert submitted.
The revert upstream has landed, so this should be available in the next build. I'm also changing the title so I can remember what this bug actually was when I get notifications about it. :-)
Job that was failing due to this issue is now passing cleanly.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0603.html