Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1734156

Summary: configuration drive isn't returning the same information as the metadata services and nova-join fails to enroll new nodes
Product: Red Hat OpenStack Reporter: David Hill <dhill>
Component: python-novajoinAssignee: Grzegorz Grasza <ggrasza>
Status: CLOSED ERRATA QA Contact: Jeremy Agee <jagee>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 13.0 (Queens)CC: acanan, emacchi, ggrasza, hjensas, jschluet, mburns, rcritten, rmascena, sbaker
Target Milestone: z9Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: python-novajoin-1.2.0-1.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-07 14:04:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Hill 2019-07-29 19:28:02 UTC
Description of problem:
configuration drive isn't returning the same information as the metadata services and nova-join fails to enroll new nodes because of the following code logic:

     if ! get_metadata_config_drive; then
        if ! get_metadata_network; then
            echo \"FATAL: No metadata available\"
            exit 1
        fi
     fi

as you can see with the following outputs:



{
  "static": {
    "cloud-init": "#cloud-config\npackages:\n - python-simplejson\n - ipa-client\n - ipa-admintools\n - openldap-clients\n - hostname\nwrite_files:\n - content: |\n     #!/bin/sh\n     \n     function get_metadata_config_drive {\n         if [ -f /run/cloud-init/status.json ]; then\n             # Get metadata from config drive\n             data=`cat /run/cloud-init/status.json`\n             config_drive=`echo $data | python -c 'import json,re,sys;obj=json.load(sys.stdin);ds=obj.get(\"v1\", {}).get(\"datasource\"); print(re.findall(r\"source=(.*)]\", ds)[0])'`\n             if [[ -b $config_drive ]]; then\n                 temp_dir=`mktemp -d`\n                 mount $config_drive $temp_dir\n                 if [ -f $temp_dir/openstack/latest/vendor_data2.json ]; then\n                     data=`cat $temp_dir/openstack/latest/vendor_data2.json`\n                     umount $config_drive\n                     rmdir $temp_dir\n                 else\n                     umount $config_drive\n                     rmdir $temp_dir\n                 fi\n             else \n                 echo \"Unable to retrieve metadata from config drive.\"\n                 return 1\n             fi\n         else\n             echo \"Unable to retrieve metadata from config drive.\"\n             return 1\n         fi\n     \n         return 0\n     }\n     \n     function get_metadata_network {\n         # Get metadata over the network\n         data=$(timeout 300 /bin/bash -c 'data=\"\"; while [ -z \"$data\" ]; do sleep $[ ( $RANDOM % 10 )  + 1 ]s; data=`curl -s http://169.254.169.254/openstack/2016-10-06/vendor_data2.json 2>/dev/null`; done; echo $data')\n     \n         if [[ $? != 0 ]] ; then\n             echo \"Unable to retrieve metadata from metadata service.\"\n             return 1\n         fi\n     }\n     \n     \n     if ! get_metadata_config_drive; then\n        if ! get_metadata_network; then\n            echo \"FATAL: No metadata available\"\n            exit 1\n        fi\n     fi\n     \n     # Get the instance hostname out of the metadata\n     fqdn=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"hostname\", \"\"))'`\n      \n     if [ -z \"$fqdn\" ]; then\n         echo \"Unable to determine hostname\"\n         exit 1\n     fi\n      \n     realm=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"krb_realm\", \"\"))'`\n     otp=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"ipaotp\", \"\"))'`\n     \n     hostname=`/bin/hostname -f`\n      \n     # run ipa-client-install\n     OPTS=\"-U -w $otp\"\n     if [ $hostname != $fqdn ]; then\n         OPTS=\"$OPTS --hostname $fqdn\"\n     fi\n     if [ -n \"$realm\" ]; then\n         OPTS=\"$OPTS --realm=$realm\"\n     fi\n     ipa-client-install $OPTS\n   path: /root/setup-ipa-client.sh\n   permissions: '0700'\n   owner: root:root\nruncmd:\n- sh -x /root/setup-ipa-client.sh > /var/log/setup-ipa-client.log 2>&1"
  },
  "join": {
    "krb_realm": "IDM.localdomain",
    "hostname": "overcloud-controller-1.idm.localdomain"
  }
}

versus:


[root@overcloud-controller-1 ~]# curl -s http://169.254.169.254/openstack/2016-10-06/vendor_data2.json | jq .
{
  "static": {
    "cloud-init": "#cloud-config\npackages:\n - python-simplejson\n - ipa-client\n - ipa-admintools\n - openldap-clients\n - hostname\nwrite_files:\n - content: |\n     #!/bin/sh\n     \n     function get_metadata_config_drive {\n         if [ -f /run/cloud-init/status.json ]; then\n             # Get metadata from config drive\n             data=`cat /run/cloud-init/status.json`\n             config_drive=`echo $data | python -c 'import json,re,sys;obj=json.load(sys.stdin);ds=obj.get(\"v1\", {}).get(\"datasource\"); print(re.findall(r\"source=(.*)]\", ds)[0])'`\n             if [[ -b $config_drive ]]; then\n                 temp_dir=`mktemp -d`\n                 mount $config_drive $temp_dir\n                 if [ -f $temp_dir/openstack/latest/vendor_data2.json ]; then\n                     data=`cat $temp_dir/openstack/latest/vendor_data2.json`\n                     umount $config_drive\n                     rmdir $temp_dir\n                 else\n                     umount $config_drive\n                     rmdir $temp_dir\n                 fi\n             else \n                 echo \"Unable to retrieve metadata from config drive.\"\n                 return 1\n             fi\n         else\n             echo \"Unable to retrieve metadata from config drive.\"\n             return 1\n         fi\n     \n         return 0\n     }\n     \n     function get_metadata_network {\n         # Get metadata over the network\n         data=$(timeout 300 /bin/bash -c 'data=\"\"; while [ -z \"$data\" ]; do sleep $[ ( $RANDOM % 10 )  + 1 ]s; data=`curl -s http://169.254.169.254/openstack/2016-10-06/vendor_data2.json 2>/dev/null`; done; echo $data')\n     \n         if [[ $? != 0 ]] ; then\n             echo \"Unable to retrieve metadata from metadata service.\"\n             return 1\n         fi\n     }\n     \n     \n     if ! get_metadata_config_drive; then\n        if ! get_metadata_network; then\n            echo \"FATAL: No metadata available\"\n            exit 1\n        fi\n     fi\n     \n     # Get the instance hostname out of the metadata\n     fqdn=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"hostname\", \"\"))'`\n      \n     if [ -z \"$fqdn\" ]; then\n         echo \"Unable to determine hostname\"\n         exit 1\n     fi\n      \n     realm=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"krb_realm\", \"\"))'`\n     otp=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"ipaotp\", \"\"))'`\n     \n     hostname=`/bin/hostname -f`\n      \n     # run ipa-client-install\n     OPTS=\"-U -w $otp\"\n     if [ $hostname != $fqdn ]; then\n         OPTS=\"$OPTS --hostname $fqdn\"\n     fi\n     if [ -n \"$realm\" ]; then\n         OPTS=\"$OPTS --realm=$realm\"\n     fi\n     ipa-client-install $OPTS\n   path: /root/setup-ipa-client.sh\n   permissions: '0700'\n   owner: root:root\nruncmd:\n- sh -x /root/setup-ipa-client.sh > /var/log/setup-ipa-client.log 2>&1"
  },
  "join": {
    "krb_realm": "IDM.localdomain",
    "ipaotp": "41cca7447d054e5eaa06b100dac38629", 
    "hostname": "overcloud-controller-1.idm.localdomain"
  }
}


the ipaotp value is missing in the later one.





Version-Release number of selected component (if applicable):
Latest

How reproducible:
All nodes

Steps to Reproduce:
1. Deploy this overcloud in this environment.
2.
3.

Actual results:
Nodes fail to enroll with IPA

Expected results:
No issues

Additional info:

Comment 7 Grzegorz Grasza 2019-09-04 14:31:04 UTC
My interpretation of the original report is that there are two issues:
 * not reading metadata from network, when we don't have it in config-drive, which we fix in https://review.opendev.org/#/c/677765/
 * missing OTP token in config-drive metadata, which we suspect is because a host with the same name was already enrolled in IPA when the node was provisioned. It was deleted afterwards. We are adding better logging for this case in https://review.opendev.org/#/c/677455/

It might be the case, that the host was deleted too late, while a new one was already being provisioned. Hosts are deleted from IPA asynchronously via notifications which are sent by nova, received by the novajoin-notifier service. Their processing could be delayed and retried, depending on connectivity between nova <-> rabbitmq <-> novajoin <-> IPA.

Comment 16 Alex McLeod 2019-10-31 11:29:01 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Comment 19 errata-xmlrpc 2019-11-07 14:04:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3791