Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1361339 - [RFE] improve domain detection method with a more robust one in the bootstrap.py
[RFE] improve domain detection method with a more robust one in the bootstrap.py
Status: CLOSED ERRATA
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Bootstrap (Show other bugs)
6.2.0
x86_64 All
unspecified Severity low (vote)
: 6.2.9
: Unused
Assigned To: Rich Jerrido
jcallaha
: FutureFeature, Triaged
Depends On:
Blocks: 1426387
  Show dependency treegraph
 
Reported: 2016-07-28 17:05 EDT by Reartes Guillermo
Modified: 2017-05-01 09:53 EDT (History)
7 users (show)

See Also:
Fixed In Version: katello-client-bootstrap-1.3.0-1
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1426387 (view as bug list)
Environment:
Last Closed: 2017-05-01 09:53:35 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Github Katello/katello-client-bootstrap/issues/126 None None None 2016-11-16 17:34 EST
Red Hat Product Errata RHBA-2017:1191 normal SHIPPED_LIVE Satellite 6.2.9 Async Bug Release 2017-05-01 13:49:42 EDT

  None (edit)
Description Reartes Guillermo 2016-07-28 17:05:30 EDT
Description of problem:

I installed a fresh Sat6.2 to experiment, since it was release today/yesterday. 
I tried to register the first system, with the new bootstrap.py program but it failed when querying via API a wrong domain.

Version-Release number of selected component (if applicable):
6.2.0

How reproducible:
always (but requires a typo)

Steps to Reproduce:
[root@rhevm1 ~]# ./bootstrap.py -l admin -s testsat3.example.com -o 'FakeCorp' -L 'Central' -a RHEL6-RHEV-M -g RHEL6-RHEV-M
admin's password:
Foreman Bootstrap Script
This script is designed to register new systems or to migrate an existing system to a Foreman server with Katello
[NOTIFICATION], [2016-07-28 14:55:27], [This system is not registered to RHN. Attempting to register via subscription-manager] 
[NOTIFICATION], [2016-07-28 14:55:27], [Retrieving Client CA Certificate RPMs] 
[RUNNING], [2016-07-28 14:55:27], [rpm -Uvh http://testsat3.example.com/pub/katello-ca-consumer-latest.noarch.rpm] 
Recuperando http://testsat3.example.com/pub/katello-ca-consumer-latest.noarch.rpm
Preparando...               ##################################################
katello-ca-consumer-testsat3##################################################
[SUCCESS], [2016-07-28 14:55:30], [rpm -Uvh http://testsat3.example.com/pub/katello-ca-consumer-latest.noarch.rpm], completed successfully.

[ERROR], [2016-07-28 14:55:31], EXITING: [0 element in array for search key 'name="exmaple.com"' in API '/api/v2/domains'. Please note that all searches are case-sensitive. Fatal error.] failed to execute properly.

Wow, i got confused by this: name="exmaple.com"

I checked and rechecked Sat 6.2 instance, thinking i had a typo there somewhere. But no, all was ok in the Sat6.2.
So i modified the bootstrap.py and adding some print() statements as a debug measure.

So i traced it to "FQDN = socket.getfqdn()" in the bootstrap.py program.

And then read: https://github.com/ansible/ansible/issues/9972

The bootstrap error message is confusing, because:
 * It already logged in to the Sat6.2 instance, and can get the domains list via API.

 * Since i specified a Host Group in the bootstrap, and such Host Group has a both correct DOMAIN and SUBNET specified. 
   So i did not expect for the bootstrap to go and start prospecting it outside of the Sat6.2 instance. (and get a wrong value).
 
 * I was not aware of BZ#1343585, the domain must exist. Now i am. (but in this case the domain exists in Sat6.2 anyway).

 * It does not request it explicitly. (i am not claiming that it should)

So i went to the system that i am trying to register:

[root@rhevm1 ~]# hostname
rhevm1.example.com
[root@rhevm1 ~]# hostname -f
rhevm1.example.com
[root@rhevm1 etc]# hostname -d
example.com

The hostname is ok, so let's test DNS resolution:

[root@rhevm1 ~]# dig testsat3.example.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.37.rc1.el6_7.5 <<>> testsat3.example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36896
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;testsat3.example.com.          IN      A

;; ANSWER SECTION:
testsat3.example.com.   0       IN      A       192.168.207.93

;; Query time: 0 msec
;; SERVER: 192.168.207.25#53(192.168.207.25)
;; WHEN: Thu Jul 28 16:12:36 2016
;; MSG SIZE  rcvd: 54

# dig rhevm1.example.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.37.rc1.el6_7.5 <<>> rhevm1.example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36379
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;rhevm1.example.com.            IN      A

;; ANSWER SECTION:
rhevm1.example.com.     0       IN      A       192.168.207.47

;; Query time: 0 msec
;; SERVER: 192.168.207.25#53(192.168.207.25)
;; WHEN: Thu Jul 28 16:15:26 2016
;; MSG SIZE  rcvd: 52

Well, the DNS resolution do work. 
Even uname reports the correct nodename.

[root@rhevm1 ~]# uname -a
Linux rhevm1.example.com 2.6.32-573.12.1.el6.x86_64 #1 SMP Mon Nov 23 12:55:32 EST 2015 x86_64 x86_64 x86_64 GNU/Linux

[root@rhevm1 ~]# uname -n
rhevm1.example.com

Even the config files are ok:

[root@rhevm1 ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=rhevm1.example.com


In the end i found the typo:

[root@rhevm1 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.207.47  rhevm1.exmaple.com      rhevm1
192.168.207.46  hyper1.example.com

Well, i never noticed it because everything (RHEV-M, CFME, other Sat6.1) were using DNS request and were getting the correct FQDN.

The interesting thing is that "rhevm1" is neither the hostname nor the FQDN of the system. (it is the output of 'hostname -s', though).

So i am puzzled by the heuristic/logic that "FQDN = socket.getfqdn()" is using to purposely avoid the hostname and FQDN and hit an alias.

"rhevm1.example.com" is the output of both 'hostname' and 'hostname -f' and also 'uname -a' and 'uname -n' and the config files.
"rhevm1.exmaple.com" is NOT the output of neither 'hostname' nor 'hostname -f' and also is not the output of 'uname -a' nor 'uname -n' and is not in the config files.

Did bootstrap ever checked whatever it thinked the fqdn was with a dns request? ('hostname -d' returns the correct domain) 

Granted, the hosts file is wrong (so i opened this as an RFE and not a bug), but can the reported error message be improved somehow?

I think that a proper DNS sanity check request should be performed, or maybe use something else than "socket.getfqdn()"

Is not better to use some other command instead of "socket.getfqdn()", are there more robust alternatives? (i mean, that are able to catch that or something worse?)


Actual results:
Not very clear error message from the bootstrap script, it seems more like a Sat6.2 side than from the client.

Expected results:
A robust method to get the FQDN of the system to be registered.

Additional info:
I tried a quick lockup in the documentation "Administration guide" and "installation guide" but i found no references to the bootstrap.py. In the "Administration Guide" the old method is still listed.
I opened both pdf and searched for the string 'bootstrap' and found no results. 


Is it possible that the usage of the bootstrap.py script is not currently documented or i just missed it?
If it is already documented a note emphasizing to check the hosts file for typos probably should be added. (Just testing proper DNS resoultion would not be enough).

Cheers.
Comment 1 Rich Jerrido 2017-02-23 14:33:00 EST
Fixed in this upstream commit - https://github.com/Katello/katello-client-bootstrap/commit/49ec1ae44a7463f5f615ad6d59fccd21d9da6bda
Comment 2 pm-sat@redhat.com 2017-02-23 16:08:55 EST
Please add verifications steps for this bug to help QE verify
Comment 3 Rich Jerrido 2017-02-27 05:36:53 EST
Verification steps for this bug are the same as those here - https://bugzilla.redhat.com/show_bug.cgi?id=1425606#c4
Comment 4 Rich Jerrido 2017-03-13 19:45:20 EDT
We may want to rebase to katello-client-bootstrap-1.3.0 (https://github.com/Katello/katello-client-bootstrap/releases/tag/1.3.0) to address this.
Comment 5 jcallaha 2017-04-07 16:47:32 EDT
Verified in Satellite 6.2.9 Snap 2

The script now catches the short hostname, and a short name fqdn.

-bash-4.1# docker run -it -h shawty ch-d:bootstrap /bin/bash

[root@shawty ~]# ./bootstrap.py -s mgmt5.rhq.lab.eng.bos.redhat.com -o 'Default Organization' -g basic -a basickey -L 'Default Location'
Foreman Bootstrap Script
This script is designed to register new systems or to migrate an existing system to a Foreman server with Katello
We could not determine the domain of this machine, most probably `hostname -f` does not return the FQDN.
This can lead to Puppet missbehaviour and thus the script will terminate now.
You can override this by passing one of the following
	--force - to disable all checking
	--skip-puppet - to omit installing the puppet agent

[root@shawty ~]# hostname
shawty

[root@shawty ~]# ./bootstrap.py -s mgmt5.rhq.lab.eng.bos.redhat.com -o 'Default Organization' -g basic -a basickey -L 'Default Location' --fqdn $(hostname)
Foreman Bootstrap Script
This script is designed to register new systems or to migrate an existing system to a Foreman server with Katello
We could not determine the domain of this machine, most probably `hostname -f` does not return the FQDN.
This can lead to Puppet missbehaviour and thus the script will terminate now.
You can override this by passing one of the following
	--force - to disable all checking
	--skip-puppet - to omit installing the puppet agent
Comment 7 errata-xmlrpc 2017-05-01 09:53:35 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1191

Note You need to log in before you can comment on or make changes to this bug.