1770030 – [4.4.0-5] after deploy of HE the defined fqdn on the host changed to localhost.localdomain

Bug 1770030 - [4.4.0-5] after deploy of HE the defined fqdn on the host changed to localhost.localdomain

Summary: [4.4.0-5] after deploy of HE the defined fqdn on the host changed to localhos...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	ovirt-ansible-collection
Classification:	oVirt
Component:	hosted-engine-setup
Sub Component:
Version:	unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	ovirt-4.4.0
Target Release:	1.0.34
Assignee:	Dominik Holler
QA Contact:	Nikolai Sednev
Docs Contact:
URL:
Whiteboard:
Depends On:	1779182
Blocks:	1701490
TreeView+	depends on / blocked

Reported:	2019-11-07 23:12 UTC by Kobi Hakimi
Modified:	2019-12-23 08:05 UTC (History)
CC List:	9 users (show)
Fixed In Version:	ovirt-ansible-hosted-engine-setup-1.0.34
Clone Of:
Environment:
Last Closed:	2019-12-23 08:05:10 UTC
oVirt Team:	Network
Embargoed:
Flags:	sbonazzo: ovirt-4.4?

Attachments	(Terms of Use)
host logs (270.26 KB, application/gzip) 2019-11-07 23:12 UTC, Kobi Hakimi	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	oVirt ovirt-ansible-hosted-engine-setup pull 275	0	'None'	closed	Avoid using TLS access on libvirtd	2020-09-08 13:00:29 UTC

Description Kobi Hakimi 2019-11-07 23:12:45 UTC

Created attachment 1633813 [details]
host logs

Description of problem:
[4.4.0-4] after deploy of HE the defined fqdn on the host changed to localhost.localdomain

Version-Release number of selected component (if applicable):
ovirt-ansible-engine-setup-1.1.9-1.el8ev.noarch
python3-ovirt-setup-lib-1.3.0-0.0.master.20190419120545.gitfbe1cbd.el8ev.noarch
ovirt-hosted-engine-setup-2.4.0-0.1.master.20191104160243.git0c51343.el8ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. run the command: "hostname" to make sure your host fqdn defined as expected. 
2. deploy the HE over ISCSI


Actual results:
the defined fqdn on the host changed to localhost.localdomain

Expected results:
leave the fqdn as it was

Additional info:
also happened on deploy of HE over NFS

Comment 1 Martin Perina 2019-11-14 09:28:21 UTC

host deploy flow doesn't alter the hostname in any way

Comment 2 Sandro Bonazzola 2019-11-20 08:15:42 UTC

How did you set host name? just plain hostname command or hostnamectl?

Comment 3 Kobi Hakimi 2019-11-20 14:49:34 UTC

we reprovision the host with foreman and when I run hostnamectl before the deploy I got:
[root@caracal01 ~]# hostnamectl
   Static hostname: localhost.localdomain
Transient hostname: caracal01.lab.eng.tlv2.redhat.com
         Icon name: computer-server
           Chassis: server
        Machine ID: f7cff39c87324de9a5e42ceb60139e7f
           Boot ID: c16d8f55216f4a38ab39779c052be6cb
  Operating System: Red Hat Enterprise Linux 8.1 (Ootpa)
       CPE OS Name: cpe:/o:redhat:enterprise_linux:8.1:GA
            Kernel: Linux 4.18.0-147.el8.x86_64
      Architecture: x86-64
[root@caracal01 ~]#

after hosted engine deploy:
[root@caracal01 ~]# hostnamectl
   Static hostname: localhost.localdomain
         Icon name: computer-server
           Chassis: server
        Machine ID: f7cff39c87324de9a5e42ceb60139e7f
           Boot ID: c16d8f55216f4a38ab39779c052be6cb
  Operating System: Red Hat Enterprise Linux 8.1 (Ootpa)
       CPE OS Name: cpe:/o:redhat:enterprise_linux:8.1:GA
            Kernel: Linux 4.18.0-147.el8.x86_64
      Architecture: x86-64
[root@caracal01 ~]#

So I think the transient hostnames set/updated by DHCP in the beginning

Comment 4 Kobi Hakimi 2019-11-21 15:45:01 UTC

according to @simone maybe as a side effect of this issue the hosted engine deploy failed with the following error:
16:42:38 TASK [ovirt.hosted_engine_setup : Shutdown local VM] ***************************
16:42:39 An exception occurred during task execution. To see the full traceback, use -vvv. The error was: libvirt.libvirtError: unable to connect to server at 'lynx22.lab.eng.tlv2.redhat.com:16514': Connection refused
16:42:39 fatal: [lynx22.lab.eng.tlv2.redhat.com]: FAILED! => {"changed": false, "msg": "unable to connect to server at 'lynx22.lab.eng.tlv2.redhat.com:16514': Connection refused"}

in code it failed in:
https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/blob/master/tasks/create_target_vm/03_hosted_engine_final_tasks.yml#L90

as you can see in:
https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv-ge-deploy-4.4/123/console

Comment 5 Evgeny Slutsky 2019-11-25 16:26:30 UTC

it seems the hosted engine deployment failed early,
libvirt tls socket  activation failed on port 16514,
looks similar to https://bugzilla.redhat.com/show_bug.cgi?id=1752837

we need to to work with libvirt > 5.6.0-6

Comment 6 Evgeny Slutsky 2019-11-26 09:28:51 UTC

after upgrading to package from https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1013923

libvirt-5.6.0-6.module+el8.1.0+4244+9aa4e6bb.x86_64


tls socket activation is working:
virsh -c qemu+tls://lynx01.lab.eng.tlv2.redhat.com:16514/system list
 Id   Name                State
-----------------------------------
 1    HostedEngineLocal   running

Comment 7 Sandro Bonazzola 2019-11-26 09:33:30 UTC

Seems fixed by https://access.redhat.com/errata/RHBA-2019:3723.
Moving to QA to verify it.

Comment 8 Lukas Svaty 2019-11-29 12:16:15 UTC

Failed in libvirt-5.6.0-6.1.module+el8.1.0+4754+8d38b36b.x86_64

12:41:03 < tiraboschi> dfodor: sbonazzo: the same even with libvirt-5.6.0-6.1.module+el8.1.0+4754+8d38b36b.x86_64

Comment 9 Jiri Macku 2019-12-02 15:56:28 UTC

The following combination 

libvirt 5.6.0-6.1
and
ovirt-ansible-hosted-engine-setup-1.0.34-1.el8ev.noarch.rpm

fix this issue.

Comment 10 Michael Burman 2019-12-03 09:58:13 UTC

The issue seems to reproduce without HE at all.
Add host to rhv 4.4.0-6 over hostname, the hostname is overridden and it's now localhost.localdomain
Should i open a separate bug?

Comment 11 Evgeny Slutsky 2019-12-03 10:06:26 UTC

(In reply to Michael Burman from comment #10)
> The issue seems to reproduce without HE at all.
> Add host to rhv 4.4.0-6 over hostname, the hostname is overridden and it's
> now localhost.localdomain
> Should i open a separate bug?

Yes please,
cam you also run hostnamectl, before and after the host deployment.

Comment 12 Michael Burman 2019-12-03 13:19:21 UTC

(In reply to Evgeny Slutsky from comment #11)
> (In reply to Michael Burman from comment #10)
> > The issue seems to reproduce without HE at all.
> > Add host to rhv 4.4.0-6 over hostname, the hostname is overridden and it's
> > now localhost.localdomain
> > Should i open a separate bug?
> 
> Yes please,
> cam you also run hostnamectl, before and after the host deployment.

New bug for the non HE scenario. All info is there. tnx

Comment 13 Michael Burman 2019-12-03 13:21:09 UTC

(In reply to Michael Burman from comment #12)
> (In reply to Evgeny Slutsky from comment #11)
> > (In reply to Michael Burman from comment #10)
> > > The issue seems to reproduce without HE at all.
> > > Add host to rhv 4.4.0-6 over hostname, the hostname is overridden and it's
> > > now localhost.localdomain
> > > Should i open a separate bug?
> > 
> > Yes please,
> > cam you also run hostnamectl, before and after the host deployment.
> 

New bug for the non HE scenario. All info is there. tnx
BZ 1779182

Comment 14 Dominik Holler 2019-12-04 14:42:36 UTC

Would be setting the static host name to the transient host name a workaround for you, until nmstate/NetworkManager is enabled by default?

Comment 15 Kobi Hakimi 2019-12-04 16:15:30 UTC

(In reply to Dominik Holler from comment #14)
> Would be setting the static host name to the transient host name a
> workaround for you, until nmstate/NetworkManager is enabled by default?

after https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/pull/275 merged and we added the following task before deploy:
shell: hostnamectl set-hostname {{ inventory_hostname }}
 
I don't have this issue anymore.

Comment 16 Dominik Holler 2019-12-11 08:30:50 UTC

Kobi, can you please check if the static hostname is set in the kickstart file used to deploy the host?
The kickstart file may contain a line like 
network --bootproto dhcp --hostname xxx.xx.xxx.tlv.redhat.com --device=xxx:xx:xx:xx:xx:xx

If such or a similar line is not in the kickstart file (or this line is commented), how is the static hostname set?

Comment 17 Jiri Macku 2019-12-11 09:16:58 UTC

I checked all our kickstart templates used to provision host based on RHEL 8.1 (including rhvh-4.4).

We do have this line in all our kickstart files and this line is not commented out.

Comment 18 Dominik Holler 2019-12-11 09:52:23 UTC

(In reply to Jiri Macku from comment #17)
> I checked all our kickstart templates used to provision host based on RHEL
> 8.1 (including rhvh-4.4).
> 
> We do have this line in all our kickstart files and this line is not
> commented out.

I understood that this answer is a generic one, not about the affected host.
Can you please check if the issue reproduces if the /etc/hosts file contains
the expected hostname before starting the RHV deployment?

Comment 19 Dominik Holler 2019-12-11 10:31:45 UTC

(In reply to Dominik Holler from comment #18)
> (In reply to Jiri Macku from comment #17)
> > I checked all our kickstart templates used to provision host based on RHEL
> > 8.1 (including rhvh-4.4).
> > 
> > We do have this line in all our kickstart files and this line is not
> > commented out.
> 
> I understood that this answer is a generic one, not about the affected host.
> Can you please check if the issue reproduces if the /etc/hosts file contains

should be /etc/hostname

> the expected hostname before starting the RHV deployment?

Comment 20 Kobi Hakimi 2019-12-11 12:10:34 UTC

after mburman and I checked it, we found out that indeed the following line:
network --bootproto dhcp --hostname xxx.xx.xxx.tlv.redhat.com --device=xxx:xx:xx:xx:xx:xx

was comment out in rhel-8.1 kickstart file.
the kickstart fixed and according to mburman we don't see this issue anymore(regular env)

Comment 21 Dominik Holler 2019-12-19 10:03:53 UTC

Kobi, can this bug be closed now?

Comment 22 Kobi Hakimi 2019-12-22 09:40:48 UTC

(In reply to Dominik Holler from comment #21)
> Kobi, can this bug be closed now?

Yep, from our side, you can close it.
Thanks!!

Comment 23 Dominik Holler 2019-12-23 08:05:10 UTC

Kobi, thanks for checking.

Note You need to log in before you can comment on or make changes to this bug.