Bug 1419906

Summary: 4.1 RHEVH hostname changes to localhost.localdomain after Vdsm acquires an NM configured interface
Product: [oVirt] vdsm Reporter: Nikolai Sednev <nsednev>
Component: GeneralAssignee: Edward Haas <edwardh>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: medium Docs Contact:
Priority: medium    
Version: ---CC: bsanford, bugs, cshao, danken, didi, edwardh, mburman, nsednev, rbarry, usurse, weiwang, ycui, ylavi, yzhao
Target Milestone: ovirt-4.2.0Keywords: TestOnly, Triaged
Target Release: ---Flags: ylavi: ovirt-4.2+
cshao: testing_plan_complete+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1422610 (view as bug list) Environment:
Last Closed: 2017-12-20 11:11:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1422610    
Bug Blocks: 132679, 1475943    
Attachments:
Description Flags
sosreport-localhost.localdomain-20170207132736.tar.xz none

Description Nikolai Sednev 2017-02-07 11:30:09 UTC
Description of problem:
4.1 RHEVH hostname changes to localhost.localdomain after HE successful deployment.

alma03 ~]# hostname
localhost.localdomain
[root@alma03 ~]# hostnamectl status
   Static hostname: localhost.localdomain
         Icon name: computer
        Machine ID: 7d6810590b6f4bcb9899c396fac42315
           Boot ID: 50e260a380d645869099b385346070ca
  Operating System: Red Hat Virtualization Host 4.1 (el7.3)
       CPE OS Name: cpe:/o:redhat:enterprise_linux:7.3:GA:hypervisor
            Kernel: Linux 3.10.0-514.6.1.el7.x86_64
      Architecture: x86-64


Version-Release number of selected component (if applicable):
Host:
rhvm-appliance-4.1.20170126.0-1.el7ev.noarch
ovirt-imageio-common-1.0.0-0.el7ev.noarch
ovirt-hosted-engine-ha-2.1.0.1-1.el7ev.noarch
ovirt-hosted-engine-setup-2.1.0.1-1.el7ev.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch
ovirt-host-deploy-1.6.0-1.el7ev.noarch
ovirt-vmconsole-1.0.4-1.el7ev.noarch
ovirt-node-ng-nodectl-4.1.0-0.20170104.1.el7.noarch
libvirt-client-2.0.0-10.el7_3.4.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.3.x86_64
vdsm-4.19.4-1.el7ev.x86_64
sanlock-3.4.0-1.el7.x86_64
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
mom-0.5.8-1.el7ev.noarch
ovirt-imageio-daemon-1.0.0-0.el7ev.noarch
ovirt-setup-lib-1.1.0-1.el7ev.noarch
Linux version 3.10.0-514.6.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Sat Dec 10 11:15:38 EST 2016
Linux localhost.localdomain 3.10.0-514.6.1.el7.x86_64 #1 SMP Sat Dec 10 11:15:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux release 7.3



How reproducible:
100%

Steps to Reproduce:
1.Deploy HE on RHEVH.
2.
3.

Actual results:
Once HE successfully deployed, hostname changed to localhost.localdomain.

Expected results:
Hostname should be inherited from DNS resolution received DNS, once IP was received from DHCP.

Additional info:

Forth to our conversation with Simone:
alma03 NetworkManager[1024]: <info>  [1486462328.2852] policy: setting system hostname to 'alma03.qa.lab.tlv.redhat.com' (from address lookup)
alma03 nm-dispatcher: req:3 'up' [p1p1]: new request (4 scripts)
alma03 systemd-hostnamed: Changed host name to 'alma03.qa.lab.tlv.redhat.com'
alma03 nm-dispatcher: req:4 'hostname': new request (4 scripts)
alma03 nm-dispatcher: req:3 'up' [p1p1]: start running ordered scripts...
    ....
alma03 systemd: Started Hostname Service.
alma03 NetworkManager[988]: <info>  [1486462495.6054] settings: hostname: using hostnamed
alma03 NetworkManager[988]: <info>  [1486462495.6054] settings: hostname changed from (none) to "localhost.localdomain"

initially the hostname was set by NetworkManager as for address lookup
but now it's not working anymore so it's falling back to "localhost.localdomain"

Comment 1 Nikolai Sednev 2017-02-07 11:34:48 UTC
Adding more detailed reproduction steps here:
1)Reprovisioned to fresh 4.1 RHEVH.
2)This is the defaults on host:
[root@alma03 ~]# cat /etc/ovirt-host-deploy.conf.d/90-ngn-do-not-keep-networkmanager.conf 
[environment:init]
VDSM/disableNetworkManager=bool:True
3)Changed to not to turn off the NM and rebooted:
[root@alma03 ~]# vi /etc/ovirt-host-deploy.conf.d/90-ngn-do-not-keep-networkmanager.conf 
[root@alma03 ~]# cat /etc/ovirt-host-deploy.conf.d/90-ngn-do-not-keep-networkmanager.conf 
[environment:init]
VDSM/disableNetworkManager=bool:False
[root@alma03 ~]# reboot
Connection to alma03.qa.lab.tlv.redhat.com closed by remote host.
Connection to alma03.qa.lab.tlv.redhat.com closed.
4)Once rebooted, checked that value stayed False:
# cat /etc/ovirt-host-deploy.conf.d/90-ngn-do-not-keep-networkmanager.conf 
[environment:init]
VDSM/disableNetworkManager=bool:False
5)Manually installed rhvm-appliance-4.1.20170126.0-1.el7ev.noarch on host.
6)Started the HE deployment over NFS from Cockpit.
7)Hosted Engine Setup successfully completed!
cat /etc/ovirt-host-deploy.conf.d/90-ngn-do-not-keep-networkmanager.conf [environment:init]
VDSM/disableNetworkManager=bool:False
You have mail in /var/spool/mail/root
# systemctl status NetworkManager -l
● NetworkManager.service - Network Manager
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2017-02-07 12:14:55 IST; 24min ago
     Docs: man:NetworkManager(8)
 Main PID: 988 (NetworkManager)
   CGroup: /system.slice/NetworkManager.service
           └─988 /usr/sbin/NetworkManager --no-daemon

Feb 07 12:26:06 alma03.qa.lab.tlv.redhat.com NetworkManager[988]: <info>  [1486463166.8094] manager: NetworkManager state is now DISCONNECTED
Feb 07 12:26:06 alma03.qa.lab.tlv.redhat.com NetworkManager[988]: <info>  [1486463166.8100] policy: setting system hostname to 'localhost.localdomain' (no default device)
Feb 07 12:26:06 alma03.qa.lab.tlv.redhat.com NetworkManager[988]: <info>  [1486463166.8144] policy: setting system hostname to 'localhost.localdomain' (no default device)
Feb 07 12:26:08 localhost.localdomain NetworkManager[988]: <info>  [1486463168.5258] manager: (ovirtmgmt): new Bridge device (/org/freedesktop/NetworkManager/Devices/6)
Feb 07 12:26:11 localhost.localdomain NetworkManager[988]: <info>  [1486463171.5281] device (p1p1): link connected
Feb 07 12:26:12 localhost.localdomain NetworkManager[988]: <info>  [1486463172.0090] device (ovirtmgmt): link connected
Feb 07 12:30:59 localhost.localdomain NetworkManager[988]: <info>  [1486463459.7623] manager: (vnet0): new Tun device (/org/freedesktop/NetworkManager/Devices/7)
Feb 07 12:30:59 localhost.localdomain NetworkManager[988]: <info>  [1486463459.8038] device (vnet0): state change: unmanaged -> unavailable (reason 'connection-assumed') [10 20 41]
Feb 07 12:30:59 localhost.localdomain NetworkManager[988]: <info>  [1486463459.8047] device (vnet0): state change: unavailable -> disconnected (reason 'none') [20 30 0]
Feb 07 12:38:29 localhost.localdomain NetworkManager[988]: <info>  [1486463909.7126] device (vnet0): state change: disconnected -> unmanaged (reason 'unmanaged') [30 10 3]

Comment 2 Nikolai Sednev 2017-02-07 11:35:45 UTC
Created attachment 1248371 [details]
sosreport-localhost.localdomain-20170207132736.tar.xz

Comment 3 Nikolai Sednev 2017-02-07 13:08:41 UTC
In UI you will also see that under "Name" of the host "localhost.localdomain" will appear.

Comment 4 Red Hat Bugzilla Rules Engine 2017-02-07 15:24:41 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 5 Yaniv Lavi 2017-02-08 09:45:24 UTC
Did you ever encounter this issue?

Comment 6 Nikolai Sednev 2017-02-08 13:50:13 UTC
This issue does not happen not for 3.6RHEVH, not for 4.0RHEVH, but only for 4.1RHEVHs on all four hosts.
I don't see here any infrastructure issue.

Comment 7 Edward Haas 2017-02-15 11:45:09 UTC
I suspect that NM is setting the hostname based on the DHCP response but we do not support that using ifcfg.

Please collect and provide the the following information from the host before deploying it:
- ifcfg files.
- nmcli con show <connection name on which the management IP is set>

After the deploy, we need the same ifcfg files, to check the differences.

Comment 8 Michael Burman 2017-02-15 12:22:12 UTC
The hostname is changed to localhost much earlier, no need to wait for HE setup to end.
Once vdsm configuring the management bridge it's changing to localhost.
If restarting network service, the hostname is back to correct hostname.

[root@orchid-vds1 ~]# nmcli c s
NAME    UUID                                  TYPE            DEVICE 
enp4s0  d3cf5b62-2164-4100-9008-f227ab96f978  802-3-ethernet  enp4s0 
enp6s0  ded8ca78-089d-4433-b5a3-bc6eb54b9d1d  802-3-ethernet  --     
ens1f0  70b219fa-91cb-4281-a2e1-f63321a8c046  802-3-ethernet  --     
ens1f1  70935ce0-18db-4557-b30a-ae5403fd4b5f  802-3-ethernet  --     
[root@orchid-vds1 ~]# nmcli c s id enp4s0
connection.id:                          enp4s0
connection.uuid:                        d3cf5b62-2164-4100-9008-f227ab96f978
connection.stable-id:                   --
connection.interface-name:              enp4s0
connection.type:                        802-3-ethernet
connection.autoconnect:                 yes
connection.autoconnect-priority:        0
connection.timestamp:                   1487158384
connection.read-only:                   no
connection.permissions:                 
connection.zone:                        --
connection.master:                      --
connection.slave-type:                  --
connection.autoconnect-slaves:          -1 (default)
connection.secondaries:                 
connection.gateway-ping-timeout:        0
connection.metered:                     unknown
connection.lldp:                        -1 (default)
802-3-ethernet.port:                    --
802-3-ethernet.speed:                   0
802-3-ethernet.duplex:                  --
802-3-ethernet.auto-negotiate:          yes
802-3-ethernet.mac-address:             --
802-3-ethernet.cloned-mac-address:      --
802-3-ethernet.generate-mac-address-mask:--
802-3-ethernet.mac-address-blacklist:   
802-3-ethernet.mtu:                     auto
802-3-ethernet.s390-subchannels:        
802-3-ethernet.s390-nettype:            --
802-3-ethernet.s390-options:            
802-3-ethernet.wake-on-lan:             1 (default)
802-3-ethernet.wake-on-lan-password:    --
ipv4.method:                            auto
ipv4.dns:                               
ipv4.dns-search:                        
ipv4.dns-options:                       (default)
ipv4.dns-priority:                      0
ipv4.addresses:                         
ipv4.gateway:                           --
ipv4.routes:                            
ipv4.route-metric:                      -1
ipv4.ignore-auto-routes:                no
ipv4.ignore-auto-dns:                   no
ipv4.dhcp-client-id:                    --
ipv4.dhcp-timeout:                      0
ipv4.dhcp-send-hostname:                yes
ipv4.dhcp-hostname:                     --
ipv4.dhcp-fqdn:                         --
ipv4.never-default:                     no
ipv4.may-fail:                          yes
ipv4.dad-timeout:                       -1 (default)
ipv6.method:                            auto
ipv6.dns:                               
ipv6.dns-search:                        
ipv6.dns-options:                       (default)
ipv6.dns-priority:                      0
ipv6.addresses:                         
ipv6.gateway:                           --
ipv6.routes:                            
ipv6.route-metric:                      -1
ipv6.ignore-auto-routes:                no
ipv6.ignore-auto-dns:                   no
ipv6.never-default:                     no
ipv6.may-fail:                          yes
ipv6.ip6-privacy:                       -1 (unknown)
ipv6.addr-gen-mode:                     eui64
ipv6.dhcp-send-hostname:                yes
ipv6.dhcp-hostname:                     --
ipv6.token:                             --
GENERAL.NAME:                           enp4s0
GENERAL.UUID:                           d3cf5b62-2164-4100-9008-f227ab96f978
GENERAL.DEVICES:                        enp4s0
GENERAL.STATE:                          activated
GENERAL.DEFAULT:                        yes
GENERAL.DEFAULT6:                       yes
GENERAL.VPN:                            no
GENERAL.ZONE:                           --
GENERAL.DBUS-PATH:                      /org/freedesktop/NetworkManager/ActiveConnection/1
GENERAL.CON-PATH:                       /org/freedesktop/NetworkManager/Settings/1
GENERAL.SPEC-OBJECT:                    /
GENERAL.MASTER-PATH:                    --
IP4.ADDRESS[1]:                         10.35.128.22/24
IP4.GATEWAY:                            10.35.128.254
IP4.ROUTE[1]:                           dst = 10.35.28.1/32, nh = 10.35.128.254, mt = 100
IP4.DNS[1]:                             10.35.64.1
IP4.DOMAIN[1]:                          qa.lab.tlv.redhat.com
IP4.DOMAIN[2]:                          lab.tlv.redhat.com
IP4.DOMAIN[3]:                          tlv.redhat.com
IP4.DOMAIN[4]:                          redhat.com
DHCP4.OPTION[1]:                        requested_classless_static_routes = 1
DHCP4.OPTION[2]:                        requested_rfc3442_classless_static_routes = 1
DHCP4.OPTION[3]:                        subnet_mask = 255.255.255.0
DHCP4.OPTION[4]:                        requested_subnet_mask = 1
DHCP4.OPTION[5]:                        domain_name_servers = 10.35.64.1
DHCP4.OPTION[6]:                        ip_address = 10.35.128.22
DHCP4.OPTION[7]:                        filename = pxelinux.0
DHCP4.OPTION[8]:                        requested_static_routes = 1
DHCP4.OPTION[9]:                        dhcp_server_identifier = 10.35.28.1
DHCP4.OPTION[10]:                       requested_nis_servers = 1
DHCP4.OPTION[11]:                       requested_time_offset = 1
DHCP4.OPTION[12]:                       time_offset = 1
DHCP4.OPTION[13]:                       broadcast_address = 10.35.128.255
DHCP4.OPTION[14]:                       requested_interface_mtu = 1
DHCP4.OPTION[15]:                       requested_domain_name_servers = 1
DHCP4.OPTION[16]:                       dhcp_message_type = 5
DHCP4.OPTION[17]:                       requested_broadcast_address = 1
DHCP4.OPTION[18]:                       routers = 10.35.128.254
DHCP4.OPTION[19]:                       requested_domain_name = 1
DHCP4.OPTION[20]:                       domain_name = qa.lab.tlv.redhat.com lab.tlv.redhat.com tlv.redhat.com redhat.com
DHCP4.OPTION[21]:                       requested_routers = 1
DHCP4.OPTION[22]:                       expiry = 1487194733
DHCP4.OPTION[23]:                       requested_wpad = 1
DHCP4.OPTION[24]:                       requested_nis_domain = 1
DHCP4.OPTION[25]:                       requested_ms_classless_static_routes = 1
DHCP4.OPTION[26]:                       network_number = 10.35.128.0
DHCP4.OPTION[27]:                       requested_domain_search = 1
DHCP4.OPTION[28]:                       next_server = 10.35.70.30
DHCP4.OPTION[29]:                       requested_ntp_servers = 1
DHCP4.OPTION[30]:                       ntp_servers = 10.35.28.1 10.35.255.6
DHCP4.OPTION[31]:                       dhcp_lease_time = 43200
DHCP4.OPTION[32]:                       requested_host_name = 1
IP6.ADDRESS[1]:                         2620:52:0:2380:214:5eff:fe17:d5b0/64
IP6.ADDRESS[2]:                         fe80::214:5eff:fe17:d5b0/64
IP6.GATEWAY:                            fe80:52:0:2380::fe
IP6.ROUTE[1]:                           dst = 2620:52:0:2380::/64, nh = ::, mt = 100
[root@orchid-vds1 ~]# 
[root@orchid-vds1 ~]# 
[root@orchid-vds1 ~]# 
[root@orchid-vds1 ~]# 
[root@orchid-vds1 ~]# 
[root@orchid-vds1 ~]# 
[root@orchid-vds1 ~]# 
[root@orchid-vds1 ~]# 
[root@orchid-vds1 ~]# hostnamectl 
   Static hostname: localhost.localdomain
Transient hostname: orchid-vds1.qa.lab.tlv.redhat.com
         Icon name: computer-server
           Chassis: server
        Machine ID: 9a1ff3992c28470d992270e45ccda5ca
           Boot ID: 6e5149f0a522471cbe793ede1bd83f63
  Operating System: Red Hat Virtualization Host 4.1 (el7.3)
       CPE OS Name: cpe:/o:redhat:enterprise_linux:7.3:GA:hypervisor
            Kernel: Linux 3.10.0-514.6.1.el7.x86_64
      Architecture: x86-64

Edy, if you were correct, then restart to network wouldn't bring the hostname back.

Comment 9 Edward Haas 2017-02-15 16:29:01 UTC
A few seconds after "systemctl restart network", it looks like NM overrides the hostname.

From the messages log:

Feb 15 17:58:01 localhost systemd: Started LSB: Bring up/down networking.
Feb 15 17:58:15 localhost dbus-daemon: dbus[862]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
Feb 15 17:58:15 localhost dbus[862]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
Feb 15 17:58:15 localhost systemd: Starting Hostname Service...
Feb 15 17:58:15 localhost dbus-daemon: dbus[862]: [system] Successfully activated service 'org.freedesktop.hostname1'
Feb 15 17:58:15 localhost dbus[862]: [system] Successfully activated service 'org.freedesktop.hostname1'
Feb 15 17:58:15 localhost systemd: Started Hostname Service.
Feb 15 17:58:16 localhost NetworkManager[953]: <info>  [1487174296.4969] policy: setting system hostname to 'localhost.localdomain' (no default device)
Feb 15 17:58:16 localhost systemd-hostnamed: Changed host name to 'localhost.localdomain'

Comment 10 Edward Haas 2017-02-16 10:32:24 UTC
On RHEL, the hostname which is retrieved from DHCP is stored as static on the host.
But on RHEVH it seems this is not the case, triggering this problem.

I think we should check if this can be changed in parallel to the problem fix needed from NM.

Comment 11 Ryan Barry 2017-02-16 14:41:57 UTC
(In reply to Nikolai Sednev from comment #6)
> This issue does not happen not for 3.6RHEVH, not for 4.0RHEVH, but only for
> 4.1RHEVHs on all four hosts.
> I don't see here any infrastructure issue.

The only change in 4.1 RHV-H is that NetworkManager is on. I'd expect the root cause to be somewhere inside NM (and for this to be reproducible on RHEL-H hosts with NM running).

Comment 12 Ryan Barry 2017-02-16 15:07:07 UTC
RHV-H does not touch/change hostnamectl (or anything related to networking), and we haven't since 4.x.

Unfortunately, I don't have time to try to get a reproducer today. But I would expect this bug to be elsewhere.

Can you please post the contents of /etc/hostname on RHEL and RHV-H systems?

If they differ, how is /etc/hostname being set on the RHEL systems? Kickstart? Manually?

Comment 13 Ying Cui 2017-02-16 15:44:14 UTC
Yihui, could you help on comment 5?

Comment 14 cshao 2017-02-20 07:19:55 UTC
Weiwang, could you help on comment 5 due to yzhao is on PTO these days.

Comment 15 Yihui Zhao 2017-02-20 15:50:24 UTC
My reproduce steps:
1. Install RHVH4.1 via PXE
2. use cmd "hostname" and "cat /etc/hostname"
  #hostname
dell-per730-34.lab.eng.pek2.redhat.com
  #cat /etc/hostname
localhost.localdomain

3. The cockpit display the hostname "localhost.localdomain", so I modify the hostname to "dell-per730-34.lab.eng.pek2.redhat.com" via cockpit.

4. vi /etc/ovirt-host-deploy.conf.d/90-ngn-do-not-keep-networkmanager.conf 
cat /etc/ovirt-host-deploy.conf.d/90-ngn-do-not-keep-networkmanager.conf
[environment:init]
VDSM/disableNetworkManager=bool:False

5. Reboot

6.Deploy HE successfully

After step6, 
[root@dell-per730-34 ~]#  systemctl status NetworkManager -l
● NetworkManager.service - Network Manager
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2017-02-20 10:01:59 EST; 35min ago
     Docs: man:NetworkManager(8)
 Main PID: 1335 (NetworkManager)
   CGroup: /system.slice/NetworkManager.service
           └─1335 /usr/sbin/NetworkManager --no-daemon

Feb 20 10:06:21 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487603181.1225] manager: (ovirtmgmt): new Bridge device (/org/freedesktop/NetworkManager/Devices/9)
Feb 20 10:06:24 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487603184.6198] device (em1): link connected
Feb 20 10:06:25 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487603185.1455] device (ovirtmgmt): link connected
Feb 20 10:11:36 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487603496.6653] manager: (vnet0): new Tun device (/org/freedesktop/NetworkManager/Devices/10)
Feb 20 10:11:36 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487603496.6874] device (vnet0): state change: unmanaged -> unavailable (reason 'connection-assumed') [10 20 41]
Feb 20 10:11:36 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487603496.6886] device (vnet0): state change: unavailable -> disconnected (reason 'none') [20 30 0]
Feb 20 10:31:00 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487604660.5679] device (vnet0): state change: disconnected -> unmanaged (reason 'unmanaged') [30 10 3]
Feb 20 10:32:13 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487604733.6722] manager: (vnet0): new Tun device (/org/freedesktop/NetworkManager/Devices/11)
Feb 20 10:32:13 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487604733.6833] device (vnet0): state change: unmanaged -> unavailable (reason 'connection-assumed') [10 20 41]
Feb 20 10:32:13 dell-per730-34.lab.eng.pek2.redhat.com NetworkManager[1335]: <info>  [1487604733.6840] device (vnet0): state change: unavailable -> disconnected (reason 'none') [20 30 0]


Version-Release number of selected component (if applicable):
ovirt-node-ng-nodectl-4.1.0-0.20170104.1.el7.noarch
cockpit-ovirt-dashboard-0.10.7-0.0.6.el7ev.noarch
ovirt-imageio-common-1.0.0-0.el7ev.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch
ovirt-setup-lib-1.1.0-1.el7ev.noarch
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
ovirt-imageio-daemon-1.0.0-0.el7ev.noarch
ovirt-hosted-engine-ha-2.1.0.1-1.el7ev.noarch
ovirt-hosted-engine-setup-2.1.0.1-1.el7ev.noarch
ovirt-vmconsole-1.0.4-1.el7ev.noarch
ovirt-host-deploy-1.6.0-1.el7ev.noarch
rhvm-appliance-4.1.20170126.0-1.el7ev.4.1.rpm
cockpit-ovirt-dashboard-0.10.7-0.0.6.el7ev.noarch
cockpit-bridge-126-1.el7.x86_64
cockpit-ws-126-1.el7.x86_64
cockpit-storaged-126-1.el7.noarch
cockpit-shell-126-1.el7.noarch
rhvh-4.1-0.20170208.0+1

So,according to my reproduce steps, I can't reproduce the bug.

Comment 16 Wei Wang 2017-02-21 02:14:34 UTC
(In reply to Yaniv Dary from comment #5)
> Did you ever encounter this issue?

According to comment #15, we cannot encounter this issue again.

Comment 17 Edward Haas 2017-02-21 07:08:55 UTC
(In reply to Wei Wang from comment #16)
> (In reply to Yaniv Dary from comment #5)
> > Did you ever encounter this issue?
> 
> According to comment #15, we cannot encounter this issue again.

Comment #15 has set the hostname statically, which is a workaround to the problem reported in this BZ.
In case a user expects the hostname to be retrieved dynamically from DHCP, he may not set the hostname statically on cockpit (or elsewhere).

Comment 18 Edward Haas 2017-02-21 07:23:28 UTC
Summarizing this BZ status and proposing a workaround to lower severity:

The problem has been detected in the steps of acquiring interfaces from NetworkManager by VDSM.
When NM is unmanaging an interface, it will cleanup the interface from all its setting and the VDSM through ifcfg initscript will take over and redefine it again. NM has a unwanted behaviour of overwriting the hostname, even through the hostname has been retrieved by the ifcfg initscript on an interface that is not managed by NM.
This only occurs if the static hostname has not been set, depending solely on the transient hostname (from dhcp).
A host with no static hostname has been seen only on RHV-H.

NM BZ has been open on this: https://bugzilla.redhat.com/show_bug.cgi?id=1422610

As a *workaround*, the user can set the static hostname manually (from cockpit for example, see comment #15).

Please confirm that the workaround is working out for you so we can suggest documenting it (if NM fix will not be available).

Comment 19 Nikolai Sednev 2017-02-22 12:41:08 UTC
1)alma04.qa.lab.tlv.redhat.com's password: 

  node status: OK
  See `nodectl check` for more information

Admin Console: https://10.35.117.26:9090/ or https://2620:52:0:2340:a236:9fff:fe3b:167c:9090/

2)[root@alma04 ~]# hostname
alma04.qa.lab.tlv.redhat.com
3)[root@alma04 ~]# cat /etc/hostname
localhost.localdomain
4)The cockpit display the hostname "localhost.localdomain", so I've modified the hostname to "alma04.qa.lab.tlv.redhat.com" via cockpit.
5)Restarted from cockpit.
6)Before HE deployment, once host rebooted "systemctl status NetworkManager -l"
● NetworkManager.service - Network Manager
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2017-02-22 14:34:42 IST; 1min 58s ago
     Docs: man:NetworkManager(8)
 Main PID: 974 (NetworkManager)
   CGroup: /system.slice/NetworkManager.service
           ├─ 974 /usr/sbin/NetworkManager --no-daemon
           └─1089 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-p1p1.pid -lf /var/lib/NetworkManager/dhclient-81e02010-5b23-46d0-a51c-58e7c6ebde5b-p1p1.lease -cf /var/lib/NetworkManager/dhclient-p1p1.conf p1p1

Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4422] dhcp4 (p1p1):   gateway 10.35.117.254
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4422] dhcp4 (p1p1):   server identifier 10.35.64.1
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4422] dhcp4 (p1p1):   lease time 86400
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4422] dhcp4 (p1p1):   nameserver '10.35.64.1'
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4422] dhcp4 (p1p1):   nameserver '10.35.255.6'
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4422] dhcp4 (p1p1):   domain name 'qa.lab.tlv.redhat.com'
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4423] dhcp4 (p1p1): state changed unknown -> bound
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4444] policy: set 'p1p1' (p1p1) as default for IPv4 routing and DNS
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com dhclient[1089]: bound to 10.35.117.26 -- renewal in 32727 seconds.
Feb 22 14:34:55 alma04.qa.lab.tlv.redhat.com NetworkManager[974]: <info>  [1487766895.4463] manager: startup complete

7)After rebooting the host, FQDN remains as was set in step 4 in cockpit and CLI.
alma04.qa.lab.tlv.redhat.com's password: 
Last login: Wed Feb 22 14:36:36 2017 from ovpn-116-36.ams2.redhat.com

  node status: OK
  See `nodectl check` for more information

Admin Console: https://10.35.117.26:9090/ or https://2620:52:0:2340:a236:9fff:fe3b:167c:9090/

[root@alma04 ~]# cat /etc/hostname
alma04.qa.lab.tlv.redhat.com
[root@alma04 ~]# hostname
alma04.qa.lab.tlv.redhat.com

Comment 20 Dan Kenigsberg 2017-02-22 16:34:47 UTC
Workaround for this bug: before deploying Hosted Engine, modify the hostname to fully-qualified domain name via cockpit.

Given the simple workaround, this should not be a 4.1-blocker.

Comment 21 Red Hat Bugzilla Rules Engine 2017-02-22 16:34:55 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 22 Yaniv Kaul 2017-02-23 11:18:24 UTC
(In reply to Dan Kenigsberg from comment #20)
> Workaround for this bug: before deploying Hosted Engine, modify the hostname
> to fully-qualified domain name via cockpit.
> 
> Given the simple workaround, this should not be a 4.1-blocker.

So can we move it to 4.1.2? 4.1.3?

Comment 23 Dan Kenigsberg 2017-02-23 11:34:33 UTC
I'm happy with keeping this as a 4.1.1 exception, so we give it higher immediate priority. But nothing horrible would happen if we solve this only in 4.1.3.

(I attempted to drop the blocker flag in comment 21, but the robots were stronger than me)

Comment 25 Yihui Zhao 2017-03-09 07:45:42 UTC
Update:

Can reproduce the issue. 

After deploy HE successfully, we should reboot RHVH host,
and the hostname change to "localhost.localdomain".

[root@localhost ~]# hostname 
localhost.localdomain



Additional info:

If we install RHVH4.1 ,and add the host to engine ,then reboot RHVH host, the hostname don't change.

Comment 28 Dan Kenigsberg 2017-03-22 08:54:02 UTC
When asking kickstart to set host name, we are not affected by this, hence reducing severity.

Comment 29 Ulhas Surse 2017-08-16 01:35:52 UTC
Hi, 

I tried to install RHVH 4.1 and after installation, the hostname was localhost.localdomain though the host was getting IP and hostname from the DNS at the time of installation. 

After installation completes, the hostname was localhost.localdomain and I observed that the network was not UP because ONBOOT parameter was set to "no" and the interface did not come up to get IP / Hostname assigned. After editing ONBOOT=yes and reboot, the host was able to take the IP and hostname correctly. 

This is before adding the host to RHVM. 

(I am not sure if this is relevant, Just wanted to share my observations)

Comment 30 Dan Kenigsberg 2017-08-16 10:52:34 UTC
Michael, now that rhel-7.4 and NetworkManager-1.8.0-0.4.rc1.el7 is out, can you re-verify this bug?

Comment 31 Dan Kenigsberg 2017-08-16 10:58:25 UTC
Since this showed up in a HostedEngine-related flow, would you try it on RHV-H-4.1.5-RHEL-7.4 ?

Comment 32 Nikolai Sednev 2017-08-16 13:06:26 UTC
(In reply to Dan Kenigsberg from comment #31)
> Since this showed up in a HostedEngine-related flow, would you try it on
> RHV-H-4.1.5-RHEL-7.4 ?

I may try this as soon as we'll get 4.1.5 NGN available for testing.

Comment 33 Ryan Barry 2017-08-16 13:09:11 UTC
4.1.5 NGN was delivered yesterday, so it should be possible to test now.

Comment 34 Bill Sanford 2017-08-16 13:32:32 UTC
*** Bug 1478453 has been marked as a duplicate of this bug. ***

Comment 35 Nikolai Sednev 2017-08-17 14:01:42 UTC
1
Tested on latest rhvh-4.1-0.20170815.0+1 with NetworkManager-1.8.0-9.el7.x86_64 on latest downstream 4.1.5 with rhvm-appliance-4.1.20170811.0-1.el7.noarch.
Result:
Hostname remains the same before and after deployment of SHE over NFS. 
I've used DHCP reservations for my environment, not static IP settings.
If NGN host is rebooted after deployment, hostname remains the same as expected.

2
Tested the same scenario on master for upstream ovirt-engine-appliance.noarch 4.2-20170815.1.el7.centos on rhvh-4.1-0.20170815.0+1 with NetworkManager-1.8.0-9.el7.x86_64.
Result:
Hostname remains the same before and after deployment of SHE over NFS. 
I've used DHCP reservations for my environment, not static IP settings.
If NGN host is rebooted after deployment, hostname remains the same as expected.

Comment 36 Edward Haas 2017-11-08 14:22:45 UTC
(In reply to Nikolai Sednev from comment #35)
> 1
> Tested on latest rhvh-4.1-0.20170815.0+1 with
> NetworkManager-1.8.0-9.el7.x86_64 on latest downstream 4.1.5 with
> rhvm-appliance-4.1.20170811.0-1.el7.noarch.
> Result:
> Hostname remains the same before and after deployment of SHE over NFS. 
> I've used DHCP reservations for my environment, not static IP settings.
> If NGN host is rebooted after deployment, hostname remains the same as
> expected.
> 
> 2
> Tested the same scenario on master for upstream
> ovirt-engine-appliance.noarch 4.2-20170815.1.el7.centos on
> rhvh-4.1-0.20170815.0+1 with NetworkManager-1.8.0-9.el7.x86_64.
> Result:
> Hostname remains the same before and after deployment of SHE over NFS. 
> I've used DHCP reservations for my environment, not static IP settings.
> If NGN host is rebooted after deployment, hostname remains the same as
> expected.

Based on your description, this BZ is then resolved.
Can you please confirm?

Comment 37 Nikolai Sednev 2017-11-09 06:14:35 UTC
(In reply to Edward Haas from comment #36)
> (In reply to Nikolai Sednev from comment #35)
> > 1
> > Tested on latest rhvh-4.1-0.20170815.0+1 with
> > NetworkManager-1.8.0-9.el7.x86_64 on latest downstream 4.1.5 with
> > rhvm-appliance-4.1.20170811.0-1.el7.noarch.
> > Result:
> > Hostname remains the same before and after deployment of SHE over NFS. 
> > I've used DHCP reservations for my environment, not static IP settings.
> > If NGN host is rebooted after deployment, hostname remains the same as
> > expected.
> > 
> > 2
> > Tested the same scenario on master for upstream
> > ovirt-engine-appliance.noarch 4.2-20170815.1.el7.centos on
> > rhvh-4.1-0.20170815.0+1 with NetworkManager-1.8.0-9.el7.x86_64.
> > Result:
> > Hostname remains the same before and after deployment of SHE over NFS. 
> > I've used DHCP reservations for my environment, not static IP settings.
> > If NGN host is rebooted after deployment, hostname remains the same as
> > expected.
> 
> Based on your description, this BZ is then resolved.
> Can you please confirm?

Yes it is, but it should be verified by its QA assignee, once its status changes to ON_QA.

Comment 38 Sandro Bonazzola 2017-12-20 11:11:26 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 39 Yedidyah Bar David 2020-11-19 10:55:01 UTC
This now happened to me again, on latest 4.3, I think (I don't have engine logs anymore, only host):

Nov 18 12:00:46 vm-10-174 NetworkManager[735]: <info>  [1605693646.8934] manager: NetworkManager state is now DISCONNECTED
Nov 18 12:00:46 vm-10-174 NetworkManager[735]: <info>  [1605693646.8937] policy: set-hostname: set hostname to 'localhost.localdomain' (no default device)

Is this a regression? Should I open a new bug?

I didn't fully read all the discussion in current bug or in bug 1422610, so didn't understand what was the fix and whether it's applicable now (when in 4.3, IIUC, NM does manage all devices).

Comment 40 Edward Haas 2020-11-30 12:16:57 UTC
I think it deserves a new BZ with ref to this one. It may be a regression or a new one with the same results.