Bug 1255533 - Hosts losing ifcfg-eth0 networking in 20150603.0.el6ev [NEEDINFO]
Hosts losing ifcfg-eth0 networking in 20150603.0.el6ev
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: rhev-hypervisor (Show other bugs)
x86_64 Linux
urgent Severity urgent
: ---
: ---
Assigned To: Fabian Deutsch
Chaofeng Wu
Depends On:
  Show dependency treegraph
Reported: 2015-08-20 16:32 EDT by Robert McSwain
Modified: 2016-02-10 14:47 EST (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2015-10-21 05:16:39 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: Network
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
ibarkan: needinfo? (rmcswain)

Attachments (Terms of Use)

  None (edit)
Description Robert McSwain 2015-08-20 16:32:48 EDT
Description of problem:
After updating all the host's firmware and installing fresh hypervisors, we are losing network connectivity. net_persistence = ifcfg and migration_timeout = 600 were added to every host, rebooted, and observed ifcfg-eth0 missing

Version-Release number of selected component (if applicable):
RHEV H 20150603.0.el6ev

How reproducible:

Steps to Reproduce:
1. net_persistence = ifcfg and migration_timeout = 600
2. Configure ifcfg-eth0 in the admin TUI of RHEV-H
3. Reboot
4. Observe the TUI and /etc/sysconfig/network-scripts missing ifcfg-eth0

Actual results:
Networking is missing

Expected results:
All network devices are configured as they were previously before the reboot.

Additional info:
blade2nonetwork1.PNG (26 KB) 
screenshot inside admin showing network configured after a reboot

blade2nonetwork2.PNG (19 KB) 
screenshot showing config of eth0 after reboot

blade2nonetwork3.PNG (18 KB) 
screenshot showing missing ifcfg-eth0 after reboot
Comment 2 Fabian Deutsch 2015-09-02 13:39:05 EDT
From the description it looks like these are the symptoms around network persistence.

Ido, does this look like the networking issue we fix in 3.5.4?
Comment 3 Ido Barkan 2015-09-22 03:31:17 EDT
Hi, There are tons of logs here, which is kind of challenging. I found the last setupNetworks command on /hp-enc1-blade2-2015081913151439990114/var/log/vdsm/supervds.log

I guess this is the call from the TUI since it configures the management network.
2015-08-13 21:37:50,835::api::631::setupNetworks::(setupNetworks) Setting up network according to configuration: networks:{'rhevm': {'vlan': '319', 'ipaddr': '', 'bonding': 'bond0', 'netmask': '', 'STP': 'no', 'bridged': 'true', 'gateway': '', 'defaultRoute': True}}, bondings:{}, options:{'connectivityCheck': 'true', 'connectivityTimeout': 120}

... and then during the execution vdsm writes the updated ifcfg-eth0

2015-08-13 21:37:51,607::ifcfg::550::root::(writeConfFile) Writing to file /etc/sysconfig/network-scripts/ifcfg-eth0 configuration:
# Generated by VDSM version 4.16.20-1.el6ev

there is no reference of ifcfg-eth0 later in the log.

Also, later, I see another call to setupNetworks which maybe hints that this is the wrong server:

2015-08-13 21:41:49,497::api::631::setupNetworks::(setupNetworks) Setting up network according to configuration: networks:{'iSCSI_1': {'nic': 'eth8', 'netmask': '', 'ipaddr': '', 'bridged': 'true', 'STP': 'no'}, 'iSCSI_2': {'nic': 'eth9', 'netmask': '', 'ipaddr': '', 'bridged': 'true', 'STP': 'no'}, 'RAILS_205': {'bonding': 'bond0', 'vlan': '205', 'STP': 'no', 'bridged': 'true'}, 'RAILS_204': {'bonding': 'bond0', 'vlan': '204', 'STP': 'no', 'bridged': 'true'}, 'RAILS_220': {'bonding': 'bond0', 'vlan': '220', 'STP': 'no', 'bridged': 'true'}, 'VLAN300': {'bonding': 'bond0', 'vlan': '300', 'STP': 'no', 'bridged': 'true'}, 'VLAN902': {'vlan': '902', 'ipaddr': '', 'bonding': 'bond0', 'netmask': '', 'STP': 'no', 'bridged': 'true'}, 'P2000A': {'nic': 'eth2', 'netmask': '', 'ipaddr': '', 'bridged': 'true', 'STP': 'no'}, 'P2000B': {'nic': 'eth3', 'netmask': '', 'ipaddr': '', 'bridged': 'true', 'STP': 'no'}, 'zimbra_private': {'bonding': 'bond0', 'vlan': '4001', 'STP': 'no', 'bridged': 'true'}, 'web_private': {'bonding': 'bond0', 'vlan': '4000', 'STP': 'no', 'bridged': 'true'}, 'moodle_private': {'bonding': 'bond0', 'vlan': '4002', 'STP': 'no', 'bridged': 'true'}, 'RAILS_233': {'bonding': 'bond0', 'vlan': '233', 'STP': 'no', 'bridged': 'true'}, 'RAILS_232': {'bonding': 'bond0', 'vlan': '232', 'STP': 'no', 'bridged': 'true'}}, bondings:{}, options:{'connectivityCheck': 'true', 'connectivityTimeout': 120}

Robert? Maybe I am looking at the wrong log file? Can you point me to the a maybe more relevant log?
Comment 4 Robert McSwain 2015-10-07 10:35:25 EDT
The customer is testing out RHEL 7.1 hosts, so I've asked if there's a need to keep this bug opened. Leaving the NEEDINFO on myself for now while we await answers.
Comment 5 Yaniv Lavi 2015-10-21 05:16:39 EDT
Please reopen when you can provide the info.

Note You need to log in before you can comment on or make changes to this bug.