Bug 1189837

Summary: [RHEVH 7.0] RHEVH is losing connectivity when connected to a single interface - dhclient is not running
Product: Red Hat Enterprise Virtualization Manager Reporter: Gil Klein <gklein>
Component: ovirt-nodeAssignee: Fabian Deutsch <fdeutsch>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.5.0CC: adahms, cshao, danken, ecohen, fdeutsch, gklein, hadong, huiwa, iheim, istein, leiwang, lsurette, lvernia, pstehlik, rbarry, yaniwang, ycui, yeylon
Target Milestone: ---Keywords: Regression
Target Release: 3.5.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: node
Fixed In Version: Doc Type: Known Issue
Doc Text:
Assigning an IP configuration to a Red Hat Enterprise Virtualization Hypervisor 7.0 host or Red Hat Enterprise Linux 7 host using DHCP succeeds initially, but the DHCP client fails. As a result, after approximately one half of the DHCP lease time has passed in the local network, the host loses its IP address. As a workaround, you must set the valid_lft parameter of the interface to "forever": ip addr change <address> dev <interface> This prevents the IP address from being lost, but also prevents the IP address from being renewed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-11 10:16:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1164311    
Attachments:
Description Flags
journalctl output
none
auditd logs
none
vdsm log none

Description Gil Klein 2015-02-05 15:15:18 UTC
Description of problem:

RHEVH 7.0 hosts are losing connectivity after ~1 day, even when only 1 interface is connected.

When checking the host right after reboot, dhcp is fetching an ip but dhclient is not running (like it does on native RHEL 7.0 system).

Opening another BZ based on https://bugzilla.redhat.com/show_bug.cgi?id=1183751#c7 cause it seems the issue is not related to multiple interfaces scenario.


Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Hypervisor 7.0 (20150127.0.el7ev)

How reproducible:
100% (Reproduced by both RHEV QE + Virt QE)


Steps to Reproduce:
1. Install RHEVH 7 (20150127.0.el7ev)
2. Configure it to use DHCP
3. Wait ~1 day

Actual results:
Host is losing connectivity after ~1 day. 
IP is not present on the interface when running ifconfig / ip a
dhclient is not running


Expected results:
Host should not loss connectivity


Additional info:

cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor 7.0 (20150127.0.el7ev)

uname -a
Linux coda-vdsb.tlv.redhat.com 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master rhevm state UP qlen 1000
    link/ether f8:b1:56:e2:87:3f brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fab1:56ff:fee2:873f/64 scope link 
       valid_lft forever preferred_lft forever
3: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN 
    link/ether de:22:ce:37:12:0b brd ff:ff:ff:ff:ff:ff
4: rhevm: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether f8:b1:56:e2:87:3f brd ff:ff:ff:ff:ff:ff
    inet 10.35.5.21/23 brd 10.35.5.255 scope global dynamic rhevm
       valid_lft 76677sec preferred_lft 76677sec
    inet6 fe80::fab1:56ff:fee2:873f/64 scope link 
       valid_lft forever preferred_lft forever

# brctl show
bridge name	bridge id		STP enabled	interfaces
;vdsmdummy;		8000.000000000000	no		
rhevm		8000.f8b156e2873f	no		em1

# rpm -qa|grep dhc
dhcp-libs-4.2.5-27.el7_0.2.x86_64
dhcp-common-4.2.5-27.el7_0.2.x86_64
dhclient-4.2.5-27.el7_0.2.x86_64

Comment 1 Gil Klein 2015-02-05 15:18:17 UTC
Created attachment 988536 [details]
journalctl output

Comment 2 Gil Klein 2015-02-05 15:21:13 UTC
Created attachment 988537 [details]
auditd logs

Comment 3 Gil Klein 2015-02-05 15:23:39 UTC
Created attachment 988538 [details]
vdsm log

Comment 4 Gil Klein 2015-02-08 06:12:22 UTC
I've left 2 RHEVH 7.0 servers running for the weekend. 
Both has a single NIC connected. 

For one host I've set a static address using:
"ip addr change <address> dev <interface>" - No connectivity issues / IP lose for 48 hours.  Host is running stable. 

The other host was running without any change, and was losing it's connectivity after ~24h.

Comment 5 Ying Cui 2015-02-08 21:59:03 UTC
Test scenario 1 and scenario 2, for your reference.
Test scenario 1:
RHEVH itself single network without rhevm bridge(did not register to rhevm), after running 48h+, the IP is still here, _not_ lose connectivity.
1. installed rhevh el7 successful.
2. set dhcp network by rhevh TUI.
3. running 48+, the network is not lose connectivity.

Test scenario 2:
RHEVH itself two NIC up, all two NIC are created by manual, not through rhevh tui. thereinto the bridge rhevm is created by manual _not_ by vdsm.
After 3 days running, all network are connectivity. _not_lost.
can not share this machine, it is our SVVP test machine, important data are in it.

Test version for above:
rhev-hypervisor7-7.0-20150127.0
ovirt-node-3.2.1-6.el7.noarch
dhcp-common-4.2.5-27.el7_0.2.x86_64
dhcp-libs-4.2.5-27.el7_0.2.x86_64
dhclient-4.2.5-27.el7_0.2.x86_64

Comment 7 cshao 2015-02-09 04:20:24 UTC
Hi Fabian, Danken,

For this bug, is there anything QE can provide?

Thanks!

Comment 8 Ying Cui 2015-02-09 04:47:20 UTC
Hi Dan,
  Chen is provided the info on comment 6 what you may want to know when dhclient is not running. 
  See chen's step 4 in comment 6, there is no pid for dhcpclient after register to rhevm, not waiting some time.
  Please reach out cshao@ for this bug helps during I am not in office can not response timely.

Thanks
Ying

Comment 9 Ying Cui 2015-02-09 05:31:50 UTC
Chen, add rhevh via rhevm,then dhclient? above are register rhevh to rhevm.

if add rhevh via rhevm work, then we need to run more than 1+ days to check then.

Comment 12 Fabian Deutsch 2015-02-10 19:45:57 UTC
*** Bug 1183751 has been marked as a duplicate of this bug. ***

Comment 13 Lior Vernia 2015-02-11 10:16:25 UTC

*** This bug has been marked as a duplicate of bug 1187244 ***