Bug 1162844 - Network connectivity drops after vdsm runs
Summary: Network connectivity drops after vdsm runs
Keywords:
Status: CLOSED DUPLICATE of bug 1116004
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.5.1
Assignee: Ido Barkan
QA Contact: Gil Klein
URL:
Whiteboard: network
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-11 20:35 UTC by Adam Litke
Modified: 2016-02-10 19:36 UTC (History)
15 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-11-18 15:30:11 UTC
oVirt Team: Network
Embargoed:


Attachments (Terms of Use)
supervdsm.log (96.62 KB, text/plain)
2014-11-11 20:35 UTC, Adam Litke
no flags Details
vdsm.log (15.04 MB, text/plain)
2014-11-11 20:36 UTC, Adam Litke
no flags Details
engine log (771.76 KB, application/x-gzip)
2014-11-12 19:16 UTC, Adam Litke
no flags Details

Description Adam Litke 2014-11-11 20:35:04 UTC
Description of problem: 
I've been experiencing peculiar and annoying networking behavior on my
oVirt development hosts and I'm hoping someone familiar with vdsm
networking configuration can help me get to the bottom of it.

My setup is two mini-Dells acting as virt hosts and ovirt engine
running on my laptop.  The dells get their network config from a
cobbler instance running on my laptop which also provides PXE
services.

After freshly installing the dells, I get a nice, stable network
connection.  After installing vdsm, the connection seems to drop
occasionally.  I have visit the machine, log into the console, and
execute 'dhclient ovirtmgmt'.  This fixes the problem again for
awhile.

Version-Release number of selected component (if applicable):
vdsm-4.16.0-522.git4a3768f.fc20.x86_64

How reproducible: For me, always (after a non-deterministic period of time elapses)

Steps to Reproduce:
1. Start vdsm
2. Use vdsm for awhile


Actual results:
The ovirtmgmt interface loses its IP address

Expected results:
Connectivity is not interrupted

Additional info:

Comment 1 Adam Litke 2014-11-11 20:35:38 UTC
Created attachment 956412 [details]
supervdsm.log

Comment 2 Adam Litke 2014-11-11 20:36:23 UTC
Created attachment 956413 [details]
vdsm.log

Comment 3 Adam Litke 2014-11-11 20:37:12 UTC
As suggested by Ondřej Svoboda, I tried disabling NetworkManager and that did not seem to resolve the problem.

Comment 4 Ondřej Svoboda 2014-11-11 22:16:03 UTC
Adam,

did you lose connection after VDSM and superVDSM restarted (not necessarily)? Could you look in engine logs?

supervdsm.log
  MainThread::DEBUG::2014-11-11 14:19:31,399::supervdsmServer::451::SuperVdsm.Server::(main) Terminated normally

vdsm.log
  MainThread::DEBUG::2014-11-11 14:19:26,600::vdsm::58::vds::(sigtermHandler) Received signal 15

There are a couple of not really nice warnings in supervdsm.log when VDSM creates the management network (bridge expected too early -- looks harmless; libvirt network not there -- I don't like this) and also further on (sourceroutethread trying to add the same route over and over again).

What puzzles me though is that lines such as the one below indicate some kind of DHCP activity.

  sourceRoute::DEBUG::2014-11-11 15:24:28,939::sourceroutethread::38::root::(process_IN_CLOSE_WRITE_filePath) Responding to DHCP response in /var/run/vdsm/sourceRoutes/1415737468

I CC'd Toni and Ido. Guys, can you see anything in the logs?

Comment 5 Ondřej Svoboda 2014-11-11 22:18:31 UTC
I mean, the last observation seems to contradict the connection loss.

Comment 6 Adam Litke 2014-11-12 19:16:30 UTC
Created attachment 956855 [details]
engine log

I didn't see anything fishy in the engine.log but here it is for completeness

Comment 7 Adam Litke 2014-11-12 19:18:37 UTC
(In reply to Ondřej Svoboda from comment #4)
> Adam,
> 
> did you lose connection after VDSM and superVDSM restarted (not
> necessarily)? Could you look in engine logs?

Unless I'm missing something, could this be the new soft-fencing in response to storage connection failures?

> 
> supervdsm.log
>   MainThread::DEBUG::2014-11-11
> 14:19:31,399::supervdsmServer::451::SuperVdsm.Server::(main) Terminated
> normally
> 
> vdsm.log
>   MainThread::DEBUG::2014-11-11
> 14:19:26,600::vdsm::58::vds::(sigtermHandler) Received signal 15
> 
> There are a couple of not really nice warnings in supervdsm.log when VDSM
> creates the management network (bridge expected too early -- looks harmless;
> libvirt network not there -- I don't like this) and also further on
> (sourceroutethread trying to add the same route over and over again).
> 
> What puzzles me though is that lines such as the one below indicate some
> kind of DHCP activity.
> 
>   sourceRoute::DEBUG::2014-11-11
> 15:24:28,939::sourceroutethread::38::root::(process_IN_CLOSE_WRITE_filePath)
> Responding to DHCP response in /var/run/vdsm/sourceRoutes/1415737468

At this point I may have logged into the box and executed "dhclient ovirtmgmt" in order to rescue the connection.

Comment 8 Antoni Segura Puimedon 2014-11-14 23:59:19 UTC
After talking with Adam and asking him to try the patch for https://bugzilla.redhat.com/1142082 the issue has not happened again. Please Adam, if by Monday it still keeps the address mark this bug as duplicate of the one above.

Comment 9 Barak 2014-11-16 13:17:52 UTC
Adam - Just to be on the safe side what OS did you run on your mini-dells ?

Bug 1116004 was fixed for RHEL 7.1 and 7.0.z (Bug 1148345)

Comment 10 Adam Litke 2014-11-18 15:30:11 UTC
(In reply to Barak from comment #9)
> Adam - Just to be on the safe side what OS did you run on your mini-dells ?

I tried with CentOS 7 and Fedora 20.

> Bug 1116004 was fixed for RHEL 7.1 and 7.0.z (Bug 1148345)

*** This bug has been marked as a duplicate of bug 1116004 ***


Note You need to log in before you can comment on or make changes to this bug.