Bug 449139 - RH5.2 Xen Windows GOS Network failure
Summary: RH5.2 Xen Windows GOS Network failure
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.2
Hardware: x86_64
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Michal Novotny
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-30 18:24 UTC by Hector Arteaga
Modified: 2016-04-26 13:46 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-12-02 13:44:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Image with the IP config information (72.67 KB, image/jpeg)
2008-06-17 18:33 UTC, Hector Arteaga
no flags Details

Description Hector Arteaga 2008-05-30 18:24:15 UTC
Description of problem:
My windows 2003 Guest OS loses network communication.  My Xen server and Linux 
guest OSes do not show these symptoms.  On of the differenced between my 
windows and linux GOS is that Windows is Fully Virtualized while linux is 
running as a paravirtualized VM.  Not sure how to get more information on this 
issue and the windows event logs don't show much.

Version-Release number of selected component (if applicable):
Found on Redhat 5.2 snapshot 7

How reproducible:
Happens sporadically from time to time (usually a couple of times a day)

Steps to Reproduce:
1. Start an application that sends network traffic to the windows VM.  I also 
had traffic being sent to my linux VM so that may have had somehting to do 
with it.

  
Actual results:


Expected results:


Additional info:

Comment 4 Bill Burns 2008-06-13 17:52:52 UTC
Does this still happen with the GA version? Can you supply the following:
output of ipconfig /all from the guest and the dom0 dmesg output when this
happens.


Comment 5 Hector Arteaga 2008-06-17 18:33:01 UTC
Created attachment 309655 [details]
Image with the IP config information

Here is the requested ipconfig information after the issue has happened.  I was
not able to capture the image right after the problem since it happened at 5 in
the morning.

Comment 6 Hector Arteaga 2008-06-20 21:04:41 UTC
My guest OS is Windows 2003 R2 Enterprise Edition Service Pack 2 and I have 
not seen the issue with a fully virtualized Linux Guest OS.

Comment 9 ashok 2009-03-18 11:19:11 UTC
can you capture the dmesg output as suggested by bill?

Comment 11 ashok 2009-03-25 13:10:52 UTC
does "Sending traffic" mean Pinging?
Then I tried with win2003 R2 SP2 and over a period of 3 hours, there is no packet loss and no network drop.

The Host is :2.6.18-128.1.1.el5xen #1 SMP Mon Jan 26 14:19:09 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

Comment 12 Hector Arteaga 2009-04-01 01:50:13 UTC
By "Sending Traffic" what I meant was that the link be tested to ensure that it remains up.  Our test application continuously monitors the VM through the network while generating IO to the storage on the VM.  I would think that pinging would suffice as a test although it may not detect all cases.  
Also your host version "2.6.18-128.1.1.el5xen" I believe is from a 5.3 install and I do not recall seeing this issue with a 5.3 host (although I had the para-virtualized drivers loaded).  I saw this on a 5.2 xen server setup.

Comment 13 Michal Novotny 2009-05-13 08:54:34 UTC
Well, I tried it and no luck to reproduce it. You wrote something about your test application - could you please send the test application for us to allow testing (just a binary, it's Windows system anyway :) Are you talking about application running in Windows guest, right ?).

Anyway, is that test application designed to send traffic all the time by itself with no need of user interaction with some logs being created? This could help a lot to have this application and try it by ourselves...

Thanks,
Michal

Comment 14 Hector Arteaga 2009-05-15 00:20:27 UTC
The testbench that we use is internal HP proprietary.  Also, it requires a separate linux host and is not too strait forward.  The primary focus of the application is to generate storage traffice, but it keep tabs on the servers/VMs to see how things are going and the such.  This is the part that is failing.  The test application starts doing its storage traffic thing then a few hours down the road, it loses communication with the VM.

Comment 15 Michal Novotny 2009-05-15 07:07:32 UTC
Well, what do you mean by generating storage traffic? What sort of packets are sent there? Just for clarification, is this the application that is running inside the Windows VM? Also, what do you mean by few hours - how long it has been tested since this happened? Five hours? Eight hours? More?

Comment 18 Hector Arteaga 2009-05-15 17:29:20 UTC
Basically its a bunch of reads and writes going down to the storage device (Virtual disks presented to the VM).  The main application runs on the linux server that I mentioned above but there are client processes that are started on the Windows VM itself.  The main process then checks on the client processes through the network, but there isn't a lot of network traffice generated.  Its mostly status messages get sent back and forth.  The test usually fails after about 5 hours.

Comment 19 Michal Novotny 2009-06-01 13:53:16 UTC
Could you please try to reproduce it with Jirka's RPMs from:

http://people.redhat.com/jdenemar/xen.el5/

and tell if this is the issue?

Thanks,
Michal

Comment 20 Michal Novotny 2009-09-18 02:15:34 UTC
Hi Hector, any luck with this one? Did you try this with RPMs from comment #19 ? You could also try with: http://people.redhat.com/jdenemar/xen/ and could you please provide results what's working and what's now working (which RPMs)? I was unable to reproduce it at all.

Thanks,
Michal

Comment 22 Michal Novotny 2009-10-09 08:27:24 UTC
Hi Hector,
since I am unable to reproduce it, could you please try with RPMs from http://people.redhat.com/minovotn/xen and provide us the test results? Also, could you give us the exact steps to reproduce and what your Windows application in the guest is exactly doing ?

Thanks,
Michal

Comment 23 Michal Novotny 2009-11-02 08:00:41 UTC
Hector,
I guess no testing was done with RPMs described in c#22 so I need to ask you more about your application used for testing. You wrote it's basically a bunch of read/write operations going down to the VM's disks with main application running on Linux server. Should I consider this communication as client/server communication where client resides in the guest (VM) and server is standard application for Linux system? What's the communication protocol? Or this doesn't matter and it's just sending random data from guest (VM) to host machine (server)? Just random data ? You wrote something about checking the data by main process on the client processes on the Windows VM itself - by main process you mean the Linux server application or what? By client process you mean the one inside Windows VM, ie. this is Windows application, right ?

Thanks,
Michal

Comment 24 Michal Novotny 2009-12-02 13:44:45 UTC
Since reporter is no longer available to help with debugging assistance and since we were unable to reproduce it at all with RHEL 5.4 version I am going to close it. If you run into those issues please feel free to reopen this one with more information.

Comment 25 Paolo Bonzini 2010-04-08 15:43:46 UTC
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).


Note You need to log in before you can comment on or make changes to this bug.