Bug 591548 - netback does not properly get to the Connected state after it's been Closed
Summary: netback does not properly get to the Connected state after it's been Closed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.5
Hardware: All
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Paolo Bonzini
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 518435 526393 643345
TreeView+ depends on / blocked
 
Reported: 2010-05-12 14:32 UTC by Paolo Bonzini
Modified: 2011-01-13 21:31 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-13 21:31:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch (584 bytes, patch)
2010-05-12 14:35 UTC, Paolo Bonzini
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0017 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.6 kernel security and bug fix update 2011-01-13 10:37:42 UTC

Description Paolo Bonzini 2010-05-12 14:32:48 UTC
The netback driver fails to transition from InitWait to Connected after it's 
been closed once.  The reason is that at the moment netdev_state_change is 
called the interface is still down, so the NETDEV_CHANGE event is not called.

This is visible with the xenpv-win drivers by disabling and enabling the 
adapters repeatedly.  Without the patch, the drivers hang about 1 in 50 times
(and that is because of some hacks in the drivers; if I make the drivers talk
the correct xenbus protocol they will hang 100% of the time).

Upstream ties the Connected transition to the completion of the hotplug scripts, so it doesn't have this issue.

Comment 1 Paolo Bonzini 2010-05-12 14:35:53 UTC
Created attachment 413444 [details]
patch

Comment 2 RHEL Program Management 2010-05-20 12:41:54 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 5 Jarod Wilson 2010-06-29 13:35:48 UTC
in kernel-2.6.18-205.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 6 Jarod Wilson 2010-06-29 13:39:59 UTC
Not sure yet what went wrong w/the release script, but that should have been "in kernel-2.6.18-204.el5" (in build 204, not 205).

Comment 10 Binbin Yu 2010-12-22 08:42:36 UTC
Tested with:
i386 and x86_64 host
Win2008-32 guest
Win2003-64 guest

Component version:
xen-3.0.3-120.el5
xenpv-win-1.3.1-1.el5

Steps:
1. install xenpv-win-1.3.1-1 on Windows guest
2. disable then enable the PV NIC from Device Manager
3. repeat step2

Reproduced the bug with kernel-xen-2.6.18-194.el5:
For Win2008-32 and Win2003-64 guest, they both take only one disable/enable cycle to make guest hang.

Verified the bug with kernel-xen-2.6.18-231.el5:
For both guests, disable/enable work smoothly, and after 6 disable/enable
cycles the guests still work fine without hang.

According to the test result above, set bug to VERIFIED.

here steps are referred to https://bugzilla.redhat.com/show_bug.cgi?id=643345

Comment 11 Binbin Yu 2010-12-24 08:18:56 UTC
Also verified  with kernel-xen-2.6.18-238.el5

Comment 13 errata-xmlrpc 2011-01-13 21:31:39 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html


Note You need to log in before you can comment on or make changes to this bug.