RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 875309 - An Hyper-V RHEL6.3 Guest is unreachable from the network after live migration
Summary: An Hyper-V RHEL6.3 Guest is unreachable from the network after live migration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.3
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: ---
Assignee: jason wang
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-10 11:41 UTC by Claudio Latini
Modified: 2018-12-03 17:57 UTC (History)
13 users (show)

Fixed In Version: kernel-2.6.32-347.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-21 06:56:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0496 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6 kernel update 2013-02-20 21:40:54 UTC

Description Claudio Latini 2012-11-10 11:41:56 UTC
Description of problem:
An RHEL 6.3 Hyper-V VM Guest is unreachable from the network after live migration. This happens because the VM doesn't send a gratuitous ARP (GARP) to inform the underlying network the node change. The VM is configured with MSFT Linux Integration Components (LIC) 3.4 and hv_netvsc syntetic NIC.

Version-Release number of selected component (if applicable):
Red Hat Enrterprise Linux 6.3 from installation media (2.6.32-279) and later.

How reproducible:
100%

Steps to Reproduce:
1. Obtain a at-least two-node 2008 R2 Hyper-V Cluster;
2. (IMPORTANT) Connect the cluster to a layer-2 network switch;
3. Create a RHEL 6.3 Hyper-V Linux Guest with LIC 3.4 and place it on a cluster node;
4. Configure VM to use the hv_netvsc syntetic driver and static MAC;
5. LIve migrate the VM to the other node.
 
Actual results:
The VM doen't send the GARP and after the migration is unreachable because the switch MAC address table is unchanged.

Expected results:
The VM must send the GARP advertising the underlying network.

Additional info:
The Hyper-v network driver calls netif_notify_peers() after live migration to perform the GARP task. However the RHEL 6.3 kernel code doesn't do this work unconditionally:

from net/ipv4/devinet.c:
---
case NETDEV_NOTIFY_PEERS:
case NETDEV_CHANGEADDR:
    /* Send gratuitous ARP to notify of link change */
    if (IN_DEV_ARP_NOTIFY(in_dev)) {
        struct in_ifaddr *ifa = in_dev->ifa_list;
        
        if (ifa)
            arp_send(ARPOP_REQUEST, ETH_P_ARP,
                    ifa->ifa_address, dev,
                    ifa->ifa_address, NULL,
                    dev->dev_addr, NULL);
    }
    break;
---

if the function IN_DEV_ARP_NOTIFY() return false the GARP is never sent.

The issue has been resolved in the upstream (see https://lkml.org/lkml/2011/3/30/536) and enhanced for secondary ip addresses (see https://lkml.org/lkml/2011/7/24/152)

So, Red Hat should also apply this patches to solve even in its kernel.

Comment 8 Dor Laor 2012-11-26 13:02:04 UTC
Arr, other hypervisors (such KVM) do the gratitious packet by the hyerpvisor and keep the guest OS outside of the scope). If that's upstream we can still fix it.

Comment 9 jason wang 2012-11-27 06:51:29 UTC
(In reply to comment #8)
> Arr, other hypervisors (such KVM) do the gratitious packet by the hyerpvisor
> and keep the guest OS outside of the scope). If that's upstream we can still
> fix it.

Btw, we plan to let guest (virtio-net) send the garp in the future (the guest driver part were already upstream).

Comment 11 RHEL Program Management 2012-11-28 20:11:55 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 12 Jarod Wilson 2012-12-07 18:36:01 UTC
Patch(es)

Comment 15 Jarod Wilson 2012-12-11 20:32:40 UTC
Patch(es)

Comment 17 Shengnan Wang 2013-02-01 10:16:01 UTC
Hi K.Y.,
Could you please help to check the testing steps and results of the limitation testing? 

Testing the live migration on two Hyper-V hosts with one network adapter for each. Due to the environment limitation, there is still slightly packets drop during the process (Reference 'Networking considerations for live migration' http://technet.microsoft.com/en-us/library/ff428137%28WS.10%29.aspx)

Before fixing of the problem (testing with RHEL6.3 LIC guests), the RHEL6.3 LIC guest was out of network after doing live migration. Keeped pinging the guest, there were more than 400 packets lost in average.

Testing with the fixed kernel (with RHEL6.4 snapshot5 guest, kernel-2.6.32-356.el6). There is only about 60 packets lost during the live migration in average.

Comment 18 Shengnan Wang 2013-02-01 10:42:36 UTC
(In reply to comment #17)
> Hi K.Y.,
> Could you please help to check the testing steps and results of the
> limitation testing? 
> 

It should be 'live migration testing' not 'limitation testing'.


> Testing the live migration on two Hyper-V hosts with one network adapter for
> each. Due to the environment limitation, there is still slightly packets
> drop during the process (Reference 'Networking considerations for live
> migration'
> http://technet.microsoft.com/en-us/library/ff428137%28WS.10%29.aspx)
> 

There is a table listing the 'Host configuration' and Live migration bandwidth in 'Networking considerations for live migration' part. From the table, see that there will be some packets lost in the test environment with one network adapter.

> Before fixing of the problem (testing with RHEL6.3 LIC guests), the RHEL6.3
> LIC guest was out of network after doing live migration. Keeped pinging the
> guest, there were more than 400 packets lost in average.
> 
> Testing with the fixed kernel (with RHEL6.4 snapshot5 guest,
> kernel-2.6.32-356.el6). There is only about 60 packets lost during the live
> migration in average.

Are the steps and results enough to verify the bug? Or could you help to test the package if there is more suitable environment on your site?

Thanks!

Comment 19 K. Y. Srinivasan 2013-02-03 23:38:47 UTC
I think some packet loss is to be expected. I am copying Haiyang and Hashir. They can shed some additional light on this.

Comment 20 Haiyang Zhang 2013-02-04 15:41:42 UTC
(In reply to comment #19)
> I think some packet loss is to be expected. I am copying Haiyang and Hashir.
> They can shed some additional light on this.

I agree that a few packet loss during the transition is expected, as long as the VM is reachable after the migration.

Comment 21 Shengnan Wang 2013-02-05 10:28:14 UTC

Verify this problem with RHEL6.4 guest (kernel-2.6.32-356.el6). 

Build version:
Host: Microsoft Hyper-V Server 2012
Guest: RHEL6.4 (kernel-2.6.32-356.el6)

Steps:
1. Obtain a two-node 2012 Hyper-V Cluster. (There is one network adapter for each host.)
2. Connect the cluster to a layer-2 network switch;
3. Create a RHEL6.4 guest with the hv_netvsc driver on one host and configure the guest to use the static mac. 
4. Check the guest network via ping from the other machine. 
5. Live migrate the RHEL6.4 guest to the other host via SCVMM.
6. Check the output of the ping.

Results:
Due to the environment limitation, there is still slightly packets drop during the process. Details, please have a look at comment 17 and commment 18. Confirmed the test steps and results with MS side. Some packets loss is to be expected mentioned in comment 19 and comment 20.


So change the status fo the bug to 'verified'.

Comment 23 errata-xmlrpc 2013-02-21 06:56:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0496.html


Note You need to log in before you can comment on or make changes to this bug.