Bug 662649

Summary: VM network does not work after suspend/resume laptop
Product: Red Hat Enterprise Linux 6 Reporter: Mauricio Teixeira <mteixeira>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: bsarathy, dallan, eblake, jyang, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-16 22:22:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 756082    
Attachments:
Description Flags
Files requested by comment #4
none
Another round of tests none

Description Mauricio Teixeira 2010-12-13 13:59:08 UTC
Description of problem:

I run VMs on my laptop with RHEL 6 Workstation for testing purposes. Every time after a cycle of suspend/resume the VMs loose their ability to access the network. I am using NAT only. VMs get IP, can ping the NET gateway (the hypervisor) and the host IP but can't ping anything else.

Version-Release number of selected component (if applicable):

libvirt-0.8.1-27.el6.x86_64
kernel-2.6.32-71.7.1.el6.x86_64

How reproducible:

Every time.

Steps to Reproduce:
1. Boot host.
2. Use VM.
3. Shutdown VM.
4. Suspend host.
5. Bring host back.
6. Power up VM.
7. Test network.
  
Actual results:

Network not accessible.

Expected results:

Should work. :)

Additional info:

Comment 4 Laine Stump 2011-01-06 21:32:03 UTC
I don't have a machine running RHEL6 I can get to suspend and resume properly *at all* at the moment. I have checked with Fedora13, and guests are able to access the external network with no problem after a suspend/resume.

I will work on getting my RHEL box doing basic suspend/resume. In the meantime, can you please attach the output of the following (from the host, not the guest), both pre-suspend and post-resume:

    iptables -S -t net
    iptables -S
    ifconfig
    brctl show

Maybe something in there will provide a clue.

Also, just to confirm - the *host* is able to contact the external network after resume, correct?

Comment 5 Mauricio Teixeira 2011-01-13 10:28:35 UTC
Ok, now this does not make sense to me...

First, note there have been some updates after I opened the ticket:
libvirt-0.8.1-27.el6.x86_64
kernel-2.6.32-71.14.1.el6.x86_64

1 - I initially generated the before-*.txt files, with Windows 7 + RHEL 5.5 VMs running.
2 - Hibernate
3 - Resume
4 - Bring up Windows 7 VM, verified there is no network
5 - Generate the after-*.txt files
6 - Bring up RHEL 5.5 VM, verified that network now works (odd)
7 - Reboot Windows 7 VM, and now network works (very odd)
8 - Generate the later-*.txt files

Now I don't understand. :)

Comment 6 Mauricio Teixeira 2011-01-13 10:29:55 UTC
Created attachment 473287 [details]
Files requested by comment #4

Comment 7 Laine Stump 2011-01-13 21:23:20 UTC
The thing I suspected wasn't the case - I had thought possibly the iptables rules that do the NAT were getting stomped by something during the resume, but those seem to be fine.

The one item that changes is the output of brctl show, but not in a way that would imply what you're seeing:

before-brctl.txt: shows that there are no interfaces connected to virbr0, which would imply either that no guests are running, or at least that those guests are not connected to the network, and so they wouldn't be able to connect to *anywhere* (including the host).

after-brctl.txt: shows that two interfaces are connected to the bridge, presumably one for each guest, but you say that at the time you collected the after-* files, you had only one guest running, and it had no network connectivity.

later-brctl.txt: again shows no interfaces connected to virbr0, implying that there are no guests connected to the network, yet you say that at this point both guests are able to connect to the network.

Can you verify this is really the case? The evidence doesn't fit the description...

(The "vnet0" and "vnet1" are tap devices connected to the guests' emulated network interfaces; you can learn which tap device is in use for which guest by doing a "virsh dumpxml" of each running domain and looking in the <interface> section. Speaking of that, the output of "virsh dumpxml" for both of the guests, as well as the output of "net-dumpxml default" may (but probably won't) provide a hint).

Comment 8 Mauricio Teixeira 2011-01-14 12:18:32 UTC
Created attachment 473515 [details]
Another round of tests

I run another round of tests. The file 00-steps.txt explain what I did when, and then each other file contains the output of the chain of the commands during each step.