Bug 663143

Summary: All VMs lost network connectivity
Product: Red Hat Enterprise Linux 5 Reporter: Simon Gao <gao>
Component: xenAssignee: Laszlo Ersek <lersek>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: low    
Version: 5.5CC: drjones, lersek, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-15 20:07:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Xen log
none
egrep 'Balloon: [0-9]{1,5} KiB free'
none
Syslog containing memory squeeze messages none

Description Simon Gao 2010-12-14 19:56:07 UTC
Description of problem:

Xen VMs lost network connectivity

Version-Release number of selected component (if applicable):

RHEL 5.5 2.6.18-194.26.1.el5xen

How reproducible:


Steps to Reproduce:
1. Install VMs 
2. Start more that 12 VMs
3. Adding additional VM
  
Actual results:

All VMs lost network connectivity

Expected results:

There are plenty of free physical memory. The dom0 does not run out of memory. 
Network connectivity should work on all VMs.

Additional info:

Logged messages:

xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.

Comment 1 Andrew Jones 2010-12-14 22:04:08 UTC
This is likely a dup of bug 653262.

Comment 2 Laszlo Ersek 2010-12-15 11:52:26 UTC
(In reply to comment #1)
> This is likely a dup of bug 653262.

Simon, can you please post the host's "/proc/xen/balloon" and xend's DEBUG log? Thanks.

Comment 3 Simon Gao 2010-12-15 17:44:44 UTC
Created attachment 468916 [details]
Xen log

Comment 4 Simon Gao 2010-12-15 17:45:25 UTC
# cat /proc/xen/balloon 
Current allocation: 14086144 kB
Requested target:   14086144 kB
Low-mem balloon:    18405040 kB
High-mem balloon:          0 kB
Driver pages:           1024 kB
Xen hard limit:          ??? kB

Comment 5 Laszlo Ersek 2010-12-15 19:04:38 UTC
Created attachment 468939 [details]
egrep 'Balloon: [0-9]{1,5} KiB free'

From the /proc/xen/balloon contents in comment 4 it seems that there is plenty of free memory; Current allocation == Requested target. However this may not be the case when all those guests are starting up. There are several lines in the xend log that suggest a low-running balloon, see the attachment.

It would be interesting to correlate (by timestamp) these xend.log entries with the netback memory pressure lines in the syslog.

Is auto-ballooning turned on?

  # grep -B3 auto-balloon /etc/xen/xend-config.sxp
  
  # Automatically balloon dom0 down if try to balloon domU up for more memory
  # that is free.
  (auto-balloon-dom0 no)

Whatever causes the memory pressure, ballooning or not, the fixes for bug 653262 and bug 653501 take page reassignment between guest and host (flipping) out of the netfront-netback communication. Is it possible to repeat the many guests test with those?

Comment 6 Simon Gao 2010-12-15 19:17:30 UTC
Created attachment 468941 [details]
Syslog containing memory squeeze messages

Comment 7 Simon Gao 2010-12-15 19:18:53 UTC
If auto-ballooning is turned on by default, then yes. 

# grep -i balloon /etc/xen/xend-config.sxp
# Dom0 will balloon out when needed to free memory for domU.
# If dom0-min-mem=0, dom0 will never balloon out.

Comment 8 Laszlo Ersek 2010-12-15 20:07:41 UTC
For example these are very close to each other:

[2010-12-14 10:33:22 xend 6413]
DEBUG (balloon:151) Balloon: 1032 KiB free; 0 to scrub; need 4096; retries: 20.

[2010-12-14 10:33:22 xend 6413]
DEBUG (balloon:145) Balloon: 4104 KiB free; need 4096; done.

[2010-12-14 10:33:22 xend 6413]
DEBUG (balloon:151) Balloon: 4096 KiB free; 0 to scrub; need 2097152;
retries: 25.

and

messages:Dec 14 10:33:31 xenserver kernel: 
xen_net: Memory squeeze in netback driver.

messages:Dec 14 10:33:36 xenserver kernel:
xen_net: Memory squeeze in netback driver.

messages:Dec 14 10:33:43 xenserver kernel:
xen_net: Memory squeeze in netback driver.

messages:Dec 14 10:33:46 xenserver kernel:
xen_net: Memory squeeze in netback driver.

The fixes for bug 653262 and bug 653501 make the memory squeeze path in netback unreachable, so I'm closing this as a duplicate for now.

*** This bug has been marked as a duplicate of bug 653262 ***