Bug 542756

Summary: Backporting MSI-X mask bit acceleration
Product: Red Hat Enterprise Linux 5 Reporter: Don Dugger <ddugger>
Component: xenAssignee: Don Dugger <ddugger>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5CC: clalance, donald.d.dugger, jdenemar, knoel, mrezanin, qing.he, rwu, xen-maint, xin.li, yuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xen-3.0.3-104.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 537734 Environment:
Last Closed: 2010-03-30 08:58:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 537734    
Bug Blocks:    
Attachments:
Description Flags
tools changes
none
MSI-X mask bit acceleration none

Description Don Dugger 2009-11-30 18:03:25 UTC
+++ This bug was initially created as a clone of Bug #537734 +++
(This BZ is for the Tools changes needed for MSI-X mask bit acceleration)

Description of problem:

guest MSI-X mask/unmask operation of kernel 2.6.18 currently goes to Dom0 and QEmu, which brings big impact to overall CPU utilization and throughput.
Xen-unstable already has a feature called mask bit acceleration, that skip QEmu, needs to back port it to improve SR-IOV performance for 2.6.18 guests

Comment 1 Don Dugger 2009-11-30 18:21:28 UTC
Created attachment 374839 [details]
tools changes

tools: MSI-X mask bit acceleration

from qemu-xen-unstable commit f00f286226e425c355a73fecc492b6e11bac419a

    passthrough: MSI-X mask bit acceleration

    Read MSI-X mask bit directly from the device, since buffered version
    may not be up-to-date when MSI-X mask bit interception is working.
    Also rebind every MSI-X vector on guest PCI BAR rebalancing so that
    MSI-X mask bit intercept handler can get the correct gpa

    [ Also, fix declaration of pt_msix_update_remap in pt-msi.h, which
      was misspelled pt_msi_update_remap. -iwj ]

    Signed-off-by: Qing He <qing.he>
    Signed-off-by: Ian Jackson <ian.jackson.com>

Signed-off-by: Qing He <qing.he

Comment 2 Jiri Denemark 2009-12-21 08:59:40 UTC
Kernel patches required for this were reverted. Changing back to ASSIGNED until a new version of the patch is provided. Also be sure the patch will not break compatibility with older kernels.

Comment 8 Rita Wu 2010-01-22 11:11:48 UTC
Cannot use VF in kernel-2.6.18-185.el5 which is downloaded from http://people.redhat.com/jwilson/el5


[root@intel-x5550-12-1 ~]# dmesg |grep mmconfig
PCI: Cannot map mmconfig aperture for segment 0
=============
[root@intel-x5550-12-1 ~]# xm dmesg |grep -i vt
(XEN) Intel VT-d has been enabled
(XEN) Intel VT-d snoop control disabled

=============
[root@intel-x5550-12-1 ~]#cat /etc/grub.conf 
...
title Red Hat Enterprise Linux Server (2.6.18-185.el5xen)
	root (hd0,0)
	kernel /xen.gz-2.6.18-185.el5 iommu=force
	module /vmlinuz-2.6.18-185.el5xen ro root=/dev/VolGroup01/LogVol00 pci_pt_e820_access=on
	module /initrd-2.6.18-185.el5xen.img
...
==============

#xm dmesg
...
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 0000000000095800 (usable)
(XEN)  0000000000095800 - 00000000000a0000 (reserved)
(XEN)  00000000000e8000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 00000000cefa5800 (usable)
(XEN)  00000000cefa5800 - 00000000d0000000 (reserved)
(XEN)  00000000f0000000 - 00000000f8000000 (reserved)
(XEN)  00000000fec00000 - 00000000fed40000 (reserved)
(XEN)  00000000fed45000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000000330000000 (usable)
...
(XEN) mm.c:630:d0 Non-privileged (0) attempt to map I/O space 000fec00

Comment 9 Rita Wu 2010-01-22 11:13:21 UTC
Created attachment 386121 [details]
/var/log/messages

Comment 10 Rita Wu 2010-01-22 11:14:15 UTC
Created attachment 386123 [details]
the output from dmidecode

Comment 11 Bill Burns 2010-01-27 17:47:34 UTC
opened up some comments that should not have been private.

Comment 12 Miroslav Rezanina 2010-01-29 20:43:20 UTC
Kernel supporting VF and MSI-X acceleration can be found at

https://brewweb.devel.redhat.com/taskinfo?taskID=2235072

Comment 13 Rita Wu 2010-02-01 08:47:17 UTC
Test on the kernel-xen mentioned in comment 12. I'm wondering is it the expected steps and results?

Steps:
1. Create 14 VF
2.Assign these 14 VF to 7 guests
3.Install netperf on host and 7 guests
4.let host as netserver
5.From each of the 7 guests, run #netperf -H $host_ip -l 20000
6.From host, run #top to see the used CPU in time.

Results:
The highest CPU usage is less than 20% and it's always near 10% .

Comment 14 Yufang Zhang 2010-02-01 09:39:31 UTC
(In reply to comment #13)
> Test on the kernel-xen mentioned in comment 12. I'm wondering is it the
> expected steps and results?
> 
> Steps:
> 1. Create 14 VF
> 2.Assign these 14 VF to 7 guests
> 3.Install netperf on host and 7 guests
> 4.let host as netserver

choose another host as netserver should be better? Since we should make sure domain0 doesn`t waist much cpu cycles on network processing?

> 5.From each of the 7 guests, run #netperf -H $host_ip -l 20000
> 6.From host, run #top to see the used CPU in time.
> 
> Results:
> The highest CPU usage is less than 20% and it's always near 10% .

Comment 15 Rita Wu 2010-02-01 11:26:38 UTC
And I used kerner-xen downloaded from http://people.redhat.com/~ddutile/rhel5/bz547980/.
The highest CPU usage is about 30% and it's always above 22%. I cannot see it increase dramatically.

Comment 16 Don Dugger 2010-02-01 16:45:59 UTC
Qing-

Is this the expected behavior and test methodology for this issu?

Comment 17 Qing He 2010-02-02 01:54:12 UTC
(In reply to comment #15)
> And I used kerner-xen downloaded from
> http://people.redhat.com/~ddutile/rhel5/bz547980/.
> The highest CPU usage is about 30% and it's always above 22%. I cannot see it
> increase dramatically.    

The expected result is an decrease of dom0 overhead, domU CPU utilization should not vary much before dom0 bottleneck is reached. So please put netserver on another host, and use `xm top' to see the domU CPU utilization, the following is expected:
    (1) without this patch:
        Dom0 CPU usage is high and proportional to the number of working VFs
    (2) with this patch:
        Dom0 CPU usage is low and remains constant as the number of VFs increases.

Comment 18 Rita Wu 2010-02-02 05:44:23 UTC
(In reply to comment #17)
> (In reply to comment #15)
> > And I used kerner-xen downloaded from
> > http://people.redhat.com/~ddutile/rhel5/bz547980/.
> > The highest CPU usage is about 30% and it's always above 22%. I cannot see it
> > increase dramatically.    
> 
> The expected result is an decrease of dom0 overhead, domU CPU utilization
> should not vary much before dom0 bottleneck is reached. So please put netserver
> on another host, and use `xm top' to see the domU CPU utilization, the
> following is expected:
>     (1) without this patch:
>         Dom0 CPU usage is high and proportional to the number of working VFs
>     (2) with this patch:
>         Dom0 CPU usage is low and remains constant as the number of VFs
> increases.    

Should I load the network through the VF link or qemu emulated NIC?

Comment 19 Qing He 2010-02-02 05:50:22 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > (In reply to comment #15)
> > > And I used kerner-xen downloaded from
> > > http://people.redhat.com/~ddutile/rhel5/bz547980/.
> > > The highest CPU usage is about 30% and it's always above 22%. I cannot see it
> > > increase dramatically.    
> > 
> > The expected result is an decrease of dom0 overhead, domU CPU utilization
> > should not vary much before dom0 bottleneck is reached. So please put netserver
> > on another host, and use `xm top' to see the domU CPU utilization, the
> > following is expected:
> >     (1) without this patch:
> >         Dom0 CPU usage is high and proportional to the number of working VFs
> >     (2) with this patch:
> >         Dom0 CPU usage is low and remains constant as the number of VFs
> > increases.    
> 
> Should I load the network through the VF link or qemu emulated NIC?    

Please use VF links, only MSI-X of assigned device can benefit from this patch

Comment 20 Rita Wu 2010-02-02 05:57:59 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > (In reply to comment #15)
> > > > And I used kerner-xen downloaded from
> > > > http://people.redhat.com/~ddutile/rhel5/bz547980/.
> > > > The highest CPU usage is about 30% and it's always above 22%. I cannot see it
> > > > increase dramatically.    
> > > 
> > > The expected result is an decrease of dom0 overhead, domU CPU utilization
> > > should not vary much before dom0 bottleneck is reached. So please put netserver
> > > on another host, and use `xm top' to see the domU CPU utilization, the
> > > following is expected:
> > >     (1) without this patch:
> > >         Dom0 CPU usage is high and proportional to the number of working VFs
> > >     (2) with this patch:
> > >         Dom0 CPU usage is low and remains constant as the number of VFs
> > > increases.    
> > 
> > Should I load the network through the VF link or qemu emulated NIC?    
> 
> Please use VF links, only MSI-X of assigned device can benefit from this patch    


Thanks Qing He.
But now I cannot use VF links, due to Bug 552348

Comment 21 Qing He 2010-02-02 06:26:19 UTC
(In reply to comment #20)
> (In reply to comment #19)
> > Please use VF links, only MSI-X of assigned device can benefit from this patch    
> 
> 
> Thanks Qing He.
> But now I cannot use VF links, due to Bug 552348    

Seems I don't have the access to that bug.

Anyway, if you just want to test the correctness of this patch, you can also use ordinary PCI devices with MSI-X interrupts, the patch will also function in this situation.

Comment 22 Rita Wu 2010-02-03 06:11:25 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > (In reply to comment #19)
> > > Please use VF links, only MSI-X of assigned device can benefit from this patch    
> > 
> > 
> > Thanks Qing He.
> > But now I cannot use VF links, due to Bug 552348    
> 
> Seems I don't have the access to that bug.
> 
> Anyway, if you just want to test the correctness of this patch, you can also
> use ordinary PCI devices with MSI-X interrupts, the patch will also function in
> this situation.    

Hi Qing He,

Can you check whether my results is expected?

1.Create 2 guests with 1 PCI device
2.run #netperf -H $another_host_ip -l 2000 -- -m 64 -M 64 in guest one by one

Results:
Without the patch: the cpu utility(#xm top) is about 8% =>50%=>70% 
With the patch: the cpu utility(#xm top) is constant. (16%~23% =>18%~22%=>18%~22%)

Comment 23 Qing He 2010-02-03 06:24:03 UTC
(In reply to comment #22)
> Hi Qing He,
> 
> Can you check whether my results is expected?
> 
> 1.Create 2 guests with 1 PCI device

you mean you have 2 PCI devices and assign them to 2 guests, respectively, is that right?

> 2.run #netperf -H $another_host_ip -l 2000 -- -m 64 -M 64 in guest one by one
> 
> Results:
> Without the patch: the cpu utility(#xm top) is about 8% =>50%=>70% 
> With the patch: the cpu utility(#xm top) is constant. (16%~23%
> =>18%~22%=>18%~22%)    

Is these numbers the total CPU utilization or only the utilization of Dom0?
I expect similar results, however, there's a question: This 8%, is it the cpu utilization when idle (no netperf load)? When two domains are both idle, there shouldn't be significant difference on CPU utilization, since the devices aren't working at the time, but 16%-23% is far higher than 8%

Comment 24 Rita Wu 2010-02-03 08:03:33 UTC
(In reply to comment #23)
> (In reply to comment #22)
> > Hi Qing He,
> > 
> > Can you check whether my results is expected?
> > 
> > 1.Create 2 guests with 1 PCI device
> 
> you mean you have 2 PCI devices and assign them to 2 guests, respectively, is
> that right?

Yes, each guest has its own 1 PCI device.
> 
> > 2.run #netperf -H $another_host_ip -l 2000 -- -m 64 -M 64 in guest one by one
> > 
> > Results:
> > Without the patch: the cpu utility(#xm top) is about 8% =>50%=>70% 
> > With the patch: the cpu utility(#xm top) is constant. (16%~23%
> > =>18%~22%=>18%~22%)    
> 
> Is these numbers the total CPU utilization or only the utilization of Dom0?

These numbers is the utilization of Dom0 which I got from "xm top".

> I expect similar results, however, there's a question: This 8%, is it the cpu
> utilization when idle (no netperf load)? When two domains are both idle, there
> shouldn't be significant difference on CPU utilization, since the devices
> aren't working at the time, but 16%-23% is far higher than 8%    

Sorry, I just recorded the number when 2 guests is booting that's the reason it is 16%~23%. And I retest again. With new kernel-xen +new xen, the utilization of Dom0 stays about 10% from idle to loading network. And even the utilization of DomU is hign(about 50%), then utilization of Dom0 is still relative low (10%).

Comment 25 Qing He 2010-02-03 08:22:22 UTC
(In reply to comment #24)
> And I retest again. With new kernel-xen +new xen, the utilization
> of Dom0 stays about 10% from idle to loading network. And even the utilization
> of DomU is hign(about 50%), then utilization of Dom0 is still relative low
> (10%).    

The dom0 part is exactly what we want to see out of the patch, to release dom0 from handling MSI-X interrupts thus remove this bottleneck.

For the domU, the cpu utilization should remain at a similar level as the unpatched case, what CPU percentage do you observe for domU without the patch?

Comment 26 Rita Wu 2010-02-03 08:47:46 UTC
(In reply to comment #25)
> (In reply to comment #24)
> > And I retest again. With new kernel-xen +new xen, the utilization
> > of Dom0 stays about 10% from idle to loading network. And even the utilization
> > of DomU is hign(about 50%), then utilization of Dom0 is still relative low
> > (10%).    
> 
> The dom0 part is exactly what we want to see out of the patch, to release dom0
> from handling MSI-X interrupts thus remove this bottleneck.
> 
> For the domU, the cpu utilization should remain at a similar level as the
> unpatched case, what CPU percentage do you observe for domU without the patch?    

Yes,without the new patch, the utilization of DomU is also about 40%~50% when loading network. It is the expected results, right?

Comment 27 Qing He 2010-02-03 08:55:40 UTC
(In reply to comment #26)
> (In reply to comment #25)
> > (In reply to comment #24)
> > > And I retest again. With new kernel-xen +new xen, the utilization
> > > of Dom0 stays about 10% from idle to loading network. And even the utilization
> > > of DomU is hign(about 50%), then utilization of Dom0 is still relative low
> > > (10%).    
> > 
> > The dom0 part is exactly what we want to see out of the patch, to release dom0
> > from handling MSI-X interrupts thus remove this bottleneck.
> > 
> > For the domU, the cpu utilization should remain at a similar level as the
> > unpatched case, what CPU percentage do you observe for domU without the patch?    
> 
> Yes,without the new patch, the utilization of DomU is also about 40%~50% when
> loading network. It is the expected results, right?    

yes, that it is.

Comment 28 Rita Wu 2010-02-03 09:00:32 UTC
Thanks Qing He. :)

Verified on kernel-xen from https://brewweb.devel.redhat.com/taskinfo?taskID=2235072  + xen-104

Comment 30 errata-xmlrpc 2010-03-30 08:58:56 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0294.html

Comment 31 Paolo Bonzini 2010-04-08 15:51:29 UTC
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).