Bug 625368 - xend should give warning message to prevent creating when guest vcpus >32 on ia64 platform
Summary: xend should give warning message to prevent creating when guest vcpus >32 on ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.6
Hardware: ia64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Michal Novotny
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 514500
TreeView+ depends on / blocked
 
Reported: 2010-08-19 08:01 UTC by XinSun
Modified: 2014-02-02 22:38 UTC (History)
10 users (show)

Fixed In Version: xen-3.0.3-126.el5
Doc Type: Bug Fix
Doc Text:
On the Itanium platform, it was possible to create a HVM (Hardware Virtual Machine) guest with more than 32 virtual CPUs (VCPUs) without any warning messages. Consequently, the "xm list" command reported only one VCPU on that machine. With this update, a patch has been provided to disallow HVM guest creation with more than 32 VCPUs on the Itanium platform, as this is not supported, thus preventing this bug.
Clone Of:
Environment:
Last Closed: 2011-07-21 09:15:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
xend_log_pv.txt (9.12 KB, text/plain)
2010-08-19 08:01 UTC, XinSun
no flags Details
xend_log_hvm.txt (11.35 KB, text/plain)
2010-08-19 08:02 UTC, XinSun
no flags Details
xm_dmesg.txt (15.71 KB, text/plain)
2010-08-19 08:03 UTC, XinSun
no flags Details
rhel5u5-ia64-hvm (604 bytes, text/plain)
2010-08-19 08:05 UTC, XinSun
no flags Details
rhel5u5-ia64-pv (579 bytes, text/plain)
2010-08-19 08:06 UTC, XinSun
no flags Details
xm dmesg, create hvm guest with 33 vcpus (5.44 KB, text/plain)
2011-03-09 12:33 UTC, Qixiang Wan
no flags Details
xend.log, create hvm guest with 33 vcpus (10.80 KB, text/plain)
2011-03-09 12:34 UTC, Qixiang Wan
no flags Details
Patch to disallow HVM guest creation with more than 32 VCPUs on ia64 platform (4.50 KB, patch)
2011-03-10 16:14 UTC, Michal Novotny
no flags Details | Diff
Patch to disallow *any* guest creation with more than 32 VCPUs (4.06 KB, patch)
2011-03-11 13:30 UTC, Michal Novotny
no flags Details | Diff
Patch to move VCPU count check to user-space component (3.07 KB, patch)
2011-03-14 11:52 UTC, Michal Novotny
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1070 0 normal SHIPPED_LIVE xen bug fix and enhancement update 2011-07-21 09:12:56 UTC

Description XinSun 2010-08-19 08:01:10 UTC
Created attachment 439612 [details]
xend_log_pv.txt

Description of problem:
On x86_64 and i386 platform, When you create guest's vcpus > 32, xend will give warning to prevent creating operation. But on ia64 platform, you can create it even if vcpus > 32 .So if you run "xm list", you will find there is only 1 vcpu list and the state is not normal.

Version-Release number of selected component (if applicable):
xen-3.0.3-115.el5
kernel-xen-2.6.18-211.el5

How reproducible:
Always

Steps to Reproduce:
1.Create a hvm guest with vcpus=33
# xm create rhel5-ia64-hvm vcpus=33
2.Creeate a pv guest with vcpus=33
# xm create rhel5-ia64-pv vcpus=33
3.Run "xm list" to check
  
Actual results:
1.After step1, you can create a hvm guest.
2.After step2, you can also create a pv guest
3. After step3, you can see 
[root@dhcp-66-82-130 xen]# xm list
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3996     2 r-----  13942.5
rhel5u5-ia64-hvm                          42      527     1 ------      0.0
rhel5u5-ia64-pv                           41      511    33 -b----    150.6


Expected results:
xend should give warning messages, like:
Using config file ".rhel5-ia64-hvm".
Error: (22, 'Invalid argument')

Additional info:
Add attachment:  xend_log_hvm.txt, xend_log.pv.txt
                 xm_dmesg_.txt
                 rhel5-ia64-hvm (guest config file)
                 rhel5-ia64-pv

Comment 1 XinSun 2010-08-19 08:02:03 UTC
Created attachment 439613 [details]
xend_log_hvm.txt

Add xend_log_hvm.txt

Comment 2 XinSun 2010-08-19 08:03:59 UTC
Created attachment 439614 [details]
xm_dmesg.txt

Add xm_dmesg.txt

Comment 3 XinSun 2010-08-19 08:05:54 UTC
Created attachment 439615 [details]
rhel5u5-ia64-hvm

Add rhel5u5-ia64-hvm

Comment 4 XinSun 2010-08-19 08:06:35 UTC
Created attachment 439616 [details]
rhel5u5-ia64-pv

Add rhel5u5-ia64-pv

Comment 6 YangGuang 2010-11-08 09:10:46 UTC
Version-Release number of selected component (if applicable):
xen-3.0.3-117.el5
kernel-xen-2.6.18-230.el5


Steps to Reproduce:
1.Create a hvm guest with vcpus=33
# xm create rhel5-ia64-hvm vcpus=33
2.Run "xm list" to check

Actual results:
1.After step1, you can create a hvm guest.
2.After step2, you can see 
[root@dhcp-66-83-230 xen]# xm li
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3996     2 r-----   1935.9
rhel5u5-64hvm-ia64                        13     1039     1 ------      0.0

Expected results:
xend should give warning messages, like:
Using config file ".rhel5-ia64-hvm".
Error: (22, 'Invalid argument')

Comment 7 Michal Novotny 2011-03-02 11:29:50 UTC
Well, I've been investigating this a little and the problem is coming from elsewhere AFAIK. The problem is that the -EINVAL error is coming directly from the xen hypervisor instead of the user-space stack.

I was investigating the xen hypervisor sources and I found out that there are following defines:

include/public/arch-x86/xen.h:#define MAX_VIRT_CPUS 32
include/public/arch-ia64.h:#define MAX_VIRT_CPUS 64

That way the theoretical limit on ia64 vCPUs is 64 and not 32 as in the case of i386 and x86_64 platforms.

The xen hypervisor code is having:

        ret = -EINVAL;
        if ( max > MAX_VIRT_CPUS )
            break;

where the MAX_VIRT_CPUS is coming from the public header files like the include/public/arch-ia64.h for case of ia64 platform. As you can see this is not failing on ia64 because the hypervisor thinks that it supports 64 vCPUs and not just 32. Why it isn't working to show 33 vCPUs in the `xm list` on ia64 for your steps is a different issue however since this failure is coming from hypervisor and libxc is just doing a hypercall which calls the hypervisor with appropriate domctl (i.e. XEN_DOMCTL_max_vcpus since first it's setting up maximum vcpu count and then setting up the real number of VCPUs AFAIK).

I'm changing the component to kernel-xen since xen hypervisor is the part of kernel-xen component.

Michal

Comment 8 Andrew Jones 2011-03-02 12:20:20 UTC
I'm a bit confused. The description makes it sound like the -EINVAL isn't actually occurring, but rather that's what the report "expects" it to be (it's under a heading of 'expected results', and the actual results are that 'xm list' shows the listing without any errors, but the vcpu count is wrong.

'xm dmesg' shows that the hvm guest got all 33 vcpus created. Does the ia64-hvm guest work? I see no sign from these logs that it doesn't.

The only thing I see wrong is that 'xm list' is reporting 1 instead of 33, which sounds like a problem in xend (it's assuming 32 for ia64-hvm guests, or something). So the expected results should be 33, instead of 1, not some error.

Can the reporter please clarify things, I don't have an ia64 box setup right now to play with it myself.

Comment 9 Andrew Jones 2011-03-03 16:03:44 UTC
Adding the needinfo flag for my questions in the previous comment.

Comment 10 Andrew Jones 2011-03-07 15:39:12 UTC
Please also trying listing/getting domain info with virsh to run it through libvirt to see what happens.

Comment 11 Qixiang Wan 2011-03-09 12:32:19 UTC
[1] HVM guest:

if vcpus > 32, qemu-dm will be zombie after create the guest, of course the guest can't boot up and work

$ xm cr rhel5u5-ia64fv 
Using config file "./rhel5u5-ia64fv".
Started domain rhel5u5_ia64fv

$ ps aux | grep qemu
root     26075  0.7  0.0      0     0 ?        Z    15:15   0:01 [qemu-dm] <defunct>
root     26316  0.0  0.0  61408  1808 pts/2    S+   15:18   0:00 grep qemu

$ xm list
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3996     2 r-----  54988.0
rhel5u5_ia64fv                            40     1039     1 ------      0.0

$ virsh dominfo rhel5u5_ia64fv
Id:             40
Name:           rhel5u5_ia64fv
UUID:           d4d849bc-5778-0635-9d03-f55ace306d40
OS Type:        hvm
State:          no state
CPU(s):         1
CPU time:       0.0s
Max memory:     1065024 kB
Used memory:    1064896 kB
Persistent:     no
Autostart:      disable

[2] PV guest:

Domain can boot up with 33 vcpus, can see 33 vcpus both inside (cat /proc/cpuinfo) and outside (xm list) guest. Guest work well after boot up.

$ xm list
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3996     2 r-----  54992.3
rhel5u5-ia64pv                            41      511    33 -b----    344.4

[3] Additional Info:

(1) Create HVM/PV guest with vcpus > 64 will get 'Invalid argument' error.

$ xm create rhel5u5-ia64fv vcpus=65
Using config file "./rhel5u5-ia64fv".
Error: (22, 'Invalid argument')

Comment 12 Qixiang Wan 2011-03-09 12:33:25 UTC
Created attachment 483197 [details]
xm dmesg, create hvm guest with 33 vcpus

Comment 13 Qixiang Wan 2011-03-09 12:34:07 UTC
Created attachment 483198 [details]
xend.log, create hvm guest with 33 vcpus

Comment 15 Andrew Jones 2011-03-09 18:17:49 UTC
Thanks for the clarification and machine access. And thanks for helping pinpoint the failure to qemu with the ps output. Taking the next step and looking in the qemu log we see

Fatal error while trying to get io event!

Which points us to cpu_get_ioreq() and this code

...
        for ( i = 0; i < vcpus; i++ )
            if ( ioreq_local_port[i] == port )
                break;

        if ( i == vcpus ) {
            fprintf(logfile, "Fatal error while trying to get io event!\n");
            exit(1);
        }
...

pointing us to a place where qemu assumes max-vcpus=32

//the evtchn port for polling the notification,
#define NR_CPUS 32
evtchn_port_t ioreq_local_port[NR_CPUS];

Bumping NR_CPUS to 64 fixed qemu, but 'xm list' still showed 1 vcpu. Using virt-viewer I connected and saw we were at the EFI prompt (no autoboot setup). So I booted linux from fs0. This woke up more vcpus; I got messages like these in 'xm dmesg'

(XEN) vpd base: 0xf000000007bc0000, vpd size:65536
(XEN) Allocate domain vhpt at 0xf000000137000000(16MB)
(XEN) Allocate domain vtlb at 0xf000000189230000(32KB)
(XEN) ivt_base: 0xf000000004010000
(XEN) vlsapic.c:297: VLSAPIC inservice base=f000000007a983c0
(XEN) arch_boot_vcpu: vcpu 1 awaken 000000003e29f530!
...
(XEN) vpd base: 0xf0000000078e0000, vpd size:65536
(XEN) Allocate domain vhpt at 0xf000000121000000(16MB)
(XEN) Allocate domain vtlb at 0xf000000189140000(32KB)
(XEN) ivt_base: 0xf000000004010000
(XEN) vlsapic.c:297: VLSAPIC inservice base=f000000007be83c0
(XEN) arch_boot_vcpu: vcpu 31 awaken 000000003e29f530!

However it only woke up another 31, so 'xm list' showed 32, and after it booted up 'grep proc /proc/cpuinfo | wc -l' was also 32.

It appears there are other coded assumptions that max-vcpus should be 32 being hit/enforced. At this point in xen's and ia64's life, I agree with qwan to just make it an error to have a config specify more. Returning this to xen userspace to add the condition. hvm-ia64 guests should not have > 32 vcpus.

Comment 16 Michal Novotny 2011-03-10 16:00:08 UTC
If we want to have this check in user-space then I'm working on this one. The patch will be coming soon.

Michal

Comment 17 Michal Novotny 2011-03-10 16:14:17 UTC
Created attachment 483504 [details]
Patch to disallow HVM guest creation with more than 32 VCPUs on ia64 platform

Hi,
this is the patch for BZ #625368 to disallow creation of HVM guest
with more than 32 VCPUs on ia64 platform. Also, the method to get
information whether guest is HVM or PV directly from configuration
has been added and the check for read-only IDE devices has been
altered to use this function for HVM detection instead of detecting
it on it's own. The check whether guest exceed more than allowed
number of VCPUs for the platform has been implemented as a method
that's accepting the mask argument where developer could easily
define the guest type (pv or hvm) and platform (i386, x86_64, ia64)
in a simple string like "hvm-ia64" to disable more than defined
number of VCPUs for case of HVM ia64 guest. Also, the keyword
'all' has been introduced for both platform and domain (guest) type
definition to allow mask syntax 'all-all' for everything - which
is basically the same like if you leave the mask empty since the
mask argument is optional.

This has been introduced to easily limit number of VCPUs by the
user-space code for each platform supported using a simple def.

The code has been tested on RHEL-5 ia64 host to create both PV
and HVM guests having more and less than 32 VCPUs so everything
is working fine.

Michal

Comment 18 Michal Novotny 2011-03-11 13:30:22 UTC
Created attachment 483716 [details]
Patch to disallow *any* guest creation with more than 32 VCPUs

Hi,
this is the patch for BZ #625368 to disallow creation of HVM guest
with more than 32 VCPUs. The check whether guest exceed more than allowed
number of VCPUs for the platform has been implemented as a method
that's accepting the mask argument where developer could easily
define the guest type (pv or hvm) and platform (i386, x86_64, ia64)
in a simple string like "hvm-ia64" to disable more than defined
number of VCPUs for case of HVM ia64 guest. Also, the keyword
'all' has been introduced for both platform and domain (guest) type
definition to allow mask syntax 'all-all' for everything - which
is basically the same like if you leave the mask empty since the
mask argument is optional.

This has been introduced to easily limit number of VCPUs by the
user-space code for each platform supported but with setting up
limit to 32 VCPUs for vcpus and maxvcpus for all everything.

The code has been tested on RHEL-5 x86_64 and ia64 host to create
both PV and HVM guests having more and less than 32 VCPUs so
everything is working fine.

Michal

Comment 19 Michal Novotny 2011-03-14 11:52:24 UTC
Created attachment 484152 [details]
Patch to move VCPU count check to user-space component

Hi,
this is the patch for BZ #625368 to disallow creation of any domain
that's trying to use more than supported/allowed number of VCPUs.
This patch limits number of VCPUs for PV ia64 domain to 64 and
every other domain type/platform combination to 32 as those are
the checked limits for platforms.

The code has been tested on RHEL-5 x86_64 for both less and more
VCPUs than 32 to see it's working fine and also for ia64 platform
to check against more or less than 64 VCPUs (PV) and 32 VCPUs (HVM).
And everything was working fine.

Michal

Comment 21 Miroslav Rezanina 2011-03-17 09:55:20 UTC
Fix built into xen-3.0.3-126.el5

Comment 24 Qixiang Wan 2011-03-30 09:54:21 UTC
VERIFIED with xen-3.0.3-126.el5. Create HVM guest with vcpus > 32 isn't allowed after the fix is applied.

$ xm create rhel5u5-ia64fv vcpus=33
Using config file "./rhel5u5-ia64fv".
Error: HVM domain on ia64 platform cannot use more than 32 VCPUs (defined in vcpus)

Comment 25 Tomas Capek 2011-07-13 13:18:40 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
On the Itanium platform, it was possible to create a HVM (Hardware Virtual Machine) guest with more than 32 virtual CPUs (VCPUs) without any warning messages. Consequently, the "xm list" command reported only one VCPU on that machine. With this update, a patch has been provided to disallow HVM guest creation with more than 32 VCPUs on the Itanium platform, as this is not supported, thus preventing this bug.

Comment 26 errata-xmlrpc 2011-07-21 09:15:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1070.html

Comment 27 errata-xmlrpc 2011-07-21 11:58:26 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1070.html


Note You need to log in before you can comment on or make changes to this bug.