Bug 509122 - RFE: RHEL 5.5 Resource monitor - monitor Host from Guest
Summary: RFE: RHEL 5.5 Resource monitor - monitor Host from Guest
Keywords:
Status: CLOSED DUPLICATE of bug 514579
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libvirt
Version: 5.5
Hardware: All
OS: Linux
low
high
Target Milestone: rc
: 5.5
Assignee: Richard W.M. Jones
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 241231
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-01 13:15 UTC by Hannes Kuehnemund
Modified: 2009-12-15 04:15 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of: 241231
Environment:
Last Closed: 2009-12-15 04:15:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Hannes Kuehnemund 2009-07-01 13:15:59 UTC
+++ This bug was initially created as a clone of Bug #241231 +++

At SAP we want to push XEN/KVM.
One important requirement for running SAP virtualized is to be able to monitor
the real ressource consuption/allocation.
 
Linux on Power e.g. provides a nice proc-interface to get all necessary
information we need.
Can XEN/KVM provide something similiar as well ?
One important thing is, that it should be possible to get such kind of
information from _inside_ a VM (e.g. as root) without the need to provide
additional authentication.
 
ls3679:~ # cat /proc/ppc64/lparcfg
lparcfg 1.6
serial_number=IBM,021004C1A
system_type=IBM,9124-720
partition_id=2
R4=0x96
R5=0x0
R6=0x80020000
R7=0x400000040004
BoundThrds=1
CapInc=1
DisWheRotPer=2070000
MinEntCap=100
MinEntCapPerVP=10
MinMem=11520
MinProcs=2
partition_max_entitled_capacity=250
system_potential_processors=4
DesEntCap=150
DesMem=11520
DesProcs=4
DesVarCapWt=64
 
partition_entitled_capacity=150
group=32770
system_active_processors=4
pool=0
pool_capacity=400
pool_idle_time=1313857586998491
pool_num_procs=4
unallocated_capacity_weight=0
capacity_weight=64
capped=0
unallocated_capacity=0
purr=648228675343980
partition_active_processors=4
partition_potential_processors=4
shared_processor_mode=1

 
The values above are today parsed by a  small SAP tool (SAPOSCOL, a small
program which runs as with root privileges), and then displayed as in the
following picture.
(The important part are the values displayed unter "virtual system" and "CPU").

--- Additional comment from rdoty on 2007-05-24 11:05:44 EDT ---

Created an attachment (id=155353)
Screen capture showing desired information - example of potential user interface


--- Additional comment from helge.deller on 2007-05-31 10:14:34 EDT ---

See xen-devel mailing list for some more info as well: 
http://lists.xensource.com/archives/html/xen-devel/2007-05/msg00908.html

--- Additional comment from rdoty on 2007-07-26 10:17:22 EDT ---

Too late for RHEL 5.1. Moved to RHEL 5.2 and elevated priority.

--- Additional comment from riek on 2007-09-17 13:41:30 EDT ---

Adjusting priority due to our new priority inclusion criteria as outlined in
http://intranet.corp.redhat.com/ic/intranet/RHELInclusionCriteria.html



--- Additional comment from rjones on 2007-11-14 10:04:20 EDT ---

Libvirt already offers a way for domU's to securely access the
domain information from dom0 / the hypervisor, using the remote
support (http://libvirt.org/remote.html).  Remote support went
in to libvirt >= 0.3.0 in RHEL 5.1.  If you would like help using
or developing with libvirt + remote support, please contact me.

In addition libvirt covers the type of information which is
apparently needed, such as # CPUs, CPU pinning, hostname, amount
of real RAM and so forth.

So I'm closing this as WONTFIX.

--- Additional comment from helge.deller on 2007-11-14 14:01:22 EDT ---

Hi Richard,

Thanks a lot for the feedback and offer to help us with libvirt.
I just have a few comments/questions before that. It would be great if you cold 
comment on them and of course you still may close it WONTFIX afterwards.

Since we just want to have read-access to "uncritical" values personally I 
think a secured access with libvirt is quite much overhead.

Let's take a hosting service provider. He runs one physical server, and gives 
different customers a virtual machine each. Of course one customer should not 
see configurations of the other customers. 
Now one customer runs SAP software in this specific VM, and SAP support 
organization is responsible for this customer in this specific VM.
SAP service & support organizations need to monitor basic physical 
configurations of this VM to be able to find bottlenecks. E.g. does the amount 
physical memory for this VM changed in the background ?

So, SAP running in this VM needs some read-only values. On Linux/ppc64 we can 
easily read this from the /proc file. It's fast, clean and easy. On XEN, with 
libvirt, the hosting provider would need to give a (different?) authentication 
for libvirt to each customer. Furthermore, since we ask for values quite often 
(e.g. 20 times per second), I'm not sure if we wouldn't run into performance 
problems with libvirt, esp. if many VMs run on a single box ?
Do you have any performance tests or usecases on this ?

Best regards,
Helge

--- Additional comment from hannes.kuehnemund on 2007-11-16 12:23:46 EDT ---

Just as additional information. The DMTF (http://www.dmtf.org) published a CIM
standard for extended monitoring virtual environments. We, as SAP, implemented
this standard in our monitoring tool. The information metrics of this CIM
standard are (using random values, nevermind):

ResourceMemoryReservation : 18000
ResourceMemoryLimit : 18000
ResourceProcessorReservation : 0
ResourceProcessorLimit : 0
Host_NumberOfPhysicalCPUsUtilized : 1.00
Host_TotalCPUTime : 1
Host_HypervisorManagementTime : 1
Host_MemoryAllocatedToVirtualServers : 1536
Host_HyperVisorMemory : 34
Host_SharedMemory : 0
Host_PagedOutMemory : 0
Host_PageInRate 0:
Host_FreeVirtualMemory : 65117
Host_UsedVirtualMemory : 2432
Host_FreePhysicalMemory : 44638
VM_NumberOfPhysicalCPUsUtilized : 0
VM_TotalCPUTime : 0
VM_StealTime : 0
VM_ActiveVirtualProcessors : 0
VM_PhysicalMemoryAllocatedToVirtualSystem : 17388
VM_WorkingSetSize : 0
VM_SharedMemory : 0
VM_TargetMemorySize : 0
VM_PageInRate : 0
VM_HostMemoryPercentage : 36.94 

The way this information will be provided is important for us. Our monitoring
tools read the values from the zOS and Linux PPC64 /proc interface. Thus we
think, that IBM's approach (using the /proc interface) is the one to go for.

If you need the DMTF CIM document, please let me know.

--- Additional comment from berrange on 2007-11-16 14:10:04 EDT ---

We are aware of the CIM standards. There was a recently announced project which
implements a generic CIM provider on top of the base libvirt library API which
is in an early, but active stage of development. Thus any stats capabilities in
libvirt can be exposed as CIM metrics.

--- Additional comment from veillard on 2007-11-19 07:23:14 EDT ---

For the record, Libvirt CIM project is at http://libvirt.org/CIM/
and mostly worked on by IBM staff at the moment,

Daniel

--- Additional comment from rjones on 2007-11-19 08:18:43 EDT ---

There are several different issues which Helge raised in comment 7
which I want to talk about separately.

(1) Secure access / libvirt protocol overhead / speed

You don't have to use SSL.  libvirt also supports unencrypted TCP
connections (although you have to turn them on in a configuration
file: http://libvirt.org/remote.html#Remote_libvirtd_configuration)

The libvirt protocol uses XDR and in our testing this was much more
efficient than alternatives such as XML-RPC.
http://en.wikipedia.org/wiki/External_Data_Representation

Nevertheless we haven't particularly tested how well the remote
protocol scales when you have lots of requests, from lots of
clients.  It's possible, likely even, that the libvirtd server
has performance problems, and if you find them then please
raise them as bugs (in Bugzilla and/or upstream) and they will
get fixed.

Our use of XDR also allows (but does not yet implement) pipelining
of requests and the possibility of sending multiple asynchronous
updates.

(2) Authentication & separate access for domains.

Helge wrote: "Let's take a hosting service provider. He runs
one physical server, and gives different customers a virtual
machine each. Of course one customer should not see
configurations of the other customers."

This has been discussed on upstream libvir-list a bit, and
the current state of thinking is summed up in the second
paragraph in this email:
https://www.redhat.com/archives/libvir-list/2007-August/msg00030.html
Which is to say, we are not very far along understanding the
best way to implement this, and there are several possible
approaches, all with significant downsides.

At the moment I have been recommending to ISPs that they put
a wrapper around libvirt and expose a subset of the API which
(eg in your case) would only allow a customer to see their own
domain.  You could even patch libvirt / libvirtd to do this, if
you are willing to carry such a patch.

In any case, please join libvir-list and open a new discussion
on this issue so that whatever solution we do choose in the end
will suit you.

(3) Reply to comment 8 (CIM).

You'll want to join upstream libvirt-cim list too, to make sure
that IBM are implementing all those stats counters you need.
Some of them will require additional support from libvirt.

 - - -

I'm still going to close this as WONTFIX because I just don't
think there is any chance of getting a /proc-like interface into
RHEL 5.2.

Rich.

--- Additional comment from rjones on 2008-01-17 06:57:12 EDT ---

Reopening for possible consideration in RHEL 5.3 timeframe.

--- Additional comment from rjones on 2008-01-17 07:05:00 EDT ---

Summary of discussions with SAP:

(1) They would like to monitor aspects of the guest, from within the guest.

(2) For list of things to monitor, please see document in comment 12.

(3) Guest may be running on a physical host shared with guests from other
customers, so guest should not be able to see status of other guests.

(4) May not be acceptable for the ISP running the physical host to install any
software on the dom0 (eg. libvirtd).

(5) Transport over (something like) XenBus may be preferable to transport over a
network connection.

(6) SAP have discussed this with XenSource, who recommended using xenstore.

(7) Libvirt security model isn't compatible with requirement (3).

(8) Typical request frequency - per second or per minute.

--- Additional comment from rpacheco on 2008-01-21 12:08:12 EDT ---

Moving to RHEL 5.3 per comment #13.

--- Additional comment from rdoty on 2008-04-11 11:19:24 EDT ---

Moved to SAP RHEL 5.3 tracker.

--- Additional comment from andriusb on 2008-09-10 15:20:05 EDT ---

Greetings - This bugzilla is being deferred to RHEL 5.4 due to upstream and/or design-specific concerns. Deferring to rdoty if this needs to be cloned as a RHEL 6 feature.

--- Additional comment from rdoty on 2008-09-12 12:39:20 EDT ---

Moved to RHEL 6. Changed component to "distribution" as there is no generic virtualization component; we will want this capability to be hypervisor agnostic, not Xen specific. This will need to be implemented upstream first, and then evaluated for backport to RHEL 5; that would need to be a separate request.

This request was discussed in a phone call on 9/12/08. SAP provided more information and a use case.

The basic requirement is to be able to get - read only - certain information about the physical host from a guest. SAP often has to troubleshoot customer performance problems. Their experience is that performance problems in guests may be due to changes in the physical host. 

Their monitoring utility records performance related information over time, and can be used for trend analysis.

--- Additional comment from rdoty on 2008-09-12 13:22:05 EDT ---

Updated title (should have done this when moving BZ to RHEL 6.0)

--- Additional comment from rjones on 2008-10-10 11:34:15 EDT ---

This isn't a libvirt bug, but anyway, assigning it back to me.

--- Additional comment from rdoty on 2008-10-23 11:23:52 EDT ---

SAP is emphasizing that they won't be able to certify SAP on Ovirt without the ability to do at least some of the monitoring.

This is a major need for them. Although the original request is against Xen, this is a general Red Hat Virtualization requirement. SAP indicated that they aren't happy with the Xen upstream implementation.

--- Additional comment from hannes.kuehnemund on 2008-11-21 03:22:59 EDT ---

As agreed in the SAP/Red Hat Call, providing content of email send out to all virtualization partners of SAP:

---------------------------------------------------------
As mentioned in the "LinuxLab Virtualization Certification Workshop 2009" preparation meetings, we urgently need to resolve all outstanding monitoring issues in virtualized environments.

Why is SAP requesting monitoring information:
There have been several customer escalations in the past, where SAP was running on virtualized hardware without having a chance to know the physical environment. To ensure optimal support for our joint customers on virtual machines, we need to have access to the most important virtualization information.

Which kind of "monitoring information" does SAP request:
We have assembled a list of data which must be availabe (read-only is sufficient) to the virtual machine.
At minimum, the following information - if applicable to your virtualization technology - needs to be accessible directly from inside a virtual environment:

Host related metrics:

    * Number of physical CPUs utilized
    * Total CPU time (s)
    * Paged out memory (KB/s)
    * Page in rate (KB/s)
    * Free physical memory (KB)
    * Disk and network traffic (KB/s) and status (shared/exclusive)
    * Time spend in Hypervisor (s)

VM related metrics:

    * Number of physical CPUs configured
    * Number of physical CPUs utilized
    * CPU time spend for this virtual machine (s)
    * Minimum physical memory avaliable/configured (KB)
    * Maximum physical memory avaliable/configured (KB)

How does SAP collect this information:
SAP uses a small C-program (saposcol) which acts as a collector for this information.
You are invited to work together with us to implement an interface to your virtualization technology to retrieve this information.

Our preferred ways are either

    * that you provide a shared library/DLL which can be utilized by saposcol,
    * or provide the information in a text file in the /proc or /sys filesystem (on Linux).

In case you prefer a shared library, it should not connect via network to the hypervisor, as this introduced high latencies and needs to be runtim-configurable.

When does SAP needs this interface:
As soon as possible.
It's not a requirement, that this interface is available when the virtualization workshop starts, but it's a requirement for productive usage of the virtualization technology with SAP.

We cannot certify any virtualization technology which does not provide this information.

Does this requirement affects already available/certified virtualization technologies (e.g. RHEL5 Xen) ?

The already certified technologies are not affected in their current versions, but we would like to have them providing the information as soon as possible as well.

New versions of those technologies (e.g. all new Red Hat Virtualization technologies which should be used with SAP) are required to provide the information prior a certification by SAP.

How to proceed?
We suggest, that you get in contact with us. Let's take a look together at your virtualization technology and work on a schedule on when you can provide the interface.

--- Additional comment from rdoty on 2009-01-08 11:36:01 EDT ---

Created an attachment (id=328475)
SAP Virtual Server Monitoring document, V1.1

--- Additional comment from mwaite on 2009-03-20 10:51:51 EDT ---

Hello Russ, Richard and sly and all.
I see that it is targeted for RHEL6 at this point. 
Is Andy C. familiar with this bugzilla?
Who has been involved from a business impact perspective so far?

--- Additional comment from rjones on 2009-06-12 09:59:03 EDT ---

This is the proposed feature for Fedora 12 / RHEL 6.

https://fedoraproject.org/wiki/Features/Hostinfo

Please read it, and if you have any comments, please
add them here (or email me directly if you wish).

The list of host information hasn't been fully thrashed
out yet, but it will include at least all of the items
that SAP have requested.

--- Additional comment from hannes.kuehnemund on 2009-06-15 08:52:22 EDT ---

Thanks a lot for the update! I've read the page and I have the impression that at the current state, it looks very good. A decoupling from the underlying hypervisor makes perfectly sense as well as the ease of use factor (if I understood it correctly, that inside the VM nothing special has to be implemented).

SAP is looking forward to see the first prototype. We would be pleased to test such packages at any point in time to ensure that all requirements are met.

Thanks

Comment 2 Subhendu Ghosh 2009-12-15 04:15:42 UTC

*** This bug has been marked as a duplicate of bug 514579 ***


Note You need to log in before you can comment on or make changes to this bug.