+++ This bug was initially created as a clone of Bug #241231 +++ At SAP we want to push XEN/KVM. One important requirement for running SAP virtualized is to be able to monitor the real ressource consuption/allocation. Linux on Power e.g. provides a nice proc-interface to get all necessary information we need. Can XEN/KVM provide something similiar as well ? One important thing is, that it should be possible to get such kind of information from _inside_ a VM (e.g. as root) without the need to provide additional authentication. ls3679:~ # cat /proc/ppc64/lparcfg lparcfg 1.6 serial_number=IBM,021004C1A system_type=IBM,9124-720 partition_id=2 R4=0x96 R5=0x0 R6=0x80020000 R7=0x400000040004 BoundThrds=1 CapInc=1 DisWheRotPer=2070000 MinEntCap=100 MinEntCapPerVP=10 MinMem=11520 MinProcs=2 partition_max_entitled_capacity=250 system_potential_processors=4 DesEntCap=150 DesMem=11520 DesProcs=4 DesVarCapWt=64 partition_entitled_capacity=150 group=32770 system_active_processors=4 pool=0 pool_capacity=400 pool_idle_time=1313857586998491 pool_num_procs=4 unallocated_capacity_weight=0 capacity_weight=64 capped=0 unallocated_capacity=0 purr=648228675343980 partition_active_processors=4 partition_potential_processors=4 shared_processor_mode=1 The values above are today parsed by a small SAP tool (SAPOSCOL, a small program which runs as with root privileges), and then displayed as in the following picture. (The important part are the values displayed unter "virtual system" and "CPU"). --- Additional comment from rdoty on 2007-05-24 11:05:44 EDT --- Created an attachment (id=155353) Screen capture showing desired information - example of potential user interface --- Additional comment from helge.deller on 2007-05-31 10:14:34 EDT --- See xen-devel mailing list for some more info as well: http://lists.xensource.com/archives/html/xen-devel/2007-05/msg00908.html --- Additional comment from rdoty on 2007-07-26 10:17:22 EDT --- Too late for RHEL 5.1. Moved to RHEL 5.2 and elevated priority. --- Additional comment from riek on 2007-09-17 13:41:30 EDT --- Adjusting priority due to our new priority inclusion criteria as outlined in http://intranet.corp.redhat.com/ic/intranet/RHELInclusionCriteria.html --- Additional comment from rjones on 2007-11-14 10:04:20 EDT --- Libvirt already offers a way for domU's to securely access the domain information from dom0 / the hypervisor, using the remote support (http://libvirt.org/remote.html). Remote support went in to libvirt >= 0.3.0 in RHEL 5.1. If you would like help using or developing with libvirt + remote support, please contact me. In addition libvirt covers the type of information which is apparently needed, such as # CPUs, CPU pinning, hostname, amount of real RAM and so forth. So I'm closing this as WONTFIX. --- Additional comment from helge.deller on 2007-11-14 14:01:22 EDT --- Hi Richard, Thanks a lot for the feedback and offer to help us with libvirt. I just have a few comments/questions before that. It would be great if you cold comment on them and of course you still may close it WONTFIX afterwards. Since we just want to have read-access to "uncritical" values personally I think a secured access with libvirt is quite much overhead. Let's take a hosting service provider. He runs one physical server, and gives different customers a virtual machine each. Of course one customer should not see configurations of the other customers. Now one customer runs SAP software in this specific VM, and SAP support organization is responsible for this customer in this specific VM. SAP service & support organizations need to monitor basic physical configurations of this VM to be able to find bottlenecks. E.g. does the amount physical memory for this VM changed in the background ? So, SAP running in this VM needs some read-only values. On Linux/ppc64 we can easily read this from the /proc file. It's fast, clean and easy. On XEN, with libvirt, the hosting provider would need to give a (different?) authentication for libvirt to each customer. Furthermore, since we ask for values quite often (e.g. 20 times per second), I'm not sure if we wouldn't run into performance problems with libvirt, esp. if many VMs run on a single box ? Do you have any performance tests or usecases on this ? Best regards, Helge --- Additional comment from hannes.kuehnemund on 2007-11-16 12:23:46 EDT --- Just as additional information. The DMTF (http://www.dmtf.org) published a CIM standard for extended monitoring virtual environments. We, as SAP, implemented this standard in our monitoring tool. The information metrics of this CIM standard are (using random values, nevermind): ResourceMemoryReservation : 18000 ResourceMemoryLimit : 18000 ResourceProcessorReservation : 0 ResourceProcessorLimit : 0 Host_NumberOfPhysicalCPUsUtilized : 1.00 Host_TotalCPUTime : 1 Host_HypervisorManagementTime : 1 Host_MemoryAllocatedToVirtualServers : 1536 Host_HyperVisorMemory : 34 Host_SharedMemory : 0 Host_PagedOutMemory : 0 Host_PageInRate 0: Host_FreeVirtualMemory : 65117 Host_UsedVirtualMemory : 2432 Host_FreePhysicalMemory : 44638 VM_NumberOfPhysicalCPUsUtilized : 0 VM_TotalCPUTime : 0 VM_StealTime : 0 VM_ActiveVirtualProcessors : 0 VM_PhysicalMemoryAllocatedToVirtualSystem : 17388 VM_WorkingSetSize : 0 VM_SharedMemory : 0 VM_TargetMemorySize : 0 VM_PageInRate : 0 VM_HostMemoryPercentage : 36.94 The way this information will be provided is important for us. Our monitoring tools read the values from the zOS and Linux PPC64 /proc interface. Thus we think, that IBM's approach (using the /proc interface) is the one to go for. If you need the DMTF CIM document, please let me know. --- Additional comment from berrange on 2007-11-16 14:10:04 EDT --- We are aware of the CIM standards. There was a recently announced project which implements a generic CIM provider on top of the base libvirt library API which is in an early, but active stage of development. Thus any stats capabilities in libvirt can be exposed as CIM metrics. --- Additional comment from veillard on 2007-11-19 07:23:14 EDT --- For the record, Libvirt CIM project is at http://libvirt.org/CIM/ and mostly worked on by IBM staff at the moment, Daniel --- Additional comment from rjones on 2007-11-19 08:18:43 EDT --- There are several different issues which Helge raised in comment 7 which I want to talk about separately. (1) Secure access / libvirt protocol overhead / speed You don't have to use SSL. libvirt also supports unencrypted TCP connections (although you have to turn them on in a configuration file: http://libvirt.org/remote.html#Remote_libvirtd_configuration) The libvirt protocol uses XDR and in our testing this was much more efficient than alternatives such as XML-RPC. http://en.wikipedia.org/wiki/External_Data_Representation Nevertheless we haven't particularly tested how well the remote protocol scales when you have lots of requests, from lots of clients. It's possible, likely even, that the libvirtd server has performance problems, and if you find them then please raise them as bugs (in Bugzilla and/or upstream) and they will get fixed. Our use of XDR also allows (but does not yet implement) pipelining of requests and the possibility of sending multiple asynchronous updates. (2) Authentication & separate access for domains. Helge wrote: "Let's take a hosting service provider. He runs one physical server, and gives different customers a virtual machine each. Of course one customer should not see configurations of the other customers." This has been discussed on upstream libvir-list a bit, and the current state of thinking is summed up in the second paragraph in this email: https://www.redhat.com/archives/libvir-list/2007-August/msg00030.html Which is to say, we are not very far along understanding the best way to implement this, and there are several possible approaches, all with significant downsides. At the moment I have been recommending to ISPs that they put a wrapper around libvirt and expose a subset of the API which (eg in your case) would only allow a customer to see their own domain. You could even patch libvirt / libvirtd to do this, if you are willing to carry such a patch. In any case, please join libvir-list and open a new discussion on this issue so that whatever solution we do choose in the end will suit you. (3) Reply to comment 8 (CIM). You'll want to join upstream libvirt-cim list too, to make sure that IBM are implementing all those stats counters you need. Some of them will require additional support from libvirt. - - - I'm still going to close this as WONTFIX because I just don't think there is any chance of getting a /proc-like interface into RHEL 5.2. Rich. --- Additional comment from rjones on 2008-01-17 06:57:12 EDT --- Reopening for possible consideration in RHEL 5.3 timeframe. --- Additional comment from rjones on 2008-01-17 07:05:00 EDT --- Summary of discussions with SAP: (1) They would like to monitor aspects of the guest, from within the guest. (2) For list of things to monitor, please see document in comment 12. (3) Guest may be running on a physical host shared with guests from other customers, so guest should not be able to see status of other guests. (4) May not be acceptable for the ISP running the physical host to install any software on the dom0 (eg. libvirtd). (5) Transport over (something like) XenBus may be preferable to transport over a network connection. (6) SAP have discussed this with XenSource, who recommended using xenstore. (7) Libvirt security model isn't compatible with requirement (3). (8) Typical request frequency - per second or per minute. --- Additional comment from rpacheco on 2008-01-21 12:08:12 EDT --- Moving to RHEL 5.3 per comment #13. --- Additional comment from rdoty on 2008-04-11 11:19:24 EDT --- Moved to SAP RHEL 5.3 tracker. --- Additional comment from andriusb on 2008-09-10 15:20:05 EDT --- Greetings - This bugzilla is being deferred to RHEL 5.4 due to upstream and/or design-specific concerns. Deferring to rdoty if this needs to be cloned as a RHEL 6 feature. --- Additional comment from rdoty on 2008-09-12 12:39:20 EDT --- Moved to RHEL 6. Changed component to "distribution" as there is no generic virtualization component; we will want this capability to be hypervisor agnostic, not Xen specific. This will need to be implemented upstream first, and then evaluated for backport to RHEL 5; that would need to be a separate request. This request was discussed in a phone call on 9/12/08. SAP provided more information and a use case. The basic requirement is to be able to get - read only - certain information about the physical host from a guest. SAP often has to troubleshoot customer performance problems. Their experience is that performance problems in guests may be due to changes in the physical host. Their monitoring utility records performance related information over time, and can be used for trend analysis. --- Additional comment from rdoty on 2008-09-12 13:22:05 EDT --- Updated title (should have done this when moving BZ to RHEL 6.0) --- Additional comment from rjones on 2008-10-10 11:34:15 EDT --- This isn't a libvirt bug, but anyway, assigning it back to me. --- Additional comment from rdoty on 2008-10-23 11:23:52 EDT --- SAP is emphasizing that they won't be able to certify SAP on Ovirt without the ability to do at least some of the monitoring. This is a major need for them. Although the original request is against Xen, this is a general Red Hat Virtualization requirement. SAP indicated that they aren't happy with the Xen upstream implementation. --- Additional comment from hannes.kuehnemund on 2008-11-21 03:22:59 EDT --- As agreed in the SAP/Red Hat Call, providing content of email send out to all virtualization partners of SAP: --------------------------------------------------------- As mentioned in the "LinuxLab Virtualization Certification Workshop 2009" preparation meetings, we urgently need to resolve all outstanding monitoring issues in virtualized environments. Why is SAP requesting monitoring information: There have been several customer escalations in the past, where SAP was running on virtualized hardware without having a chance to know the physical environment. To ensure optimal support for our joint customers on virtual machines, we need to have access to the most important virtualization information. Which kind of "monitoring information" does SAP request: We have assembled a list of data which must be availabe (read-only is sufficient) to the virtual machine. At minimum, the following information - if applicable to your virtualization technology - needs to be accessible directly from inside a virtual environment: Host related metrics: * Number of physical CPUs utilized * Total CPU time (s) * Paged out memory (KB/s) * Page in rate (KB/s) * Free physical memory (KB) * Disk and network traffic (KB/s) and status (shared/exclusive) * Time spend in Hypervisor (s) VM related metrics: * Number of physical CPUs configured * Number of physical CPUs utilized * CPU time spend for this virtual machine (s) * Minimum physical memory avaliable/configured (KB) * Maximum physical memory avaliable/configured (KB) How does SAP collect this information: SAP uses a small C-program (saposcol) which acts as a collector for this information. You are invited to work together with us to implement an interface to your virtualization technology to retrieve this information. Our preferred ways are either * that you provide a shared library/DLL which can be utilized by saposcol, * or provide the information in a text file in the /proc or /sys filesystem (on Linux). In case you prefer a shared library, it should not connect via network to the hypervisor, as this introduced high latencies and needs to be runtim-configurable. When does SAP needs this interface: As soon as possible. It's not a requirement, that this interface is available when the virtualization workshop starts, but it's a requirement for productive usage of the virtualization technology with SAP. We cannot certify any virtualization technology which does not provide this information. Does this requirement affects already available/certified virtualization technologies (e.g. RHEL5 Xen) ? The already certified technologies are not affected in their current versions, but we would like to have them providing the information as soon as possible as well. New versions of those technologies (e.g. all new Red Hat Virtualization technologies which should be used with SAP) are required to provide the information prior a certification by SAP. How to proceed? We suggest, that you get in contact with us. Let's take a look together at your virtualization technology and work on a schedule on when you can provide the interface. --- Additional comment from rdoty on 2009-01-08 11:36:01 EDT --- Created an attachment (id=328475) SAP Virtual Server Monitoring document, V1.1 --- Additional comment from mwaite on 2009-03-20 10:51:51 EDT --- Hello Russ, Richard and sly and all. I see that it is targeted for RHEL6 at this point. Is Andy C. familiar with this bugzilla? Who has been involved from a business impact perspective so far? --- Additional comment from rjones on 2009-06-12 09:59:03 EDT --- This is the proposed feature for Fedora 12 / RHEL 6. https://fedoraproject.org/wiki/Features/Hostinfo Please read it, and if you have any comments, please add them here (or email me directly if you wish). The list of host information hasn't been fully thrashed out yet, but it will include at least all of the items that SAP have requested. --- Additional comment from hannes.kuehnemund on 2009-06-15 08:52:22 EDT --- Thanks a lot for the update! I've read the page and I have the impression that at the current state, it looks very good. A decoupling from the underlying hypervisor makes perfectly sense as well as the ease of use factor (if I understood it correctly, that inside the VM nothing special has to be implemented). SAP is looking forward to see the first prototype. We would be pleased to test such packages at any point in time to ensure that all requirements are met. Thanks
*** This bug has been marked as a duplicate of bug 514579 ***