Bug 1112684
| Summary: | [scale] Start VM on host is slow (max = 12 min) under load of 400 running vms | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Yuri Obshansky <yobshans> | ||||||||||||||||||
| Component: | ovirt-engine-restapi | Assignee: | Liran Zelkha <lzelkha> | ||||||||||||||||||
| Status: | CLOSED CANTFIX | QA Contact: | Yuri Obshansky <yobshans> | ||||||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||||||
| Priority: | unspecified | ||||||||||||||||||||
| Version: | 3.4.0 | CC: | bazulay, ecohen, gklein, iheim, lpeer, michal.skrivanek, nsoffer, oourfali, oramraz, rbalakri, Rhev-m-bugs, yeylon, yobshans | ||||||||||||||||||
| Target Milestone: | --- | ||||||||||||||||||||
| Target Release: | 3.6.0 | ||||||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||||||
| OS: | Linux | ||||||||||||||||||||
| Whiteboard: | infra | ||||||||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||||
| Last Closed: | 2014-11-04 07:50:14 UTC | Type: | Bug | ||||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||||
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||
| Embargoed: | |||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||
|
Description
Yuri Obshansky
2014-06-24 13:26:52 UTC
Created attachment 911737 [details]
RHEVM-3.4-UserPortal-Performance-Test-400TH-2014-06-17 report
Created attachment 911738 [details]
JMeter script
Created attachment 911974 [details]
Host 24 vdsm log
Created attachment 911975 [details]
Host 25 vdsm log
Created attachment 911977 [details]
Host 20 vdsm log
Created attachment 911978 [details]
Host 30 vdsm log
Created attachment 911979 [details]
Host 29 vdsm log
We need more info: - vdsm cpu graph: what is 40? 40% of of single core? 40% or all cores? - How many cores are in the host? - What was cpu usage of other processes? we have 400 qemu processes running, right? - What is the memory usage of other processes on the host? - Is host overloaded and swapping to disk - vdsm memory graph does not make sense. Where is the sample data? how was it collected? - If I'm reading the tcp graph correctly, we seem to have about 800 connections near the end of the test. Can you explain what is each color in the graph, and how this data was collected? - What vms are running? idle? - What type of storage is used? - How many storage domains are used? All information could be found in attached "RHEVM-3.4-UserPortal-Performance-Test-400TH-2014-06-17 report". Anyway here is it: - vdsm cpu graph: results for vdsm was running on engine machine and cpu usage graph shows process CPU usage percentage for all cores - 24 x Intel(R) Xeon(R) CPU E5-2630 @ 2.00GHz – RAM 64 G CPU Sockets: 2, CPU Cores per Socket: 6, CPU Threads per Core: 2 - See attached report 1.4 Summary statistics table. I didn't check how many processes of qemu were running. - I didn't monitor hosts. - I didn't monitor hosts. - It was collected using JMeter Perfmon plugin (based on Sigar API). Monitor_MEM.csv with sample data attached to bug - BLUE - CLOSERWAIT, PINK - EST, RED - TIMEWAIT. JMeter Perfmon plugin - Physical VMs with kernel and quest-agent installed - NFS - 2 SDs Created attachment 913148 [details]
Memory Usage sample data
Hi Yuri, how quickly are you starting the VMs? From the logs it looks like it's sequential with delay of 80s between each create call? It is not static value. Each threads performs in cycle the following calls: different getInfo calls, shutdown VM, waiting till VM is down, startup VM and waiting till it is Up. So, it is very randomly. It depends on response time of previous calls. What is static that thread ramp up delay - 10 sec i.e. each thread stats after 10 sec. Yuri - can you provide more info? 1. Enclose jstack and logs of the engine 2. Speciy how many VMs are running on each host? Yuri - can you provide more info? 1. Enclose jstack and logs of the engine 2. Speciy how many VMs are running on each host? 1. I don't have environment right now I'll prepare jstack and engine when environment be ready 2. 400 VMs were running on 6 hosts, so ~ 66 VMs per hosts Hi Yuri - any updates? Environment still is not ready since I encountered with several new issues/bugs. I hope, the environment will be ready till end of week. Yuri - any additional info on this? Otherwise, let's close this bug. Let's close this bug. We cannot reproduce it on rhev-m 3.5 because we detected performance degradation on first test iteration (50 threads). https://bugzilla.redhat.com/show_bug.cgi?id=1155146 |