Bug 970179
| Summary: | After hypervisor upgrade creation time of virtual machines is longer than expected. | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Josh Carter <jocarter> | ||||||||||
| Component: | ovirt-engine | Assignee: | Nobody's working on this, feel free to take it <nobody> | ||||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||||||||
| Severity: | high | Docs Contact: | |||||||||||
| Priority: | urgent | ||||||||||||
| Version: | 3.1.2 | CC: | acathrow, amureini, dyasny, iheim, jocarter, lpeer, michal.skrivanek, mkalinin, pstehlik, Rhev-m-bugs, yeylon, ykaul | ||||||||||
| Target Milestone: | --- | ||||||||||||
| Target Release: | 3.2.0 | ||||||||||||
| Hardware: | x86_64 | ||||||||||||
| OS: | Linux | ||||||||||||
| Whiteboard: | storage | ||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2013-06-13 18:35:38 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Josh Carter
2013-06-03 15:40:34 UTC
Created attachment 756543 [details]
spm-vdsm-log
Created attachment 756544 [details]
rhevm-engine
there seems to be some storage responsiveness issue - pending customer test results. Other than that the VM creation seems to work ok. Moving to storage Created attachment 756885 [details]
iostat output
Looking at iostat, we can see svctm gets high.
[mku@mku-ws]$ awk '($11 > 5) {print $0}; /util/ {print $0}' iostat.out
... some output ....
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
dm-380 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 5.53 5.34 0.00
dm-381 0.00 0.00 0.00 0.00 0.00 0.00 8.24 0.00 12.16 10.57 0.00
dm-382 0.00 0.00 6.70 10.65 127.03 135.52 30.27 0.32 18.65 5.51 9.56
dm-385 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 11.13 10.95 0.00
dm-416 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 5.39 5.15 0.00
dm-417 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 5.20 5.04 0.00
dm-418 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 6.99 6.80 0.00
dm-419 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 6.32 6.05 0.00
dm-421 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 6.37 6.13 0.00
dm-422 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 7.17 6.74 0.00
dm-424 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 6.18 5.94 0.00
dm-425 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 5.29 5.10 0.00
dm-17 0.00 0.00 0.30 0.00 0.50 0.00 3.33 0.01 25.00 25.00 0.75
dm-20 0.00 0.00 0.80 0.00 200.20 0.00 500.00 0.03 35.38 12.75 1.02
.....
After working with Storage team, we confirmed this is not a storage issue. As well, we can see here: http://pastebin.test.redhat.com/145578 vdsm flow, seems like all lvcreate and qemu-img create commands are performed really fast. The question is why does it take vdsm so much time? We asked the customer to run the script introduced on this article: https://access.redhat.com/site/articles/279063 while he is running the api create script and we are waiting for his results. We are trying now to see if their dns setup set correctly, if vdsClient and sudo commands outside vdsm run fast enough. https://access.redhat.com/site/solutions/35304 Dns confirmed working properly. Attaching the script results for perfoemance tests. Created attachment 757251 [details]
watcher script output
I don't see anything exceptional in the setup. Maybe high memory consumption of vdsm, don't know if that's normal but even if not the CPU is fine - so there doesn't seem to be an issue of vdsm code. did you try to run those storage-related externals command separately and they indeed returned quickly enough? can we attach strace to the vdsm process and check whether it's not blocked in some syscall? After upgrading to 3.2 GA all the issues resolved. Closing the case. Unfortunately, the issue was not resolved. But seems like we narrowed it down to this BZ#979193, so I decided to leave this one closed and open a new one. |