Bug 1265672
Summary: | [SCALE] Disk performance is really slow | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Carlos Mestre González <cmestreg> | ||||
Component: | ovirt-engine | Assignee: | Nobody <nobody> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | |||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.6.0 | CC: | amureini, cmestreg, ecohen, gklein, lsurette, rbalakri, Rhev-m-bugs, yeylon | ||||
Target Milestone: | ovirt-3.6.3 | ||||||
Target Release: | 3.6.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | storage | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-09-24 16:54:51 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Carlos Mestre González
2015-09-23 13:05:26 UTC
Created attachment 1076232 [details]
qemu log..
Adding to storage whiteboard for the moment. Seems fairly reproducible so I'm adding the qemu log just in case, couldn't fine any errors in other logs. The only error I've seen in the engine.log is this one, not sure if relevant: 2015-09-23 13:40:07,046 ERROR [org.ovirt.engine.core.vdsbroker.VmsMonitoring] (DefaultQuartzScheduler _Worker-77) [a013bb9] VM '071562dc-591c-4c5d-8ee0-644bb51fe820' managed non pluggable device was remo ved unexpectedly from libvirt: 'VmDevice:{id='VmDeviceId:{deviceId='c2d7a067-55a9-4e9b-a5c6-e516a3efb f15', vmId='071562dc-591c-4c5d-8ee0-644bb51fe820'}', device='spice', type='GRAPHICS', bootOrder='0', specParams='[]', address='', managed='true', plugged='false', readOnly='false', deviceAlias='', custo mProperties='[]', snapshotId='null', logicalName='null', usingScsiReservation='false'}' Why is it a RHEV bug and not QEMU/KVM/libvirt? do you suspect anything wrong in the way RHEV launches the VM? 5MB/s is also a joke. It should be 50-500MB/sec, depending on your storage. Few more questions: 1. There aren't clear instructions on how to reproduce the issue. Specifically, what is your storage server? 2. I've noticed you are using a VM with 16 sockets? Is that on purpose? Can you try with 2 or so? 3. Why try to use ext4? just dd on the raw partition. What is your 'dd' command? Did you verify it's not running slowly? (did you look at ddpt for example?) (In reply to Yaniv Kaul from comment #5) > Few more questions: > 1. There aren't clear instructions on how to reproduce the issue. > Specifically, what is your storage server? > 2. I've noticed you are using a VM with 16 sockets? Is that on purpose? Can > you try with 2 or so? Sorry, with 1 CPU. Please test with more. Also, why -cpu Nehalem ? (again, I doubt any are related - you have a more severe issues - your whole IO is quite slow for some reason). > 3. Why try to use ext4? just dd on the raw partition. What is your 'dd' > command? Did you verify it's not running slowly? (did you look at ddpt for > example?) Also, with the same hardware, how does this stack up against oVirt 3.5's performance? Any noticeable difference? Just to clarify, our nfs storage server is really slow now (getting 8MB/s transfer rate with dd with virtio for example), the issue is that the performance with IDE is almost 10 times slower (<1 MB/s), probably if it wasn't that slow i wouldn't catch it up with the issues because of timeout in our tests since we don't test perfomance in general. (In reply to Yaniv Kaul from comment #4) > Why is it a RHEV bug and not QEMU/KVM/libvirt? > do you suspect anything wrong in the way RHEV launches the VM? > > 5MB/s is also a joke. It should be 50-500MB/sec, depending on your storage. Normally I assign it to rhev so devel team can investigate first and assign it accordingly. (In reply to Yaniv Kaul from comment #5) > Few more questions: > 1. There aren't clear instructions on how to reproduce the issue. > Specifically, what is your storage server? I'm checking all the issues with our server now with the team, I'll update with a private comment. > 3. Why try to use ext4? just dd on the raw partition. What is your 'dd' > command? Did you verify it's not running slowly? (did you look at ddpt for > example?) ext4 is part of our test suite, I just checked the dd command to see the speed. dd if=/dev/zero of=test2 bs=1M count=100 haven't check ddpt, I'll look. 1. Please fix your storage server. No point in testing with such issues. (Make sure your network connection is not 100Mbps - that can explain some of it). 2. RHEV devel will be lacking a lot of data here. Especially around the QEMU/KVM issues. I don't see what RHEV has to do with this ATM. 3. You are missing the flag to perform direct IO on the 'dd' command. Without it, you might be writing into cache. 100M is not a lot. You need to bypass the VM cache. Why not use 'fio' or some other reasonable tool? Note that with some storages (XtremIO for example), writing zero's doesn't write anything at all, so again you are 'cheating' sort of speak. Tested in another env with the same rhevm build and I don't see any performance issue with IDE disks. Also from my quick test on 3.5 it seems to have no issues there neither. Not sure what is happening in my environment, could be infrastructure or the fact the nodes are hosted engine(?). anyway closing this bug and opening it again if I can manage to get a clear picture of what is going on. |