| Summary: | takes between 1-3 minutes to start a VM non-bootable disk | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Haim <hateya> | ||||||
| Component: | libvirt | Assignee: | Osier Yang <jyang> | ||||||
| Status: | CLOSED WORKSFORME | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 6.2 | CC: | abaron, bazulay, dallan, danken, dnaori, hateya, iheim, mgoldboi, yeylon, ykaul | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-07-07 14:00:06 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 682015 | ||||||||
| Attachments: |
|
||||||||
Created attachment 502004 [details]
libvirt logs
Haim, which version of libvirt you were running? Log inspection shows, we wait too long on file descriptors to became available for I/O, although they are, e.g.:
17:45:48.577: 1961: debug : virExecWithHook:761 : LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?
17:45:48.578: 1961: debug : virCommandRunAsync:1214 : Command result 0, with PID 5486
....
17:46:01.303: 1961: debug : virCommandRun:1084 : Result status 0, stdout: '' stderr: 'name "pci-bridge", bus PCI
17:46:01.316: 1961: debug : virExecWithHook:761 : LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?
17:46:01.316: 1961: debug : virCommandRunAsync:1214 : Command result 0, with PID 5694
...
17:46:19.330: 1961: debug : virCommandRun:1084 : Result status 0, stdout: '' stderr: 'name "pci-bridge", bus PCI
17:46:19.626: 1961: debug : virJSONValueToString:1062 : result={"execute":"qmp_capabilities"}
17:46:19.626: 1961: debug : virEventPollUpdateHandle:144 : Update handle w=31 e=15
17:46:19.626: 1961: debug : virEventPollInterruptLocked:690 : Interrupting
...
17:46:27.300: 1961: debug : virEventPollInterruptLocked:690 : Interrupting
17:46:27.300: 1961: debug : virJSONValueFromString:933 : string={"return": {}}
Moreover, notice (it is not shown here, but in attached logs) when qemu outputs on stdout we pick output immediately.
(In reply to comment #5) > Haim, which version of libvirt you were running? Log inspection shows, we wait > too long on file descriptors to became available for I/O, although they are, > e.g.: > > 17:45:48.577: 1961: debug : virExecWithHook:761 : LC_ALL=C > PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root > /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,? > -device virtio-net-pci,? > 17:45:48.578: 1961: debug : virCommandRunAsync:1214 : Command result 0, with > PID 5486 > .... > 17:46:01.303: 1961: debug : virCommandRun:1084 : Result status 0, stdout: '' > stderr: 'name "pci-bridge", bus PCI > > > 17:46:01.316: 1961: debug : virExecWithHook:761 : LC_ALL=C > PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root > /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,? > -device virtio-net-pci,? > 17:46:01.316: 1961: debug : virCommandRunAsync:1214 : Command result 0, with > PID 5694 > ... > 17:46:19.330: 1961: debug : virCommandRun:1084 : Result status 0, stdout: '' > stderr: 'name "pci-bridge", bus PCI > > > 17:46:19.626: 1961: debug : virJSONValueToString:1062 : > result={"execute":"qmp_capabilities"} > 17:46:19.626: 1961: debug : virEventPollUpdateHandle:144 : Update handle w=31 > e=15 > 17:46:19.626: 1961: debug : virEventPollInterruptLocked:690 : Interrupting > ... > 17:46:27.300: 1961: debug : virEventPollInterruptLocked:690 : Interrupting > 17:46:27.300: 1961: debug : virJSONValueFromString:933 : string={"return": {}} > > > Moreover, notice (it is not shown here, but in attached logs) when qemu outputs > on stdout we pick output immediately. libvirt-0.9.1-1.el6.x86_64 Well, I can't see any problem in the code, but I might be wrong. What I still think of is qemu might be slow. I mean - it can, under some circumstances, take long qemu to write output and quit, which result in slow libvirt. So Haim, would you mind reproduce this, and when you reach the point, where everything is so lazy, just run qemu manually: LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,? This write something on stderr. If this command runs more than, say 5 seconds, I believe this is qemu issue. Thanks. |
Created attachment 500587 [details] vdsm logs. Description of problem: noticed that it takes more then usual to start a new VM, it differs between 1-3 minutes, on a very high performance machines. from investigation I did, it appears that thread goes to sleep for a minute after performing memTestAndCommit check, and after almost a minute, continues to VM wrapper thread, which prepares the volume, creates the XML, and sends to libvirt. also, performed some code modification on order to isolate this issue and following are the results: 1) disabled resource locking in memTestAndCommit function under /usr/share/vdsm/clientIf.py def memTestAndCommit(self, newVm): """ Test if enough memory is available for new VM (currently unused) """ #self._memLock.acquire() using script, started bunch of disk-less VMs, it took a minute to start each VM. pleas refer to 'vdsm-disable_lock.log' - attached. 2) disabled relative 'if' that call this function in /usr/share/vdsm/vm.py def _startUnderlyingVm(self): try: if 'recover' not in self.conf: #if not self.cif.memTestAndCommit(self): # self.setDownStatus(ERROR, # 'Out of memory - machine not created') # return rc, err = self._prepostVmScript('pre_vm') using script, started a bunch of disk-less VMs, it took a minute to start 10 vms. please refer to 'vdsm-disable_if.log' - attached. not sure if its a regression.