Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 707197

Summary: takes between 1-3 minutes to start a VM non-bootable disk
Product: Red Hat Enterprise Linux 6 Reporter: Haim <hateya>
Component: libvirtAssignee: Osier Yang <jyang>
Status: CLOSED WORKSFORME QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.2CC: abaron, bazulay, dallan, danken, dnaori, hateya, iheim, mgoldboi, yeylon, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-07 14:00:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 682015    
Attachments:
Description Flags
vdsm logs.
none
libvirt logs none

Description Haim 2011-05-24 11:37:21 UTC
Created attachment 500587 [details]
vdsm logs.

Description of problem:

noticed that it takes more then usual to start a new VM, it differs between 1-3 minutes, on a very high performance machines. 
from investigation I did, it appears that thread goes to sleep for a minute after performing memTestAndCommit check, and after almost a minute, continues to VM wrapper thread, which prepares the volume, creates the XML, and sends to libvirt.

also, performed some code modification on order to isolate this issue and following are the results:

1) disabled resource locking in memTestAndCommit function under /usr/share/vdsm/clientIf.py


 def memTestAndCommit(self, newVm):
        """
        Test if enough memory is available for new VM (currently unused)
        """
        #self._memLock.acquire()

using script, started bunch of disk-less VMs, it took a minute to start each VM.

pleas refer to 'vdsm-disable_lock.log' - attached. 

2) disabled relative 'if' that call this function in /usr/share/vdsm/vm.py


    def _startUnderlyingVm(self):
        try:        
            if 'recover' not in self.conf:
               #if not self.cif.memTestAndCommit(self):
               #     self.setDownStatus(ERROR,
               #                        'Out of memory - machine not created')
               #     return
            
                rc, err = self._prepostVmScript('pre_vm')

using script, started a bunch of disk-less VMs, it took a minute to start 10 vms. 

please refer to 'vdsm-disable_if.log' - attached. 

not sure if its a regression.

Comment 3 Haim 2011-05-31 13:51:07 UTC
Created attachment 502004 [details]
libvirt logs

Comment 5 Michal Privoznik 2011-06-07 09:47:30 UTC
Haim, which version of libvirt you were running? Log inspection shows, we wait too long on file descriptors to became available for I/O, although they are, e.g.:

17:45:48.577: 1961: debug : virExecWithHook:761 : LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?
17:45:48.578: 1961: debug : virCommandRunAsync:1214 : Command result 0, with PID 5486
....
17:46:01.303: 1961: debug : virCommandRun:1084 : Result status 0, stdout: '' stderr: 'name "pci-bridge", bus PCI


17:46:01.316: 1961: debug : virExecWithHook:761 : LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?
17:46:01.316: 1961: debug : virCommandRunAsync:1214 : Command result 0, with PID 5694
...
17:46:19.330: 1961: debug : virCommandRun:1084 : Result status 0, stdout: '' stderr: 'name "pci-bridge", bus PCI


17:46:19.626: 1961: debug : virJSONValueToString:1062 : result={"execute":"qmp_capabilities"}
17:46:19.626: 1961: debug : virEventPollUpdateHandle:144 : Update handle w=31 e=15
17:46:19.626: 1961: debug : virEventPollInterruptLocked:690 : Interrupting
...
17:46:27.300: 1961: debug : virEventPollInterruptLocked:690 : Interrupting
17:46:27.300: 1961: debug : virJSONValueFromString:933 : string={"return": {}}


Moreover, notice (it is not shown here, but in attached logs) when qemu outputs on stdout we pick output immediately.

Comment 6 Haim 2011-06-07 20:36:45 UTC
(In reply to comment #5)
> Haim, which version of libvirt you were running? Log inspection shows, we wait
> too long on file descriptors to became available for I/O, although they are,
> e.g.:
> 
> 17:45:48.577: 1961: debug : virExecWithHook:761 : LC_ALL=C
> PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root
> /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,?
> -device virtio-net-pci,?
> 17:45:48.578: 1961: debug : virCommandRunAsync:1214 : Command result 0, with
> PID 5486
> ....
> 17:46:01.303: 1961: debug : virCommandRun:1084 : Result status 0, stdout: ''
> stderr: 'name "pci-bridge", bus PCI
> 
> 
> 17:46:01.316: 1961: debug : virExecWithHook:761 : LC_ALL=C
> PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root
> /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,?
> -device virtio-net-pci,?
> 17:46:01.316: 1961: debug : virCommandRunAsync:1214 : Command result 0, with
> PID 5694
> ...
> 17:46:19.330: 1961: debug : virCommandRun:1084 : Result status 0, stdout: ''
> stderr: 'name "pci-bridge", bus PCI
> 
> 
> 17:46:19.626: 1961: debug : virJSONValueToString:1062 :
> result={"execute":"qmp_capabilities"}
> 17:46:19.626: 1961: debug : virEventPollUpdateHandle:144 : Update handle w=31
> e=15
> 17:46:19.626: 1961: debug : virEventPollInterruptLocked:690 : Interrupting
> ...
> 17:46:27.300: 1961: debug : virEventPollInterruptLocked:690 : Interrupting
> 17:46:27.300: 1961: debug : virJSONValueFromString:933 : string={"return": {}}
> 
> 
> Moreover, notice (it is not shown here, but in attached logs) when qemu outputs
> on stdout we pick output immediately.


libvirt-0.9.1-1.el6.x86_64

Comment 7 Michal Privoznik 2011-06-08 09:19:38 UTC
Well, I can't see any problem in the code, but I might be wrong. What I still think of is qemu might be slow. I mean - it can, under some circumstances, take long qemu to write output and quit, which result in slow libvirt. So Haim, would you mind reproduce this, and when you reach the point, where everything is so lazy, just run qemu manually:

LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root /usr/libexec/qemu-kvm -device ? -device pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?

This write something on stderr. If this command runs more than, say 5 seconds, I believe this is qemu issue. Thanks.