Bug 1834233 - While Start or Reboot VM - It takes a long time till VM status is 'UP'
Summary: While Start or Reboot VM - It takes a long time till VM status is 'UP'
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.40.13
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ovirt-4.4.3
: ---
Assignee: Tomáš Golembiovský
QA Contact: David Vaanunu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-11 11:02 UTC by David Vaanunu
Modified: 2020-11-11 06:42 UTC (History)
5 users (show)

Fixed In Version: vdsm-4.40.33
Clone Of:
Environment:
Last Closed: 2020-11-11 06:42:11 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 111398 0 master MERGED qga: bring the VM up in engine after boot 2020-11-01 08:17:25 UTC
oVirt gerrit 111514 0 master MERGED qga: treat None in capabilities same in all places 2020-11-01 08:17:41 UTC
oVirt gerrit 111520 0 master MERGED qga: add is_active method 2020-11-01 08:17:25 UTC
oVirt gerrit 111521 0 master MERGED qga: mark guest-info check from single location 2020-11-01 08:17:25 UTC
oVirt gerrit 111522 0 master MERGED qga: invalidate qemu-ga capabilities on reboot 2020-11-01 08:17:25 UTC

Description David Vaanunu 2020-05-11 11:02:57 UTC
Description of problem:
While Start or Reboot VM - It takes a long time till status is 'UP'
The results are double compare to 4.3.6 / 4.3.9 


Version-Release number of selected component (if applicable):

redhat-release-8.2-1.0.el8.x86_64
rhv-release-4.4.0-31-001.noarch
vdsm-4.40.13-1.el8ev.x86_64

How reproducible:


Steps to Reproduce:
1. Start / Reboot VM
2. Wait VM status is 'powering-up'
3. ssh to VM / Open VM console and login


Actual results:
Start VM - takes ~70 seconds
Reboot VM - takes ~80 seconds

Expected results:
As 4.3.6 / 4.3.9 results
   Start VM - takes ~30 seconds
   Reboot VM - takes ~40 second

Additional info:

VM name - vm_NewMemory_18__L0_Group_0
VM id - f590f324-7f4d-487a-be5b-62f3e803e771


The log files contain 'Start VM'

2020-05-11 05:34:48,346-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (EE-ManagedThreadFactory-engine-Thread-76) [eb61412f-cf0f-4220-bbe7-fba0263cb7c3] START, CreateBrokerVDSCommand(HostName = f01-h06-000-r620.rdu2.scalelab.redhat.com, CreateVDSCommandParameters:{hostId='67c8c1fb-bd3d-48b6-ae01-3384b8faa0bd', vmId='f590f324-7f4d-487a-be5b-62f3e803e771', vm='VM [vm_NewMemory_18__L0_Group_0]'}), log id: 547e69de


2020-05-11 05:35:56,692-04 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-90) [] VM 'f590f324-7f4d-487a-be5b-62f3e803e771'(vm_NewMemory_18__L0_Group_0) moved from 'PoweringUp' --> 'Up'

Comment 1 Michal Skrivanek 2020-05-12 07:16:38 UTC
probably related to guest agent polling. is that _always_ ~70s or are they cases when it's less than 60s?

Comment 2 David Vaanunu 2020-05-13 07:07:36 UTC
Have no results less than 60 seconds. (Min results are 61 seconds)


Min - 60.7 sec
Avg - 70 sec
Max - 81 Sec
Per.95 - 76 sec

Comment 3 Michal Skrivanek 2020-05-13 07:24:31 UTC
Is it only this one VM, what type is it? can you please verify the status of guest agent in those VMs? Is it installed and running? ovirt-guest-agent or qemu-guest-agent or both? Which version(s)?
Also please attach vdsm.log from the host where the VM was started (and include name and approx time frame)

Comment 4 David Vaanunu 2020-05-13 10:56:22 UTC
Tests are running over 1350 VMs


[root@basic_population ~]# rpm -qa | grep -i guest
qemu-guest-agent-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64

[root@basic_population ~]# rpm -qa | grep -i rel
redhat-release-eula-8.2-1.0.el8.x86_64
redhat-release-8.2-1.0.el8.x86_64
rhv-release-4.4.0-31-001.noarch
epel-release-8-8.el8.noarch


agent is running


Log File:
Name= vm_106__L0_Group_0
vmId='1200269f-319d-4a5d-ba00-050981dc539b'

Timestamp:
2020-05-13 10:40 (VM Start)
              
https://drive.google.com/open?id=1l0thr_NocDcCAmgqSDyAd7z724-KlPqC

Comment 5 Arik 2020-06-28 10:02:02 UTC
It seems like VDSM stopped changing the VM status to UP when it manages to connect to the guest agent - I see for instance the following in the log after ~10 sec:

2020-06-28 12:53:46,520+0300 INFO  (qgapoller/3) [vds] New QEMU-GA capabilities for vm_id=437eeeed-eeb3-4b62-8f3d-6a26b99f6f60, qemu-ga=2.12.0, commands={'guest-set-vcpus', 'guest-get-users', 'guest-get-vcpus', 'guest-network-get-interfaces', 'guest-get-memory-block-info', 'guest-get-timezone', 'guest-get-memory-blocks', 'guest-sync-delimited', 'guest-suspend-hybrid', 'guest-fsfreeze-freeze', 'guest-get-fsinfo', 'guest-fsfreeze-freeze-list', 'guest-suspend-disk', 'guest-info', 'guest-fsfreeze-thaw', 'guest-fstrim', 'guest-set-memory-blocks', 'guest-set-user-password', 'guest-suspend-ram', 'guest-get-host-name', 'guest-get-time', 'guest-sync', 'guest-fsfreeze-status', 'guest-get-osinfo', 'guest-set-time', 'guest-shutdown', 'guest-ping'} (qemuguestagent:216)

And the VM is set with IP and FQDN.
Yet, VDSM keep reporting the status as powering up for 60 sec.

Comment 8 David Vaanunu 2020-11-01 10:46:56 UTC
Verified on version:
redhat-release-8.3-1.0
rhv-release-4.4.3-9-001
vdsm-4.40.34-1


Actual results:
Start VM - takes ~35 seconds
Reboot VM - takes ~470 seconds


4.3.6 / 4.3.9 results
   Start VM - takes ~30 seconds
   Reboot VM - takes ~40 second

Comment 9 Sandro Bonazzola 2020-11-11 06:42:11 UTC
This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.