Bug 2030119

Summary: [aarch64]: virsh xml operation slow down on libvirt-7.10.0-1
Product: Red Hat Enterprise Linux 8 Reporter: Yiding Liu (Fujitsu) <yidliu>
Component: libvirtAssignee: Ján Tomko <jtomko>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.6CC: jdenemar, jtomko, lcapitulino, nilal, virt-maint, weizhan
Target Milestone: rcKeywords: Regression, Triaged
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-8.0.0-0rc1.1.module+el8.6.0+13853+e8cd34b9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-10 13:24:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 8.0.0
Embargoed:
Bug Depends On:    
Bug Blocks: 1929792, 1885765    
Attachments:
Description Flags
libvirtd.log
none
new libvirtd log none

Description Yiding Liu (Fujitsu) 2021-12-08 02:44:46 UTC
Description of problem:
virsh xml related operation costs more time than libvirt-7.9.0-1.
E.g. virsh define and virsh edit

Version-Release number of selected component (if applicable):
libvirt-7.10.0-1.module+el8.6.0+13502+4f24a11d.aarch64


How reproducible: 100%


Steps to Reproduce:
1. time virsh define test.xml

Actual results:
# time virsh define test.xml 
Domain 'fj-kvm-vm-debug' defined from test.xml


real	0m8.689s
user	0m0.030s
sys	0m0.030s


Expected results:
# rpm -q libvirt 
libvirt-7.9.0-1.module+el8.6.0+13150+28339563.aarch64
# time virsh define test.xml 
Domain 'fj-kvm-vm-debug' defined from test.xml


real	0m0.049s
user	0m0.028s
sys	0m0.010s



Additional info:
This are a lot of libvirt CI cases failed due to this issue.
virsh cmd costs more time and exceed the TIMEOUT setting, then case failed.

Comment 1 Yiding Liu (Fujitsu) 2021-12-08 03:49:42 UTC
Created attachment 1845177 [details]
libvirtd.log

log_filters="1:libvirt 1:event"

Comment 2 Yiding Liu (Fujitsu) 2021-12-08 07:06:47 UTC
aarch64 only. I can't reproduce error on x86_64.

# rpm -q libvirt
libvirt-7.10.0-1.module+el8.6.0+13502+4f24a11d.x86_64
# time virsh define auto_test_tool/guest.xml 
Domain 'fj-kvm-vm' defined from auto_test_tool/guest.xml


real	0m0.045s
user	0m0.011s
sys	0m0.007s

Comment 3 Peter Krempa 2021-12-08 08:45:22 UTC
The debug log doesn't contain logs from the internals so it doesn't really show what's taking so long.

Please re-capture the log with the following log filter setting:

1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util

https://www.libvirt.org/kbase/debuglogs.html#targeted-logging-for-debugging-qemu-vms

Comment 4 Yiding Liu (Fujitsu) 2021-12-08 09:01:36 UTC
Created attachment 1845198 [details]
new libvirtd log

1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util

Comment 5 Ján Tomko 2021-12-08 12:30:17 UTC
It looks like
commit 3bc6f46d305ed82f7314ffc4c2a66847b831a6bd
    qemu: Invalidate capabilities cache on host cpuid mismatch

considers architectures where we don't query host cpuid as mismatched,
so the capabilities are probes every time (~60 in the attached log file).

I've sent a patch upstream to only do the check if we queried host cpuid.
Will share a link once it appears in the archives.

Comment 6 Ján Tomko 2021-12-08 13:07:45 UTC
Proposed upstream patch:
https://listman.redhat.com/archives/libvir-list/2021-December/msg00218.html

Comment 7 Luiz Capitulino 2021-12-08 13:18:25 UTC
Jan, huge thanks for jumping in. This looks like a high priority regression for FJ. Should we set ITR=8.6?

Great catch, Yiding!

Comment 8 Ján Tomko 2021-12-08 14:35:07 UTC
Pushed upstream as:
commit 33538bc46b7446525387b5555c58ea298c198c83
Author:     Ján Tomko <jtomko>
CommitDate: 2021-12-08 15:27:58 +0100

    qemu: do not compare missing cpu data
    
    For x86, we invalidate qemu caps cache if the host CPUID changed.
    However other cpu drivers do not have the 'getHostData' function
    implemented.
    
    Skip the comparison if we do not have host CPUData available,
    since virCPUDataIsIdentical always returns an error in that case.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=2030119
    
    Fixes: 3bc6f46d305ed82f7314ffc4c2a66847b831a6bd
    Signed-off-by: Ján Tomko <jtomko>
    Reviewed-by: Jiri Denemark <jdenemar>

git describe: v7.10.0-136-g33538bc46b

Comment 13 Yiding Liu (Fujitsu) 2022-01-14 06:29:06 UTC
Verified on libvirt-8.0.0-0rc1.1.el8.aarch64. The fix works, thanks a lot.
```
# time virsh define guest.xml 
Domain 'fj-kvm-vm' defined from guest.xml


real	0m0.090s
user	0m0.048s
sys	0m0.016s
```

Comment 16 errata-xmlrpc 2022-05-10 13:24:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759