Bug 2225667

Summary: libvirtd hangs when deploying VM with virt-install
Product: Red Hat Enterprise Linux 9 Reporter: Mario Cattamo <mcattamo>
Component: systemdAssignee: Jan Macku <jamacku>
Status: VERIFIED --- QA Contact: Frantisek Sumsal <fsumsal>
Severity: high Docs Contact:
Priority: urgent    
Version: 9.3CC: carlosrodrifernandez, hhan, hongzliu, jamacku, juzhou, liali, lvivier, mxie, pkrempa, qren, rpittau, smitterl, systemd-maint, thuth, tli, virt-maint, xiaofwan, yalzhang, yanghliu, ykarel, zhguo
Target Milestone: rcKeywords: TestBlocker, Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: systemd-252-17.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirtd log none

Description Mario Cattamo 2023-07-25 15:59:14 UTC
Created attachment 1977520 [details]
libvirtd log

Description of problem:
When deploying a VM with virt-install, the command hangs, not even being able to be stopped with ^C. The only way to make the command work again is to restart libvirtd service, but this is still an issue.

It has been seen in virt-install debug mode, that it hangs in "Requesting libvirt URI default"

On the other hand, the status of libvirtd service when hang, shows
libvirtd[52740]: cannot parse process status data
More details attached in the log.

Version-Release number of selected component (if applicable):
RHEL 9.3 compose RHEL-9.3.0-20230725.27
Date: 2023-07-25
osbuild-90-1.el9.noarch.rpm
osbuild-composer-85-1.el9.x86_64.rpm
weldr-client-35.9-1.el9.x86_64.rpm
KERNEL: 5.14.0-344.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Deploy RHEL-9.3 VM in Openstack
2. git clone https://github.com/virt-s1/rhel-edge.git
3. ./ostree.sh

Actual results:
virt-install hangs and VM deployment fails due to timeout.

Expected results:
virt-install being successful to deploy VM

Additional info:

Comment 1 Xiaofeng Wang 2023-07-26 01:02:45 UTC
At that time, command "sudo virsh list --all" hangs as well. But libvirtd.service is active.
Restarting libvirtd service will resolve.

systemd and libvirt version:
systemd-252-16.el9.x86_64
libvirt-daemon-kvm-9.3.0-2.el9.x86_64
libvirt-daemon-9.3.0-2.el9.x86_64

Comment 3 Xiaofeng Wang 2023-07-26 01:04:13 UTC
I found a bug https://bugzilla.redhat.com/show_bug.cgi?id=2213660 which is related with this one.

Comment 5 Han Han 2023-07-27 08:22:48 UTC
I think it is the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=2213660
The buggy commit for systems is: ff32060f2ed37b68dc26256b05e2e69013b0ecfe 
core/service: when resetting PID also reset known flag 

And systemd-252-16.el9.x86_64 contains this buggy commit:
➜  ~ rpm -q --changelog systemd-252-16.el9.x86_64 |grep 'core/service: when resetting PID also reset known flag' 
- core/service: when resetting PID also reset known flag (#2210237)

Comment 6 Han Han 2023-07-27 08:24:10 UTC
See also https://bugzilla.redhat.com/show_bug.cgi?id=2213660 and https://github.com/systemd/systemd/pull/28000
Please backport the fix

Comment 7 Peter Krempa 2023-07-27 08:28:22 UTC
(In reply to Han Han from comment #5)
> I think it is the same issue as
> https://bugzilla.redhat.com/show_bug.cgi?id=2213660
> The buggy commit for systems is: ff32060f2ed37b68dc26256b05e2e69013b0ecfe 
> core/service: when resetting PID also reset known flag 
> 
> And systemd-252-16.el9.x86_64 contains this buggy commit:
> ➜  ~ rpm -q --changelog systemd-252-16.el9.x86_64 |grep 'core/service: when
> resetting PID also reset known flag' 
> - core/service: when resetting PID also reset known flag (#2210237)

Yes, I've come to the same conclusion. There's another duplicate of this filed with libvirt just now.

Comment 8 Peter Krempa 2023-07-27 08:28:26 UTC
*** Bug 2226916 has been marked as a duplicate of this bug. ***

Comment 9 Peter Krempa 2023-08-01 07:00:14 UTC
*** Bug 2227980 has been marked as a duplicate of this bug. ***

Comment 10 Yatin Karel 2023-08-02 07:49:42 UTC
Also hitting the same issue with CentOS 9-stream in OpenStack Upstream CI https://bugs.launchpad.net/neutron/+bug/2029335, would be good to get the revert patch included in systemd rpm to clear this issue.

Comment 11 Frantisek Sumsal 2023-08-04 08:26:45 UTC
*** Bug 2229106 has been marked as a duplicate of this bug. ***

Comment 15 Peter Krempa 2023-08-08 08:35:48 UTC
*** Bug 2229859 has been marked as a duplicate of this bug. ***

Comment 16 David Tardon 2023-08-16 08:36:12 UTC
*** Bug 2231983 has been marked as a duplicate of this bug. ***