Bug 2043579

Summary: virt-install: "ERROR internal error: cannot parse process status data for pid" on guest reboot
Product: Red Hat Enterprise Linux 9 Reporter: Michal Privoznik <mprivozn>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
libvirt sub component: General QA Contact: Hongzhou Liu <hongzliu>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: abologna, drjones, eric.auger, gshan, hongzliu, jdenemar, juzhou, lcapitulino, lijin, pkrempa, rjones, tyan, tzheng, virt-maint, yidliu
Version: unspecifiedKeywords: Regression, Triaged, Upstream
Target Milestone: rc   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-8.0.0-2.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2041610 Environment:
Last Closed: 2022-05-17 12:46:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1924294    

Description Michal Privoznik 2022-01-21 14:47:44 UTC
+++ This bug was initially created as a clone of Bug #2041610 +++

When the virt-install of an 8.6 aarch64 guest is complete and we are asked whether we want to reboot and reply 'yes', one gets:

[  OK  ] Started Reboot.
[  OK  ] Reached target Reboot.
dracut Warning: Killing all remaining processes
Rebooting.
[ 7478.584400] reboot: Restarting system

ERROR    internal error: cannot parse process status data for pid '99994/0'

The VM gets stopped.

Then it is possible to start it again but the guest should reboot without this user involvement.

--- Additional comment from Eric Auger on 2022-01-17 22:10:52 CET ---

on Fujitsu machine installed with RHEL-8.6.0-20220116.d.0 BaseOS aarch64

start:

sudo virt-install --name aarch64-vm0-rhel8.6 --ram 32768 --accelerate --virt-type kvm --arch=aarch64 --vcpus=48 --disk size=40,pool=vm --location http://download.eng.bos.redhat.com/rhel-8/composes/RHEL-8/RHEL-8.6.0-20220116.d.0/compose/BaseOS/aarch64/os --os-type linux  --os-variant rhl8.0 --check-cpu --network default

Host has:

Linux fujitsu-fx700-01-n01.2a2m.lab.eng.bos.redhat.com 4.18.0-359.el8.kpq0.aarch64
Name         : libvirt
Version      : 8.0.0
Release      : 1.module+el8.6.0+13888+55157bfb

Name         : virt-install
Version      : 3.2.0
Release      : 2.el8

qemu-kvm.aarch64   15:6.2.0-2.module+el8.6.0+13738+17338784

--- Additional comment from Peter Krempa on 2022-01-18 09:16:27 CET ---

Seems to be caused by:

commit 938382b60ae5bd1f83b5cb09e1ce68b9a88f679a
Author: Ani Sinha <ani>
Date:   Tue Jan 11 15:50:43 2022 +0530

    report error when virProcessGetStatInfo() is unable to parse data
    
    Currently virProcessGetStatInfo() always returns success and only logs error
    when it is unable to parse the data. Make this function actually report the
    error and return a negative value in this error scenario.
    
    Fix the callers so that they do not override the error generated.
    Also fix non-linux implementation of this function so as to report error.
    
    Signed-off-by: Ani Sinha <ani>
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: Michal Privoznik <mprivozn>

[...]

diff --git a/src/util/virprocess.c b/src/util/virprocess.c
index b559a4257e..85d8c8e747 100644
--- a/src/util/virprocess.c
+++ b/src/util/virprocess.c
@@ -1784,7 +1784,10 @@ virProcessGetStatInfo(unsigned long long *cpuTime,
         virStrToLong_ullp(proc_stat[VIR_PROCESS_STAT_STIME], NULL, 10, &systime) < 0 ||
         virStrToLong_l(proc_stat[VIR_PROCESS_STAT_RSS], NULL, 10, &rss) < 0 ||
         virStrToLong_i(proc_stat[VIR_PROCESS_STAT_PROCESSOR], NULL, 10, &cpu) < 0) {
-        VIR_WARN("cannot parse process status data");
+        virReportError(VIR_ERR_INTERNAL_ERROR,
+                       _("cannot parse process status data for pid '%d/%d'"),
+                       (int) pid, (int) tid);
+        return -1;
     }

     /* We got jiffies

--- Additional comment from Michal Privoznik on 2022-01-18 12:46:00 CET ---

Ooops, revert proposed on the list:

https://listman.redhat.com/archives/libvir-list/2022-January/msg00778.html

--- Additional comment from Yiding Liu (Fujitsu) on 2022-01-20 09:42:56 CET ---

Pre-Verify: PASS
I backport the fix to libvirt-8.0.0-1.module+el8.6.0+13888+55157bfb.src.rpm and the error was gone.

```
# virt-install --connect qemu:///system -l /var/lib/libvirt/boot/RHEL-8.6.0-20220108.3-BaseOS-aarch64-boot.iso  --os-variant rhel8-unknown -n debug --memory 4096 --machine virt --vcpus 4 --disk /var/lib/libvirt/images/debug.qcow2
[snip]

dracut Warning: Killing all remaining processes
Rebooting.
[  141.619939] reboot: Restarting system

Domain creation completed.
Restarting guest.
Running text console command: virsh --connect qemu:///system console debug
Connected to domain 'debug'
Escape character is ^] (Ctrl + ])

```

--- Additional comment from Michal Privoznik on 2022-01-20 17:54:49 CET ---

Pushed into master as:

commit 105dace22cc7b5b18d72a4dcad4a2cf386ce5c99
Author:     Michal Prívozník <mprivozn>
AuthorDate: Tue Jan 18 12:40:09 2022 +0100
Commit:     Michal Prívozník <mprivozn>
CommitDate: Thu Jan 20 17:51:07 2022 +0100

    Revert "report error when virProcessGetStatInfo() is unable to parse data"
    
    This reverts commit 938382b60ae5bd1f83b5cb09e1ce68b9a88f679a.
    
    Turns out, the commit did more harm than good. It changed
    semantics on some public APIs. For instance, while
    qemuDomainGetInfo() previously did not returned an error it does
    now. While the calls to virProcessGetStatInfo() is guarded with
    virDomainObjIsActive() it doesn't necessarily mean that QEMU's
    PID is still alive. QEMU might be gone but we just haven't
    realized it (e.g. because the eof handler thread is waiting for a
    job).
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2041610
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: Andrea Bolognani <abologna>

v8.0.0-129-g105dace22c

Comment 1 Michal Privoznik 2022-01-21 14:52:53 UTC
To POST:

https://gitlab.com/redhat/rhel/src/libvirt/-/merge_requests/4

Comment 2 Hongzhou Liu 2022-01-25 13:01:35 UTC
This bug can be reproduce on arch x86_64:

Packages:
virt-install-3.2.0-12.el9.noarch
libvirt-8.0.0-1.el9.x86_64

Step 1:

Install a vm using virt-install
# virt-install --name=rhel8.6-1 --memory=4096 --vcpus=2   --location http://download.eng.pek2.redhat.com/released/rhel-6-7-8/rhel-8/RHEL-8/8.4.0/BaseOS/x86_64/os/

Step 2:

Click reboot system to finish install

Result: The vm will shut off and the return message like this
(virt-viewer:254047): virt-viewer-WARNING **: 19:56:42.017: vnc-session: got vnc error Server closed the connection
ERROR    internal error: cannot parse process status data for pid '254002/0'
Domain installation does not appear to have been successful.
If it was, you can restart your domain by running:
  virsh --connect qemu:///system start rhel8.6-1
otherwise, please restart your installation.

Step 3:

Check the status for vm
# virsh domstate rhel8.6-1 
shut off

Step 4:

Start the vm and connect with virt-viewer

Result: Display correct and installation finished

Comments: I try to add an option --graphics type=vnc  and the vm can reboot successfully.

Comment 3 Hongzhou Liu 2022-01-26 02:56:43 UTC
Do pre-verification with libvirt-8.0.0-2.el9.x86_64


Step 1:

Install a vm using virt-install
# virt-install --name=rhel8.6-1 --memory=4096 --vcpus=2   --location http://download.eng.pek2.redhat.com/released/rhel-6-7-8/rhel-8/RHEL-8/8.4.0/BaseOS/x86_64/os/

Step 2:

Click reboot system to finish install

Result: vm can reboot successfully.

Step 3:
Check the status for vm
# virsh domstate rhel8.6-1 
running

Base on this result I change the verified status to tested, Thanks

Comment 6 Hongzhou Liu 2022-01-28 02:43:16 UTC
Verify this bug with libvirt-8.0.0-2.el9.x86_64


Step 1:

Install a vm using virt-install
# virt-install --name=rhel8.6-1 --memory=4096 --vcpus=2   --location http://download.eng.pek2.redhat.com/released/rhel-6-7-8/rhel-8/RHEL-8/8.4.0/BaseOS/x86_64/os/

Step 2:

Click reboot system to finish install

Result: vm can reboot successfully. Display is correct.

Step 3:
Check the status for vm
# virsh domstate rhel8.6-1 
running

Base on this result I change the verified status to tested, Thanks

Comment 9 errata-xmlrpc 2022-05-17 12:46:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: libvirt), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2390