Bug 2041610

Summary: virt-install: "ERROR internal error: cannot parse process status data for pid" on guest reboot
Product: Red Hat Enterprise Linux 8 Reporter: Eric Auger <eric.auger>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Hongzhou Liu <hongzliu>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.6CC: abologna, akoutsou, atodorov, drjones, gshan, hongzliu, jdenemar, jrusz, juzhou, laine, lcapitulino, lijin, mprivozn, pkotvan, pkrempa, rjones, tzheng, virt-maint
Target Milestone: rcKeywords: Regression, TestBlocker, Triaged, Upstream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-8.0.0-2.module+el8.6.0+14025+ca131e0a Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2043579 (view as bug list) Environment:
Last Closed: 2022-05-10 13:25:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1929792, 1885765    

Description Eric Auger 2022-01-17 21:06:34 UTC
When the virt-install of an 8.6 aarch64 guest is complete and we are asked whether we want to reboot and reply 'yes', one gets:

[  OK  ] Started Reboot.
[  OK  ] Reached target Reboot.
dracut Warning: Killing all remaining processes
Rebooting.
[ 7478.584400] reboot: Restarting system

ERROR    internal error: cannot parse process status data for pid '99994/0'

The VM gets stopped.

Then it is possible to start it again but the guest should reboot without this user involvement.

Comment 2 Peter Krempa 2022-01-18 08:16:27 UTC
Seems to be caused by:

commit 938382b60ae5bd1f83b5cb09e1ce68b9a88f679a
Author: Ani Sinha <ani>
Date:   Tue Jan 11 15:50:43 2022 +0530

    report error when virProcessGetStatInfo() is unable to parse data
    
    Currently virProcessGetStatInfo() always returns success and only logs error
    when it is unable to parse the data. Make this function actually report the
    error and return a negative value in this error scenario.
    
    Fix the callers so that they do not override the error generated.
    Also fix non-linux implementation of this function so as to report error.
    
    Signed-off-by: Ani Sinha <ani>
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: Michal Privoznik <mprivozn>

[...]

diff --git a/src/util/virprocess.c b/src/util/virprocess.c
index b559a4257e..85d8c8e747 100644
--- a/src/util/virprocess.c
+++ b/src/util/virprocess.c
@@ -1784,7 +1784,10 @@ virProcessGetStatInfo(unsigned long long *cpuTime,
         virStrToLong_ullp(proc_stat[VIR_PROCESS_STAT_STIME], NULL, 10, &systime) < 0 ||
         virStrToLong_l(proc_stat[VIR_PROCESS_STAT_RSS], NULL, 10, &rss) < 0 ||
         virStrToLong_i(proc_stat[VIR_PROCESS_STAT_PROCESSOR], NULL, 10, &cpu) < 0) {
-        VIR_WARN("cannot parse process status data");
+        virReportError(VIR_ERR_INTERNAL_ERROR,
+                       _("cannot parse process status data for pid '%d/%d'"),
+                       (int) pid, (int) tid);
+        return -1;
     }

     /* We got jiffies

Comment 3 Michal Privoznik 2022-01-18 11:46:00 UTC
Ooops, revert proposed on the list:

https://listman.redhat.com/archives/libvir-list/2022-January/msg00778.html

Comment 5 Michal Privoznik 2022-01-20 16:54:49 UTC
Pushed into master as:

commit 105dace22cc7b5b18d72a4dcad4a2cf386ce5c99
Author:     Michal Prívozník <mprivozn>
AuthorDate: Tue Jan 18 12:40:09 2022 +0100
Commit:     Michal Prívozník <mprivozn>
CommitDate: Thu Jan 20 17:51:07 2022 +0100

    Revert "report error when virProcessGetStatInfo() is unable to parse data"
    
    This reverts commit 938382b60ae5bd1f83b5cb09e1ce68b9a88f679a.
    
    Turns out, the commit did more harm than good. It changed
    semantics on some public APIs. For instance, while
    qemuDomainGetInfo() previously did not returned an error it does
    now. While the calls to virProcessGetStatInfo() is guarded with
    virDomainObjIsActive() it doesn't necessarily mean that QEMU's
    PID is still alive. QEMU might be gone but we just haven't
    realized it (e.g. because the eof handler thread is waiting for a
    job).
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2041610
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: Andrea Bolognani <abologna>

v8.0.0-129-g105dace22c

Comment 10 Alexander Todorov 2022-01-25 08:13:11 UTC
We're seeing this on x86_64 as well and it is a test blocker for osbuild-composer:
https://gitlab.com/osbuild/ci/osbuild-composer/-/jobs/2011663415

Comment 18 Hongzhou Liu 2022-01-28 02:30:45 UTC
Verify this bug with 

libvirt-8.0.0-2.module+el8.6.0+14025+ca131e0a

Step 1:

Install a vm using virt-install
# virt-install --name=rhel8.6-1 --memory=4096 --vcpus=2   --location http://download.eng.pek2.redhat.com/released/rhel-6-7-8/rhel-8/RHEL-8/8.4.0/BaseOS/x86_64/os/

Step 2:

Click reboot system to finish install

Result: vm can reboot successfully.

Step 3:
Check the status for vm
# virsh domstate rhel8.6-1 
running

Base on this result I change the verified status to tested, Thanks

Comment 22 errata-xmlrpc 2022-05-10 13:25:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759