Bug 1327010

Summary: libvirt shows guest state 'running', if guest in process of booting(might never boot)
Product: Red Hat Enterprise Linux 7 Reporter: pagupta
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.2CC: jdenemar, mkletzan, rbalakri
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-14 06:44:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description pagupta 2016-04-14 04:59:30 UTC
Description of problem:

If emulator threads don't get chance to run guest won't boot but libvirt shows status as 'running' which management layers also show. 

There should be a sort of hearbeat mechanism between guest and host to confirm
guest is actually running. We could make status based on some changed flags in already shared data between guest and host like virtio, kvm-clock etc to consider as hearbeat. 

Major problem here is management layer would always think guest is running
even it is not.

Version-Release number of selected component (if applicable):

libvirt-1.2.17-13.el7_2.4.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.7.x86_64
kernel-rt-kvm-3.10.0-306.0.1.rt56.179.el7.x86_64

Steps to Reproduce:
1. Isolate physical cores to run vCPU threads.
2. pin vCPU threads to these isolated cores with RT priority.
3. Configure emulator threads also run on same cores

Actual results:
1 virsh list shows 'guest running'

Expected results:
1] virsh list should show 
  a] 'guest is booting'
  b]  If guest remains in booting mode for longer time, 'guest is hanged'.
  c] Also throw error in logs.

Additional info:

Comment 2 Jiri Denemark 2016-04-14 06:44:30 UTC
First, the reproducer steps rely on a bug, once that is fixed this issue will no longer be reproducible. Even if it was, there is no generic way of checking whether a guest is already running or not without detailed knowledge about the guest itself (which libvirt doesn't have). If a management wants to list domains as running only when the guest OS is up and running, it needs to incorporate its own technics to check that. From libvirt's point of view a domain is perfectly running even if it's just showing a boot menu, for example.

Comment 3 Martin Kletzander 2016-04-19 06:22:34 UTC
I agree here that the problem is with the pinning configuration.  Also, libvirt cannot provide universal heartbeat mechanism that would effectively be able to say whether the guest booted or not.  That's one of the reasons for not differentiating 'running' from 'booting'.

Comment 4 pagupta 2016-04-19 07:37:24 UTC
Hi Martin, Jiri,

I am not worried about the cause of this issue. What I am more worried about is
Guest is not booting and for our Customer how he will know what guest is doing?

If we are showing Guest is 'running', first impression of any person would be Guest running fine. Unless some expert looks at it or try to solve.

I understand this scenario is specific to some configurations. But I am looking at bigger picture if we could think about designing a better solution to provide
better Guest stats. That will definitely boost Customer experience.

Best regards,
Pankaj