Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1404627

Summary:

OpenStack instances fail to boot with qemu-kvm-rhev 2.6.0 when virt_mode=qemu and cpu_mode=host-model in nova.conf

Product:

Red Hat Enterprise Linux 7

Reporter:

Javier Peña <jpena>

Component:

libvirt

Assignee:

Jiri Denemark <jdenemar>

Status:

CLOSED ERRATA

QA Contact:

chhu

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

7.3

CC:

apevec, chayang, dmsimard, dyuan, ehabkost, jdenemar, jpena, juzhang, knoel, lhuang, michen, mtessun, pbonzini, qzhang, rbalakri, sgordon, tshefi, virt-maint, xuzhang, zhang.lei.fly

Target Milestone:

Keywords:

Upstream

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

libvirt-3.2.0-1.el7

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-08-01 17:19:14 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Libvirt logs from cirros (002.log) and centos (004.log) instances	none

Description Javier Peña 2016-12-14 09:57:44 UTC

Created attachment 1231568 [details]
Libvirt logs from cirros (*002.log) and centos (*004.log) instances

Description of problem:
Trying to boot an instance on an OpenStack environment set up using qemu emulation (not KVM) in a system running RHEL 7.3 results in a non-bootable instance.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.6.0-27.el7.x86_64
libvirt-daemon-driver-qemu-2.0.0-10.el7_3.2.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Setup RHEL 7.3 and the Red Hat OpenStack repos on a VM
2. packstack --allinone
3. Boot a VM using the Cirros image

Actual results:
Instance gets stuck, the console shows "Starting up ..." and never finished. A CentOS 7 image manages to get to the bootloader and boot the kernel, which then hangs with the message "Probing EDD (edd=off to disable)... ok"

Expected results:
Instances boot successfully.

Additional info:
This won't be an issue for most users, however for the RDO and upstream OpenStack CIs this is a real issue, because our CI jobs run on public clouds without nested virtualization.

Setting cpu_mode=none in nova.conf works as a workaround, however this is not the default value for Nova.

Attached libvirt logs from two failed instances.

Comment 3 Eduardo Habkost 2016-12-16 16:58:42 UTC

Using "host-model" with TCG is very likely to cause problems. QEMU is being started with -cpu SandyBridge,+osxsave,+hypervisor,+xsaveopt but TCG doesn't support most of the SandyBridge features. Does it work if using the default (qemu64) CPU model?

I don't see any valid reason to use cpu="host-model" with TCG, as TCG CPU features doesn't depend on the host CPU model at all. 

That said, I don't know what should be the right solution. I don't know if there's an existing libvirt interface that you can use to find out that "host-model" is not reliable in the existing host, or if libvirt can change the behavior of "host-model" when in TCG mode.

Comment 4 Jiri Denemark 2016-12-16 19:52:46 UTC

(In reply to Eduardo Habkost from comment #3)
> Using "host-model" with TCG is very likely to cause problems. QEMU is being
> started with -cpu SandyBridge,+osxsave,+hypervisor,+xsaveopt but TCG doesn't
> support most of the SandyBridge features. Does it work if using the default
> (qemu64) CPU model?
> 
> I don't see any valid reason to use cpu="host-model" with TCG, as TCG CPU
> features doesn't depend on the host CPU model at all. 
> 
> That said, I don't know what should be the right solution. I don't know if
> there's an existing libvirt interface that you can use to find out that
> "host-model" is not reliable in the existing host

Partially yes, but it's only upstream anyway.

> if libvirt can change the behavior of "host-model" when in TCG mode.

I was thinking of changing it to just use the best model runnable in TCG, which would be Westmere, but I'm not sure it's a good idea.

In any case, is qemu-kvm in RHEL actually supported to be used in TCG mode?

Comment 5 Karen Noel 2016-12-16 21:04:09 UTC

(In reply to Jiri Denemark from comment #4)
> (In reply to Eduardo Habkost from comment #3)
> > Using "host-model" with TCG is very likely to cause problems. QEMU is being
> > started with -cpu SandyBridge,+osxsave,+hypervisor,+xsaveopt but TCG doesn't
> > support most of the SandyBridge features. Does it work if using the default
> > (qemu64) CPU model?
> > 
> > I don't see any valid reason to use cpu="host-model" with TCG, as TCG CPU
> > features doesn't depend on the host CPU model at all. 
> > 
> > That said, I don't know what should be the right solution. I don't know if
> > there's an existing libvirt interface that you can use to find out that
> > "host-model" is not reliable in the existing host
> 
> Partially yes, but it's only upstream anyway.
> 
> > if libvirt can change the behavior of "host-model" when in TCG mode.
> 
> I was thinking of changing it to just use the best model runnable in TCG,
> which would be Westmere, but I'm not sure it's a good idea.
> 
> In any case, is qemu-kvm in RHEL actually supported to be used in TCG mode?

No, TCG is not supported in RHEL or other Red Hat products. 

That said, if people are using TCG for useful purposes in the community, we should fix issues that help them. Especially if the change is relatively simple and low risk. Thanks.

Comment 7 Jiri Denemark 2016-12-16 21:28:07 UTC

(In reply to Karen Noel from comment #5)
> Especially if the change is relatively simple and low risk.

A proper fix for this will likely require the work we did for QEMU 2.8.0 and libvirt 2.5.0 which doesn't really fall into the "simple and low risk" category.

Comment 9 Stephen Gordon 2016-12-17 02:07:38 UTC

(In reply to Karen Noel from comment #5)
> (In reply to Jiri Denemark from comment #4)
> > (In reply to Eduardo Habkost from comment #3)
> > > Using "host-model" with TCG is very likely to cause problems. QEMU is being
> > > started with -cpu SandyBridge,+osxsave,+hypervisor,+xsaveopt but TCG doesn't
> > > support most of the SandyBridge features. Does it work if using the default
> > > (qemu64) CPU model?
> > > 
> > > I don't see any valid reason to use cpu="host-model" with TCG, as TCG CPU
> > > features doesn't depend on the host CPU model at all. 
> > > 
> > > That said, I don't know what should be the right solution. I don't know if
> > > there's an existing libvirt interface that you can use to find out that
> > > "host-model" is not reliable in the existing host
> > 
> > Partially yes, but it's only upstream anyway.
> > 
> > > if libvirt can change the behavior of "host-model" when in TCG mode.
> > 
> > I was thinking of changing it to just use the best model runnable in TCG,
> > which would be Westmere, but I'm not sure it's a good idea.
> > 
> > In any case, is qemu-kvm in RHEL actually supported to be used in TCG mode?
> 
> No, TCG is not supported in RHEL or other Red Hat products. 

Do we have a release note or knowledge base article calling that out?

> That said, if people are using TCG for useful purposes in the community, we
> should fix issues that help them. Especially if the change is relatively
> simple and low risk. Thanks.

Similarly does upstream documentation exist calling out that using host-model with TCG can't be expected to work (even though it has until recently...).

Comment 10 Stephen Gordon 2016-12-17 02:20:15 UTC

(In reply to Jiri Denemark from comment #4)
> (In reply to Eduardo Habkost from comment #3)
> > Using "host-model" with TCG is very likely to cause problems. QEMU is being
> > started with -cpu SandyBridge,+osxsave,+hypervisor,+xsaveopt but TCG doesn't
> > support most of the SandyBridge features. Does it work if using the default
> > (qemu64) CPU model?
> > 
> > I don't see any valid reason to use cpu="host-model" with TCG, as TCG CPU
> > features doesn't depend on the host CPU model at all. 

My understanding of the way this came to be is that they are trying to use configurations as close as possible to what a "real" user would use (host-model is the default in Nova), but nested virtualization isn't available in the clouds used for OpenStack CI so they have to go without KVM.

This only arose after qemu-kvm-ev 2.6.0 was released to CentOS Virt SIG testing because RHOSP 10 testing occurred on internal clouds that have nested virtualization enabled and physical bare-metal systems.

> > That said, I don't know what should be the right solution. I don't know if
> > there's an existing libvirt interface that you can use to find out that
> > "host-model" is not reliable in the existing host
> 
> Partially yes, but it's only upstream anyway.
> 
> > if libvirt can change the behavior of "host-model" when in TCG mode.
> 
> I was thinking of changing it to just use the best model runnable in TCG,
> which would be Westmere, but I'm not sure it's a good idea.

I actually think that would be pretty reasonable for the use cases we're talking about here as long as it's done in a way that is future proof enough if someone expands TCG support in the future upstream (for whatever reason). Basically, expose the best model that TCG can support.

All of that said, this would seem to be a lower priority than the work Jiri refers to in https://bugzilla.redhat.com/show_bug.cgi?id=1371617#c27 which would make handling of missing support for a flag more graceful in the KVM case - especially since we're going to hack out the host-model stuff in the CI systems when virt_type="qemu" in the meantime regardless.

Comment 13 Jiri Denemark 2016-12-20 20:44:00 UTC

CPU configuration as it currently works between libvirt and QEMU is not ideal.
And host-model is probably the worse part of it. Since there is no way of
asking QEMU what CPU features it can enable on the host CPU, libvirt just
checks the host CPU via CPUID and asks QEMU to enable all features it detected
on a CPU model that seems to be the best match for the host CPU. Naturally
QEMU/KVM is not able to provide all the features it was asked for to a guest
and does filters some of them out. While this creates a guest CPU which may
change when QEMU or KVM is upgraded, it mostly works fine because both QEMU
and KVM adopt new CPU features rather quickly.

However, when QEMU is run in TCG mode the virtual CPU instructions are
emulated by QEMU. Thus adding support for new features and instructions in TCG
mode is usually more complicated than in KVM mode and as a result of this (and
likely also because developers focus more on KVM) TCG just lacks most of the
new features of modern CPUs. So when libvirt asks for all the features it
detected in the host CPU, QEMU is going to filter a lot of them and mainly the
most recent ones. So it may easily happen that a guest is presented with a CPU
which looks like a modern CPU (according to family and model numbers), but
without a lot of features. In other words, on a guest running in TCG mode on a
Skylake host CPU will see a CPU which claims to be Skylake but its feature set
will be similar to Westmere. So if the guest can't cope with such a strange
CPU, host-model will not work for it in TCG mode. It might have been working
with an older guest, older libvirt, or older QEMU because the guest didn't
know about the new CPU models and thus didn't want to use its features or
simply because the virtual CPU looked differently and claimed to be something
older.

And since TCG has never been supported on RHEL it doesn't get a lot of testing
not to mention that the results differ with host CPU, libvirt, QEMU, and guest
which makes the problem space pretty big.

That said we're working on fixing these issues upstream, QEMU is getting a new
interface for querying host's CPU, libvirt is being modified to use this so
that it doesn't ask for something QEMU can't virtualize, etc.

Another good point Eduardo mentioned is the strangeness of using host-model in
TCG mode. When guest's and host's architectures do not match, host-model does
not logically even make sense. But even if the architectures match host-model
is a bit strange since the virtual CPU is completely emulated anyway. However,
it's not a nonsense and libvirt never refused such configuration and it may
even work sometimes so we should just make it work reliably. And I think a
good way of doing so is choosing one of the CPU models supported by QEMU in
TCG mode.

So fixing host-model in KVM and TCG mode are two separate things. The fix for
KVM is a bit more complicated since even QEMU 2.8.0 is not new enough for
that, but TCG mode can be fixed in libvirt with QEMU 2.8.0 (or newer) and I'll
work on the patches for libvirt 3.0.0. Unfortunately QEMU 2.8.0 is the first
one which is able to tell what CPU models are usable in TCG mode so this fix
will not work with anything older than that.

Comment 15 Paolo Bonzini 2016-12-23 15:02:51 UTC

Please open a bug in the upstream QEMU bug tracker (https://bugs.launchpad.net/qemu/) including where to download the cirros guest image and the kernel version of the image.  Also include the QEMU command line.

Please write the bug as if openstack didn't even exist :) as that would not provide any useful hints to the upstream developers.  Thanks!

Comment 16 Javier Peña 2016-12-23 15:27:32 UTC

Done: https://bugs.launchpad.net/qemu/+bug/1652333

Although using a Cirros image will surely count as a hint in itself :)

Comment 17 Jiri Denemark 2017-03-03 19:28:09 UTC

This should be finally fixed by (in combination with QEMU 2.9.0):

commit 2a586b4402a7637e0bef9a2876d065c0ce6bfef1
Refs: v3.1.0-9-g2a586b440
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Jan 30 16:10:22 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemucapstest: Update test data for QEMU 2.9.0

    Signed-off-by: Jiri Denemark <jdenemar>

commit 0bde051f3de02b1be25ea4a4d9f062abfa3d1397
Refs: v3.1.0-10-g0bde051f3
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Jan 30 16:10:49 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    domaincapstest: Add test data for QEMU 2.9.0

    Signed-off-by: Jiri Denemark <jdenemar>

commit d2f8f3052d48f284d56e27c98ce7a2ce6c656e59
Refs: v3.1.0-11-gd2f8f3052
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 15 10:18:53 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    docs: Update description of the host-model CPU mode

    Signed-off-by: Jiri Denemark <jdenemar>

commit 4c0723a1d75b981e8939c4c5b6bde7607fc7301e
Refs: v3.1.0-12-g4c0723a1d
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Jan 30 16:30:13 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Rename hostCPU/feature element in capabilities cache

    The element will be generalized in the following commits.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 03a34f6b84da009291e8651aba71df8a6761d081
Refs: v3.1.0-13-g03a34f6b8
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 22 15:46:47 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Prepare for more types in qemuMonitorCPUModelInfo

    Signed-off-by: Jiri Denemark <jdenemar>

commit 2fc215dd2ad4b88c1054da804c4c45b3d4e5c2fa
Refs: v3.1.0-14-g2fc215dd2
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 22 16:01:30 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Store more types in qemuMonitorCPUModelInfo

    While query-cpu-model-expansion returns only boolean features on s390,
    but x86_64 reports some integer and string properties which we are
    interested in.

    Signed-off-by: Jiri Denemark <jdenemar>

commit d7f054a512a911a386d9bbeec51379e4bb843ca5
Refs: v3.1.0-15-gd7f054a51
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 22 16:51:50 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Probe "max" CPU model in TCG

    Querying "host" CPU model expansion only makes sense for KVM. QEMU 2.9.0
    introduces a new "max" CPU model which can be used to ask QEMU what the
    best CPU it can provide to a TCG domain is.

    Signed-off-by: Jiri Denemark <jdenemar>

commit f0138289920d5204c1654bc9b17115d1a315d62e
Refs: v3.1.0-16-gf01382899
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Jan 11 14:36:34 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Get host CPU model from QEMU on x86_64

    Until now host-model CPU mode tried to enable all CPU features supported
    by the host CPU even if QEMU/KVM did not support them. This caused a
    number of issues and made host-model quite unreliable. Asking QEMU for
    the CPU it can provide and the current host makes host-model much more
    robust.

    This commit fixes the following bugs:

        https://bugzilla.redhat.com/show_bug.cgi?id=1018251
        https://bugzilla.redhat.com/show_bug.cgi?id=1371617
        https://bugzilla.redhat.com/show_bug.cgi?id=1372581
        https://bugzilla.redhat.com/show_bug.cgi?id=1404627
        https://bugzilla.redhat.com/show_bug.cgi?id=870071

    In addition to that, the following bug should be mostly limited to cases
    when an unsupported feature is explicitly requested:

       	https://bugzilla.redhat.com/show_bug.cgi?id=1335534

    Signed-off-by: Jiri Denemark <jdenemar>

commit be3d59754b1a1da174ff1796882a0ceb35e198e8
Refs: v3.1.0-17-gbe3d59754
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Jan 31 13:44:00 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Use enum for CPU model expansion type

    Signed-off-by: Jiri Denemark <jdenemar>

commit bb3363c90b5b19c37f8e5b8f512eb00014d2dae4
Refs: v3.1.0-18-gbb3363c90
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu Feb 23 13:53:51 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Use full CPU model expansion on x86

    The static CPU model expansion is designed to return only canonical
    names of all CPU properties. To maintain backwards compatibility libvirt
    is stuck with different spelling of some of the features, but we need to
    use the full expansion to get the additional spellings. In addition to
    returning all spelling variants for all properties the full expansion
    will contain properties which are not guaranteed to be migration
    compatible. Thus, we need to combine both expansions. First we need to
    call the static expansion to limit the result to migratable properties.
    Then we can use the result of the static expansion as an input to the
    full expansion to get both canonical names and their aliases.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 2f882dbfa92c14d585a786a42d284b63ffdca4e3
Refs: v3.1.0-19-g2f882dbfa
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu Feb 23 14:31:23 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Make virQEMUCapsInitCPUModel testable

    Signed-off-by: Jiri Denemark <jdenemar>

commit d065934cd07c01fbb29f25bbb223eb4ce126a90e
Refs: v3.1.0-20-gd065934cd
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 1 17:48:41 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Switch host CPU data scripts to model expansion

    Instantiating "host" CPU and querying it using qom-get has been the only
    way of probing host CPU via QEMU until 2.9.0 implemented
    query-cpu-model-expansion for x86_64. Even though libvirt never really
    used the old way its result can be easily converted into the one
    produced by query-cpu-model-expansion. Thus we can reuse the original
    test data and possible get new data from hosts where QEMU does not
    support the new QMP command.

    Signed-off-by: Jiri Denemark <jdenemar>

commit d46a1aa4d8caafe977cc41a80ef86af1d10e60b7
Refs: v3.1.0-21-gd46a1aa4d
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Feb 13 14:59:42 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Convert all json data files to query-cpu-model-expansion

    Converted by running the following command, renaming the files as
    *.new, and committing only the *.new files.

        (cd tests/cputestdata; ./cpu-convert.py *.json)

    Signed-off-by: Jiri Denemark <jdenemar>

commit a19696b5924e7512dcca4f30d15147036708389e
Refs: v3.1.0-22-ga19696b59
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Feb 13 10:33:52 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Test virQEMUCapsInitCPUModel

    The original test didn't use family/model numbers to make better
    decisions about the CPU model and thus mis-detected the model in the two
    cases which are modified in this commit. The detected CPU models now
    match those obtained from raw CPUID data.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 5e4fc2ef993343643587f2b079b63f2c9f038e6f
Refs: v3.1.0-23-g5e4fc2ef9
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Feb 13 15:04:38 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Drop obsolete CPU test data files

    Signed-off-by: Jiri Denemark <jdenemar>

commit 8907204cd83f0ca29c48d19bbf2778132d8578a2
Refs: v3.1.0-24-g8907204cd
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Feb 13 15:06:35 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Drop .new suffix from CPU test data files

    Signed-off-by: Jiri Denemark <jdenemar>

Comment 18 Tzach Shefi 2017-03-29 10:26:40 UTC

If it's on any help I also hit this
On a nested OPS9 (-p 2017-03-28.1 )
RHEL7.3 

[root@compute-0 ~]# rpm -qa | grep qemu-kvm
qemu-kvm-common-rhev-2.6.0-28.el7_3.6.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64

This bit from nova compute logs helped me reach this bug

   if ret is None: raise libvirtError ('virDomainGetXMLDesc() failed', dom=self)
ERROR nova.compute.manager [instance: 092bdc0e-742e-4324-8b2f-2def3c376a14] libvirtError: internal error: client socket is closed

The suggested work around helped 
#cpu_mode=host-model  <- was set changed to none
cpu_mode=none

Restarted nova compute service instance booted up fine.

Comment 20 chhu 2017-06-16 02:16:40 UTC

Reproduced on packages:
libvirt-client-2.0.0-10.el7_3.9.x86_64
qemu-kvm-1.5.3-126.el7.x86_64

Steps:
1. Edit the /etc/nova/nova.conf, and restart OpenStack services.
virt_type=qemu
cpu_mode=host-model
#service openstack-nova-api restart
#service openstack-nova-compute restart

Then, try to start a VM with cirros image. The VM hang.
SeaBIOS....
iPXE ...PCI2.10 PnP PMM ...

2. Change the cpu_mode=none
#service openstack-nova-api restart
#service openstack-nova-compute restart
The VM still hang.

3. Change back:
virt_type=kvm
cpu_mode=host-model
#service openstack-nova-api restart
#service openstack-nova-compute restart

The VM start successfully.

Comment 21 chhu 2017-06-16 02:38:55 UTC

Reproduced on packages:
libvirt-client-2.0.0-10.el7_3.9.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.10.x86_64

Steps:
1. Edit the /etc/nova/nova.conf, and restart OpenStack services.
virt_type=qemu
cpu_mode=host-model
#service openstack-nova-api restart
#service openstack-nova-compute restart

Then, try to start a VM with cirros image. The VM hang.
-----------------------------------------------------
Starting up....

-----------------------------------------------------

2. Change the cpu_mode=none
#service openstack-nova-api restart
#service openstack-nova-compute restart
The VM start successfully.

Comment 22 chhu 2017-06-16 03:20:17 UTC

Try to verify with packages:
libvirt-client-3.2.0-10.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.10.x86_64

Steps:
1. Edit the /etc/nova/nova.conf, and restart OpenStack services.
virt_type=qemu
cpu_mode=host-model
#service openstack-nova-api restart
#service openstack-nova-compute restart

Then, try to start a VM with cirros image. The VM hang.
-----------------------------------------------------
Starting up....

-----------------------------------------------------

Comment 23 chhu 2017-06-16 03:25:39 UTC

Verified on RHOS11 + RHEL7.3.

with packages:
libvirt-client-3.2.0-10.el7.x86_64
qemu-kvm-rhev-2.9.0-10.el7.x86_64

Steps:
1. Edit the /etc/nova/nova.conf, and restart OpenStack services.
virt_type=qemu
cpu_mode=host-model
#service openstack-nova-api restart
#service openstack-nova-compute restart

Then, try to start a VM with cirros image.
The VM start successfully. No hang.

2. Edit the /etc/nova/nova.conf, and restart OpenStack services.
virt_type=kvm
cpu_mode=host-model

#service openstack-nova-api restart
#service openstack-nova-compute restart

Then, try to start a VM with cirros image.
The VM start successfully. No hang.

Comment 24 errata-xmlrpc 2017-08-01 17:19:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 25 errata-xmlrpc 2017-08-01 23:59:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846