Bug 1238615 - virsh fails randomly with "Domain not found: xenUnifiedDomainLookupByName"
Summary: virsh fails randomly with "Domain not found: xenUnifiedDomainLookupByName"
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libvirt
Version: 5.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: 5.9
Assignee: Daniel Berrangé
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: TRACKER-bugs-affecting-libguestfs
TreeView+ depends on / blocked
 
Reported: 2015-07-02 09:42 UTC by tingting zheng
Modified: 2017-04-18 21:53 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-18 21:53:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
virsh command log with debugging (30.96 KB, text/plain)
2015-09-10 21:50 UTC, Richard W.M. Jones
no flags Details

Comment 6 Richard W.M. Jones 2015-09-10 21:50:38 UTC
Created attachment 1072371 [details]
virsh command log with debugging

Dan's log file.

Comment 7 Daniel Berrangé 2015-09-11 11:21:57 UTC
I've not figured out why this is broken, but I have narrowed it down to the Xen XM driver code. The guest is not running so Xend and XenStore report no info, so the name lookup moves onto the XM driver code. I can see the guest being initially loaded, but when it comes todo the lookup it non-deterministically fails. I can't tell what's broken in the XM driver code though.

Comment 8 Daniel Berrangé 2015-09-11 13:42:35 UTC
This was indeed a nasty issue, only apparent because this test machine has a hell of a lot of guests defined in /etc/xen and is quite slow at loading them


https://www.redhat.com/archives/libvir-list/2015-September/msg00432.html

commit 4e7028a83d9932e89fb552b40221ecd844cbd690
Author: Daniel P. Berrange <berrange>
Date:   Fri Sep 11 14:15:50 2015 +0100

    xen: fix race in refresh of config cache

The xenXMConfigCacheRefresh method scans /etc/xen and loads
all config files it finds. It then scans its internal hash
table and purges any (previously) loaded config files whose
refresh timestamp does not match the timestamp recorded at
the start of xenXMConfigCacheRefresh(). There is unfortunately
a subtle flaw in this, because if loading the config files
takes longer than 1 second, some of the config files will
have a refresh timestamp that is 1 or more seconds different
(newer) than is checked for. So we immediately purge a bunch
of valid config files we just loaded.

To avoid this flaw, we must pass the timestamp we record at
the start of xenXMConfigCacheRefresh() into the
xenXMConfigCacheAddFile() method, instead of letting the
latter call time(NULL) again.

Signed-off-by: Daniel P. Berrange <berrange redhat com>

Comment 12 Richard W.M. Jones 2015-09-16 13:04:02 UTC
This bug would affect virt-v2v users on RHEL 7.2 who are
converting guests from RHEL 5 Xen legacy systems (to modern KVM).

It only affects slower systems with lots of Xen guests, and can
usually be worked around by repeating the v2v command.

I think we should wait to see if a customer hits this bug.  So
far we have only seen it on our own machines, and to my knowledge
no customer has hit it.

Comment 17 Chris Williams 2017-04-18 21:53:20 UTC
Red Hat Enterprise Linux 5 shipped it's last minor release, 5.11, on September 14th, 2014. On March 31st, 2017 RHEL 5 exited Production Phase 3 and entered Extended Life Phase. For RHEL releases in the Extended Life Phase, Red Hat  will provide limited ongoing technical support. No bug fixes, security fixes, hardware enablement or root-cause analysis will be available during this phase, and support will be provided on existing installations only.  If the customer purchases the Extended Life-cycle Support (ELS), certain critical-impact security fixes and selected urgent priority bug fixes for the last minor release will be provided.  For more details please consult the Red Hat Enterprise Linux Life Cycle Page:
https://access.redhat.com/support/policy/updates/errata

This BZ does not appear to meet ELS criteria so is being closed WONTFIX. If this BZ is critical for your environment and you have an Extended Life-cycle Support Add-on entitlement, please open a case in the Red Hat Customer Portal, https://access.redhat.com ,provide a thorough business justification and ask that the BZ be re-opened for consideration of an errata. Please note, only certain critical-impact security fixes and selected urgent priority bug fixes for the last minor release can be considered.


Note You need to log in before you can comment on or make changes to this bug.