Red Hat Bugzilla – Bug 869361
UV: KVM virt-manager fails to launch on large memory systems (>8TB)
Last modified: 2013-10-15 05:49:21 EDT
Description of problem:
The current problem occurs when trying to launch virt-manager on a
large memory system. Smallest I've seen so far, was uv48-sys with
When running virt-manager you'll see the error.
libvirtError: Unable to encode message payload.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Does this bug have the wrong component?
What is "UV"?
What is the output of the following commands when run as root?
virsh list --all
UV = Ultra Violet -- SGI's x86_64 super commuters.
There are actually two systems where this fails,
UV1000 w/ 2048cpus and 8TB of memory
UV2000 w/ 2048cpus and 16TB of memory
In both cases that is 2048 cores -- we are not currently
I will add the requested information soon.
I have the output from ' strace -o virsh-out -ff virsh capabilities'
if that is helpful.
# virsh list --all
Id Name State
# virsh capabilities
error: failed to get capabilities
error: Unable to encode message payload
error: Reconnected to the hypervisor
[root@uv48-sys ~]# topology
System type: UV100/1000
System name: uv48-sys
Serial number: UV-00000048
Partition number: 0
9084.85 GB Memory Total
128.00 GB Max Memory on any Node
1 BASE I/O Riser
2 Network Controllers
1 Storage Controller
8 USB Controllers
1 VGA GPU
This has the wrong component, which is why no one was looking at it.
From the info in comment #4 I'd guess it is probably not the amount of RAM that's the trigger, but rather the size of the NUMA topology causing very large capabilities XML
Provide provide the version of the libvirt RPM that is installed when seeing this behaviour.
George, I think this is the very same bug that we've chased a while ago. Let me find it.
Move to rhel6.5 tracker.
can you please provide both server & client side debug logs as well as version requested in comment 7?
Michael, that info is in BZ 960683.
This BZ should get closed out as replaced by BZ 960683.
Sorry for the confusion.
*** This bug has been marked as a duplicate of bug 960683 ***
hi, Michal Privoznik,
I'm verifying this bug in the latest libvirt 6.5 build.
First, I need to reproduce this bug in the old build, if it can be reproduced, then, we test the latest build to verify the bugs. But the problem is that we didn't have the large machine which memory is large than 8T.
Since this bug is duplicated with bug 960683, and there is one attachment to simulate huge cpus on small boxes. I add that patch to the old build and try to reproduce this bug. The result is:
This bug (869361) can't be reproduced via that simulated path.
The bug (960683) can be reproduced via that simulated path.
PS. Here is the simulated path link: https://bugzilla.redhat.com/attachment.cgi?id=756168
Would you please give me some advice, how can I simulated one env to reproduce this bug? Or how can I verify this bug in the latest build? Thanks very much.
Well I don't think this one needs to be reproduced. It is a duplicate. The orginal problem for this bug was encoding numa topology into capabilities XML. The encoded XML was too big for a libvirt packet. However, we've fixed it meanwhile and now even huge XML can be sent through.
OK, I got it. Thanks for your quickly reply.
(In reply to Michal Privoznik from comment #15)
> Well I don't think this one needs to be reproduced. It is a duplicate. The
> orginal problem for this bug was encoding numa topology into capabilities
> XML. The encoded XML was too big for a libvirt packet. However, we've fixed
> it meanwhile and now even huge XML can be sent through.