Bug 819284 - RHEL5 Xen: hypervisor conring_size too small when loglevel=all, missing logs
RHEL5 Xen: hypervisor conring_size too small when loglevel=all, missing logs
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen (Show other bugs)
5.8
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Xen Maintainance List
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-06 05:29 EDT by Pasi Karkkainen
Modified: 2012-05-09 02:52 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-05-09 02:52:54 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Pasi Karkkainen 2012-05-06 05:29:06 EDT
Description of problem:

When using the following Xen hypervisor (xen.gz) boot cmdline options: "dom0_mem=2048M loglvl=all guest_loglvl=all iommu=1"

You end up missing lines in the Xen dmesg. The default conring buffer size is too small, and "xm dmesg" won't show all the messages from the beginning of the hypervisor boot because there's no room to store all the lines. This happens on systems with many CPUs and/or with an IOMMU.

Version-Release number of selected component (if applicable):
RHEL 5.8 / 2.6.18-308.1.1.el5.

How reproducible:
Always.

Steps to Reproduce:
1. Add the following options to grub.conf: "dom0_mem=2048M loglvl=all guest_loglvl=all iommu=1".
2. reboot the system.
3. check "xm dmesg" and notice a lot of messages missing from the beginning. You can compare the "xm dmesg" output to serial console output.
  
Actual results:
"xm dmesg" begins with the following lines:

301 base: 0xfed00000
(XEN) [VT-D]dmar.c:468: Host address width 40
(XEN) [VT-D]dmar.c:477: found ACPI_DMAR_DRHD
(XEN) [VT-D]dmar.c:336: dmaru->address = fed90000
(XEN) [VT-D]dmar.c:293: found IOAPIC: bdf = 0:1e.1
(XEN) [VT-D]dmar.c:293: found IOAPIC: bdf = 0:13.0
(XEN) [VT-D]dmar.c:345: found INCLUDE_ALL
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1a.7
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1d.7
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1a.0
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1a.1
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1a.2
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1d.0
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1d.1
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1d.2
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1a.0
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1a.1
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1d.0
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1d.1
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1d.2
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1a.7
(XEN) [VT-D]dmar.c:481: found ACPI_DMAR_RMRR
(XEN) [VT-D]dmar.c:287: found endpoint: bdf = 0:1d.7
(XEN) [VT-D]dmar.c:485: found ACPI_DMAR_ATSR
(XEN) [VT-D]dmar.c:274: found bridge: bdf = 0:1.0  sec = 1  sub = 1
(XEN) [VT-D]dmar.c:274: found bridge: bdf = 0:3.0  sec = 2  sub = 2
(XEN) [VT-D]dmar.c:274: found bridge: bdf = 0:7.0  sec = 3  sub = 5
(XEN) [VT-D]dmar.c:274: found bridge: bdf = 0:9.0  sec = 6  sub = 6
(XEN) [VT-D]dmar.c:274: found bridge: bdf = 0:a.0  sec = 7  sub = 7
(XEN) Intel VT-d has been enabled
...

So clearly the messages from the beginning of the boot are missing.


Expected results:
All the information should be shown, including the messages from the beginning of the boot.

Additional info:

This happens because the Xen conring_size is too small in RHEL5 Xen hypervisor. Upstream Xen has an option called "conring_size=" which can be used to set (grow) the conring buffer size and this problem doesn't happen.
Comment 1 Laszlo Ersek 2012-05-07 04:00:49 EDT
Generally whenever we want to do anything with the Xen dmesg, we capture it over the serial console (sometimes with "sync_console" on the hv command line), and ask customers to do the same, since that's the only "sure" way to save it (if there are boot problems eg.)

If you redirect the messages to "/var/log/xen/console/hypervisor.log" with setting XENCONSOLED_LOG_HYPERVISOR=yes in "/etc/sysconfig/xend", does the truncation still occur? (I would guess so; by the time xend starts in the boot process we must have lost the first messages.)

On the surface this seems to be an easy convenience backport, but a quick hg blame + grep identified the following patches:

http://xenbits.xensource.com/hg/xen-unstable.hg/rev/19543
http://xenbits.xensource.com/hg/xen-unstable.hg/rev/20130
http://xenbits.xensource.com/hg/xen-unstable.hg/rev/20133
http://xenbits.xensource.com/hg/xen-unstable.hg/rev/20374
http://xenbits.xensource.com/hg/xen-unstable.hg/rev/21038
http://xenbits.xensource.com/hg/xen-unstable.hg/rev/21225

These patches appear a bit turbulent for functionality I'd expect to be "simple" and "convenience".

Xen team, thoughts?
Comment 2 Andrew Jones 2012-05-09 02:52:54 EDT
I thought about doubling the ring size in the HV once because it appears that the tools already support 32k. However, we dropped the idea since the benefit wasn't worth the risk. As Laszlo said, we can already capture all logs by using serial. I'm closing this as WONTFIX. It's too late in RHEL5's lifecycle to churn much code for minimal feature gain.

Note You need to log in before you can comment on or make changes to this bug.