Description of problem: When using a qemu/kvm guest, currently virsh dump is unimplemented. That's mostly because we lack a mechanism in qemu-kvm itself to generate a proper corefile. Once we have that mechanism, implementing virsh dump should be very straightforward.
With respect to the newly-added blocker and rhel-5.4.0 flags, this was discussed earlier in the original BZ #505527: ------------------------------------------------------------------ https://bugzilla.redhat.com/show_bug.cgi?id=505527#c18 Comment #18 From CAI Qian (caiqian) 2009-06-16 18:02:34 EDT Raise severity/priority to high, and set RHEL5.4 Beta blocker flags. The reason behind this is that it is a feature that should be in for beta testing. ------------------------------------------------------------------ https://bugzilla.redhat.com/show_bug.cgi?id=505527#c19 Comment #19 From Chris Lalancette (clalance) 2009-06-17 04:23:47 EDT (In reply to comment #18) > Raise severity/priority to high, and set RHEL5.4 Beta blocker flags. The reason > behind this is that it is a feature that should be in for beta testing. You have to be careful what you are talking about. Getting "virsh dumpcore" working is a significant amount of work that runs across several components, including qemu, libvirt, and crash. A reasonable target for that might be RHEL-6, but certainly not 5.4. We may, however, be able to do enough bugfixing on KVM to get kdump working in 5.4. That will have to be looked at in detail. Chris Lalancette ------------------------------------------------------------------ https://bugzilla.redhat.com/show_bug.cgi?id=505527#c21 Comment #21 From CAI Qian (caiqian) 2009-06-17 05:28:42 EDT Yes, I was talking about getting kdump working inside KVM guests, not "virsh dump". ------------------------------------------------------------------
Don't hold your breath, but I'm looking at qemu. :-)
Generating a proper corefile from qemu-kvm is not as easy as it sounds, and does not completely make sense in a FV environment. For example, qemu-kvm has no idea of the kernel memory map, or the startup arguments of the kernel. I wrote something that would generate an ELF file for a running qemu image, vaguely resembling /proc/kcore, but it needs to get the kernel memory map from the user (it's not hard for the user to extract it, but that's not the point). It may make more sense to have some kind of helper kernel module and then to implement the generation of the core file purely in libvirt, with no need for qemu help.
(In reply to comment #3) > Generating a proper corefile from qemu-kvm is not as easy as it sounds, and > does not completely make sense in a FV environment. > > For example, qemu-kvm has no idea of the kernel memory map, or the startup > arguments of the kernel. I wrote something that would generate an ELF file for But why would qemu-kvm want to know anything about the guest kernel memory map? The whole idea is to have qemu-kvm dump out a "raw" version of memory; from there, the crash utility knows (or can be taught) what to do with it. It's basically equivalent to what we can do for Xen FV guests. Or am I missing something? Chris Lalancette
> But why would qemu-kvm want to know anything about the guest kernel memory map? If you wanted to debug the core file with gdb, then yes, you'd need virtual address references in the PT_LOAD segments. But the crash utility only really cares about physical memory. That being said, crash does need to know any physical base address relocations being done on architectures like x86_64 where kernel unity mapped addresses do not directly yield the physical address by just stripping PAGE_OFFSET. From kdump ELF vmcore files, crash calculates the physical base address from the kdump's vmcore PT_LOAD segment's virtual address references. And when kdump vmcores are transformed by makedumpfile into its unique "compressed kdump" format, the physical base address is shoved into the unique header used by that format. For xen kernels, though, x86_64 physical base address is hardwired to 0x200000 on x86_64 because the xen dumpfile headers have no such information. So far, the hard-wiring has "held"... So Paolo's statement that it "does not completely make sense in a FV environment" is valid in that respect. When Chris Smith did his KVM/dump prototype, he generated a vmcore file from a KVM guest that ran just fine with the crash utility -- but he "cheated" in that he did know the particulars of the FV kernel that was running, and did put the virtual address particulars in the PT_LOAD segments. (BTW, he never responded to my query as to the status of his KVM dumpfile prototype that he posted on the qemu-devel list -- maybe he doesn't work at HP any more?)
Yes, the advantage of doing it in qemu is that you leverage all the arch-dependent code to translate kernel addresses to physical address. But actually the exact same information that qemu uses, is available in a saved VM file.
> That being said, crash does need to know any physical base address > relocations being done on architectures like x86_64 where kernel unity > mapped addresses do not directly yield the physical address by just > stripping PAGE_OFFSET. Sorry -- I misspoke above -- I was referring to the x86_64 __START_KERNEL_map region above, not the PAGE_OFFSET/unity-mapped region. The x86_64 kernel has two primary virtual mappings of physical memory, one that is PAGE_OFFSET (0xffff880000000000) based, where: physaddr = virtaddr - 0xffff880000000000 and a second mapping of the kernel text and static data, which is based at __START_KERNEL_map (0xffffffff80000000) -- but has not been "unity-mapped" since x86_64 kernels became relocatable. For translating kernel/static-data virtual addresses into their physical address, the physical base address must also be applied like so: physaddr = virtaddr - __START_KERNEL_map + physical_base So for example, a RHEL5 kdump vmcore have one mapping that references the __START_KERNEL_map region at 0xffffffff80000000, which maps to physical address 0x0000000000200000, and then a bunch of PAGE_OFFSET based unity-mapped regions based from 0xffff810000000000: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000270 0x0000000000000000 0x0000000000000000 0x0000000000000f20 0x0000000000000f20 0 LOAD 0x0000000000001190 0xffffffff80000000 0x0000000000200000 0x00000000004e6000 0x00000000004e6000 RWE 0 LOAD 0x00000000004e7190 0xffff810000000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x0000000000587190 0xffff810000100000 0x0000000000100000 0x0000000000f00000 0x0000000000f00000 RWE 0 LOAD 0x0000000001487190 0xffff810009000000 0x0000000009000000 0x00000000953bf000 0x00000000953bf000 RWE 0 LOAD 0x0000000096846190 0xffff81009e486000 0x000000009e486000 0x00000000015ac000 0x00000000015ac000 RWE 0 LOAD 0x0000000097df2190 0xffff81009fa9a000 0x000000009fa9a000 0x000000000000f000 0x000000000000f000 RWE 0 LOAD 0x0000000097e01190 0xffff81009fb1a000 0x000000009fb1a000 0x000000000000b000 0x000000000000b000 RWE 0 LOAD 0x0000000097e0c190 0xffff81009fb3a000 0x000000009fb3a000 0x00000000000c6000 0x00000000000c6000 RWE 0 LOAD 0x0000000097ed2190 0xffff810100000000 0x0000000100000000 0x0000000ee0000000 0x0000000ee0000000 RWE 0 So presuming that KVM FV kernels are given a contiguous block of "pseudo-physical" memory (probably an invalid assumption), a KVM dump could be expressed in two PT_LOAD segments, one for the __START_KERNEL_map, and one for the PAGE_OFFSET unity-mapped region. > But actually the exact same information that qemu uses, is available in > a saved VM file. I wondering whether the __START_KERNEL_map-to-physical_base relationship can be gleaned from a saved-VM file? It sounds like the answer is no, but that's the one missing piece of the puzzle.
> I wondering whether the __START_KERNEL_map-to-physical_base relationship > can be gleaned from a saved-VM file? It sounds like the answer is no, > but that's the one missing piece of the puzzle. The saved VM file should have all the necessary pieces of the puzzle: CR3, the GDT base (which actually should not be needed), the page table. With the kernel debug info, you could also get the kcore_list and regenerate /proc/kcore.
But /proc/kcore doesn't help much with the phys_base determination. On my RHEL5 machine, with a __START_KERNEL_map of ffffffff80000000, and which has a phys_base of 0x200000 (2MB), here's its /proc/kcore: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000190 0x0000000000000000 0x0000000000000000 0x0000000000000974 0x0000000000000000 0 LOAD 0x0000ffffff601000 0xffffffffff600000 0x0000000000000000 0x0000000000800000 0x0000000000800000 RWE 1000 LOAD 0x0000ffff88001000 0xffffffff88000000 0x0000000000000000 0x0000000077f00000 0x0000000077f00000 RWE 1000 LOAD 0x0000ffff80002000 0xffffffff80001000 0x0000000000000000 0x00000000004e4cec 0x00000000004e4cec RWE 1000 LOAD 0x0000c20000001000 0xffffc20000000000 0x0000000000000000 0x00001fffffffffff 0x00001fffffffffff RWE 1000 LOAD 0x0000810000001000 0xffff810000000000 0x0000000000000000 0x000000003fe0a000 0x000000003fe0a000 RWE 1000 There is that one PT_LOAD segment at 0xffffffff80001000, which seemingly says that it maps to physical address 0, which is clearly not the case. In reality, virtual address 0xffffffff80001000 maps to physical address 0x201000. Kdump vmcores create a PT_LOAD segment for the __START_KERNEL_map region that factors in the phys_base offset.
...and for that matter, even if the kcore_list was useful, accessing *it* requires the phys_base.
No, the phys_base would be taken from the page table. I was saying that with the page table (whose physical address is in CR3) would be the missing piece, and with that you can make something like /proc/kcore.
(In reply to comment #11) > No, the phys_base would be taken from the page table. I was saying that with > the page table (whose physical address is in CR3) would be the missing piece, > and with that you can make something like /proc/kcore. Can you whip up a proof-of-concept? I don't need to "make something like /proc/kcore" -- I just need to be able to translate __START_KERNEL_map-based virtual addresses into their (pseudo) physical addresses -- as seen by the FV kernel -- and then know how to find those physical addresses in the saved-VM file.
Created attachment 350861 [details] proof of concept Here it is. The files in the tarball are: - qemu-load.c: Generic library to load QEMU save VM files. Users need to know if the host OS was 32- or 64-bit; tested only for 64-bit host. - qemu-load.h: Matching header file. - test.c: Example of how to use the library, plus code to actually map virtual addresses to physical. In the proof of concept, instead of using the __START_KERNEL_map I round the address of the IDT down by 1GB. The mapping code however is totally independent from this part.
Can I get a pointer to a vmlinux/saved-vm-file pair to work with?
Created attachment 350915 [details] qemu bits
Created attachment 350918 [details] libvirt bits (for upstream)
Created attachment 350920 [details] libvirt bits (for RHEL)
Comment on attachment 350915 [details] qemu bits created bug 510244 to track the qemu bits; moved the qemu-rhel.patch attachment there
This wasn't pushed to upstream in time for Update 4, so the best we can do at this point is reassign this for Update 5 and push the bits upstream where this wasn't done. I don't see why this wouldn't be accepted in libvirt. Daniel
Waiting for upstream qemu to accept at least the idea, and decide on the name for the dump command.
Created attachment 351100 [details] new, simpler patch for upstream
Created attachment 351101 [details] new, simpler patch for RHEL
Committed upstream at http://libvirt.org/git/?p=libvirt.git;a=commit;h=e1abc448143d83db8aad8962fc24b13465dbc69b Should this be CLOSED/UPSTREAM?
No, the patch should be backported to the RHEL5 version of libvirt, attached to this BZ, and this bug put into POST state ready for RHEL-5.5.
Created attachment 357661 [details] libvirt-0.7.0-qemud-dump.patch Backport of upstream e1abc44.
libvirt-0.6.3-24.el5 has been built in dist-5E-qu-candidate with the fixes Daniel
This bug has been verified with libvirt 0.6.3-24.el5 on RHEL-5.5. Already fixed, set status to VERIFIED. Steps to Verify: [root@dhcp-66-70-62 libvirt]# ll -h /tmp total 76K drwx------ 3 root root 4.0K Dec 30 11:45 gconfd-root drwx------ 2 root root 4.0K Dec 30 11:45 keyring-4qWfI1 srwxr-xr-x 1 root root 0 Dec 30 11:45 mapping-root drwx------ 2 root root 4.0K Dec 30 15:00 orbit-root drwx------ 2 root root 4.0K Dec 30 11:45 ssh-EEyqiF3551 drwx------ 2 root root 4.0K Dec 30 11:45 virtual-root.CZEllr [root@dhcp-66-70-62 libvirt]# virsh start rhel5u5 Domain rhel5u5 started [root@dhcp-66-70-62 libvirt]# virsh dump rhel5u5 /tmp/dump Domain rhel5u5 dumped to /tmp/dump [root@dhcp-66-70-62 libvirt]# ll -h /tmp total 445M -rw-r--r-- 1 root root 445M Dec 30 15:24 dump drwx------ 3 root root 4.0K Dec 30 11:45 gconfd-root drwx------ 2 root root 4.0K Dec 30 11:45 keyring-4qWfI1 srwxr-xr-x 1 root root 0 Dec 30 11:45 mapping-root drwx------ 2 root root 4.0K Dec 30 15:00 orbit-root drwx------ 2 root root 4.0K Dec 30 11:45 ssh-EEyqiF3551 drwx------ 2 root root 4.0K Dec 30 11:45 virtual-root.CZEllr [root@dhcp-66-70-62 libvirt]# ssh 192.168.122.77 uname -r The authenticity of host '192.168.122.77 (192.168.122.77)' can't be established. RSA key fingerprint is f0:91:5c:a1:88:47:45:87:52:ba:48:75:9c:8e:80:52. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.122.77' (RSA) to the list of known hosts. root.122.77's password: 2.6.18-183.el5 [root@dhcp-66-70-62 libvirt]# rpm -ivh kernel-debuginfo-2.6.18-183.el5.x86_64.rpm kernel-debuginfo-common-2.6.18-183.el5.x86_64.rpm Preparing... ########################################### [100%] 1:kernel-debuginfo-common########################################### [ 50%] 2:kernel-debuginfo ########################################### [100%] [root@dhcp-66-70-62 libvirt]# crash /usr/lib/debug/lib/modules/2.6.18-183.el5/vmlinux /tmp/dump crash 4.1.2-1.el5 Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... KERNEL: /usr/lib/debug/lib/modules/2.6.18-183.el5/vmlinux DUMPFILE: /tmp/dump CPUS: 1 DATE: Wed Dec 30 15:24:09 2009 UPTIME: 00:22:14 LOAD AVERAGE: 0.02, 0.05, 0.23 TASKS: 91 NODENAME: localhost.localdomain RELEASE: 2.6.18-183.el5 VERSION: #1 SMP Mon Dec 21 18:37:42 EST 2009 MACHINE: x86_64 (2992 Mhz) MEMORY: 1 GB PANIC: "" PID: 0 COMMAND: "swapper" TASK: ffffffff80308b60 [THREAD_INFO: ffffffff803fa000] CPU: 0 STATE: TASK_RUNNING (ACTIVE) WARNING: panic task not found crash> bt PID: 0 TASK: ffffffff80308b60 CPU: 0 COMMAND: "swapper" #0 [ffffffff803fbeb8] schedule at ffffffff80063f96 #1 [ffffffff803fbec0] thread_return at ffffffff80063ff8 #2 [ffffffff803fbf68] default_idle at ffffffff8006c3a5 #3 [ffffffff803fbf90] cpu_idle at ffffffff800497b7 crash> Version-Release number of selected component (if applicable): [root@dhcp-66-70-62 libvirt]# uname -a Linux dhcp-66-70-62.nay.redhat.com 2.6.18-183.el5 #1 SMP Mon Dec 21 18:37:42 EST 2009 x86_64 x86_64 x86_64 GNU/Linux [root@dhcp-66-70-62 libvirt]# lsmod|grep kvm kvm_intel 86664 0 kvm 223648 2 ksm,kvm_intel [root@dhcp-66-70-62 libvirt]# rpm -qa|grep libvirt libvirt-0.6.3-24.el5 libvirt-debuginfo-0.6.3-24.el5 libvirt-python-0.6.3-24.el5 [root@dhcp-66-70-62 libvirt]# rpm -q kvm kvm-83-140.el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0205.html