Bug 1785126
| Summary: | dump-guest-memory failed due to Python Exception <class 'gdb.MemoryError'> | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Xueqiang Wei <xuwei> | ||||
| Component: | gdb | Assignee: | Kevin Buettner <kevinb> | ||||
| gdb sub component: | system-version | QA Contact: | Michal Kolar <mkolar> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | medium | ||||||
| Priority: | medium | CC: | chayang, coli, dsmith, gdb-bugs, jinzhao, juzhang, keiths, kevinb, lersek, marcandre.lureau, mcermak, ohudlick, qe-baseos-tools-bugs, virt-maint | ||||
| Version: | 8.2 | Keywords: | Bugfix, Triaged | ||||
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
||||
| Target Release: | 8.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | gdb-8.2-13.el8 | Doc Type: | No Doc Update | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 1777751 | ||||||
| : | 1847164 1886602 (view as bug list) | Environment: | |||||
| Last Closed: | 2021-05-18 15:46:02 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1777751, 1842691 | ||||||
| Bug Blocks: | 1847164 | ||||||
| Attachments: |
|
||||||
|
Comment 1
Xueqiang Wei
2019-12-19 08:45:18 UTC
With the same environment mentioned in Comment 1, tested with dump-guest-core=off. If set dump-guest-core=off: 1. hit this issue with seabios 2. hit this issue with edk2 (gdb) source /usr/share/qemu-kvm/dump-guest-memory.py (gdb) set height 0 (gdb) dump-guest-memory /tmp/vmcore X86_64 guest RAM blocks: target_start target_end host_addr message count ---------------- ---------------- ---------------- ------- ----- 0000000000000000 00000000000a0000 00007f9613e00000 added 1 00000000000c0000 00000000000ca000 00007f9613ec0000 added 2 00000000000ca000 00000000000cd000 00007f9613eca000 joined 2 00000000000cd000 00000000000e8000 00007f9613ecd000 joined 2 00000000000e8000 00000000000f0000 00007f9613ee8000 joined 2 00000000000f0000 0000000000100000 00007f9613ef0000 joined 2 0000000000100000 0000000080000000 00007f9613f00000 joined 2 00000000f4000000 00000000f8000000 00007f960fc00000 added 3 00000000f8000000 00000000fc000000 00007f960ba00000 added 4 00000000fd010000 00000000fd012000 00007f971ba00000 added 5 00000000fffc0000 0000000100000000 00007f971be00000 added 6 0000000100000000 0000000180000000 00007f9693e00000 added 7 Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x7f970f87c000: Error occurred in Python command: Cannot access memory at address 0x7f970f87c000 (gdb) bt #0 0x00007f9732829016 in ppoll () at /lib64/libc.so.6 #1 0x0000560d590351c5 in ppoll (__ss=0x0, __timeout=0x7ffd99034da0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x0000560d590351c5 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=1675788861) at util/qemu-timer.c:334 #3 0x0000560d590360d5 in os_host_main_loop_wait (timeout=1675788861) at util/main-loop.c:233 #4 0x0000560d590360d5 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:497 #5 0x0000560d58cfda57 in main_loop () at vl.c:1981 #6 0x0000560d58cfda57 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4820 (gdb) Created attachment 1662350 [details]
Patch for dump-guest-memory.py
I was unable to create an environment for reproducing the bug on either RHEL 8.1 or RHEL 8.2. That said, I believe the problem exists in both 8.1 and 8.2; I just wasn't able to use virt-manager to create suitable VMs for testing. I WAS able to reproduce the problem using Fedora 31. Using virt-manager on Fedora 31, I created two VMs, one running RHEL 8.1 with firmware set to BIOS and the other with firmware set to OVMF (UEFI x86_64: /usr/share/edk2/ovmf/OVMF_CODE.fd). I then made these VMs run stand-alone, without the libvirt framework. With that in place, I followed the instructions in the "Steps to reproduce" section of the bug report. I observed that the dump-guest-memory.py script was able to create a guest core dump from within GDB using a core dump from running the RHEL 8.1 BIOS VM. When I tried the same thing using a core dump obtained from running the RHEL 8.1 OVMF VM, I observed the python exception noted in the bug report. However, when I attempt to dump the guest memory from GDB that's attached to the running QEMU process (which is running the RHEL 8.1 OVMF VM), the dump-guest-memory script works as expected. After poking around some more, I confirmed that the memory region in question is absent from the core file, but was originally accessible when the process was running. Again using GDB, I debugged the linux kernel on the virtualization host. This kernel was responsible for making the core dump of QEMU running the RHEL 8.1 OVMF VM. Specifically, I looked at vma_dump_size() in fs/binfmt_elf.c to see why the memory region in question was not being dumped. When I stepped through this chunk of code for the problematic memory region... /* Dump segments that have been written to. */ if (vma->anon_vma && FILTER(ANON_PRIVATE)) goto whole; if (vma->vm_file == NULL) return 0; ...I found that the "return 0" statement was being taken, which in turn causes the region in question to not be dumped. I took another look at the memory region (from within GDB attached to the running QEMU process running RHEL 8.1 OVMF) and found that it was all 0. It seems likely that the region in question had been allocated, but never written to, so I think it makes sense for the kernel to not dump this region. That being the case, the memory region in question is not available to GDB or the dump-guest-memory.py script, running within GDB. Thus, the behavior of GDB, throwing the exception for inaccessible memory, is correct. I've made a minor modification to dump-guest-memory.py which causes it to write out zeroes for inaccessible memory. As noted above, in this case, the memory is inaccessible due to the kernel choosing to not include it in the QEMU core dump. I've added this patch as an attachment. In conclusion, this problem is not a GDB bug. It might be argued that it's a kernel bug, but as described earlier I think the kernel's behavior in the case makes sense. Therefore, I think the bug is actually in the dump-guest-memory.py script which doesn't handle the case of the allocated (zeroed), but never-written-to region not being placed in the core dump. (In reply to Xueqiang Wei from comment #2) > With the same environment mentioned in Comment 1, tested with > dump-guest-core=off. > > If set dump-guest-core=off: 1. hit this issue with seabios 2. hit this > issue with edk2 FWIW, dump-guest-memory.py won't work correctly with "dump-guest-core=off". When the "off" setting is used, QEMU calls madvise(...,...,MADV_DONT_DUMP) for the guest memory regions. So, when you cause a core dump to occur, those pages won't be dumped, leading all of the guest's memory being inaccessible. Per Kevin Buettner's analysis in <https://bugzilla.redhat.com/show_bug.cgi?id=1785126#c4>, the host kernel's process coredump logic is what decides to omit the never-written-to region from the qemu process coredump. The python script should expect and handle this possibility. Kevin proposed a patch in <https://bugzilla.redhat.com/show_bug.cgi?id=1785126#c3>; we should upstream it and backport it. Moving this BZ back to qemu-kvm. Kevin, can you send the patch to qemu-devel? thanks (In reply to Marc-Andre Lureau from comment #7) > Kevin, can you send the patch to qemu-devel? thanks Yes, I can (and will) do that. (In reply to Marc-Andre Lureau from comment #7) > Kevin, can you send the patch to qemu-devel? thanks Done. See: https://lists.nongnu.org/archive/html/qemu-devel/2020-02/msg03875.html I've done further investigations and have found that there is a bug in BFD and GDB. I've filed an upstream bug which may be found here: https://sourceware.org/bugzilla/show_bug.cgi?id=25631 A patch series fixing this bug can be found starting here: https://sourceware.org/ml/gdb-patches/2020-03/msg00106.html (Though it's likely that several iterations will be necessary for the series to be approved.) I've tested a GDB build using my patches against a QEMU core file and found that the original dump-guest-memory script works as expected. Thank you, Kevin -- does that mean we should move this BZ (and bug 1777751) to a different component? Thanks. (In reply to Laszlo Ersek from comment #13) > Thank you, Kevin -- does that mean we should move this BZ (and bug 1777751) > to a different component? Thanks. Yes, they should both be moved to gdb. Moving to gdb per <https://bugzilla.redhat.com/show_bug.cgi?id=1785126#c14>. Thanks! Moving to 8.4. Reproduced against gdb-8.2-12.el8 and verified against both gdb-8.2-13.el8 and gdb-8.2-14.el8. Verified against gdb-8.2-14.el8. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (gdb bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1836 |