Bug 233938
Summary: | x86_64 crash session on RHEL5 fails with read error during initialization | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Eugene Teo (Security Response) <eteo> | ||||
Component: | crash | Assignee: | Dave Anderson <anderson> | ||||
Status: | CLOSED NOTABUG | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5.0 | CC: | eteo | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-03-26 15:17:22 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Eugene Teo (Security Response)
2007-03-26 04:06:35 UTC
Created attachment 150868 [details]
crash -d7 ./usr/lib/debug/lib/modules/2.6.9-22.0.1.ELsmp/vmlinux vmcore
Thanks for the "-d7" log -- that's usually my first request... If we strip out just dumpfile memory accesses, we see this: <readmem: ffffffff804d51d0, KVADDR, "xtime", 16, (FOE), 9ef570> <readmem: ffffffff803cc1a0, KVADDR, "system_utsname", 390, (ROE), 9efb5c> <readmem: ffffffff803cc180, KVADDR, "linux_banner", 8, (FOE), 7fff58fe6c48> <readmem: ffffffff80315dc2, KVADDR, "accessible check", 8, (ROE|Q), 7fff58fe68c8> <readmem: ffffffff80315dc2, KVADDR, "readstring characters", 574, (ROE|Q), 7fff58fe58b0> <readmem: ffffffff804d3080, KVADDR, "cpu_pda entry", 128, (FOE), a20540> <readmem: ffffffff804d3100, KVADDR, "cpu_pda entry", 128, (FOE), a20540> <readmem: ffffffff804d3180, KVADDR, "cpu_pda entry", 128, (FOE), a20540> <readmem: ffffffff804d3200, KVADDR, "cpu_pda entry", 128, (FOE), a20540> <readmem: ffffffff804d3280, KVADDR, "cpu_pda entry", 128, (FOE), a20540> <readmem: ffffffff804d3300, KVADDR, "cpu_pda entry", 128, (FOE), a20540> <readmem: ffffffff804d3380, KVADDR, "cpu_pda entry", 128, (FOE), a20540> <readmem: ffffffff804d3400, KVADDR, "cpu_pda entry", 128, (FOE), a20540> <readmem: 10010000084, KVADDR, "tss_struct ist array", 56, (FOE), 9fb090> <readmem: 1020385a004, KVADDR, "tss_struct ist array", 56, (FOE), 9fb0c8> crash: read error: kernel virtual address: 1020385a004 type: "tss_struct ist array" The last kernel virtual address access at 1020385a004 failed. The x86_64 has two "unity-mapped" virtual address spaces, one beginning at ffffffff00000000 (__START_KERNEL_map) and the second one beginning at 10000000000. The first one maps the kernel's static text and data, and the second one maps all physical memory into virtual memory. In both cases, the identifier can be stripped off, and that leaves the physical memory address. So the largest kernel text/data virtual address read was at ffffffff804d51d0 ("xtime"), or 4d51d0 physical. The last two reads were generic virtual address accesses, the first one at 10010000084, 10000084 physical, was successfully read, while the second one at 1020385a004, 20385a004 physical, failed. The netdump format is as simple as it gets -- it contains a page-sized ELF header, followed by the contents of physical memory. So the dumpfile should be equal to the size of physical memory plus a page for the ELF header data. Since the last fatal read attempt was at 20385a004 physical, the dumpfile would have to be over 8GB (0x200000000) in length. The other addresses shown for the "level4_pgt" page table addresses are all in the 15GB region, so I guessing that this system is ~16GB. So I'm presuming that the vmcore-incomplete is too small -- just do an "ls -l" on it. Thanks for the analysis. I learnt a lot. Yes, the incomplete vmcore is only 4.8GB and I was expecting a 16GB vmcore. Yep, that's unfortunate... Even if the crash code was hacked to skip the "ist" (interrupt stack) initialization, it's doubtful that it would get too far beyond that given that it's only got a quarter of the physical memory. For 32-bit x86 systems, you can often analyze vmcore-incomplete files as long as they at least contain all of "lowmem", i.e., at least 896MB. You wouldn't be able to access module data since that typically gets vmalloc'd out of highmem, but the crash session will initialize, and, since all kernel stacks are in lowmem, you could get backtraces for all tasks. In fact, most commands work just fine since kernel static data, slab memory, etc. comes out of lowmem. Highmem will only contain user-memory and vmalloc'd kernel memory (mostly for modules). But for 64-bit systems, stuff gets allocated from all over the physical memory map, and despite this just being "ist" related, it would invariably bump into another piece of critical data if that were ignored. |