| Summary: | gdb cannot "handle" a large data structure | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Dave Anderson <anderson> | ||||
| Component: | gdb | Assignee: | Jan Kratochvil <jan.kratochvil> | ||||
| Status: | CLOSED ERRATA | QA Contact: | qe-baseos-tools-bugs | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 6.0 | CC: | pmuller, tromey | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | gdb-7.2-49.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-12-06 17:33:50 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
Here are links to the two vmlinux files referenced above: http://people.redhat.com/anderson/vmlinux-2.6.38-rc4.gz http://people.redhat.com/anderson/vmlinux-2.6.38.2-9.fc15.gz Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. The DWARF is definitely correct, but (IMO) odd, e.g.:
[ 4428] member
name (strp) "nr_zones"
decl_file (data1) 47
decl_line (data2) 615
type (ref4) [ ed]
data_member_location (data4) location list [ 13e40]
The earlier version doesn't have a location list here, just a constant:
[ 447e] member
name (strp) "nr_zones"
decl_file (data1) 47
decl_line (data2) 615
type (ref4) [ c7]
data_member_location (sdata) 13e40
The bug is that gdb gives up on this kind of member location.
I'm testing a patch. (In reply to comment #5) > I'm testing a patch. Awesome Tom -- I appreciate the quick response. (and I'm praying that your patch will apply to the crash utility's embedded gdb-7.0...)
> (and I'm praying that your patch will apply to the crash utility's
> embedded gdb-7.0...)
I took a quick look and I think it will.
It would be better if crash did not embed gdb in this way.
Stuff like this is why Fedora forbids private copies of libraries and the like.
E.g., I believe there have been important debuginfo-reading additions
since 7.0. That means that there are probably cases where gdb can access
values (or maybe even unwind -- not sure) but where crash cannot.
Yeah, well that's not going to happen... The embedded gdb is the perfect one-stop-shop for data structure deconstruction, text disassembly, line-numbers, pretty-printing, add-symbol-file capability for throwing in all the kernel module objects into the session, etc.. Actually, unwinding is one thing that gdb is *not* used for. It's just that the pglist_data structure is a key focal point, and the failure here was obvious. But gdb failures have been *very* rare over the years, and incorporating an updated version proved to be fairly easy the last time we went over this issue. http://fedoraproject.org/wiki/Packaging:No_Bundled_Libraries (that GDB is not a library is not accepted) Tom, can you attach the patch to this BZ? Created attachment 497441 [details]
the patch
OK, thanks... It looked good until the handle_data_member_location() call, which doesn't exist. But I don't see it in gdb-7.2-48.el6 either -- what are you patching against? Sorry it's right there in the patch, but the last stanza doesn't apply to either gdb-7.0 or gdb-7.2-48.el6. But I should be able to shoe-horn it in and test it out. Thanks! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1699.html |
Description of problem: When running gdb against the attached 2.6.38.2-9.fc15.x86_64 Fedora 15 vmlinux kernel, it cannot properly "handle" this particular, fairly large, data structure: (gdb) ptype struct pglist_data type = struct pglist_data { struct zone node_zones[4]; struct zonelist node_zonelists[2]; int nr_zones; spinlock_t node_size_lock; long unsigned int node_start_pfn; long unsigned int node_present_pages; long unsigned int node_spanned_pages; int node_id; wait_queue_head_t kswapd_wait; struct task_struct *kswapd; int kswapd_max_order; enum zone_type classzone_idx; } I noticed this when using an older gdb-7.0 version that is embedded in the crash utility, which relies upon gdb being able to determine structure member offsets. And in the data structure above, the offsets of all members beyond the node_zonelists[2] member return 0. But taking the crash utility out of the picture, the problem can also be seen simply by running "gdb vmlinux" using either gdb-7.2.50.20110328-31.fc15 or gdb-7.2-48.el6. Here's what happens: # gdb vmlinux-2.6.38.2-9.fc15 GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /root/vmlinux-2.6.38.2-9.fc15...done. (gdb) ptype struct pglist_data type = struct pglist_data { struct zone node_zones[4]; struct zonelist node_zonelists[2]; int nr_zones; spinlock_t node_size_lock; long unsigned int node_start_pfn; long unsigned int node_present_pages; long unsigned int node_spanned_pages; int node_id; wait_queue_head_t kswapd_wait; struct task_struct *kswapd; int kswapd_max_order; enum zone_type classzone_idx; } (gdb) p &((struct pglist_data *)(0x0)).node_zonelists[0] $1 = (struct zonelist *) 0x1c00 (gdb) p &((struct pglist_data *)(0x0)).nr_zones $2 = (int *) 0x0 (gdb) p &((struct pglist_data *)(0x0)).node_size_lock $3 = (spinlock_t *) 0x0 (gdb) p &((struct pglist_data *)(0x0)).node_start_pfn $4 = (long unsigned int *) 0x0 (gdb) p &((struct pglist_data *)(0x0)).node_present_pages $5 = (long unsigned int *) 0x0 (gdb) p &((struct pglist_data *)(0x0)).node_spanned_pages $6 = (long unsigned int *) 0x0 (gdb) p &((struct pglist_data *)(0x0)).node_id $7 = (int *) 0x0 (gdb) p &((struct pglist_data *)(0x0)).kswapd_wait $8 = (wait_queue_head_t *) 0x0 (gdb) p &((struct pglist_data *)(0x0)).kswapd $9 = (struct task_struct **) 0x0 (gdb) p &((struct pglist_data *)(0x0)).kswapd_max_order $10 = (int *) 0x0 (gdb) p &((struct pglist_data *)(0x0)).classzone_idx $11 = (enum zone_type *) 0x0 (gdb) Interestingly enough, if I run against a slightly earlier 2.6.38-rc4 kernel, the problem does not happen: # gdb vmlinux-2.6.38-rc4 GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /root/vmlinux-2.6.38-rc4...done. (gdb) ptype struct pglist_data type = struct pglist_data { struct zone node_zones[4]; struct zonelist node_zonelists[2]; int nr_zones; spinlock_t node_size_lock; long unsigned int node_start_pfn; long unsigned int node_present_pages; long unsigned int node_spanned_pages; int node_id; wait_queue_head_t kswapd_wait; struct task_struct *kswapd; int kswapd_max_order; enum zone_type classzone_idx; } (gdb) p &((struct pglist_data *)(0x0)).node_zonelists[0] $1 = (struct zonelist *) 0x1c00 (gdb) p &((struct pglist_data *)(0x0)).nr_zones $2 = (int *) 0x13e40 (gdb) p &((struct pglist_data *)(0x0)).node_size_lock $3 = (spinlock_t *) 0x13e44 (gdb) p &((struct pglist_data *)(0x0)).node_start_pfn $4 = (long unsigned int *) 0x13e48 (gdb) p &((struct pglist_data *)(0x0)).node_present_pages $5 = (long unsigned int *) 0x13e50 (gdb) p &((struct pglist_data *)(0x0)).node_spanned_pages $6 = (long unsigned int *) 0x13e58 (gdb) p &((struct pglist_data *)(0x0)).node_id $7 = (int *) 0x13e60 (gdb) p &((struct pglist_data *)(0x0)).kswapd $8 = (struct task_struct **) 0x13e80 (gdb) p &((struct pglist_data *)(0x0)).kswapd_max_order $9 = (int *) 0x13e88 (gdb) p &((struct pglist_data *)(0x0)).classzone_idx $10 = (enum zone_type *) 0x13e8c (gdb) Of interest is that the earlier kernel was compiled with 4.5.1, whereas the the newer one was compiled with 4.6.0-1. Version-Release number of selected component (if applicable): gdb-7.2-48.el6 How reproducible: Always Steps to Reproduce: 1. gdb vmlinux-2.6.38.2-9.fc15 2. calculate offset of pglist_data structure members as shown above 3. Actual results: Member offsets beyond pglist_data.node_zonelists[] are miscalculated to be 0. Expected results: Should be the same offset as seen when running against "earlier" vmlinux-2.6.38-rc4 kernel. Additional info: vmlinux-2.6.38-rc4: gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) vmlinux--2.6.38.2-9.fc15: gcc version 4.6.0 20110329 (Red Hat 4.6.0-1) (GCC)