896741 – parameter location-list debuginfo fails to cover -mfentry instrumentation

Bug 896741 - parameter location-list debuginfo fails to cover -mfentry instrumentation

Summary: parameter location-list debuginfo fails to cover -mfentry instrumentation

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	gcc
Sub Component:
Version:	18
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Jakub Jelinek
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-01-17 21:51 UTC by Alexander Kurtakov
Modified:	2013-12-21 16:54 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2013-12-21 16:54:38 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
The script (237 bytes, text/plain) 2013-01-17 21:51 UTC, Alexander Kurtakov	no flags	Details
Error output. (622 bytes, text/plain) 2013-01-17 21:52 UTC, Alexander Kurtakov	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
GNU Compiler Collection	54793	0	None	None	None	Never

Description Alexander Kurtakov 2013-01-17 21:51:26 UTC

Created attachment 680556 [details]
The script

Comment 1 Alexander Kurtakov 2013-01-17 21:52:04 UTC

Created attachment 680557 [details]
Error output.

Comment 2 Josh Stone 2013-01-18 00:49:25 UTC

From eu-readelf -N -w, that die is:

 [1517249]      formal_parameter
               name                 (strp) "count"
               decl_file            (data1) 1
               decl_line            (data2) 468
               type                 (ref4) [15095d8]
               location             (sec_offset) location list [577acc]

And that location list is:

 [577acc]  0x0000000000000b85..0x0000000000000bab [   0] reg1
           0x0000000000000bab..0x0000000000000bfb [   0] reg13
           0x0000000000000bfb..0x0000000000000c01 [   0] GNU_entry_value:
       [   0] reg1
                                                  [   3] stack_value
           0x0000000000000c01..0x0000000000000c11 [   0] reg13

I'm not sure why those address ranges are so low.  With regular readelf, it reads:

    00577acc ffffffff81195705 ffffffff8119572b (DW_OP_reg1 (rdx))
    00577acc ffffffff8119572b ffffffff8119577b (DW_OP_reg13 (r13))
    00577acc ffffffff8119577b ffffffff81195781 (DW_OP_GNU_entry_value: (DW_OP_reg1 (rdx)); DW_OP_stack_value)
    00577acc ffffffff81195781 ffffffff81195791 (DW_OP_reg13 (r13))
    00577acc <End of list>

The disassembly for the start of sys_write is:

    0xffffffff81195700 <+0>: callq  0xffffffff8163d680 <__fentry__>
    0xffffffff81195705 <+5>: push   %rbp

So the debuginfo apparently doesn't start until after the -mfentry addition.

Running stap -P (for prologue detection) doesn't change its attempt to probe at
the very start, 0xffffffff81195700, which the location list doesn't cover.

Frank also asked me if other lines in that function can resolve variables, and
it seems so:

$ stap -L 'kernel.statement("sys_write@fs/read_write.c:*")'
kernel.statement("sys_write@fs/read_write.c:471") $buf:char const* $count:size_t $f:struct fd $ret:ssize_t
kernel.statement("sys_write@fs/read_write.c:481") $buf:char const* $count:size_t $f:struct fd $ret:ssize_t

Comment 3 Frank Ch. Eigler 2013-01-18 00:52:28 UTC

The kernel has started using CFLAGS=-mfentry for its nefarious purposes, which generates DWARF that is more restricted than necessary.  The location list for function parameters cover include the -mfentry-caused callq/push pair.

Comment 4 Frank Ch. Eigler 2013-01-18 00:53:20 UTC

s/cover include/should cover/.

Comment 5 Mark Wielaard 2013-01-18 11:39:12 UTC

(In reply to comment #2)
> From eu-readelf -N -w, that die is:
> 
>  [1517249]      formal_parameter
>                name                 (strp) "count"
>                decl_file            (data1) 1
>                decl_line            (data2) 468
>                type                 (ref4) [15095d8]
>                location             (sec_offset) location list [577acc]
> 
> And that location list is:
> 
>  [577acc]  0x0000000000000b85..0x0000000000000bab [   0] reg1
>            0x0000000000000bab..0x0000000000000bfb [   0] reg13
>            0x0000000000000bfb..0x0000000000000c01 [   0] GNU_entry_value:
>        [   0] reg1
>                                                   [   3] stack_value
>            0x0000000000000c01..0x0000000000000c11 [   0] reg13
> 
> I'm not sure why those address ranges are so low.  With regular readelf, it
> reads:
> 
>     00577acc ffffffff81195705 ffffffff8119572b (DW_OP_reg1 (rdx))
>     00577acc ffffffff8119572b ffffffff8119577b (DW_OP_reg13 (r13))
>     00577acc ffffffff8119577b ffffffff81195781 (DW_OP_GNU_entry_value:
> (DW_OP_reg1 (rdx)); DW_OP_stack_value)
>     00577acc ffffffff81195781 ffffffff81195791 (DW_OP_reg13 (r13))
>     00577acc <End of list>

eu-readelf prints the "raw" location list entries, which are begin and end address offsets from the base address. The base address is defined either by an earlier selection entry or (more likely) the base address of the compilation unit from which the location list entry is referenced.

Comment 6 Frank Ch. Eigler 2013-01-18 16:05:59 UTC

Jakub advises this is a known problem in GCC 4.7, reported 2010-10-03.
Further testing indicates that this affects gdb too.

Comment 7 Josh Stone 2013-01-18 17:46:45 UTC

(In reply to comment #5)
> eu-readelf prints the "raw" location list entries, which are begin and end
> address offsets from the base address. The base address is defined either by
> an earlier selection entry or (more likely) the base address of the
> compilation unit from which the location list entry is referenced.

Hmm, OK.  IMO that's not very helpful to have raw numbers -- while we're off topic, is there a way to coax elfutils to expand those the way binutils does?

Comment 8 Mark Wielaard 2013-01-18 18:06:09 UTC

(In reply to comment #7)
> Hmm, OK.  IMO that's not very helpful to have raw numbers -- while we're off
> topic, is there a way to coax elfutils to expand those the way binutils does?

There is no direct mapping back from debug_loc to debug_info. And I don't immediately see a good way to know which offset ranges belong to which CU. It seems you have do a full scan of all DIEs in a CU to find the attributes that have a locptr. Which seems a bit expensive. On the other hand we seem to do some of that already anyway to detect unused offset ranges, so maybe we can piggyback on that.

Comment 9 Mark Wielaard 2013-01-18 21:30:07 UTC

(In reply to comment #8)
> (In reply to comment #7)
> > Hmm, OK.  IMO that's not very helpful to have raw numbers -- while we're off
> > topic, is there a way to coax elfutils to expand those the way binutils does?
> 
> There is no direct mapping back from debug_loc to debug_info. And I don't
> immediately see a good way to know which offset ranges belong to which CU.
> It seems you have do a full scan of all DIEs in a CU to find the attributes
> that have a locptr. Which seems a bit expensive. On the other hand we seem
> to do some of that already anyway to detect unused offset ranges, so maybe
> we can piggyback on that.

Lets continue the offtopic part here: https://lists.fedorahosted.org/pipermail/elfutils-devel/2013-January/002881.html

Comment 10 Fedora End Of Life 2013-12-21 10:29:30 UTC

This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Mark Wielaard 2013-12-21 16:54:38 UTC

The original script that showed the error now works with kernel 3.12.5-302.fc20.x86_64 gcc-4.8.2-7.fc20.x86_64 and systemtap-2.4-1.fc20.x86_64

Also the example from the upstream gcc bug now works with gcc-4.8.2-7.fc20.x86_64 and gdb-7.6.50.20130731-16.fc20.x86_64

Reading symbols from /tmp/a.out...done.
(gdb) break foo
Breakpoint 1 at 0x400690: file foo2.c, line 10.
(gdb) run
Starting program: /tmp/a.out 

Breakpoint 1, foo (a=7, b=4) at foo2.c:10
10	{

Note that the upstream bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54793 isn't closed yet, but still in state NEW. But that is something for upstream I guess. For Fedora this seems resolved.

Note You need to log in before you can comment on or make changes to this bug.