Bug 1539664
Summary: | [ppc64le] Setting breakpoint on function name doesn't work | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Severin Gehwolf <sgehwolf> | ||||
Component: | annobin | Assignee: | Nick Clifton <nickc> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rawhide | CC: | fweimer, jan.kratochvil, mjw, nickc, pmuldoon, sergiodj | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | ppc64le | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | annobin-3.4-1.fc28 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-02-09 21:25:47 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Severin Gehwolf
2018-01-29 12:07:49 UTC
# nm -Cl /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.161-1.b14.fc28.ppc64le/openjdk/build/jdk8.build/images/j2sdk-image/jre/lib/ppc64le/server/libjvm.so | grep JavaCallWrapper 000000000060cfe0 t JavaCallWrapper::oops_do(OopClosure*) /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.161-1.b14.fc28.ppc64le/openjdk/hotspot/src/share/vm/runtime/javaCalls.cpp:144 000000000060cc30 t JavaCallWrapper::JavaCallWrapper(methodHandle, Handle, JavaValue*, Thread*) /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.161-1.b14.fc28.ppc64le/openjdk/hotspot/src/share/vm/runtime/javaCalls.cpp:47 000000000060cc30 t JavaCallWrapper::JavaCallWrapper(methodHandle, Handle, JavaValue*, Thread*) 000000000060cf30 t JavaCallWrapper::~JavaCallWrapper() /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.161-1.b14.fc28.ppc64le/openjdk/hotspot/src/share/vm/runtime/javaCalls.cpp:111 000000000060cf30 t JavaCallWrapper::~JavaCallWrapper() Thus, one of: break JavaCallWrapper::JavaCallWrapper break javaCalls.cpp:47 should work for setting a break point, but doesn't. To be honest, I'm not sure whether this is a gdb bug as this used to work with 8.0.50.20171204-33.fc28[1], but downgrading that in a local mock doesn't fix the problem at hand. Any pointers as to what the issue might be are appreciated. Sorry for the noise. I'll close as not-a-bug. [1] https://koji.fedoraproject.org/koji/buildinfo?buildID=1008667 So could you attach here a binary which fails for GDB? Provided a link to an image privately. FYI with gcc-7.3.1-1.fc28.ppc64le I got some missing functions linkage error. After upgrade to Rawhide gcc-8.0.1-0.6.fc28.ppc64le I get now: cc1: warning: command line option '-std=gnu++98' is valid for C++/ObjC++ but not for C (In reply to Jan Kratochvil from comment #6) > FYI with gcc-7.3.1-1.fc28.ppc64le I got some missing functions linkage error. > After upgrade to Rawhide gcc-8.0.1-0.6.fc28.ppc64le I get now: > cc1: warning: command line option '-std=gnu++98' is valid for C++/ObjC++ > but not for C Not sure what you are trying to do, but if you attempt to build java-1.8.0-openjdk on ppc64le on rawhide, you'll run into bug 1539812. The build warning should be benign. It is a valid bug. <JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+2296>: bl <JavaCallWrapper::JavaCallWrapper+8> => ld r4,96(r31) <JavaCallWrapper::JavaCallWrapper(methodHandle, Handle, JavaValue*, Thread*)>: <JavaCallWrapper::JavaCallWrapper>: addis r2,r12,101 <JavaCallWrapper::JavaCallWrapper+4>: addi r2,r2,-20016 <JavaCallWrapper::JavaCallWrapper+8>: mflr r0 ppc64 sometimes calls skipping some first instructions. GDB should handle that but in this case it does not - it places the breakpoint on the very first instruction. Probably something in rs6000-tdep.c skip_prologue(). It does affect also upstream trunk: GNU gdb (GDB) 8.1.50.20180131-git https://sourceware.org/ml/gdb-patches/2018-01/msg00669.html This may be related to the bug. (In reply to Sergio Durigan Junior from comment #9) > https://sourceware.org/ml/gdb-patches/2018-01/msg00669.html It does not fix it. And I believe it is a different issue. I have asked IBM upstream: PowerPC: Call of a function+8 address https://sourceware.org/ml/gdb/2018-01/msg00028.html It is because annobin started to create some new unusual symbols which confuse GDB: PASS: echo 'void f(void){}' >filename.c;/usr/bin/gcc -o filename.o -c filename.c ;readelf -Ws filename.o|grep filename.c 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS filename.c FAIL: echo 'void f(void){}' >filename.c;/usr/bin/gcc -o filename.o -c filename.c -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1;readelf -Ws filename.o|grep filename.c 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS filename.c 5: 0000000000000000 2 OBJECT LOCAL DEFAULT 1 filename.c 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT 1 filename.c_end The problem is that first function in a CU has the same address as "filename.c" data symbol (which has its address and it has its non-zero size). This disables ppc64le global vs. local entrypoint detection for first function in a file: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=591a12a1d4c8843343eb999145d8bcc1efedf408 Couldn't be the symbol of other type than a regular runtime-type STT_OBJECT? I am not aware how is it useful for some static analysis tools using annobin info but GDB is not happy from it. (In reply to Jan Kratochvil from comment #12) Hi Jan, > 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS filename.c > 5: 0000000000000000 2 OBJECT LOCAL DEFAULT 1 filename.c > 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT 1 filename.c_end > > The problem is that first function in a CU has the same address as > "filename.c" data symbol (which has its address and it has its non-zero > size). This disables ppc64le global vs. local entrypoint detection for > first function in a file: > > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff; > h=591a12a1d4c8843343eb999145d8bcc1efedf408 Is the problem that ppc_elfv2_elf_make_msymbol_special () only expects to find *one* symbol at a given PC address, and does not check to see if there are any others ? If so, would it be possible for the function to be extended to keep searching until it finds one with a non-zero at_other field ? (I am not sure how to do this though. I am not a GDB expert). > Couldn't be the symbol of other type than a regular runtime-type STT_OBJECT? It could, but isn't STT_OBJECT the correct type for this symbol ? The symbol is being used to mark that start of the code for a given object file (or CU if you prefer) inside a larger binary. Hence the STT_OBJECT type. The symbol that should be used to determine the global-vs-local entry point status is the function symbol for the address though, not the object symbool. So shouldn't lookup_minimal_symbol_by_pc_section_1() prefer symbols of type STT_FUNC over symbols of type STT_OBJECT ? (Can it do this ?) Cheers Nick Hi Nick, (In reply to Nick Clifton from comment #13) > Is the problem that ppc_elfv2_elf_make_msymbol_special () only expects > to find *one* symbol at a given PC address, and does not check to see if > there are any others ? GDB creates the ELF symbols correctly (correctly means that STT_OBJECT for the filename is NOT marked by MSYMBOL_TARGET_FLAG_1. The problem is that ppc_elfv2_skip_entrypoint() finds the STT_OBJECT symbol instead of the expected STT_FUNC symbol. > If so, would it be possible for the function to be extended to keep searching > until it finds one with a non-zero at_other field ? Sure it is a GDB bug. First I do not know why lookup_minimal_symbol_by_pc() ever returns data symbols when it is named "*_by_pc". Besides that ppc_elfv2_skip_entrypoint() could also iterate through all the symbols at that PC address if any of them does not have MSYMBOL_TARGET_FLAG_1 (but there isn't available such underlying symbol interating interface to do that). But touching such fragile GDB code always opens a can of worms so I wanted to clarify first if it is what we want to do. As I do not think there exists any real data variable spanning over all the text=code of the whole file. Besides that even if it did the text can be discontiguous which cannot be expressed by ELF symbols. And then there is already DW_AT_{{low,high}_pc,ranges} so I do not understand the purpose of these new ELF symbols myself which is why I am asking. DWARF is present the *-debuginfo.rpm the same way as this .symtab. As I do not think this new annobin extension will affect only GDB, various binary tools always had problems with overlapping address ranges, including elfutils for example. I am not happy with another incompatibility from mainstream Linux binary formats. For example current DWZ being used only by Red Hat is making Red Hat OSes incompatible to other binary tools (incl. LLVM). > It could, but isn't STT_OBJECT the correct type for this symbol ? No, there does not exist any runtime data entity covering the whole PC address range of a file. Maybe STT_FILE could be used for it or at least STT_NOTYPE or even best something new from STT_LOOS..STT_HIOS. > The symbol is being used to mark that start of the code for a given object > file (or CU if you prefer) inside a larger binary. Hence the STT_OBJECT type. STT_OBJECT: "a data object, such as a variable, an array, and so on." does not correspond to me for "start of the code" > The symbol that should be used to determine the global-vs-local entry point > status is the function symbol for the address though, not the object > symbool. So shouldn't lookup_minimal_symbol_by_pc_section_1() prefer > symbols of type STT_FUNC over symbols of type STT_OBJECT ? (Can it do this > ?) This is what I wrote above. Just I did not want to open a can of worms in GDB before a justification the new annobin symbol is really the thing we want to do as personally I do not think so. Maybe it could be discussed more on some list? Hi Jan, (In reply to Jan Kratochvil from comment #14) > Just I did not want to open a can of worms in > GDB before a justification the new annobin symbol is really the thing we > want to do as personally I do not think so. Maybe it could be discussed > more on some list? We could - but there is no annobin developers mailing list, and there are only two annobin developers (myself and Florian), so here would be just as good as anywhere else. > But touching such fragile GDB code always opens a can of worms so I wanted > to clarify first if it is what we want to do. OK, that makes sense. > STT_OBJECT: "a data object, such as a variable, an array, and so on." > does not correspond to me for "start of the code" *sigh* True, I was going by the name of the type, not its definition. OK - so if I change the symbol type, to say STT_LOOS + 1, would that actually help GDB ? There would still be multiple symbols at the same address. Do you know if GDB's mimisymbol reading code will skip OS specific symbols ? Cheers Nick (In reply to Nick Clifton from comment #15) > OK - so if I change the symbol type, to say STT_LOOS + 1, would that > actually help GDB ? Make that STT_NOTYPE. I have just discovered that GAS does not support setting symbols to arbitrary numeric values. (It does allow some values, but only ones that it recognises). Hi Nick, (In reply to Nick Clifton from comment #15) > OK - so if I change the symbol type, to say STT_LOOS + 1, would that > actually help GDB ? GDB does not read ELF symbols directly, it uses bfd/ for that. From bfd/ elf_slurp_symbol_table() I do not see it would ignore such symbols, they just do not get any special BSF_* flags. (In reply to Nick Clifton from comment #16) > Make that STT_NOTYPE. That should be the same for bfd/ ELF symbol reader. GDB will still consider it as mst_file_text due to 'sym->section->flags & SEC_CODE' in elf_symtab_read(). I would like to reuse existing STT_FILE - is it really a no go? Otherwise I can just post for GDB upstream something dealing with the current state of annobin. Still I agree STT_NOTYPE is more correct than STT_OBJECT. Hi Jan, > That should be the same for bfd/ ELF symbol reader. GDB will still consider > it as mst_file_text due to 'sym->section->flags & SEC_CODE' in > elf_symtab_read(). So basically changing the type of the symbols generated by annobin is not really going to help GDB, but it would help reduce confusion over the meaning/use of the symbol. Right ? > I would like to reuse existing STT_FILE - is it really a no go? Sadly yes. STT_FILE symbols have special semantics, and they have to be absolute. I need a symbol that is attached to a specific section and which will not confuse tools like GDB because they do not behave like normal file symbols. > Otherwise I can just post for GDB upstream something dealing with the > current state of annobin. Still I agree STT_NOTYPE is more correct than > STT_OBJECT. OK, I will make that change and update the annobin rpm. Cheers Nick Nick, could use a truly local symbol (.L… or whatever syntax they use)? I don't think these symbols have to go into .symtab at all. Hi Nick, (In reply to Nick Clifton from comment #18) > So basically changing the type of the symbols generated by annobin is > not really going to help GDB, but it would help reduce confusion over > the meaning/use of the symbol. Right ? Right. (In reply to Florian Weimer from comment #19) > Nick, could use a truly local symbol (.L… or whatever syntax they use)? I > don't think these symbols have to go into .symtab at all. That would be nice, GDB already ignores such symbols: || ((sym->flags & BSF_LOCAL) && sym->name[0] == '$' && sym->name[1] == 'L')) I expected the ELF symbol name should really match the filename. But such a prefix would fix that even for existing GDBs. Hi Guys, (In reply to Florian Weimer from comment #19) > Nick, could use a truly local symbol (.L… or whatever syntax they use)? I > don't think these symbols have to go into .symtab at all. Hmm, investigating.... I need symbols that do go into the symbol table, so that they can be extracted by the tools that parse the annobin notes. But it should not matter if they are local. I will run some tests. Cheers Nick I haven't tried with elfutils yet. But various elfutils based tools use a variant of dwfl_module_addrsym/dwfl_module_addrinfo/dwfl_module_addrname to match an address to the "closest" symbol/name. The rules to pick the closest symbol as follows: - Ignore symbols that: - have no name - have no section associated (SHN_UNDEF) - have an address/value larger than the one we are looking for - that are STT_SECTION, STT_FILE or STT_TLS. - A sized symbol is preferred to a sizeless symbol, if the address falls in the symbol range. - A symbol with address, smaller, but closer to the given address is preferred over a symbol with a lower address, or a more global symbol is preferred over a more local one (STB_GLOBAL > STB_WEAK > STB_LOCAL). - For sizeless symbol prefer the closest one that is in the same section as the address if we can determine that. - For a sized symbol, if the beginning of its range is no closer, prefer the one with the smallest range (that still contains the address). - If everything above is equal, prefer the first symbol found in the table. (In reply to Mark Wielaard from comment #22) > - A sized symbol is preferred to a sizeless symbol, if the address falls in > the symbol range. GDB also has this preference. Num: Value Size Type Bind Vis Ndx Name 35: 0000000000400536 2 OBJECT LOCAL DEFAULT 12 filename.c 36: 0000000000400538 0 NOTYPE LOCAL DEFAULT 12 filename.c_end Why is there both filename.c_end and the size of filename.c? Maybe if filename.c was sizeless we can still get the size from filename.c_end and it would not confuse the consumers so much. [Just read comment #23 - it came in whilst I was typing this. The end symbol is there so that the size of the start symbol can be calculated. But it would be possible to move the size to the end symbol and always have a zero size for the start symbol. I will run some tests locally]. Right. .L prefixed symbols do not work because they are too local, and end being completely stripped from the symbol table. :-( But I do have another idea... Jan would GDB be happy if the address of the start symbol was increased by one ? So for example in your dump from comment #13 if the STT_OBJECT symbol filename.c was now: 5: 0000000000000001 2 NOTYPE LOCAL DEFAULT 1 filename.c (ie assigned to address 0x1 not address 0x0) would that work ? Or would the symbol increment have to be larger than an instruction size ? What I could do is tweak the annobin plugin to change the address of these symbols so that they reference just after the start of the file. It also needs a tweak to the consumer of the annobin information (ie readelf) so that it knows to look for symbols that are 1 byte beyond the target address, but this can be done too. Would this work for you ? (In reply to Nick Clifton from comment #24) > But it would be possible to move the size to the end symbol and always have > a zero size for the start symbol. I expected both symbols would be sizeless and the size would be calculated by annobin-consumer (obviously as a difference between the two symbols). > Jan would GDB be happy if the address of the > start symbol was increased by one ? One can have 'func1:retq; func2:retq' (with -Os), that is ambiguous again. And I do not think it is too clean design. I am not trying to just deal with this bugreport - I could workaround it some way in GDB. I am just trying to prevent problems with binary tools in general in the future. But you have significantly more experience with all of this so I am not sure I am doing the right thing. (In reply to Jan Kratochvil from comment #25) > I expected both symbols would be sizeless and the size would be calculated > by annobin-consumer (obviously as a difference between the two symbols). Originally I only planned on having one symbol, and so that size was needed to calculate the end of the region it covered. Obviously now that I have two symbols this is no longer necessary, and my tests show that zero-sized symbols work for the decoding of the annotations, so I will check this change in. I expected Nick will CLOSE the Bug but I find the Comment 0 fixed now: https://koji.fedoraproject.org/koji/taskinfo?taskID=24898920 ppc64le: https://kojipkgs.fedoraproject.org//work/tasks/8920/24898920/build.log ... (gdb) (gdb) No source file named javaCalls.cpp. Breakpoint 1 (javaCalls.cpp:1) pending. ... + grep JavaCallWrapper::JavaCallWrapper gdb.out Thread 2 "java" hit Breakpoint 1, JavaCallWrapper::JavaCallWrapper ( 47 JavaCallWrapper::JavaCallWrapper(methodHandle callee_method, Handle receiver, JavaValue* result, TRAPS) { #0 JavaCallWrapper::JavaCallWrapper (this=0x7ffff660d9b8, callee_method=..., + grep sun.misc.Unsafe |