Bug 630166
Summary: | [6.1 FEAT] Large-TOC support | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | IBM Bug Proxy <bugproxy> | ||||||||||
Component: | gcc | Assignee: | Jakub Jelinek <jakub> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | qe-baseos-tools-bugs | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | high | ||||||||||||
Version: | 6.1 | CC: | cward, ddumas, jjarvis, mfranc, nobody+PNT0273897, pmuller, sbest, sglass | ||||||||||
Target Milestone: | beta | Keywords: | FutureFeature, OtherQA | ||||||||||
Target Release: | 6.1 | ||||||||||||
Hardware: | ppc64 | ||||||||||||
OS: | All | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | gcc-4.4.5-6.el6 | Doc Type: | Enhancement | ||||||||||
Doc Text: |
These updated packages provide support for the "mcmodel=medium" and "-mcmodel=large" options on the 64-bit PowerPC architecture. These new options provide the ability to extend the TOC addressing space up to 2GB.
|
Story Points: | --- | ||||||||||
Clone Of: | |||||||||||||
: | 663587 (view as bug list) | Environment: | |||||||||||
Last Closed: | 2011-05-19 13:57:48 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 538808, 580566, 663587 | ||||||||||||
Attachments: |
|
Description
IBM Bug Proxy
2010-09-03 21:12:03 UTC
IBM is signed up to test and provide feedback. Setting OtherQA ------- Comment From rsisk.com 2010-10-04 10:36 EDT------- Code Upstream Status: Accepted I think this is only acceptable if the default stays to be -mcmodel=small instead of -mcmodel=medium and the patch is further modified with a bunch of extra cmodel != CMODEL_SMALL checks to make sure there is really zero or minimal impact when not using explicit -mcmodel=medium or -mcmodel=large gcc options. I guess the crt files would need to be compiled with -mcmodel=medium, but other than that libraries in the distro would stay to be -mcmodel=small compiled. Is this acceptable/worth it for IBM? A compiler defaulting to -mcmodel=medium can be in gcc46 (or later) packages. ------- Comment From sjmunroe.com 2010-12-02 10:26 EDT------- This large-TOC is very important for P7 and more so with the next generation of POWER processors, so not have -mcmodel=medium as the default would be disappointing. We believe that the new default is backward compatible with existing binaries but I will let Alan explain the specifics. Large-TOC is not a performance issue (if anything, it just slows things down tiny bit if I understand things right), so it is only important for very large shared libraries or very large binaries. For those surely the users can just add additional gcc flags when their libraries or binaries fail to link (or replace -mminimal-toc or similar with -mcmodel=medium or -mcmodel=large). On a testcase like: void f1 (void) __attribute__((visibility ("hidden"))); void f2 (void); static __attribute__((noinline, noclone)) void f3 (void) { asm (""); } int i; extern int j; static int k; extern int l __attribute__((visibility ("hidden"))); int foo (void) { f1 (); f2 (); f3 (); return i + j + k; } the difference between -mcmodel=small and -mcmodel=medium is: - ld 11,.LC0@toc(2) + addis 11,2,.LC0@toc@ha + addis 9,2,.LC1@toc@ha + ld 11,.LC0@toc@l(11) + ld 9,.LC1@toc@l(9) addi 1,1,112 - ld 9,.LC1@toc(2) so -mcmodel=medium larger and slower, but what is much more important is that it significantly affects scheduling etc., basically we'd ship a different compiler in RHEL6.1 from RHEL6.0. For users which prefer stability that is not a good idea. We'll be shipping newer gcc versions for RHEL, so anyone interested in latest and greatest features can just use those compilers, changes to the system gcc should be limited to low risk changes. ------- Comment From sjmunroe.com 2010-12-02 11:45 EDT-------
>Large-TOC is not a performance issue (if anything, it just slows things down
>tiny bit if I understand things right)
Not necessarily true. Your example does not show the advantage of TOC relative addressing for local static. Especially important for large C++ applications. Also you did not look at the additional optimization implemented in the linker (ld, Binutils 2.21)
Also for micro-architectures that have higher load-to-load latencies then FXU-to-load (like P6) the medium model can have a performance advantage.
And finally for the next generation after P7 the medium model will be a performance advantage.
------- Comment From bergner.com 2010-12-06 18:23 EDT-------
Jakub wrote:
> I think this is only acceptable if the default stays to be -mcmodel=small
> instead of -mcmodel=medium
We (IBM) concede the -mcmodel=medium default and agree to keeping -mcmodel=small as the default. I'll leave it to Alan to answer your other questions/concerns.
------- Comment From amodra.com 2010-12-06 22:47 EDT------- Jakub, can you tell me where you are concerned that the mcmodel support should have extra TARGET_CMODEL != CMODEL_SMALL tests? The LO_SUM in legitimate_constant_pool_address_p? That won't be confused with other LO_SUMs due to toc_relative_expr_p unspec test. Similarly for the LO_SUM in rs6000_delegitimize_address. If you accept legitimate_constant_pool_address_p is OK with respect to LO_SUMs, then that makes the rs6000_mode_dependent_address and print_operand_address changes OK too. The rs6000.md changes all test TARGET_CMODEL, except for cmptf_internal2 and I can't imagine that worries you a great deal. All in all, I was quite suprised when developing the patch just how well separated the mcmodel support was; I expected that I'd need to do major surgery to the existing TOC code. Created attachment 465134 [details]
gcc patches for powerpc mcmodel
------- Comment on attachment From amodra.com 2010-12-06 22:51 EDT-------
Jakub, you already have this set of patches.
Yeah, in: bool -legitimate_constant_pool_address_p (rtx x) +legitimate_constant_pool_address_p (const_rtx x, bool strict) { return (TARGET_TOC - && GET_CODE (x) == PLUS + && (GET_CODE (x) == PLUS || GET_CODE (x) == LO_SUM) && GET_CODE (XEXP (x, 0)) == REG - && (TARGET_MINIMAL_TOC || REGNO (XEXP (x, 0)) == TOC_REGISTER) + && (REGNO (XEXP (x, 0)) == TOC_REGISTER + || ((TARGET_MINIMAL_TOC + || TARGET_CMODEL != CMODEL_SMALL) + && INT_REG_OK_FOR_BASE_P (XEXP (x, 0), strict))) && toc_relative_expr_p (XEXP (x, 1))); } I was worried both about the == LO_SUM addition, but also about the TARGET_MINIMAL_TOC change (it now requires INT_REG_OK_FOR_BASE_P even for TARGET_MINIMAL_TOC, although before there was nothing like that before. In rs6000_delegitimize_address, again the == LO_SUM addition and TARGET_MINIMAL_TOC addition. In rs6000_mode_dependent_address the LO_SUM case addition. The print_operand_address change. The rs6000_output_addr_const_extra change isn't also obviously related to != CMODEL_SMALL. rs6000_generate_compare which adds an extra scratch clobber unconditionally and *cmptf_internal2 change. Created attachment 465249 [details]
binutils patches for mcmodel support
------- Comment on attachment From amodra.com 2010-12-06 22:54 EDT-------
This patch is against binutils-2.20.51.0.2-5.11.el6.src.rpm, in case you don't wish to upgrade rhel6 to binutils-2.21.
Created attachment 465250 [details]
Patch to make -mcmodel=small the default
------- Comment on attachment From bergner.com 2010-12-07 09:57 EDT-------
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Set
CMODEL_SMALL as default.
------- Comment From amodra.com 2010-12-07 21:53 EDT------- > I was worried both about the == LO_SUM addition, but also about the As I mentioned before, the toc_relative_expr_p test will ensure that this LO_SUM does not match any other LO_SUM generated by the powerpc backend. If you like, you can add a TARGET_CMODEL test there but it isn't necessary. > TARGET_MINIMAL_TOC change (it now requires INT_REG_OK_FOR_BASE_P even for > TARGET_MINIMAL_TOC, although before there was nothing like that before. I believe this is actually a TARGET_MINIMAL_TOC bugfix. See http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00747.html for an example of why INT_REG_OK_FOR_BASE_P is needed for cmodel != small. I think the same situation, ie. the pseudo doesn't get a hard reg (or gets r0) can occur for TARGET_MINIMAL_TOC, and we are relying on the register allocator just happening to make the right choice. The insn that loads the reg used here, load_toc_aix_{di,si}, really ought to have "=b" constraint. > In rs6000_delegitimize_address, again the == LO_SUM addition and > TARGET_MINIMAL_TOC addition. Well, we won't be messing with TARGET_MACHO code later in the function (even if you care) because TARGET_MACHO doesn't have minimal-toc or cmodel. Again, we won't match other LO_SUMs, because they won't have UNSPEC_TOCREL. Similar to legitimate_constant_pool_address_p it wouldn't hurt to add a TARGET_CMODEL test for the LO_SUM, but I'm quite sure it isn't necessary. The TARGET_MINIMAL_TOC addition is again a bugfix! > In rs6000_mode_dependent_address the LO_SUM case addition. Again, a TARGET_CMODEL test is unnecessary but won't hurt. > The print_operand_address change. Yeah, well, rearranging the order of matching against other LO_SUMs and legitimate_constant_pool_address_p is needed since otherwise a cmodel != small toc load would match the LO_SUM case. The correctness here relies on legitimate_constant_pool_address_p being correct, ie. *not* matching the "other LO_SUMs". > The rs6000_output_addr_const_extra change isn't also obviously related to != > CMODEL_SMALL. That's part of the hack to stop us emitting x@toc+const@l(r). I don't think this will trigger on TARGET_MINIMAL_TOC code, but if it does, the affect would be to emit label+8-.LCTOC(r) rather than label-.LCTOC+8(r). I do know the code will trigger on no-minimal-toc, giving x+8@toc(2) rather than x@toc+8(2). Someone hacked gas a long time ago to accept the latter expression rather than fixing gcc.. Not that I'm proud of my hack to get the order of the offset correct. It really should be done by rewriting the rtl. > rs6000_generate_compare which adds an extra scratch clobber unconditionally and > *cmptf_internal2 change. Matters only if you care about TARGET_XL_COMPAT with long double comparisons. I don't think the extra scratch should hurt too much code. ------- Comment From bergner.com 2010-12-13 23:22 EDT------- Jakub, after Alan's comment, do you have any more concerns or are we good? I've committed this to redhat/gcc-4_4-branch: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167861 (the merge didn't apply cleanly because tls markers aren't supported in 4.4/4.4-RH and "d" constraint doesn't exist there either). Will test with a bootstrap/regtest today (so far just cross compiler tested). The binutils support isn't there yet though. Some small tweaks to build crtfiles with -mcmodel=medium (as -mcmodel=small is the default) will still be needed. Created attachment 476106 [details] power6/7 mcmodel fix ------- Comment on attachment From amodra.com 2011-01-30 21:41 EDT------- http://gcc.gnu.org/ml/gcc-patches/2011-01/msg02167.html I've backported today two large-TOC bugfixes from the trunk: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170130 This enhancement request was evaluated by the full Red Hat Enterprise Linux team for inclusion in a Red Hat Enterprise Linux minor release. As a result of this evaluation, Red Hat has tentatively approved inclusion of this feature in the next Red Hat Enterprise Linux Update minor release. While it is a goal to include this enhancement in the next minor release of Red Hat Enterprise Linux, the enhancement is not yet committed for inclusion in the next minor release pending the next phase of actual code integration and successful Red Hat and partner testing. ~~ Partners and Customers ~~ This bug was included in RHEL 6.1 Beta. Please confirm the status of this request as soon as possible. If you're having problems accessing 6.1 bits, are delayed in your test execution or find in testing that the request was not addressed adequately, please let us know. Thanks! ------- Comment From sglass.com 2011-04-06 09:15 EDT------- This feature has been verified by IBM Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: support for -mcmodel=medium and -mcmodel=large on PowerPC 64-bit has been added. The default model is still the small model as before. Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -support for -mcmodel=medium and -mcmodel=large on PowerPC 64-bit has been added. The default model is still the small model as before.+These updated packages provide support for the "mcmodel=medium" and "-mcmodel=large" options on the 64-bit PowerPC architecture. These new options provide the ability to extend the TOC addressing space up to 2GB. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0663.html |