995446 – gcc -02 optmization regression in v4.8 causing malformed arg passing (32 bit arch only)

Bug 995446 - gcc -02 optmization regression in v4.8 causing malformed arg passing (32 bit arch only)

Summary: gcc -02 optmization regression in v4.8 causing malformed arg passing (32 bit ...

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	gcc
Sub Component:
Version:	19
Hardware:	i386
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	---
Assignee:	Jakub Jelinek
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	996643
TreeView+	depends on / blocked

Reported:	2013-08-09 11:42 UTC by Peter Rajnoha
Modified:	2013-09-30 10:13 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Clones:	996643 (view as bug list)
Environment:
Last Closed:	2013-09-30 10:13:54 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
preprocessed lib/config/config.c from lvm2 source tree (267.38 KB, text/plain) 2013-08-12 11:09 UTC, Peter Rajnoha	no flags	Details
rh995446.i (2.44 KB, text/plain) 2013-08-13 09:31 UTC, Jakub Jelinek	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
GNU Compiler Collection	57459	0	None	None	None	Never

Description Peter Rajnoha 2013-08-09 11:42:15 UTC

We've just hit this. It seems that there's some regression in -O2 gcc optimization which causes the value to be malformed on function call (the value is incresed by one). We've hit this problem with lvm2 with recent gcc in F19 (gcc (GCC) 4.8.1 20130603 (Red Hat 4.8.1-1)).

Snippet from the log message for the workaround we use now in lvm2 to avoid this issue:

 gcc -O2 v4.8 on 32 bit architecture is causing a bug in parameter
    passing. It does not happen with -01 nor -O0.
    
    The problematic part of the code was strlen use in config.c in
    the config_def_check fn and the call for _config_def_check_tree in it:
    
    <snip>
      rplen = strlen(rp);
      if (!_config_def_check_tree(handle, vp, vp + strlen(vp), rp, rp + rplen, CFG_PATH_MAX_LEN - rplen, cn, cmd->cft_def_hash)) ...
    </snip>
    

   If compiled with -O0 (correct):
    
    Breakpoint 1, config_def_check (cmd=0x819b050, handle=0x81a04f8) at config/config.c:775
    (gdb) p	vp
    $1 = 0x8189ee0 <_cfg_path> "config"
    (gdb) p	strlen(vp)
    $2 = 6
    (gdb)
    _config_def_check_tree (handle=0x81a04f8, vp=0x8189ee0 <_cfg_path>
    "config", pvp=0x8189ee6 <_cfg_path+6> "", rp=0xbfffe1e8 "config",
    prp=0xbfffe1ee "", buf_size=58, root=0x81a2568, ht=0x81a65
    48) at config/config.c:680
    (gdb) p	vp
    $4 = 0x8189ee0 <_cfg_path> "config"
    (gdb) p	pvp
    $5 = 0x8189ee6 <_cfg_path+6> ""
    
    If compiled with -O2 (incorrect):
    
    Breakpoint 1, config_def_check (cmd=cmd@entry=0x8183050, handle=0x81884f8) at config/config.c:775
    (gdb) p	vp
    $1 = 0x8172fc0 <_cfg_path> "config"
    (gdb) p strlen(vp)
    $2 = 6
    (gdb) p	vp + strlen(vp)
    $3 = 0x8172fc6 <_cfg_path+6> ""
    (gdb)
    _config_def_check_tree (handle=handle@entry=0x81884f8, pvp=0x8172fc7
    <_cfg_path+7> "host_list", rp=rp@entry=0xbffff190 "config",
    prp=prp@entry=0xbffff196 "", buf_size=buf_size@entry=58, ht=0x
    818e548, root=0x818a568, vp=0x8172fc0 <_cfg_path> "config") at
    config/config.c:674
    (gdb) p	pvp
    $4 = 0x8172fc7 <_cfg_path+7> "host_list"
    
    The difference is in passing the "pvp" arg for _config_def_check_tree.
    While in the correct case, the value of _cfg_path+6 is passed
    (the result of vp + strlen(vp) - see the snippet of the code above),
    in the incorrect case, this value is increased by 1 to _cfg_path+7,
    hence totally malforming the string that is being processed.


See also https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=2f61478436a7ebe058ffad3f92e1c4dada0805d0. The workaround we provide for now is that the _cfg_path variable is not define with "const" qualifier (so we assume that the optimizer is less aggressive then - it actually helps).

The code itself: https://git.fedorahosted.org/cgit/lvm2.git/tree/lib/config/config.c. The breakpoint was set to the line just before the _config_def_check_tree_call in config_def_check function.

Please, let me know if you need any more info...

Comment 1 Peter Rajnoha 2013-08-09 11:45:32 UTC

(In reply to Peter Rajnoha from comment #0)

> ?id=2f61478436a7ebe058ffad3f92e1c4dada0805d0. The workaround we provide for
> now is that the _cfg_path variable is not define with "const" qualifier (so
> we assume that the optimizer is less aggressive then - it actually helps).

(sorry, I meant "static" qualifier, not the "const").

Also, shuffling the arg list for the _config_def_check_tree fn helped (but the workaround with removing the "static" is shorter...

Comment 2 Peter Rajnoha 2013-08-09 12:04:47 UTC

This is not reproducible with the older gcc v4.7.2 (the version that is currently in F18).

Comment 3 Jakub Jelinek 2013-08-12 09:35:56 UTC

Please provide preprocessed source for the source file in question and gcc command line options used to compile it.

Comment 4 Peter Rajnoha 2013-08-12 11:09:51 UTC

Created attachment 785648 [details]
preprocessed lib/config/config.c from lvm2 source tree

Preprocessed config.i attached, the complete option list used:

-Wall -Wundef -Wshadow -Wcast-align -Wwrite-strings -Wmissing-prototypes -Wmissing-declarations -Wnested-externs -Winline -Wmissing-noreturn -Wformat-security -Wredundant-decls -Wpointer-arith -g -O2 -fPIC -save-temps

Comment 5 Jakub Jelinek 2013-08-12 19:37:13 UTC

I think I'd prefer to see this under debugger, how can one reproduce this under gdb and does it require any special setup?

Comment 6 Mikuláš Patočka 2013-08-13 01:20:43 UTC

The expanded strlen of an aligned string should look like this:
s_l:
.LFB12:
        .cfi_startproc
        movl    $b, %eax
.L2:
        movl    (%eax), %ecx
        addl    $4, %eax
        leal    -16843009(%ecx), %edx
        notl    %ecx
        andl    %ecx, %edx
        andl    $-2139062144, %edx
        je      .L2
        movl    %edx, %ecx
        shrl    $16, %ecx
        testl   $32896, %edx
        cmove   %ecx, %edx
        leal    2(%eax), %ecx
        cmove   %ecx, %eax
        addb    %dl, %dl
        sbbl    $3, %eax
        subl    $b, %eax
        ret

In the assembler that results from compiling the submitted file with
"gcc -m32 -O2 -g -fPIC config.i -S" we can see:

.L706:
        movl    (%edx), %ecx
        addl    $4, %edx
        leal    -16843009(%ecx), %eax
        notl    %ecx
        movl    %eax, 44(%esp)
        andl    %ecx, 44(%esp)
        andl    $-2139062144, 44(%esp)
        je      .L706
        movl    44(%esp), %eax
        movl    52(%esp), %ecx
        movl    %ecx, 60(%esp)
        movl    %eax, %ecx
        shrl    $16, %ecx
        testl   $32896, %eax
        movl    %ecx, 52(%esp)
        movl    %eax, %ecx
        leal    2(%edx), %eax
        cmove   52(%esp), %ecx
        cmove   %eax, %edx
        addb    %cl, 44(%esp)
        .loc 1 775 0
        movl    160(%esp), %eax
        .loc 1 776 0
        sbbl    $3, %edx
        subl    48(%esp), %edx
        .loc 1 775 0
        addl    56(%esp), %edx
        movl    %edx, %ecx

--- that "addb    %cl, 44(%esp)" seems to be wrong, it should be "addb %cl, %cl"

Comment 7 Jakub Jelinek 2013-08-13 09:31:10 UTC

Created attachment 786091 [details]
rh995446.i

Self-contained reduced executable testcase.

Comment 8 Jakub Jelinek 2013-08-13 09:41:02 UTC

Seems to be PR57459.

Comment 9 Jakub Jelinek 2013-09-30 10:13:54 UTC

Fixed in 4.8.1-6 and later.

Note You need to log in before you can comment on or make changes to this bug.