Bug 193740

Summary: libapreq2 code miscompiled with -O2: infinite loop
Product: [Fedora] Fedora Reporter: Bojan Smojver <bojan>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 5CC: aoliva
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-09-22 16:44:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Intermediate file
none
Fix for libapreq2 none

Description Bojan Smojver 2006-06-01 02:05:21 UTC
Description of problem:
libapreq2 is part of Fedora Extras and it gets miscompiled with -O2 turned on.
The problem is in library/parse_multipart.c, lines 164 to 168. They look like this:

---------------------------
164: do {
165:     apr_bucket *f = APR_BRIGADE_FIRST(in);
166:     APR_BUCKET_REMOVE(f);
167:     APR_BRIGADE_INSERT_TAIL(out, f);
168: } while (e != APR_BRIGADE_FIRST(in));
---------------------------

Intermediate code for this looks like this:

---------------------------
do {
    apr_bucket *f = (&(in)->list)->next;
    do { do { (((((f))))->link.prev)->link.next = ((((f))))->link.next;
(((((f))))->link.next)->link.prev = ((((f))))->link.prev; } while (0);
do { (f)->type->destroy((f)->data); (f)->free(f); } while (0); } while
(0);
} while ((&(in)->list)->next != e);
---------------------------

With -O2, the code turns into:

---------------------------
.LBB4:
        .loc 1 165 0
        movl    -56(%ebp), %edx
        movl    $0, -36(%ebp)
        movl    4(%edx), %ecx
.LVL27:
        cmpl    %ecx, -44(%ebp)
        .p2align 4,,7
.L30:
        .loc 1 166 0
        movl    4(%ecx), %edx
        movl    (%ecx), %eax
        movl    %eax, (%edx)
        movl    (%ecx), %eax
.LBB5:
        .loc 1 167 0
        movl    %edi, (%ecx)
.LBE5:
        .loc 1 166 0
        movl    %edx, 4(%eax)
.LBB6:
        .loc 1 167 0
        movl    4(%edi), %eax
        movl    %eax, 4(%ecx)
        movl    4(%edi), %eax
        movl    %ecx, 4(%edi)
        movl    %ecx, (%eax)
.LBE6:
.LBE4:
        .loc 1 168 0
        jne     .L30
        jmp     .L19
---------------------------

Note how in the above jne is to .L30, which is *after* cmpl. In other words,
comparison happens once before the loop only. Which then turns the loop into an
infinite one.

If the same code is compiled with -O2 -fno-strict-aliasing, the assembly is:

---------------------------
.L28:
.LBB4:
        .loc 1 165 0
        movl    -56(%ebp), %edx
        movl    4(%edx), %eax
.LVL25:
        .loc 1 166 0
        movl    4(%eax), %ecx
        movl    (%eax), %edx
        movl    %edx, (%ecx)
        movl    (%eax), %edx
.LBB5:
        .loc 1 167 0
        movl    %esi, (%eax)
.LBE5:
        .loc 1 166 0
        movl    %ecx, 4(%edx)
.LBB6:
        .loc 1 167 0
        movl    4(%esi), %edx
        movl    %edx, 4(%eax)
        movl    4(%esi), %edx
        movl    %eax, 4(%esi)
        movl    %eax, (%edx)
.LBE6:
.LBE4:
        .loc 1 168 0
        movl    -56(%ebp), %ecx
        movl    -44(%ebp), %eax
.LVL26:
        cmpl    4(%ecx), %eax
        jne     .L28
        jmp     .L43
---------------------------

The above works fine, as cmpl is done every time through the loop.


Version-Release number of selected component (if applicable):
gcc-4.1.1-1.fc5

How reproducible:
Always.

Steps to Reproduce:
1. Download http://people.apache.org/~pgollucci/apreq2/libapreq2-2.08-RC2.tar.gz
2. Run configure and make against APR/APU trunk (i.e. 1.3.0)
3. Run make test and watch it hang.
  
Actual results:
Infinite loop, 100% CPU utilisation.

Expected results:
Works OK with -O2 -fno-strict-aliasing. Should it be like that with just -O2?

Additional info:

Comment 1 Bojan Smojver 2006-06-01 03:04:55 UTC
Sorry, stuffed up the intermediate code. The correct one is here:

-------------------------------
            do {
                apr_bucket *f = (&(in)->list)->next;
                do { (((((f))))->link.prev)->link.next = ((((f))))->link.next;
(((((f))))->link.next)->link.prev = ((((f))))->link.prev; } while (0);
                do { apr_bucket *ap__b = (f); do { ((((ap__b))))->link.next =
((struct apr_bucket *)((char *)(((&(out)->list))) - ((long) (((char *)
(&(((struct apr_bucket*)((void *)0))->link))) - ((char *) ((void *)0))))));
((((ap__b))))->link.prev = (((struct apr_bucket *)((char *)(((&(out)->list))) -
((long) (((char *) (&(((struct apr_bucket*)((void *)0))->link))) - ((char *)
((void *)0)))))))->link.prev; ((((struct apr_bucket *)((char
*)(((&(out)->list))) - ((long) (((char *) (&(((struct apr_bucket*)((void
*)0))->link))) - ((char *) ((void *)0)))))))->link.prev)->link.next =
(((ap__b))); (((struct apr_bucket *)((char *)(((&(out)->list))) - ((long)
(((char *) (&(((struct apr_bucket*)((void *)0))->link))) - ((char *) ((void
*)0)))))))->link.prev = (((ap__b))); } while (0); ; } while (0);
            } while (e != (&(in)->list)->next);
-------------------------------

Thanks to Joe Schaefer for spotting this!

Comment 2 Jakub Jelinek 2006-06-01 07:33:45 UTC
Please preprocess parse_multipart.c (either by replacing -c with -E, or adding
-save-temps to the options) and attach parse_multipart.i here.
Also, please mention all the command line options used.


Comment 3 Bojan Smojver 2006-06-01 09:54:32 UTC
Created attachment 130327 [details]
Intermediate file

Comment 4 Bojan Smojver 2006-06-01 09:55:16 UTC
Compiled with:

gcc -DHAVE_CONFIG_H -I. -I. -I../include -I/usr/local/apache2/include/apr-1
-I/usr/include/mysql -DLINUX=2 -D_REENTRANT -D_GNU_SOURCE -D_LARGEFILE64_SOURCE
-O2 -g -Wall -fstack-protector -MT parser_multipart.lo -MD -MP -MF
.deps/parser_multipart.Tpo -c parser_multipart.c  -fPIC -DPIC -save-temps -o
.libs/parser_multipart.o

Comment 5 Jakub Jelinek 2006-06-01 16:56:52 UTC
I think it is quite probable this violates strict aliasing, though I'm not 100%
sure, Alex, what do you think about that? Here is a simplified code, with
the #if 1 part aliasing clearly decides in->list.next couldn't have been
modified, while in #else part it assumes it could.
struct apr_bucket_brigade {
  void *p;
  struct apr_bucket_list { struct apr_bucket *next; struct apr_bucket *prev; }
list;
  void *bucket_alloc;
};

struct apr_bucket {
  struct { struct apr_bucket *next; struct apr_bucket *prev; } link;
  const void *type;
  __SIZE_TYPE__ length;
};

void
foo (struct apr_bucket_brigade *out, struct apr_bucket_brigade *in)
{
  struct apr_bucket * e = (struct apr_bucket *) in->list.next;
  do
    {
      struct apr_bucket * f = (struct apr_bucket *) in->list.next;

      f->link.prev->link.next = f->link.next;
      f->link.next->link.prev = f->link.prev;

      f->link.next = (struct apr_bucket *) (char *) &out->list;
      f->link.prev = ((struct apr_bucket *) (char *) &out->list)->link.prev;
#if 1
      ((struct apr_bucket *) (char *) &out->list)->link.prev->link.next = f;
      ((struct apr_bucket *) (char *) &out->list)->link.prev = f;
#else
      out->list.prev->link.next = f;
      out->list.prev = f;
#endif
    }
  while (e != in->list.next);
}



Comment 6 Bojan Smojver 2006-06-01 21:58:16 UTC
Ditto x86_64 and libapreq2 2.07.

Comment 7 Bojan Smojver 2006-06-02 01:43:02 UTC
Reports with:

FreeBSD 6.1-RELEASE
gcc version 4.1.2 20060526 (prerelease)

suggest that this may be an FC specific issue (i.e. code compiles fine,
apparently). Or maybe even that this was fixed in 4.1 series already.

Comment 8 Jakub Jelinek 2006-06-02 09:00:29 UTC
As many other strict aliasing violations (if this is one), this is highly
scheduling etc. dependent, so whether you can or can't reproduce it on other
system is irrelevant.  I can reproduce the problem with current x86_64-linux GCC
trunk e.g.

Comment 9 Alexandre Oliva 2006-06-02 09:20:58 UTC
Mainline still behaves the same way, so I doubt it's some fix in the 4.1 series.

Clearly list and link don't have the same types, so it might be just fair if
accesses to them were regarded as non-aliasing, however even if I use struct
apr_bucket_list as the type for link, it still fails in the very same way, so
that's not the reason for the problem.

Removing the intermediate (char*) casts, that are there only to silence
type-punning warnings, I do get warnings for the casts, but if I make the change
of using struct apr_bucket_list as the type for link, then the warnings are
gone, but the code still fails.  This might be a GCC bug.

In either case, I do see that, because the pointer to link is cast not to the
type of the list member, but to the type of the enclosing struct, then when the
pointer is dereferenced, we do access the stored value of the object with a type
different from its effective type, and that invokes undefined behavior.  AFAICT,
it doesn't matter that we only access a sub-object of a type that is
structurally equivalent, or even the same type; as soon as we use an
incompatible enclosing type, we're breaking the rules, and GCC is entitled to
follow the rules to make the code break faster.

Comment 10 Bojan Smojver 2006-06-15 05:25:54 UTC
One thing seems strange in the intermediate code:

------------------------------
((long) (((char *) (&(((struct apr_bucket*)((void *)0))->link))) - ((char *)
((void *)0))))
------------------------------

The above code is most likely this from apr_general.h:

------------------------------
#define APR_OFFSET(p_type,field) \
        ((long) (((char *) (&(((p_type)NULL)->field))) - ((char *) NULL)))
------------------------------

I was under the impression that on Linux, there is offsetof that can be used.
So, I created a little test program:

------------------------------
#include <stdio.h>
#include <stddef.h>

int main(int argc,char **argv){
  struct test{
    int a;
    char *b;
  };

  printf("%lu\n",offsetof(struct test,b));

  return 0;
}
------------------------------

Compiled it:

------------------------------
gcc -Wall -save-temps -ansi -o test test.c
------------------------------

This give intermediate code:

------------------------------
int main(int argc,char **argv){
  struct test{
    int a;
    char *b;
  };

  printf("%lu\n",__builtin_offsetof (struct test, b));

  return 0;
}
------------------------------

Sooo, why isn't APR code using that instead? No idea...

BTW, the output of the program on an x86_64 box is 8.

Comment 11 Bojan Smojver 2006-06-15 06:34:12 UTC
Here is the intermediate code when using native offsetof():

----------------------------------------
            do {
                apr_bucket *f = (&(in)->list)->next;
                do { (((((f))))->link.prev)->link.next = ((((f))))->link.next;
(((((f))))->link.next)->link.prev = ((((f))))->link.prev; } while (0);
                do { apr_bucket *ap__b = (f); do { ((((ap__b))))->link.next =
((struct apr_bucket *)((char *)(((&(out)->list))) - __builtin_offsetof (struct
apr_bucket, link))); ((((ap__b))))->link.prev = (((struct apr_bucket *)((char
*)(((&(out)->list))) - __builtin_offsetof (struct apr_bucket,
link))))->link.prev; ((((struct apr_bucket *)((char *)(((&(out)->list))) -
__builtin_offsetof (struct apr_bucket, link))))->link.prev)->link.next =
(((ap__b))); (((struct apr_bucket *)((char *)(((&(out)->list))) -
__builtin_offsetof (struct apr_bucket, link))))->link.prev = (((ap__b))); }
while (0); ; } while (0);
            } while (e != (&(in)->list)->next);
----------------------------------------

However, relevant assembly is not much better (code still hangs in an infinite
loop):

----------------------------------------
.L19:
        movq    56(%rsp), %rbp
.LVL27:
        movq    24(%rsp), %rdi
        movq    %r12, %rdx
        movq    %r14, %rsi
        subq    %r15, %rbp
        cmpq    %r12, %rbp
        cmovbe  %rbp, %rdx
        addq    %r15, %rdi
        call    strncmp@PLT
.LVL28:
        testl   %eax, %eax
        je      .L47
        .loc 1 159 0
        testq   %r15, %r15
        je      .L27
.LBB3:
        .loc 1 165 0
        movq    40(%rsp), %rax
        xorl    %r15d, %r15d
        movq    8(%rax), %rcx
.LVL29:
        cmpq    %rcx, %r13
        .p2align 4,,7
.L29:
        .loc 1 166 0
        movq    8(%rcx), %rdx
        movq    (%rcx), %rax
        movq    %rax, (%rdx)
        movq    (%rcx), %rax
.LBB4:
        .loc 1 167 0
        movq    %rbx, (%rcx)
.LBE4:
        .loc 1 166 0
        movq    %rdx, 8(%rax)
.LBB5:
        .loc 1 167 0
        movq    8(%rbx), %rax
        movq    %rax, 8(%rcx)
        movq    8(%rbx), %rax
        movq    %rcx, 8(%rbx)
        movq    %rcx, (%rax)
.LBE5:
.LBE3:
        .loc 1 168 0
        jne     .L29
        jmp     .L19
----------------------------------------

So, it's not the use of non-native offsetoff() that is doing this...

I went a step further and redefined like this (i.e. no (char*) cast):

----------------------------------------
#define APR_RING_SENTINEL(hp, elem, link)                               \
    (struct elem *)((hp) - APR_OFFSETOF(struct elem, link))
----------------------------------------

Now during compilation I dutifully get:

----------------------------------------
parser_multipart.c:167: warning: dereferencing type-punned pointer will break
strict-aliasing rules
----------------------------------------

Looks like the (char*) cast gets rid of the compiler warnings but not really the
underlying problem.

Maybe these macros should be rewritten?

Comment 12 Jakub Jelinek 2007-09-22 16:44:57 UTC
If the macros in libapreq2 have not been rewritten yet, they should.

Comment 13 Bojan Smojver 2007-09-22 23:02:09 UTC
I think there has been some discussion about this in APR, but I don't remember
seeing a commit. Anyway, here is the thread:

http://mail-archives.apache.org/mod_mbox/apr-dev/200708.mbox/%3cad8a87120708151140j2a839772i96ee303a82fe3a19@mail.gmail.com%3e

For now, we're just building libapreq2 with -fno-strict-aliasing.

Comment 14 Bojan Smojver 2008-06-05 03:54:38 UTC
Created attachment 308411 [details]
Fix for libapreq2

This patch makes things work on Fedora 9 with GCC 4.3.0.