Bug 484362 - gcc (or glibc) memset: starts writing 4 bytes earlier then given pointer
gcc (or glibc) memset: starts writing 4 bytes earlier then given pointer
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: SDL (Show other bugs)
rawhide
All Linux
low Severity medium
: ---
: ---
Assigned To: Thomas Woerner
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-06 09:12 EST by Hans de Goede
Modified: 2009-03-02 12:00 EST (History)
5 users (show)

See Also:
Fixed In Version: 1.2.13-7.fc10
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-03-02 12:00:29 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Hans de Goede 2009-02-06 09:12:54 EST
Description of problem:
While debugging bug 484121, I've come to the conclusion that one certain memset call in libSDL sometimes does not start writing at the given pointer, but instead starts writing 4 bytes earlier.

I've managed to reproduce bug 484121 using an i386 rawhide install on a intel q8200 quadcore, notice the reporter of 484121 has an core duo, so this might be cpu specific.

My first debug run had glibc crashing in free(), so I ran Ri-li (the program exposing the bug) through ElectricFence and managed to get a more sensible debugging result using EF_PROTECT_BELOW=1

As I still couldn't pinpoint the problem I then started using valgrind, which said:
==716== Invalid write of size 4
==716==    at 0x40471C6: SDL_AllocFormat (string3.h:85)
==716==    by 0x40494BB: SDL_CreateRGBSurface (SDL_surface.c:102)
==716==    by 0x805388B: Tableau::Save() (tableau.cc:97)
==716==    by 0x8053FB3: ChargeFichier(char const*, unsigned char*&) (ostream:5
==716==    by 0x804EB0F: Menu::SDLMain_Score(bool) (menu.cc:1218)
==716==    by 0xB65734: (below main) (libc-start.c:220) 
==716==  Address 0x66e77bc is 4 bytes before a block of size 40 alloc'd
==716==    at 0x4006AEE: malloc (vg_replace_malloc.c:207)
==716==    by 0x40471AF: SDL_AllocFormat (SDL_pixels.c:44)
==716==    by 0x40494BB: SDL_CreateRGBSurface (SDL_surface.c:102)
==716==    by 0x805388B: Tableau::Save() (tableau.cc:97)

The code in SDL_AllocFormat doing the memset which results in the "Invalid write of size 4" is:

        /* Allocate an empty pixel format structure */
        format = SDL_malloc(sizeof(*format));
        if ( format == NULL ) {
                SDL_OutOfMemory();
                return(NULL);
        }
        SDL_memset(format, 0, sizeof(*format));

I've added a
printf("%p", format);

Between the malloc and the memset (both are macros using the regular C-functions), and the printed address is indeed 4 bytes larger then the address of the invalid write reported by valgrind.

Replacing the memset with a for-loop doing the memset manually, or the malloc with a calloc fixes this. And the report of writing 4 bytes before the beginning of the buffer is consistent with EF_PROTECT_BELOW causing the crash too (and also pointing to the same SDL function).

The strange thing is, that this function gets called many times before the crash happens. I'm not saying this is definitely a gcc / glibc bug, but its weird enough to report.
Comment 1 Jakub Jelinek 2009-02-12 10:03:09 EST
That's a SDL bug.  The i386 (and x86-64) ABI says that on function entry
the direction flag in %eflags must be clear, but it seems that
SDL_revcpy (haven't checked other places) in inline assembly uses the std
instruction without adding corresponding cld afterwards.  cld in SDL_memcpy
etc. isn't needed, but it should be always used after you do std.  I.e.
std; rep; stos*...; cld; etc.
Comment 2 Hans de Goede 2009-02-12 18:22:05 EST
*** Bug 484121 has been marked as a duplicate of this bug. ***
Comment 3 Hans de Goede 2009-02-13 05:30:53 EST
(In reply to comment #1)
> That's a SDL bug.  The i386 (and x86-64) ABI says that on function entry
> the direction flag in %eflags must be clear, but it seems that
> SDL_revcpy (haven't checked other places) in inline assembly uses the std
> instruction without adding corresponding cld afterwards.  cld in SDL_memcpy
> etc. isn't needed, but it should be always used after you do std.  I.e.
> std; rep; stos*...; cld; etc.

Jakub, you rock! Right on the money, disabling the inline SDL_revcpy function fixes this. I've filed a bug about this with upstream SDL:
http://bugzilla.libsdl.org/show_bug.cgi?id=699

And a fixed SDL build is on its way to rawhide and F10-updates
Comment 4 Kevin Kofler 2009-02-13 17:53:58 EST
Looks like this may also be the real cause of rocksndiamonds bug 478664, which was magically "fixed" by a rocksndiamonds update, but Valgrind still complains. (Note that this is on F9 i386.)
Comment 5 Fedora Update System 2009-03-02 12:00:15 EST
SDL-1.2.13-7.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.