Bug 2173623

Summary: _FORTIFY_SOURCE=3 inconsistent behavior
Product: [Fedora] Fedora Reporter: Tomas Korbar <tkorbar>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: aoliva, dmalcolm, fweimer, jakub, jlaw, jwakely, mcermak, mpolacek, msebor, nickc, sipoyare
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-02-28 15:04:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2158232, 2171445    
Attachments:
Description Flags
reproducer_not_working.c
none
reproducer_working.c
none
stress.samp none

Description Tomas Korbar 2023-02-27 14:28:12 UTC
Created attachment 1946731 [details]
reproducer_not_working.c

Description of problem:
Hi,
recent raise of _FORTIFY_SOURCE flag to level 3 caused Autogens test suite
to fail and while debugging this issue i found very interesting behavior
of fread function if used with level 3 of fortification. As i understand
it, fortification level 3 of fread function causes that the function
checks how big is the buffer where it is supposed to write the read bytes
and if it is not big enough then abort programs execution.
You can see that in this code snippet:

################################# libio/bits/stdio.h
__fortify_function __wur size_t
fread_unlocked (void *__restrict __ptr, size_t __size, size_t __n,
		FILE *__restrict __stream)
{
  size_t sz = __glibc_objsize0 (__ptr);
  if (__glibc_safe_or_unknown_len (__n, __size, sz))
    {
# ifdef __USE_EXTERN_INLINES
      if (__builtin_constant_p (__size)
	  && __builtin_constant_p (__n)
	  && (__size | __n) < (((size_t) 1) << (8 * sizeof (size_t) / 2))
	  && __size * __n <= 8)
	{
	  size_t __cnt = __size * __n;
	  char *__cptr = (char *) __ptr;
	  if (__cnt == 0)
	    return 0;

	  for (; __cnt > 0; --__cnt)
	    {
	      int __c = getc_unlocked (__stream);
	      if (__c == EOF)
		break;
	      *__cptr++ = __c;
	    }
	  return (__cptr - (char *) __ptr) / __size;
	}
# endif
      return __fread_unlocked_alias (__ptr, __size, __n, __stream);
    }
  if (__glibc_unsafe_len (__n, __size, sz))
    return __fread_unlocked_chk_warn (__ptr, sz, __size, __n, __stream);
  return __fread_unlocked_chk (__ptr, sz, __size, __n, __stream);

}
#################################

Now pay attention to particular line where sz is set to object size.
size_t sz = __glibc_objsize0 (__ptr);

I think that GCC is able to confuse this function and make it return
incorrect sizes and that causes the following __fread_unlocked_chk
call think that there is buffer overflow. When specific conditions
are achieved. You could say that this is most likely caused by glibc
and not by gcc and the __glibc_objsize0 is not working correctly but
i think that this is either gcc problem or combination of specific
conditions in both the __glibc_objsize0 function and gccs way of
compilation of the problematic code. This is because i've been
able to trigger this issue by seemingly not related code.

This code in working version of the reproducer works:
################################# reproducer_working.c
    do {
        scaning_context->buf_size *= 2;
        scaning_context->data_offset += scaning_context->byte_read;
        scaning_context = realloc(scaning_context, scaning_context->buf_size + 1 + sizeof(*scaning_context));
        fprintf(stderr, "Realloced\n");
        if (scaning_context == NULL) {
            printf("Error while reallocating\n");
            return 1;
        }
        reading_pointer = ((char *)(scaning_context + 1)) + scaning_context->data_offset;
        scaning_context->data = (char *) (scaning_context+1);
    } while (scaning_context->byte_read = fread((void *)reading_pointer, 1, scaning_context->buf_size - scaning_context->data_offset, stdin));
#################################
__glibc_objsize0 correctly determines remaining size in the buffer and code works as expected.

On the other hand this code does not work:
################################# reproducer_not_working.c
    context_t *p;

    do {
        scaning_context->buf_size *= 2;
        scaning_context->data_offset += scaning_context->byte_read;
        p = realloc(scaning_context, scaning_context->buf_size + 1 + sizeof(*scaning_context));
        fprintf(stderr, "Realloced\n");
        if (p == NULL) {
            printf("Error while reallocating\n");
            return 1;
        }
        if (p != scaning_context) {
            fprintf(stderr, "Context has been moved!\n");
            scaning_context = p;
        }
        reading_pointer = ((char *)(scaning_context + 1)) + scaning_context->data_offset;
        scaning_context->data = (char *) (scaning_context+1);
    } while (scaning_context->byte_read = fread((void *)reading_pointer, 1, scaning_context->buf_size - scaning_context->data_offset, stdin));
#################################
I checked and buffer overflow condition is triggered even when scaning_context
has not been moved and the scaning_context memory area has just been extended.
It seems that the additional pointer "p" in combination with casting of
reading_pointer to "(void *)" manages to break the __glibc_objsize0
functions ability to find out correct size of remaining area.
If i remove the cast to "(void *)" then everything works as expected.
In the same time, when address sanitizer or valgrind is used then
no buffer overflow is found and this works again as expected.

Version-Release number of selected component (if applicable):
gcc-13.0.1-0.4.fc39
glibc-2.37-1.fc38
kernel-6.2.0-0.rc8.20230217gitec35307e18ba.60.fc39

How reproducible:
Always.


Steps to Reproduce:
1. gcc -O2 -g -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 ./reproducer_not_working.c -o ./reproducer_not_working
2. ./reproducer_not_working < ./stress.samp

Actual results:
Realloced
*** buffer overflow detected ***: terminated
Aborted (core dumped)

Expected results:
testing file is printed out and no buffer overflow is detected.

Additional info:
The second reproducer reproducer_working.c works as expected.
I added it to make the problem more visible.

Do not hesitate to ask further questions.
Thanks for any help.

Comment 1 Tomas Korbar 2023-02-27 14:28:44 UTC
Created attachment 1946732 [details]
reproducer_working.c

Comment 2 Tomas Korbar 2023-02-27 14:29:14 UTC
Created attachment 1946733 [details]
stress.samp

Comment 3 Siddhesh Poyarekar 2023-02-28 13:36:24 UTC
This looks very similar to a bug that was already fixed in upstream autogen:

https://sourceforge.net/p/autogen/bugs/212/

I've written about why this is a problematic programming pattern:

https://siddhesh.in/posts/that-is-not-a-number-that-is-a-freed-object.html

There's an open gcc feature request to support such uses to the extent that the compiler can see the realloc, but it cannot fully support this pattern:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105217

Can you please confirm that the autogen in question has the fix for the above bug included?

Comment 4 Tomas Korbar 2023-02-28 14:23:32 UTC
It has not. It did not occur to me to look for an upstream issue
when i realized that this is most likely a GCC problem and i wanted
to point this out in case that it could cause more issues.
Will apply autogen patch. Thanks for your help, feel free to close this.