RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2137448 - g++ compiler is incorrectly reporting the value of strlen variable
Summary: g++ compiler is incorrectly reporting the value of strlen variable
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: gcc
Version: 8.6
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Marek Polacek
QA Contact: Václav Kadlčík
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-25 00:36 UTC by mkielian@redhat.com
Modified: 2023-05-16 10:59 UTC (History)
9 users (show)

Fixed In Version: gcc-8.5.0-18.el8
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-16 09:03:21 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
User provided test file for simple program to demonstrate issue (10.00 KB, application/x-tar)
2022-10-25 00:36 UTC, mkielian@redhat.com
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-137756 0 None None None 2022-10-27 20:31:43 UTC
Red Hat Knowledge Base (Solution) 6988952 0 None None None 2022-12-04 23:52:42 UTC
Red Hat Product Errata RHBA-2023:2958 0 None None None 2023-05-16 09:03:27 UTC

Description mkielian@redhat.com 2022-10-25 00:36:21 UTC
Created attachment 1920088 [details]
User provided test file for simple program to demonstrate issue

Description of problem: g++ compiler is incorrectly reporting a strlen of "0" rather than the expected result of "3"


Version-Release number of selected component (if applicable):

Reported in RHEL 8.5
Reproduced in RHEL 8.6 by Red Hat

User reported they did not see this issue in RHEL 7.x or 9.x


How reproducible: Every Time


Steps to Reproduce:
Prerequisites:
RHEL 8.6 Build installed with GUI with default settings.

1. Install Development Tools
# dnf group install "Development Tools"

2. Update the system and all packages

# dnf update

3. Reboot

4. Download the file from the case: [ strlen.tar ]

Run test by extracting and running:

mkdir strlen.dir && cd strlen.dir
tar -xf strlen.tar
bash make.sh
./main


5. Check output

$ ./main
statsbuf_init: enter
0x7ffde8a88320 0x7ffde8a88335 (dif=21) strlen=0


Actual results:

"strlen=0"

Expected results:

"strlen=3"

Additional info:

Comment 1 Marek Polacek 2022-10-27 20:24:44 UTC
With newer GCC I get a warning:

In file included from main.cpp:3:
In function ‘CapacityPollBuckets* get_capacity_buckets(CapacityPollUdpMsg*)’,
    inlined from ‘void dsb_test(void*)’ at main.cpp:11:56,
    inlined from ‘int main()’ at main.cpp:22:10:
cap.h:32:97: warning: ‘size_t strlen(const char*)’ reading 1 or more bytes from a region of size 0 [-Wstringop-overread]
   32 |         char *capacit_part = reinterpret_cast<char *>(msg) + sizeof(CapacityPollUdpMsg) + strlen(msg->measurementName) + 1;
      |                                                                                           ~~~~~~^~~~~~~~~~~~~~~~~~~~~~
cap.h: In function ‘int main()’:
cap.h:15:21: note: source object ‘CapacityPollUdpMsg::measurementName’ of size 0
   15 |         char        measurementName[0]; // Variable size measurement name, must be null terminated. 

which I think means there's undefined behavior in the program: the array argument to strlen isn't nul-terminated.

The strlen=0 result started with
https://gcc.gnu.org/pipermail/gcc-patches/2018-July/501912.html
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=715fcd73b66c639d9e0e3f3ef9c6ff9d621d7131
which seems like an undesirable change.  It was fixed (back to strlen=3) by
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-01/msg00069.html
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=d4bf69750d31d08068f8242225b8fa06cdf11411
but there's been many changes in that area, so it doesn't look backportable to me.

I suppose strnlen could be used instead.

Comment 2 Marek Polacek 2022-10-31 19:38:56 UTC
Reduced:

$ LD_PRELOAD=~/x/trunk/gcc/libmpfr.so.4.1.6 /home/mpolacek/x/trunk/gcc/cc1plus.262437 -quiet m.C -Wall -W -Werror -O2 -Wpedantic; g++ -O2 m.s && ./a.out
$ LD_PRELOAD=~/x/trunk/gcc/libmpfr.so.4.1.6 /home/mpolacek/x/trunk/gcc/cc1plus.262438 -quiet m.C -Wall -W -Werror -O2 -Wpedantic; g++ -O2 m.s && ./a.out
Aborted (core dumped)


extern "C" {
void *memcpy(void *, const void *, unsigned long);
char *strcpy(char *, const char *);
unsigned long strlen(const char *);
}
struct S {
  int i;
  __extension__ char name[0];
} statsbuf[4];

static void
foo (void *msg)
{
  S *s = (S *) msg;
  unsigned long l = strlen (s->name);
  if (l != 3)
    __builtin_abort ();
}

char buf[24];
int
main ()
{
  strcpy(statsbuf[0].name, "foo");
  memcpy(buf, statsbuf, sizeof (void *));
  foo (buf);
}

Comment 3 Jakub Jelinek 2022-10-31 20:02:38 UTC
Thanks.
For the testcase, better add using size_t = decltype(sizeof 0); or typedef __SIZE_TYPE__ size_t; or so and replace unsigned long with size_t.
I think this is something we just don't want to support, neither in this form nor with name[];, because
strcpy(statsbuf[0].name, "foo"); actually overwriting statsbuf[1].i .
If you
#include <cstring>
#include <cstdlib>

struct S {
  int i;
  __extension__ char name[0];
} statsbuf[4];

static void
foo (void *msg)
{
  S *s = (S *) msg;
  size_t l = strlen (s->name);
  if (l != 3)
    __builtin_abort ();
}

char buf[24];
int
main ()
{
  strcpy(statsbuf[0].name, "foo");
  memcpy(buf, statsbuf, sizeof (void *));
  foo (buf);
}
with -O2 -D_FORTIFY_SOURCE=2 , it will crash with fortification failure both with older and newer g++.
name isn't really array at end of struct in the flexible array member sense, because it is nested in an array (or similarly if it would be nested in some structure and followed by some members after it).

Another thing is whether we handle properly
#include <cstring>
#include <cstdlib>

struct S {
  int i;
  char c;
  __extension__ char name[];
} statsbuf[4];

static void
foo (void *msg)
{
  S *s = (S *) msg;
  size_t l = strlen (s->name);
  if (l != 3)
    __builtin_abort ();
}

char buf[24];
int
main ()
{
  strcpy(statsbuf[0].name, "fo");
  memcpy(buf, statsbuf, sizeof (void *));
  foo (buf);
}
where the flexible array member is similarly at the end of a struct followed by some other object, but here struct S has 3 bytes of padding on at least all RHEL architectures and
so I think one should be able to use those 3 bytes of padding for the flexible array member.  At least for _FORTIFY_SOURCE=2 that seems to be the case with
name[], but not with name[0] (as the latter is an extension I think that is also acceptable), but the question is if other code, e.g. the above referenced change, doesn't break that.

Comment 4 Marek Polacek 2022-10-31 20:06:55 UTC
JFTR:

After r262438:
maybe_set_strlen_range gets lhs=l_5, src=&MEM[(struct S *)&buf].name
strip the &, so src=MEM[(struct S *)&buf].name
array_at_struct_end_p (src) says false, so we don't return
TREE_TYPE (src) is char[0:18446744073709551615]
TYPE_SIZE_UNIT (type) is 0, we use that as max in set_range_info (lhs, VR_RANGE, min, max);
so l_5's range is [0..0]

Pre-r262438, the code was looking at TYPE_DOMAIN/TYPE_MAX_VALUE of char[0:18446744073709551615]
TYPE_DOMAIN is signed long, its TYPE_MAX_VALUE is 18446744073709551615 and we used that for the
set_range_info call, so l_5's range was [0..18446744073709551615] I suppose.

Comment 5 Marek Polacek 2022-11-21 23:18:17 UTC
The code seems to invoke undefined behavior by using the flexible array member incorrectly, and the changes don't look backportable, so closing as WONTFIX.

Comment 6 Jamie Bainbridge 2022-11-26 05:17:10 UTC
Hi Marek,

Apologies for reopening this.

The cause provided in Comment 1 appears correct. That's a good find, thank you!

Our customer is asking precisely how they are invoking UB, but I don't see they are?

The customer's strings are all NULL-terminated. Compiling their source with -O0 to get debugging back:

--------------------------------------------------------------------------------
(gdb) b main.cpp:22
Breakpoint 1 at 0x4007c9: file main.cpp, line 22.
(gdb) run
statsbuf_init: enter

Breakpoint 1, main () at main.cpp:22
22              dsb_test(udp_msg_u16);

(gdb) info locals
udp_msg_u16 = '\000' <repeats 20 times>, "foo\000 ...uninitialised data... "
(gdb) p statsbuf
$1 = '\000' <repeats 20 times>, "foo", '\000' <repeats 1000 times>
--------------------------------------------------------------------------------

The minimal reproducer in Comment 2 doesn't seem correct to the customer's use case. That code:

- creates an array of structs with padding
- uses the padding to provide bytes for the Zero Length Array (legal?)
- writes past the padding bytes into the next array member (illegal)

However, the customer isn't doing any of those things.

They're packing their struct at 1-byte offsets so it has no padding.

The customer has a large static buffer (both implicitly and explicitly zeroed) and is casting a single instance of their struct to it, not an array of those structs. (perhaps they're in an environment without dynamic allocation?)

The customer memcpy's the buffer contents from static storage to stack storage, then passes a pointer to the stack buffer buffer through enough indirection that the compiler appears to lose the ability to see that the sizeof(struct->name[0]) is ever used, so optimises it to 0 at -O2.

(they're doing strlen to find the end of name[0] to place another *different* struct into their static storage at the next byte, but that isn't the same as an array of padded structs with ZLA in the padding bytes, and I don't think it matters anyway)

I think we can minimise the customer issue down to:

--------------------------------------------------------------------------------
#include <stdio.h>
#include <string.h>

struct S {
 char skip;
 char name[0];
};

static char static_buf[4];

static void
print_name_len(void *p)
{
  struct S *q = (struct S *) p;
  printf("strlen = %zu\n", strlen(q->name));
}

int
main(void)
{
  // treat static storage as struct
  struct S *c = (struct S *)static_buf;
  strcpy(c->name, "aa");

  // copy static storage to stack storage
  char stack_buf[4] = { 0 };
  memcpy(stack_buf, static_buf, 4);

  // static and stack both now contain ( 0, 'a', 'a', 0 }

  // indirectly pass the stack storage to the length function
  char *s = (char *)stack_buf;
  print_name_len(s);
  return 0;
}
--------------------------------------------------------------------------------

This reproduces what they report at their optimisation level, but I found not at O1 or less:

--------------------------------------------------------------------------------
$ g++ --version
g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-15)

$ g++ -Wall -Wextra -g -D_FORTIFY_SOURCE=2 -O2 min.cpp -o min && ./min
strlen = 0

$ g++ -Wall -Wextra -g -D_FORTIFY_SOURCE=2 -O1 min.cpp -o min && ./min
strlen = 2
--------------------------------------------------------------------------------

The cause mentioned in Comment 1 appears correct. The fixed version returns the correct strlen (and latest gives us a warning to manage our string terminator properly):

--------------------------------------------------------------------------------
$ git describe --contains d4bf69750d31d08068f8242225b8fa06cdf11411
releases/gcc-9.1.0~2050

$ scl enable gcc-toolset-9 'g++ --version'
g++ (GCC) 9.2.1 20191120 (Red Hat 9.2.1-2)

$ scl enable gcc-toolset-9 'g++ -Wall -Wextra -g -D_FORTIFY_SOURCE=2 -O2 min.cpp -o min' && ./min
strlen = 2

$ scl enable gcc-toolset-12 'g++ --version'
g++ (GCC) 12.1.1 20220628 (Red Hat 12.1.1-3)

$ scl enable gcc-toolset-12 'g++ -Wall -Wextra -g -D_FORTIFY_SOURCE=2 -O2 min.cpp -o min' && ./min
In function ‘void print_name_len(void*)’,
    inlined from ‘int main()’ at min.cpp:28:17:
min.cpp:15:9: warning: ‘size_t strlen(const char*)’ reading 1 or more bytes from a region of size 0 [-Wstringop-overread]
   15 |   printf("strlen = %zu\n", strlen(q->name));
      |   ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
min.cpp: In function ‘int main()’:
min.cpp:6:7: note: source object ‘S::name’ of size 0
    6 |  char name[0];
      |       ^~~~
strlen = 2
--------------------------------------------------------------------------------

There doesn't appear to be any illegal memory access, -D_FORTIFY_SOURCE=2 does not error on either the strcpy() or the strlen().

Please let me know if I've missed anything? Is it undefined to use a ZLA cast to pre-allocated storage like this? If so, how would the customer resolve that? Should the ZLA be replaced with a pointer to a different location? (let's assume they can't malloc)

I couldn't see anything in the doc which says this is wrong - https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Zero-Length.html - it says don't use a struct with a ZLA inside another struct, and don't create a ZLA outside of a struct.

If what I've said is correct and there is no UB, please let us know (it's probably then appropriate to close this as NEXTRELEASE because it's a bug which is already repaired but can't be backported as per Comment 1).

Jamie

Comment 8 Marek Polacek 2022-12-02 22:38:46 UTC
(In reply to Jamie Bainbridge from comment #6)
> Hi Marek,
> 
> Apologies for reopening this.
> 
> The cause provided in Comment 1 appears correct. That's a good find, thank
> you!
> 
> Our customer is asking precisely how they are invoking UB, but I don't see
> they are?
> 
> The customer's strings are all NULL-terminated. Compiling their source with
> -O0 to get debugging back:
> 
> -----------------------------------------------------------------------------
> ---
> (gdb) b main.cpp:22
> Breakpoint 1 at 0x4007c9: file main.cpp, line 22.
> (gdb) run
> statsbuf_init: enter
> 
> Breakpoint 1, main () at main.cpp:22
> 22              dsb_test(udp_msg_u16);
> 
> (gdb) info locals
> udp_msg_u16 = '\000' <repeats 20 times>, "foo\000 ...uninitialised data... "
> (gdb) p statsbuf
> $1 = '\000' <repeats 20 times>, "foo", '\000' <repeats 1000 times>
> -----------------------------------------------------------------------------
> ---
> 
> The minimal reproducer in Comment 2 doesn't seem correct to the customer's
> use case. That code:
> 
> - creates an array of structs with padding
> - uses the padding to provide bytes for the Zero Length Array (legal?)
> - writes past the padding bytes into the next array member (illegal)
> 
> However, the customer isn't doing any of those things.

Right, sorry.  The program I used to reduce the original code introduced the array and completely threw me off.

Sorry again :(

> They're packing their struct at 1-byte offsets so it has no padding.
> 
> The customer has a large static buffer (both implicitly and explicitly
> zeroed) and is casting a single instance of their struct to it, not an array
> of those structs. (perhaps they're in an environment without dynamic
> allocation?)
> 
> The customer memcpy's the buffer contents from static storage to stack
> storage, then passes a pointer to the stack buffer buffer through enough
> indirection that the compiler appears to lose the ability to see that the
> sizeof(struct->name[0]) is ever used, so optimises it to 0 at -O2.
> 
> (they're doing strlen to find the end of name[0] to place another
> *different* struct into their static storage at the next byte, but that
> isn't the same as an array of padded structs with ZLA in the padding bytes,
> and I don't think it matters anyway)
> 
> I think we can minimise the customer issue down to:
> 
> -----------------------------------------------------------------------------
> ---
> #include <stdio.h>
> #include <string.h>
> 
> struct S {
>  char skip;
>  char name[0];
> };
> 
> static char static_buf[4];
> 
> static void
> print_name_len(void *p)
> {
>   struct S *q = (struct S *) p;
>   printf("strlen = %zu\n", strlen(q->name));
> }
> 
> int
> main(void)
> {
>   // treat static storage as struct
>   struct S *c = (struct S *)static_buf;
>   strcpy(c->name, "aa");
> 
>   // copy static storage to stack storage
>   char stack_buf[4] = { 0 };
>   memcpy(stack_buf, static_buf, 4);
> 
>   // static and stack both now contain ( 0, 'a', 'a', 0 }
> 
>   // indirectly pass the stack storage to the length function
>   char *s = (char *)stack_buf;
>   print_name_len(s);
>   return 0;
> }
> -----------------------------------------------------------------------------
> ---
> 
> This reproduces what they report at their optimisation level, but I found
> not at O1 or less:
> 
> -----------------------------------------------------------------------------
> ---
> $ g++ --version
> g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-15)
> 
> $ g++ -Wall -Wextra -g -D_FORTIFY_SOURCE=2 -O2 min.cpp -o min && ./min
> strlen = 0
> 
> $ g++ -Wall -Wextra -g -D_FORTIFY_SOURCE=2 -O1 min.cpp -o min && ./min
> strlen = 2
> -----------------------------------------------------------------------------
> ---
> 
> The cause mentioned in Comment 1 appears correct. The fixed version returns
> the correct strlen (and latest gives us a warning to manage our string
> terminator properly):
> 
> -----------------------------------------------------------------------------
> ---
> $ git describe --contains d4bf69750d31d08068f8242225b8fa06cdf11411
> releases/gcc-9.1.0~2050
> 
> $ scl enable gcc-toolset-9 'g++ --version'
> g++ (GCC) 9.2.1 20191120 (Red Hat 9.2.1-2)
> 
> $ scl enable gcc-toolset-9 'g++ -Wall -Wextra -g -D_FORTIFY_SOURCE=2 -O2
> min.cpp -o min' && ./min
> strlen = 2
> 
> $ scl enable gcc-toolset-12 'g++ --version'
> g++ (GCC) 12.1.1 20220628 (Red Hat 12.1.1-3)
> 
> $ scl enable gcc-toolset-12 'g++ -Wall -Wextra -g -D_FORTIFY_SOURCE=2 -O2
> min.cpp -o min' && ./min
> In function ‘void print_name_len(void*)’,
>     inlined from ‘int main()’ at min.cpp:28:17:
> min.cpp:15:9: warning: ‘size_t strlen(const char*)’ reading 1 or more bytes
> from a region of size 0 [-Wstringop-overread]
>    15 |   printf("strlen = %zu\n", strlen(q->name));
>       |   ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> min.cpp: In function ‘int main()’:
> min.cpp:6:7: note: source object ‘S::name’ of size 0
>     6 |  char name[0];
>       |       ^~~~
> strlen = 2
> -----------------------------------------------------------------------------
> ---

I could reproduce all of these.

> There doesn't appear to be any illegal memory access, -D_FORTIFY_SOURCE=2
> does not error on either the strcpy() or the strlen().
> 
> Please let me know if I've missed anything? Is it undefined to use a ZLA
> cast to pre-allocated storage like this? If so, how would the customer
> resolve that? Should the ZLA be replaced with a pointer to a different
> location? (let's assume they can't malloc)
> 
> I couldn't see anything in the doc which says this is wrong -
> https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Zero-Length.html - it says
> don't use a struct with a ZLA inside another struct, and don't create a ZLA
> outside of a struct.
> 
> If what I've said is correct and there is no UB, please let us know (it's
> probably then appropriate to close this as NEXTRELEASE because it's a bug
> which is already repaired but can't be backported as per Comment 1).
> 
> Jamie

Comment 9 Marek Polacek 2022-12-02 22:40:15 UTC
I wonder if we could simply do this:

--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -1201,10 +1201,10 @@ maybe_set_strlen_range (tree lhs, tree src)
     return;
 
   tree type = TREE_TYPE (src);
-  if (tree size = TYPE_SIZE_UNIT (type))
-    if (size && TREE_CODE (size) == INTEGER_CST)
+  if (tree dom = TYPE_DOMAIN (type))
+    if (dom && TREE_CODE (dom) == INTEGER_CST)
       {
-   wide_int max = wi::to_wide (size);
+   wide_int max = wi::to_wide (TYPE_MAX_VALUE (dom));
    wide_int min = wi::zero (max.get_precision ());
    if (max != 0)
      --max;


which brings back the expected result and doesn't seem to regress any of the strlenopt* tests.

Comment 10 Jamie Bainbridge 2022-12-04 22:41:49 UTC
(In reply to Marek Polacek from comment #8)
> Right, sorry.  The program I used to reduce the original code introduced the
> array and completely threw me off.
> 
> Sorry again :(

No worries! It took me hours to get my head around what the customer's code was doing and why. Thank you for the confirmation :)

Comment 11 Jason Merrill 2022-12-05 02:49:55 UTC
(In reply to Marek Polacek from comment #9)
> +    if (dom && TREE_CODE (dom) == INTEGER_CST)

This will always be false: dom is an INTEGER_TYPE, not _CST.

So disabling that block doesn't regress anything?

Comment 12 Marek Polacek 2022-12-05 14:58:00 UTC
(In reply to Jason Merrill from comment #11)
> (In reply to Marek Polacek from comment #9)
> > +    if (dom && TREE_CODE (dom) == INTEGER_CST)
> 
> This will always be false: dom is an INTEGER_TYPE, not _CST.

Ugh, of course, the INTEGER_CST was meant for the TYPE_MAX_VALUE!

> So disabling that block doesn't regress anything?

If it doesn't, then the whole maybe_set_strlen_range function is pointless and can be removed.

I've started a full regtest, will report back here.

Comment 13 Marek Polacek 2022-12-05 17:41:15 UTC
With vanilla GCC 8, I see no regressions with this patch.

Do we want to go ahead with this?

diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 55e82e7b638..76ce489210c 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -1170,48 +1170,6 @@ adjust_last_stmt (strinfo *si, gimple *stmt, bool is_strcat)
   update_stmt (last.stmt);
 }
 
-/* For an LHS that is an SSA_NAME with integer type and for strlen()
-   argument SRC, set LHS range info to [0, N] if SRC refers to
-   a character array A[N] with unknown length bounded by N.  */
-
-static void
-maybe_set_strlen_range (tree lhs, tree src)
-{
-  if (TREE_CODE (lhs) != SSA_NAME
-      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
-    return;
-
-  if (TREE_CODE (src) == SSA_NAME)
-    {
-      gimple *def = SSA_NAME_DEF_STMT (src);
-      if (is_gimple_assign (def)
-     && gimple_assign_rhs_code (def) == ADDR_EXPR)
-   src = gimple_assign_rhs1 (def);
-    }
-
-  if (TREE_CODE (src) != ADDR_EXPR)
-    return;
-
-  /* The last array member of a struct can be bigger than its size
-     suggests if it's treated as a poor-man's flexible array member.  */
-  src = TREE_OPERAND (src, 0);
-  if (TREE_CODE (TREE_TYPE (src)) != ARRAY_TYPE
-      || TREE_CODE (src) == MEM_REF
-      || array_at_struct_end_p (src))
-    return;
-
-  tree type = TREE_TYPE (src);
-  if (tree size = TYPE_SIZE_UNIT (type))
-    if (size && TREE_CODE (size) == INTEGER_CST)
-      {
-   wide_int max = wi::to_wide (size);
-   wide_int min = wi::zero (max.get_precision ());
-   if (max != 0)
-     --max;
-   set_range_info (lhs, VR_RANGE, min, max);
-      }
-}
-
 /* Handle a strlen call.  If strlen of the argument is known, replace
    the strlen call with the known value, otherwise remember that strlen
    of the argument is stored in the lhs SSA_NAME.  */
@@ -1322,10 +1280,6 @@ handle_builtin_strlen (gimple_stmt_iterator *gsi)
       set_strinfo (idx, si);
       find_equal_ptrs (src, idx);
 
-      /* For SRC that is an array of N elements, set LHS's range
-    to [0, N].  */
-      maybe_set_strlen_range (lhs, src);
-
       if (strlen_to_stridx)
    {
      location_t loc = gimple_location (stmt);

Comment 14 Jakub Jelinek 2022-12-05 18:12:55 UTC
From what I can see, that function has been added for https://gcc.gnu.org/PR78450 and https://gcc.gnu.org/PR83373 by Martin.
So, how come none of the testcases added in those commits don't fail?
If we want to support flowing from one subobject into another one for strlen (I believe e.g. -D_FORTIFY_SOURCE=2 doesn't support that),
then a replacement would be to determine if src is a decl and use its DECL_SIZE_UNIT.

Comment 15 Marek Polacek 2022-12-06 00:35:50 UTC
(In reply to Jakub Jelinek from comment #14)
> From what I can see, that function has been added for
> https://gcc.gnu.org/PR78450 and https://gcc.gnu.org/PR83373 by Martin.
> So, how come none of the testcases added in those commits don't fail?

Right, it is strange.  So I dug deeper and I found out that ever since
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=c42d0aa0893cab444366c80fdd5b23bb45de6276
in particular the gimple_fold_builtin_strlen changes, the https://gcc.gnu.org/PR78450
tests pass even when the maybe_set_strlen_range call is removed.

> If we want to support flowing from one subobject into another one for strlen
> (I believe e.g. -D_FORTIFY_SOURCE=2 doesn't support that),
> then a replacement would be to determine if src is a decl and use its
> DECL_SIZE_UNIT.

I can try that tomorrow if that's preferable to removing the maybe_set_strlen_range call.

Comment 16 Marek Polacek 2022-12-06 17:23:47 UTC
FWIW, removing maybe_set_strlen_range in trunk results in

XPASS: gcc.dg/Wstringop-overread.c pr87492
FAIL: gcc.dg/attr-nonstring-4.c
FAIL: gcc.dg/strlenopt-45.c
FAIL: gcc.dg/strlenopt-80.c

so at least there the function does something.

Comment 17 Marek Polacek 2022-12-06 19:41:42 UTC
(In reply to Marek Polacek from comment #15)
> (In reply to Jakub Jelinek from comment #14)
> > From what I can see, that function has been added for
> > https://gcc.gnu.org/PR78450 and https://gcc.gnu.org/PR83373 by Martin.
> > So, how come none of the testcases added in those commits don't fail?
> 
> Right, it is strange.  So I dug deeper and I found out that ever since
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;
> h=c42d0aa0893cab444366c80fdd5b23bb45de6276
> in particular the gimple_fold_builtin_strlen changes, the
> https://gcc.gnu.org/PR78450
> tests pass even when the maybe_set_strlen_range call is removed.
> 
> > If we want to support flowing from one subobject into another one for strlen
> > (I believe e.g. -D_FORTIFY_SOURCE=2 doesn't support that),
> > then a replacement would be to determine if src is a decl and use its
> > DECL_SIZE_UNIT.
> 
> I can try that tomorrow if that's preferable to removing the
> maybe_set_strlen_range call.

So just like this?  Testing looks good so far...

--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -1200,8 +1200,10 @@ maybe_set_strlen_range (tree lhs, tree src)
       || array_at_struct_end_p (src))
     return;
 
-  tree type = TREE_TYPE (src);
-  if (tree size = TYPE_SIZE_UNIT (type))
+  if (!DECL_P (src))
+    return;
+
+  if (tree size = DECL_SIZE_UNIT (src))
     if (size && TREE_CODE (size) == INTEGER_CST)
       {
    wide_int max = wi::to_wide (size);

Comment 18 Jakub Jelinek 2022-12-06 20:20:00 UTC
Well, in that case there should be src = get_base_address (src); or so first.
Otherwise it looks reasonable.

Comment 19 Marek Polacek 2022-12-07 01:27:03 UTC
Thanks, I'm going to start a scratch build and post the full patch on tools-patches.

Comment 28 errata-xmlrpc 2023-05-16 09:03:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (gcc bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2958


Note You need to log in before you can comment on or make changes to this bug.