Bug 22344

Summary: strcmp of different strings returns 0
Product: [Retired] Red Hat Linux Reporter: Todd Stout <tstout>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact: David Lawrence <dkl>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0   
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2000-12-15 13:57:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Todd Stout 2000-12-15 13:57:42 UTC
The following code causes strcmp to return 0 when passed two
strings that are NOT equal:

#include <stdio.h>
#include <string.h>

class StrContainer
{
  public:

  StrContainer(char* str)
  {
      strVal = str;
  }

  const char* const& getStr()
  {
     return strVal;
  }

  private:

  char* strVal;
};

int main()
{
   char* str1 = "7000";
   char* str2 = "8000";

   StrContainer sc1(str1);
   StrContainer sc2(str2);
 
   if (0 == strcmp(sc1.getStr(), sc2.getStr()))
   {
     printf("Expression 1: Strings '%s' and '%s' are equal\n",
            sc1.getStr(),
            sc2.getStr());
   }
   else
   {
     printf("Expression 1: Strings '%s' and '%s' are Not equal\n",
            sc1.getStr(),
            sc2.getStr());
   }

   const char* s1 = sc1.getStr();
   const char* s2 = sc2.getStr();

   if (0 == strcmp(s1, s2))
   {
     printf("Expression 2: Strings '%s' and '%s' are equal\n",
            s1,
            s2);
   }
   else
   {
     
     printf("Expression 2: Strings '%s' and '%s' are Not equal\n",
            s1,
            s2);
   }
  
 return 0;
}


The first strcmp results in Strings '7000' and '8000' are equal while the second strcmp
properly returns that the two strings are not equal.  Initially I thought there was a 
code generation problem when invoking strcmp(sc1.getStr(), sc2.getStr()) since the
second strcmp is correct when passed local variables instead of two expressions.
After stepping through the code at the assembly level, the first strcmp fails because
the  result of sc1.getStr() is pulled from the stack for both arguments to strcmp.
However, if I create a strcmp wrapper such as

int strcmp2(const char* s1, const char* s1)
{
  printf("comparing '%s' to '%s'\n");
  return strcmp(s1, s2);
}

and call this instead of strcmp, the code functions properly.  This lead me to look
at the preprocessor output via g++ -E to see the strcmp prototype.  I noticed
that it has an attribute of __pure__.  I could not find any documentation for this
attribute and I assume it means the implementation is in assembly.

Another interesting aspect of this problem is that if StrContainer::getStr() is changed to return
const char* instead of const char* const& the code functions properly.  
I realize that this code is an example of bad programming and that the const& attribute of
getStr() is not necessary, but it still should work.  This code malfunctions only with gcc 2.96.
It operates properly with 2.95.2 and egcs-X

This code also functions properly with Sun's Forte C++, VC++ 6.0, and Borland C++.

Comment 1 Jakub Jelinek 2000-12-15 15:21:16 UTC
No, if you have const & there then the program is buggy. You even get a nice
warning about it:
o.C: In method `const char *const &StrContainer::getStr ()':
o.C:15: warning: returning reference to temporary
Here is what's going on:
as strVal is char * and you're returning const reference to const char *,
it cannot be reference to strVal (because it is char *), so it has to be
copied into temporary variable from which you then take reference.
But then it really depends with what stack pointer are both getStr routines
called (not speaking that the stack slot may be clobbered by interrupt handlers
and stuff like that at that point).
So, you can e.g.:
- change the type of strVal to const char * (likewise with the constructor)
- remove that const &
- change return type of getStr to char * const &
compile with optimizations (then it gets inlined and can work (but by pure luck).
g++ -O2 on this gives twice Not Equal.

As for __pure__, it is a weaker cousin of __const__, meaning that the function
does not clobber any memory (well, it is allowed to clobber stack below stack
pointer obviously). This is so that e.g. compiler does not have to reload its
own copies of global variables which could be potentially clobbered by call to
non-"pure" function.