Bug 610785

Summary:	GCC pointer indexing bug [was: cvCanny crashes on 64 bit systems]
Product:	[Fedora] Fedora	Reporter:	markus.moll
Component:	gcc	Assignee:	Jakub Jelinek <jakub>
Status:	CLOSED RAWHIDE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	medium	Docs Contact:
Priority:	low
Version:	13	CC:	ian.cullinan, jakub, karlthered, kwizart, madrenegade, markus.moll, michael.mathieu, nomis80, rpandit
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	531695	Environment:
Last Closed:	2010-07-09 14:34:21 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description markus.moll 2010-07-02 12:59:32 UTC

+++ This bug was initially created as a clone of Bug #531695 +++

Description of problem:
cvCanny crashes


Steps to Reproduce:
1.call cvCanny
  
Actual results:
app crashes (cvSobel works so the app itself should be correct)

--- Additional comment from karlthered on 2009-10-29 05:15:46 EDT ---

Could you provide an use case ?
Besides, we're working on OpenCV 2, there's a scratch build there:
https://bugzilla.redhat.com/show_bug.cgi?id=530717
Could you confirm that your application still crashes ?

--- Additional comment from madrenegade on 2009-10-30 02:42:27 EDT ---

I recently compiled OpenCV 2 from sthe sources and it still crashes but only on 64 bit systems.

Example:
#include <opencv/cv.h>
#include <opencv/cxcore.h>
#include <opencv/highgui.h>

int main()
{
    IplImage* pImage = cvLoadImage("test.jpg", CV_LOAD_IMAGE_GRAYSCALE);

    CvMemStorage* pStorage = cvCreateMemStorage(0);
    CvSeq* pResults = cvHoughCircles(pImage, pStorage, CV_HOUGH_GRADIENT, 1, pImage->width/2);

    for(int i = 0; i < pResults->total; i+=2)
    {
        float* p = (float*) cvGetSeqElem(pResults, i);
        CvPoint pt = cvPoint(cvRound(p[0]), cvRound(p[1]));

        cvCircle(pImage, pt, cvRound( p[2] ), CV_RGB(0xff,0xff,0xff));
    };

    cvNamedWindow("Hough", 0);
    cvShowImage("Hough", pImage);


    cvWaitKey();

    cvReleaseImage(&pImage);
    cvDestroyAllWindows();

    return 0;
}

As i said, exactly the same code runs on 32 bit Fedora without problems.

--- Additional comment from fedora-triage-list on 2009-11-16 09:33:39 EST ---


This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

--- Additional comment from ian.cullinan.au on 2010-04-13 23:34:01 EDT ---

Here is the relevant ticket in the OpenCV tracker (including patch): https://code.ros.org/trac/opencv/ticket/157

It appears that this is the result of a compiler bug.

--- Additional comment from markus.moll.be on 2010-07-02 08:26:34 EDT ---

The same problem also appeared here, and I have made a few more observations in the process:

1) This bug still appears in Fedora 13.

2) The problem only occurs with the Fedora GCC, a plain vanilla GCC 4.4.3 does not show the same behavior. If I remember correctly, a quick try on my Gentoo system also ran flawlessly.

3) The problem seems to be related to the -fstrict-overflow flag. Adding -fno-strict-overflow to the compiler options resolves it. This is almost certainly a compiler bug. Dumping all intermediate tree representations one can see that in the expression "_map[j - mapstep]" the compiler converts j-mapstep to 64 bit by casting it to "unsigned long int". My guess is that the fstrict-overflow optimizations only are applied at a later stage so that the following optimization is erroneously applied:

[quote gcc info page]
`-fstrict-overflow'

     [...]

     This option also allows the compiler to assume strict pointer
     semantics: given a pointer to an object, if adding an offset to
     that pointer does not produce a pointer to the same object, the
     addition is undefined.  This permits the compiler to conclude that
     `p + u > p' is always true for a pointer `p' and unsigned integer
     `u'.  This assumption is only valid because pointer wraparound is
     undefined, as the expression is false if `p + u' overflows using
     twos complement arithmetic.

     [...]
[end quote]

Of course, this optimization should _not_ be applied here because the expression "j-mapstep" was originally a signed integer, so that the reasoning that there is pointer wraparound is wrong.

4) I had a quick look at the GCC srpm but all the patches seem to be unrelated. Obviously I must be wrong ;-)

Comment 1 Jakub Jelinek 2010-07-06 12:01:19 UTC

Yes, this looks like a bug in ivopts handling, probably a latent bug uncovered by the backport of the http://gcc.gnu.org/PR34163 changes:
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=149207
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155288

Small self-contained testcase which works with -O0/-O1 and when compiled with gcc also with -O2/-O3, but fails with g++ at -O2/-O3.  Temporary workaround -fno-ivopts.

struct S { int s; };
struct T { int w; int h; };
int vr;

inline struct T
bar (const struct S * x)
{
  struct T t;
  t.w = vr;
  t.h = x->s;
  return t;
}

__attribute__ ((noinline))
void foo (struct S * w, unsigned char *x, int y, int *z[2])
{
  struct T t;
  int i, j, k;
  t = bar (w);
  k = t.w + 2;
  for (i = 0; i <= t.h; i++)
    {
      int *u = z[i > 0] + 1;
      unsigned char *v;
      int q = 0;
      v = x + k * i + 1;
      for (j = 0; j < t.w; j++)
	{
	  int m = u[j];
	  if (m > y && !q && v[j - k] != 2)
	    v[j] = 0;
	}
    }
}

unsigned char b[64];

int
main (void)
{
  int v[32], *z[2];
  struct S s;
  __builtin_memset (v, 0, sizeof (v));
  vr = 16;
  s.s = 16;
  z[0] = v;
  z[1] = v;
  foo (&s, b + 32, -1, z);
  return 0;
}

Comment 2 Jakub Jelinek 2010-07-09 14:34:21 UTC

Should be fixed in gcc-4.4.4-11.fc{12,13}.

Comment 3 markus.moll 2010-07-09 15:08:59 UTC

(In reply to comment #2)
> Should be fixed in gcc-4.4.4-11.fc{12,13}.    

Can you also mark the original bug (#531695) as a duplicate and close that as well?

Comment 4 Jakub Jelinek 2010-07-09 15:14:50 UTC

No, because while the fix is in gcc-4.4.4-11, it will take some time before that gcc is pushed as errata (especially because gcc-4.4.4-10 has been released as errata a few days ago - there needs to be some time to accumulate further bugfixes), and afterwards opencv needs to be rebuilt with that gcc.