Bug 594463

Summary: Strange crashes on an i686 (valgrind output in bug report)
Product: [Fedora] Fedora Reporter: Trever Adams <trever>
Component: opencvAssignee: Rakesh Pandit <rpandit>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 14CC: karlthered, kklic, kwizart, nomis80, rpandit
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-10 21:05:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ABRT output showing bug
none
Valgrind capture of problems none

Description Trever Adams 2010-05-20 18:13:08 UTC
Description of problem:
I have been trying to track down the problems I am having on my router with opencv (used for webfiltering). I am not able to show exact line numbers, but I finally have valgrind output, I have removed the mentions of my code as the paths are clean and show no errors. I am hoping someone has better luck than I am having at figuring all of this out. I see these four errors over and over. It runs fine through valgrind but crashes on its own.

Version-Release number of selected component (if applicable):
opencv-2.0.0-7.fc13.i686

Valgrind Output:
==23093==  Address 0xea0dffc is 45,604 bytes inside a block of size 151,388 free'd
==23093==    at 0x40057F6: free (vg_replace_malloc.c:325)
==23093==    by 0xB132962: cv::fastFree(void*) (in /usr/lib/libcxcore.so.2.0.0)
==23093==    by 0xB13298B: cvFree_ (in /usr/lib/libcxcore.so.2.0.0)
==23093==    by 0xAFEA20F: cvReleaseMat (in /usr/lib/libcxcore.so.2.0.0)
==23093==    by 0xAD837A5: cvHaarDetectObjects (in /usr/lib/libcv.so.2.0.0)

==23093== 
==23093== Invalid read of size 4
==23093==    at 0xAD7DB11: icvEvalHidHaarClassifier(CvHidHaarClassifier*, double, unsigned int) (in /usr/lib/libcv.so.2.0.0)
==23093==    by 0xAD81A8E: cvRunHaarClassifierCascade (in /usr/lib/libcv.so.2.0.0)
==23093==    by 0xAD81FFA: ??? (in /usr/lib/libcv.so.2.0.0)
==23093==    by 0xAD847C3: cvHaarDetectObjects (in /usr/lib/libcv.so.2.0.0)

==23093==  Address 0xea13098 is 66,240 bytes inside a block of size 151,388 free'd
==23093==    at 0x40057F6: free (vg_replace_malloc.c:325)
==23093==    by 0xB132962: cv::fastFree(void*) (in /usr/lib/libcxcore.so.2.0.0)
==23093==    by 0xB13298B: cvFree_ (in /usr/lib/libcxcore.so.2.0.0)
==23093==    by 0xAFEA20F: cvReleaseMat (in /usr/lib/libcxcore.so.2.0.0)
==23093==    by 0xAD837A5: cvHaarDetectObjects (in /usr/lib/libcv.so.2.0.0)

==23093== Conditional jump or move depends on uninitialised value(s)
==23093==    at 0xAD7DB2F: icvEvalHidHaarClassifier(CvHidHaarClassifier*, double, unsigned int) (in /usr/lib/libcv.so.2.0.0)
==23093==    by 0xAD81A8E: cvRunHaarClassifierCascade (in /usr/lib/libcv.so.2.0.0)
==23093==    by 0xAD81FFA: ??? (in /usr/lib/libcv.so.2.0.0)
==23093==    by 0xAD847C3: cvHaarDetectObjects (in /usr/lib/libcv.so.2.0.0)

Comment 1 Trever Adams 2010-05-21 18:33:08 UTC
This may not be entirely accurate as I am getting a strange message about CRC for the opencv debug info.

icvEvalHidHaarClassifier (classifier=0xab1e076c, variance_norm_factor=
    59.600247332821034, p_offset=8822)
    at /usr/src/debug/OpenCV-2.0.0/src/cv/cvhaar.cpp:681
681	        double sum = calc_sum(node->feature.rect[0],p_offset) * node->feature.rect[0].weight;
Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.23-1.fc13.i686 atk-1.30.0-1.fc13.i686 bzip2-libs-1.0.5-6.fc12.i686 c_icap_modules-classify-20100521-1.fc13.i686 cairo-1.8.10-1.fc13.i686 clamav-lib-0.95.3-1301.fc13.i686 db4-4.8.30-1.fc13.i686 expat-2.0.1-10.fc13.i686 fontconfig-2.8.0-1.fc13.i686 freetype-2.3.11-3.fc13.i686 glib2-2.24.1-1.fc13.i686 gstreamer-0.10.29-1.fc13.i686 gtk2-2.20.1-1.fc13.i686 jasper-libs-1.900.1-15.fc13.i686 libX11-1.3.1-3.fc13.i686 libXau-1.0.5-1.fc12.i686 libXcomposite-0.4.1-2.fc13.i686 libXcursor-1.1.10-4.fc13.i686 libXdamage-1.1.2-2.fc13.i686 libXext-1.1-2.fc13.i686 libXfixes-4.0.4-2.fc13.i686 libXi-1.3-2.fc13.i686 libXinerama-1.1-2.fc13.i686 libXrandr-1.3.0-5.fc13.i686 libXrender-0.9.5-1.fc13.i686 libdc1394-2.1.2-3.fc12.i686 libgcc-4.4.4-2.fc13.i686 libgomp-4.4.4-2.fc13.i686 libjpeg-6b-46.fc12.i686 libogg-1.2.0-1.fc13.i686 libpng-1.2.43-1.fc13.i686 libraw1394-2.0.5-1.fc13.i686 libselinux-2.0.90-5.fc13.i686 libstdc++-4.4.4-2.fc13.i686 libtheora-1.1.1-1.fc13.i686 libtiff-3.9.2-3.fc13.i686 libucil-0.9.8-1.fc13.i686 libunicap-0.9.8-1.fc13.i686 libusb1-1.0.6-2.fc13.i686 libvorbis-1.3.1-1.fc13.i686 libxcb-1.5-1.fc13.i686 libxml2-2.7.7-1.fc13.i686 pango-1.28.0-1.fc13.i686 pixman-0.18.0-1.fc13.i686 tre-0.8.0-1.fc13.i686
(gdb) back
#0  icvEvalHidHaarClassifier (classifier=0xab1e076c, variance_norm_factor=
    59.600247332821034, p_offset=8822)
    at /usr/src/debug/OpenCV-2.0.0/src/cv/cvhaar.cpp:681
#1  0x04787eaf in cvRunHaarClassifierCascade (_cascade=0x80edfc0, pt=..., 
    start_stage=0) at /usr/src/debug/OpenCV-2.0.0/src/cv/cvhaar.cpp:747
#2  0x0478841b in cvHaarDetectObjects.omp_fn.1(void) (.omp_data_i=0xac698d30)
    at /usr/src/debug/OpenCV-2.0.0/src/cv/cvhaar.cpp:1260
#3  0x0478abe4 in cvHaarDetectObjects (_img=0xab1adc10, 
    cascade=<value optimized out>, storage=0xab12b500, 
    scale_factor=<value optimized out>, min_neighbors=1, 
    flags=<value optimized out>, min_size=...)
    at /usr/src/debug/OpenCV-2.0.0/src/cv/cvhaar.cpp:1230
#4  0x00e1b7e2 in categorize_image () from /usr/lib/c_icap//srv_classify.so
#5  0x00e1720d in srvclassify_end_of_data_handler ()
   from /usr/lib/c_icap//srv_classify.so
#6  0x08050cb8 in do_request (req=0xab1004b0) at request.c:1157
#7  0x08051235 in process_request (req=0xab1004b0) at request.c:1196
#8  0x0805c062 in thread_main (srv=0x1caf1ca0) at mpmt_server.c:518
#9  0x00ad5919 in start_thread () from /lib/libpthread.so.0
#10 0x00a17e5e in clone () from /lib/libc.so.6

Comment 2 Trever Adams 2010-05-22 11:06:22 UTC
The application I am writing is multithreaded (c-icap module, so I cannot force it into single threaded mode, not even for testing). So, this may be related to bug #571380. Also, it appears the problem only shows up if an object is detected. If no object is detected it seems to run just fine. Each image is handled multiple times. I have commented out for trial runs all functions that use opencv, except this one:

// Function to detect objects
static void detect(ImageSession *mySession)
{
double t = 0;
ImageDetected *current = mySession->detected;

	while(current != NULL)
	{
		t = cvGetTickCount();
		// Find whether the cascade is loaded, to find the objects. If yes, then:
		if( current->category->cascade )
		{

			// There can be more than one object in an image. So create a growable sequence of detected objects.
			// Detect the objects and store them in the sequence
			current->detected = cvHaarDetectObjects( mySession->rightImage, current->category->cascade, mySession->dstorage,
								1.1, 1, 0, cvSize(0, 0) );
		}
		t = cvGetTickCount() - t;
		t = t / ((double)cvGetTickFrequency() * 1000.);
		ci_debug_printf(8, "File: %s Object: %s (%d) Detection took: %gms.\n", (rindex(mySession->fname, '/'))+1, current->category->name, current->detected->total, t);

		current=current->next;
	}
}

Comment 3 Trever Adams 2010-06-15 12:08:42 UTC
Created attachment 424129 [details]
ABRT output showing bug

ABRT output from my program. Sorry, I am not sure why the c-icap-modules do not have debugging information.

Hopefully this will help out. I am unable to duplicate this bug in a single threaded application.

Comment 4 Trever Adams 2010-06-16 15:17:50 UTC
Created attachment 424501 [details]
Valgrind capture of problems

This is a much better capture of the bug. Hopefully this will help track it down. It may have duplicate information as I left it running for quite some time (~300 images).

Comment 5 Trever Adams 2010-06-17 16:02:10 UTC
This may or may not be related to this bug. I am only starting to see it the last few days:

OpenCV Error: Unspecified error (hid_cascade has been already created) in icvCreateHidHaarClassifierCascade, file /builddir/build/BUILD/OpenCV-2.0.0/src/cv/cvhaar.cpp, line 208
terminate called after throwing an instance of 'cv::Exception'


It should be noted that I do (CvHaarClassifierCascade*) cvLoad(current->cascade_location, NULL, NULL, NULL ); once during initialization of the program for each cascade (there are 4+).

This program is multithreaded. Each thread will potentially call HaarClassifier. (It may be classifying text.) -- Again, my application is open source (a module for c-icap).

I only see this new error once per execution of the application.

Comment 6 Trever Adams 2010-06-18 03:26:25 UTC
I made a mistake. I was only seeing it once because all the other sub-processes and threads disconnect from the console as stdout. That message is happening with great regularity.

Comment 7 Rakesh Pandit 2010-06-18 03:42:58 UTC
Thnaks for looking into it, I will check this weekend what can be done.

Comment 8 Trever Adams 2010-06-18 13:41:59 UTC
Thank you. I hope this doesn't come off wrong, if you can take this to the opencv community. I would look at the code myself, but I am swamped and am unfamiliar with the codebase. Also, I would take it to opencv myself, but I don't have an established relationship...

Again, thank you to all of the Fedora opencv team and to the opencv team.

Comment 9 Trever Adams 2010-06-21 16:50:03 UTC
One of my five trainers is running, the other three aren't. It seems the ones that aren't have splits off the 0 node. The one that is didn't do a split until later in the cascade.

Working: 0-1-2-3
Not working:

0-1
|
2-3

Thank you again.

Comment 10 Trever Adams 2010-06-21 23:19:51 UTC
Sorry, comment #9 was intended for a different bug in opencv (#605499)

Comment 11 Trever Adams 2010-06-28 19:42:13 UTC
I have tracked this down. It appears there are private data structures in the cascade which get used. The appropriate "fix" seems to be adding the following to the cvHaarDetectObjects, and ilk, official documentation:

"This function is thread safe, IF AND ONLY IF, you do not use the same loaded cascade object from multiple threads at the same time. If such functionality is needed, please load (cvLoad or other) multiple copies."

If you have an appropriate established relationship with the OpenCV guys, can you make a suggestion of such a change in the official documentation?

Thank you.

Comment 12 Nicolas Chauvet (kwizart) 2011-01-07 09:07:52 UTC
Sorry for the late answear.

Can you reproduce with OpenCV 2.1 from fc14 (or even OpenCV 2.2 for fc15) ?

Comment 13 Nicolas Chauvet (kwizart) 2011-02-04 22:09:50 UTC
Can you reproduce with OpenCV 2.2.0 :
http://koji.fedoraproject.org/koji/taskinfo?taskID=2762186 ?
Once that said, you really should report this problem directly upstream.
(please report the bugreport link here).

Comment 14 Nicolas Chauvet (kwizart) 2011-06-10 21:05:13 UTC
Same as
https://bugzilla.redhat.com/show_bug.cgi?id=605499#c6