Bug 2190013 - OpenCV 4.7.0 + -Wp,-D_GLIBCXX_ASSERTIONS seems to break DNN functionality
Summary: OpenCV 4.7.0 + -Wp,-D_GLIBCXX_ASSERTIONS seems to break DNN functionality
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: opencv
Version: 38
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Nicolas Chauvet (kwizart)
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-26 19:33 UTC by Jens Georg
Modified: 2023-07-23 11:15 UTC (History)
7 users (show)

Fixed In Version: opencv-4.7.0-9.fc39 opencv-4.7.0-9.fc38
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-06-22 02:26:31 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Jens Georg 2023-04-26 19:33:20 UTC
Trying to use face recognition with Shotwell on Fedora 38, I get

/usr/include/c++/12/bits/stl_vector.h:1123: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = float; _Alloc = std::allocator<float>; reference = float&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
/usr/include/c++/12/bits/stl_vector.h:1123: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = float; _Alloc = std::allocator<float>; reference = float&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.

This used to work on Fedora 37.

There is a similar ticket to OpenCV, coming from Arch: https://github.com/opencv/opencv/issues/23323

That ticket suggests undefining _GLIBCXX_ASSERTIONS - which I find highly suspicious tbh.

Reproducible: Always

Steps to Reproduce:
1. Compile Shotwell 0.32.0 from source with face detection enabled
2. Import an image with faces
3. Open that image
4. Click on faces
5. Click on "detect faces"
Actual Results:  
Shotwells face detect helper aborts with the assertions mentioned above

Expected Results:  
At least no assertions should happen there. ideally face detection should work as before...

Probably more of an upstream issue - like I mentioned, I find the proposed work-around highly suspicios

Comment 1 Nicolas Chauvet (kwizart) 2023-05-04 14:03:11 UTC
I'm not aware that fedora has opencv with face detection enabled, (patent issue), so I don't get why it would have worked unless a self compiled opencv...

Comment 2 Jens Georg 2023-05-04 16:28:18 UTC
The important part ist DNN, not the face recognition. That just uses the DNN part of OpenCV and that was definitely enabled in F37

Comment 3 Sergio Basto 2023-05-19 00:01:51 UTC
yeah but we don't ship Nonfree algorithms like SIFT and SURF and module/xfeatures2d

Following the upstream issue , I think we can undefined  _GLIBCXX_ASSERTIONS  to fix Shotwell as others did , until upstream fix the bug ...

Comment 4 Nicolas Chauvet (kwizart) 2023-06-12 12:08:37 UTC
scratch build (unofficial) of shotwell with face detect support in order to reproduce...

f38: https://koji.fedoraproject.org/koji/taskinfo?taskID=102054373
f37: https://koji.fedoraproject.org/koji/taskinfo?taskID=102055070
PR for shotwell https://src.fedoraproject.org/rpms/shotwell/pull-request/2

Comment 5 Nicolas Chauvet (kwizart) 2023-06-12 15:07:00 UTC
At least, I confirm that running shotwell with facedetect enabled seems to work with fc37 opencv-4.6.0
(I still need to run shotwell-facedetect manually).

$ /usr/libexec/shotwell/shotwell-facedetect
Attempting to upgrade batch norm layers using deprecated params: /usr/share/shotwell/facedetect/deploy.prototxt
Successfully upgraded batch norm layers using deprecated params.
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/src/parallel/registry_parallel.impl.hpp (96) ParallelBackendRegistry core(parallel): Enabled backends(2, sorted by priority): TBB(1000); OPENMP(990)
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/include/opencv2/core/parallel/backend/parallel_for.tbb.hpp (54) ParallelForBackend Initializing TBB parallel backend: TBB_INTERFACE_VERSION=11103
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/src/parallel/parallel.cpp (77) createParallelForAPI core(parallel): using backend: TBB (priority=1000)
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/src/ocl.cpp (1186) haveOpenCL Initialize OpenCL runtime...
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/src/ocl.cpp (1192) haveOpenCL OpenCL: found 1 platforms
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/src/ocl.cpp (984) getInitializedExecutionContext OpenCL: initializing thread execution context
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/src/ocl.cpp (994) getInitializedExecutionContext OpenCL: creating new execution context...
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/src/ocl.cpp (1012) getInitializedExecutionContext OpenCL: device=Quadro K620
[ INFO:0] global /builddir/build/BUILD/opencv-4.6.0/modules/core/src/ocl.cpp (5370) __init_buffer_pools OpenCL: Initializing buffer pool for context@0 with max capacity: poolSize=0 poolSizeHostPtr=0

Seems to use a TBB backend and OpenCL (via nvidia) with me...

Can you provide the same output ?

Comment 6 Sergio Basto 2023-06-12 15:16:41 UTC
BTW I did this PR https://src.fedoraproject.org/rpms/opencv/pull-request/22 which disable temporarily -Wp,-D_GLIBCXX_ASSERTIONS ,

Let me know if I can proceed .

Thank you

Comment 7 Jens Georg 2023-06-12 15:48:28 UTC
Sorry, what exactly do you need from me?

Comment 8 Jens Georg 2023-06-12 15:54:13 UTC
jgeorg@z400: ~/Source/shotwell [git:shotwell-0.32 $=] $ ./build/subprojects/shotwell-facedetect/shotwell-facedetect
Attempting to upgrade batch norm layers using deprecated params: /home/jgeorg/Source/shotwell/subprojects/shotwell-facedetect/deploy.prototxt
Successfully upgraded batch norm layers using deprecated params.
[ INFO:0] global registry_parallel.impl.hpp:96 ParallelBackendRegistry core(parallel): Enabled backends(2, sorted by priority): TBB(1000); OPENMP(990)
[ INFO:0] global parallel_for.tbb.hpp:54 ParallelForBackend Initializing TBB parallel backend: TBB_INTERFACE_VERSION=11103
[ INFO:0] global parallel.cpp:77 createParallelForAPI core(parallel): using backend: TBB (priority=1000)
[ INFO:0] global ocl.cpp:1186 haveOpenCL Initialize OpenCL runtime...
[ INFO:0] global ocl.cpp:1192 haveOpenCL OpenCL: found 1 platforms
[ INFO:0] global ocl.cpp:984 getInitializedExecutionContext OpenCL: initializing thread execution context
[ INFO:0] global ocl.cpp:994 getInitializedExecutionContext OpenCL: creating new execution context...
[ INFO:0] global ocl.cpp:1012 getInitializedExecutionContext OpenCL: device=NVIDIA GeForce GTX 1060 6GB
[ INFO:0] global ocl.cpp:5370 __init_buffer_pools OpenCL: Initializing buffer pool for context@0 with max capacity: poolSize=0 poolSizeHostPtr=0

** (shotwell-facedetect:87759): WARNING **: 17:53:23.593: Face recognition failed: OpenCV(4.7.0) /builddir/build/BUILD/opencv-4.7.0/modules/dnn/src/layers/fast_convolution/winograd_3x3s1_f63.cpp:147: error: (-215:Assertion failed) _FX_WINO_IBLOCK == 3 && _FX_WINO_KBLOCK == 4 in function '_fx_winograd_accum_f32'


** (shotwell-facedetect:87759): WARNING **: 17:53:23.595: Face recognition failed: OpenCV(4.7.0) /builddir/build/BUILD/opencv-4.7.0/modules/dnn/src/layers/fast_convolution/winograd_3x3s1_f63.cpp:147: error: (-215:Assertion failed) _FX_WINO_IBLOCK == 3 && _FX_WINO_KBLOCK == 4 in function '_fx_winograd_accum_f32'

malloc(): unaligned tcache chunk detected
Aborted (core dumped)

Comment 9 Jens Georg 2023-06-12 15:57:42 UTC
Not sure why this is now crashing different..

Comment 10 Jens Georg 2023-06-12 16:00:43 UTC
That seems to be https://github.com/opencv/opencv/pull/23112

Comment 11 Nicolas Chauvet (kwizart) 2023-06-12 16:38:48 UTC
Thanks for the hint. I've made a scratch build before submitting any real build (still building, it should take about 1 hour):
https://koji.fedoraproject.org/koji/taskinfo?taskID=102067629

Comment 12 Nicolas Chauvet (kwizart) 2023-06-12 18:09:49 UTC
Fixed patch application: https://koji.fedoraproject.org/koji/taskinfo?taskID=102070534

Comment 13 Fedora Update System 2023-06-12 20:20:36 UTC
FEDORA-2023-e01ec3ce94 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-e01ec3ce94

Comment 14 Fedora Update System 2023-06-12 20:25:07 UTC
FEDORA-2023-e01ec3ce94 has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 15 Sergio Basto 2023-06-12 20:35:11 UTC
we should push the same fix to F38

Comment 16 Fedora Update System 2023-06-12 20:37:48 UTC
FEDORA-2023-79a0041426 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-79a0041426

Comment 17 Fedora Update System 2023-06-13 01:42:42 UTC
FEDORA-2023-79a0041426 has been pushed to the Fedora 38 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-79a0041426`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-79a0041426

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Fedora Update System 2023-06-22 02:26:31 UTC
FEDORA-2023-79a0041426 has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 19 Fedora Update System 2023-07-23 11:13:55 UTC
FEDORA-2023-c8fa60873d has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-c8fa60873d

Comment 20 Fedora Update System 2023-07-23 11:15:00 UTC
FEDORA-2023-c8fa60873d has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.