Bug 1414611

Summary: embree incorrectly enables avx2 instead of avx
Product: [Fedora] Fedora Reporter: Yaakov Selkowitz <yselkowi>
Component: embreeAssignee: Luya Tshimbalanga <luya_tfz>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 25CC: besser82, extras-qa, kwizart, luya, luya_tfz, mtasaka, skolosov, yselkowi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: embree-2.13.0-3.fc24 embree-2.13.0-3.fc25 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1412089 Environment:
Last Closed: 2017-02-07 22:18:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1412089    
Bug Blocks:    

Description Yaakov Selkowitz 2017-01-19 02:10:32 UTC
+++ This bug was initially created as a clone of Bug #1412089 +++

Description of problem: LuxRender is crashing on startup

Version: 1.6
Release: 4.f25

Steps to Reproduce:
Start LuxRener

Actual results:
LuxRender crashes

Expected results:
Normal 

Additional info:

[user@machine ~]$ luxrender 
*** Error in `luxrender': realloc(): invalid pointer: 0x00007f0852b140c0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x791fb)[0x7f084c5651fb]
/lib64/libc.so.6(realloc+0x3b0)[0x7f084c572840]
/lib64/libQt5Core.so.5(_ZN9QListData12realloc_growEi+0x31)[0x7f083f380c51]
/lib64/libQt5Core.so.5(_ZN9QListData6appendEi+0x4f)[0x7f083f380cef]
/lib64/libQt5Core.so.5(+0x1b32c0)[0x7f083f4382c0]
/lib64/libQt5Core.so.5(_Z21qRegisterResourceDataiPKhS0_S0_+0x275)[0x7f083f4340d5]
/lib64/libQt5Core.so.5(+0x7ecd3)[0x7f083f303cd3]
/lib64/ld-linux-x86-64.so.2(+0x10a3a)[0x7f0854c63a3a]
/lib64/ld-linux-x86-64.so.2(+0x10b4b)[0x7f0854c63b4b]
/lib64/ld-linux-x86-64.so.2(+0xd0a)[0x7f0854c53d0a]

--- Additional comment from Mamoru TASAKA on 2017-01-18 02:46:09 EST ---

Maybe the easiest way is to ask OpenImageIO maintainer to create another library without using opencv, put such library into some non-standard directory, and make LuxRender use such OpenImageIO library??

--- Additional comment from Yaakov Selkowitz on 2017-01-18 04:01:42 EST ---

(In reply to Mamoru TASAKA from comment #4)
> Maybe the easiest way is to ask OpenImageIO maintainer to create another
> library without using opencv, put such library into some non-standard
> directory, and make LuxRender use such OpenImageIO library??

Actually, it appears that the libOpenImageIO.so.1.6 -> libopencv_highgui.so.3.1 may not be strictly necessary; the OpenCV APIs used by OpenImageIO were in the highgui library prior to 3.0, but are now in the videoio library (which does not depend on Qt).  Dropping that DT_NEEDED would clean up this issue without losing functionality.

I started working on some patches for OpenImageIO to do that, but I need to go AFK for a few hours.

--- Additional comment from Yaakov Selkowitz on 2017-01-18 15:48 EST ---

[Rawhide] Scratch build in progress:
https://koji.fedoraproject.org/koji/taskinfo?taskID=17324238

--- Additional comment from Yaakov Selkowitz on 2017-01-18 15:50 EST ---

[F25] Scratch build in progress:
https://koji.fedoraproject.org/koji/taskinfo?taskID=17324214

--- Additional comment from Luya Tshimbalanga on 2017-01-18 17:45:18 EST ---

Using the scratch build package, here is the result
$ luxrender
Illegal instruction (core dumped)

It looks like LuxRender will be a rebuild against the updated OpenImageIO.

--- Additional comment from Yaakov Selkowitz on 2017-01-18 18:56:24 EST ---

(In reply to Luya Tshimbalanga from comment #8)
> Using the scratch build package, here is the result
> $ luxrender
> Illegal instruction (core dumped)
> 
> It looks like LuxRender will be a rebuild against the updated OpenImageIO.

Probably not; that could be anything.  Could you install LuxRender-debuginfo and debug it?

--- Additional comment from Luya Tshimbalanga on 2017-01-18 19:58:24 EST ---

Intial resulting debug on luxrender

[New Thread 0x7fffc4756700 (LWP 2831)]

Thread 1 "luxrender" received signal SIGILL, Illegal instruction.
0x00007ffff494cda8 in _GLOBAL__sub_I_bvh_intersector_stream_filters.cpp.avx2.cpp(void) ()
    at /usr/src/debug/embree-2.13.0/kernels/bvh/bvh_intersector_stream_filters.cpp:431
431	};

Need to look further

Comment 1 Yaakov Selkowitz 2017-01-19 02:15:57 UTC
AVX2 is fairly new, does your CPU support it?

Comment 2 Luya Tshimbalanga 2017-01-19 03:33:28 UTC
(In reply to Yaakov Selkowitz from comment #1)
> AVX2 is fairly new, does your CPU support it?

No, it does not according to /proc/cpuinfo below

model name	: AMD A10-7400P Radeon R6, 10 Compute Cores 4C+6G
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext ptsc cpb hw_pstate vmmcall fsgsbase bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold overflow_recov

Comment 3 Mamoru TASAKA 2017-01-19 04:01:52 UTC
Perhaps embree is mis-interpretting that Luya's CPU supports avx2? Just taking a brief look at embree code, it does "__cpuid", and kernels/common/state.cpp seems to be judging supported instructions.

Comment 4 Luya Tshimbalanga 2017-01-19 06:04:13 UTC
I scratch build embree by setting the highest ISA to avx

Rawhide
https://koji.fedoraproject.org/koji/taskinfo?taskID=17327281

F25
https://koji.fedoraproject.org/koji/taskinfo?taskID=17327283

The bug is indeed on embree in this case and I filed it upstream
https://github.com/embree/embree/issues/112

Comment 5 Luya Tshimbalanga 2017-01-19 06:05:20 UTC
As tested the scratch build allow LuxRender running on my hardware.

Comment 6 Yaakov Selkowitz 2017-01-19 06:23:16 UTC
In that case, reassigning to embree.

Comment 7 Luya Tshimbalanga 2017-01-19 21:22:51 UTC
Upstream mentioned they lack AMD hardware for testing my case and would like to contact one of contributors to test my case. Like Mamoru quoted, embree does "__cpuid" as confirmed and the odd behaviour seems specific to AMD CPU (as tested on both Gnome Boxes running with Opteron emulation and my hardware AMD A10-7400P)
In a meanwhile, should I push the workaround in the release?

Comment 8 Yaakov Selkowitz 2017-01-19 21:47:22 UTC
(In reply to Luya Tshimbalanga from comment #7)
> In a meanwhile, should I push the workaround in the release?

Without being very familiar with the code, the workaround doesn't seem as it would be harmful, so I suppose so.  But please do add a comment to the spec for future reference, as hopefully this can sooner or later be reverted.

Comment 9 Luya Tshimbalanga 2017-01-19 23:40:59 UTC
(In reply to Yaakov Selkowitz from comment #8)
> (In reply to Luya Tshimbalanga from comment #7)
> > In a meanwhile, should I push the workaround in the release?
> 
> Without being very familiar with the code, the workaround doesn't seem as it
> would be harmful, so I suppose so.  But please do add a comment to the spec
> for future reference, as hopefully this can sooner or later be reverted.

Done. 

F25
https://bodhi.fedoraproject.org/updates/FEDORA-2017-3f28cc93c1

F24
https://bodhi.fedoraproject.org/updates/FEDORA-2017-47e62bed28

Comment 10 Fedora Update System 2017-01-29 19:26:38 UTC
embree-2.13.0-3.fc24 embree-2.13.0-3.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2017-909967fb02

Comment 11 Fedora Update System 2017-01-29 19:32:09 UTC
embree-2.13.0-3.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-d4ebfe8fa5

Comment 12 Fedora Update System 2017-01-29 19:32:42 UTC
embree-2.13.0-3.fc24 embree-2.13.0-3.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2017-909967fb02

Comment 13 Luya Tshimbalanga 2017-01-29 19:39:44 UTC
Upstream provided the fix which needs more testing. Using the above update in combination scratch build from OpenImageIO allows LuxRender to run on non-Intel system.

$ luxrender
[Lux 2017-Jan-29 11:38:38 INFO : 0] Lux version 1.6.0  Build 0 of Nov 24 2016 at 02:51:55
[Lux 2017-Jan-29 11:38:38 INFO : 0] Threads: 4

Comment 14 Fedora Update System 2017-01-31 02:51:38 UTC
embree-2.13.0-3.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-909967fb02

Comment 15 Fedora Update System 2017-01-31 03:51:15 UTC
embree-2.13.0-3.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-d4ebfe8fa5

Comment 16 Fedora Update System 2017-02-07 22:18:54 UTC
embree-2.13.0-3.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 17 Fedora Update System 2017-02-08 01:50:58 UTC
embree-2.13.0-3.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.