Bug 1605231 - blas_shutdown segment fault in version 0.3.1
Summary: blas_shutdown segment fault in version 0.3.1
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: openblas
Version: 28
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Susi Lehtola
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-20 13:49 UTC by Yupeng Chang
Modified: 2018-08-14 09:33 UTC (History)
2 users (show)

Fixed In Version: openblas-0.3.1-2.fc28 openblas-0.3.2-1.fc27
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-29 03:20:40 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Patch to fix this blas_shutdown segment fault (458 bytes, patch)
2018-07-20 13:49 UTC, Yupeng Chang
no flags Details | Diff

Description Yupeng Chang 2018-07-20 13:49:39 UTC
Created attachment 1464954 [details]
Patch to fix this blas_shutdown segment fault

Description of problem:

I built caffe from source code, and use pycaffe interface to program.
When I import caffe in python3, and press Ctrl+D, segment fault reports

Version-Release number of selected component (if applicable):
0.3.1

How reproducible:
Every time it runs segment fault occurs

Steps to Reproduce:
1. build caffe from source code(https://github.com/BVLC/caffe) and install caffe
2. run python3
3. import caffe
4. press Ctrl+D

Actual results:
lldb python3
(lldb) target create "python3"
Current executable set to 'python3' (x86_64).
(lldb) run
Process 26585 launched: '/usr/bin/python3' (x86_64)
Python 3.6.6 (default, Jul 19 2018, 14:25:17) 
[GCC 8.1.1 20180712 (Red Hat 8.1.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import caffe
>>> 
Process 26585 stopped
* thread #1, name = 'python3', stop reason = signal SIGSEGV: invalid address (fault address: 0x7fffd9fff008)
    frame #0: 0x00007fffe6fb01b3 libopenblasp.so.0`blas_shutdown at memory.c:1281
   1278     for (pos = 0; pos < BUFFERS_PER_THREAD; pos ++){
   1279       struct alloc_t *alloc_info = local_memory_table[thread][pos];
   1280       if (alloc_info) {
-> 1281         alloc_info->release_func(alloc_info);
   1282         alloc_info = (void *)0;
   1283       }
   1284     }
(lldb) bt
* thread #1, name = 'python3', stop reason = signal SIGSEGV: invalid address (fault address: 0x7fffd9fff008)
  * frame #0: 0x00007fffe6fb01b3 libopenblasp.so.0`blas_shutdown at memory.c:1281
    frame #1: 0x00007fffe6d82015 libopenblasp.so.0`gotoblas_quit at memory.c:1470
    frame #2: 0x00007ffff7de58e6 ld-linux-x86-64.so.2`_dl_fini + 518
    frame #3: 0x00007ffff6b3a72c libc.so.6`__run_exit_handlers + 316
    frame #4: 0x00007ffff6b3a85c libc.so.6`__GI_exit + 28
    frame #5: 0x00007ffff6b24252 libc.so.6`__libc_start_main + 242
    frame #6: 0x0000555555554e1a python3`_start + 42
(lldb) c
Process 26585 resuming
Process 26585 exited with status = 11 (0x0000000b)

Expected results:
program exits normally

Additional info:
I think I have found the root cause.
Look at the code on line 1279 and 1280
on line 1279 a local variable "alloc_info" is declared, and initialized by value of "local_memory_table[thread][pos]"
on lone 1280 this local variable "alloc_info" is set to NULL,
however, the original location "local_memory_table[thread][pos]" still remains the same, it is NOT set to NULL.

If libopenblasp.so is loaded multiple times in the memory, the first time blas_shutdown() runs normally, the second time blas_shutdown() runs into segment fault.

The solution is to change line 1280 from "alloc_info = (void *)0;" to "local_memory_table[thread][pos] = (void *)0;"

I have created a patch based on openblas version 0.3.1, which is attached.
And I have tested this patch, after applying this patch, and rebuild openblas.
It doesn't crash anymore.

Comment 1 Susi Lehtola 2018-07-21 10:49:44 UTC
Ah, I see you already filed a ticket upstream - thanks. Next time a link will suffice. I'll patch the package as soon as the commit is merged upstream.

Comment 2 Yupeng Chang 2018-07-21 15:20:39 UTC
(In reply to Susi Lehtola from comment #1)
> Ah, I see you already filed a ticket upstream - thanks. Next time a link
> will suffice. I'll patch the package as soon as the commit is merged
> upstream.

I reported this issue to RedHat bugzilla, then I realized that I should also report this issue to upstream developer.
Next time I'll report issues to upstream, then add the link here.

It seems that upstream has added this issue to 0.3.2 milestone
You can find see the status change in 
 https://github.com/xianyi/OpenBLAS/issues/1692

Comment 3 Susi Lehtola 2018-07-21 19:49:13 UTC
Yes, meaning they agree the problem exists and will be addressed before 0.3.2 is released.

But, your patch has not yet been accepted.

Comment 4 Yupeng Chang 2018-07-22 09:06:12 UTC
(In reply to Susi Lehtola from comment #3)
> Yes, meaning they agree the problem exists and will be addressed before
> 0.3.2 is released.
> 
> But, your patch has not yet been accepted.

They have just merged this patch.
https://github.com/martin-frbg/OpenBLAS/commit/43ac839c168c652e52320267b0504e6933cb9f60

It seems that this patch fixes multiple bugs.

Comment 5 Yupeng Chang 2018-07-22 09:12:19 UTC
(In reply to Susi Lehtola from comment #3)
> Yes, meaning they agree the problem exists and will be addressed before
> 0.3.2 is released.
> 
> But, your patch has not yet been accepted.

The developer submitted the pull request, but it has not been merged yet.
I'm too anxious. :-D

But, this patch indeed fix some bugs.
Hope the pull request can be accepted soon.

Comment 6 Susi Lehtola 2018-07-22 10:29:01 UTC
(In reply to Yupeng Chang from comment #4)
> It seems that this patch fixes multiple bugs.

It also doesn't fix multiple bugs, just the bug you reported with multiple instances ;)

Comment 7 Yupeng Chang 2018-07-22 14:49:51 UTC
(In reply to Susi Lehtola from comment #6)
> (In reply to Yupeng Chang from comment #4)
> > It seems that this patch fixes multiple bugs.
> 
> It also doesn't fix multiple bugs, just the bug you reported with multiple
> instances ;)

The pull request is finally merged into OpenBlas Development branch.
Here is the link: https://github.com/xianyi/OpenBLAS/pull/1695

:D

Comment 8 Fedora Update System 2018-07-23 13:57:14 UTC
openblas-0.3.1-2.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-8c55eef389

Comment 9 Fedora Update System 2018-07-23 13:57:27 UTC
openblas-0.3.1-2.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-af4c4ac774

Comment 10 Fedora Update System 2018-07-23 19:41:00 UTC
openblas-0.3.1-2.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-8c55eef389

Comment 11 Fedora Update System 2018-07-23 22:22:32 UTC
openblas-0.3.1-2.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-af4c4ac774

Comment 12 Fedora Update System 2018-07-29 03:20:40 UTC
openblas-0.3.1-2.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 13 Fedora Update System 2018-08-04 11:16:05 UTC
openblas-0.3.2-1.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-d71ca19f18

Comment 14 Fedora Update System 2018-08-04 20:32:09 UTC
openblas-0.3.2-1.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-d71ca19f18

Comment 15 Fedora Update System 2018-08-09 16:52:13 UTC
openblas-0.3.2-1.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.