1090402 – [abrt] gnuplot: el_wgets(): gnuplot-wx killed by SIGSEGV

Bug 1090402 - [abrt] gnuplot: el_wgets(): gnuplot-wx killed by SIGSEGV

Summary: [abrt] gnuplot: el_wgets(): gnuplot-wx killed by SIGSEGV

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	gnuplot
Sub Component:
Version:	20
Hardware:	x86_64
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Frantisek Kluknavsky
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:	https://retrace.fedoraproject.org/faf...
Whiteboard:	abrt_hash:2221515e0e5776172d0aabc966d...
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-04-23 09:05 UTC by Paul DeStefano
Modified:	2015-06-30 01:00 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-06-30 01:00:07 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
File: backtrace (36.73 KB, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
File: cgroup (173 bytes, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
File: core_backtrace (7.63 KB, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
File: dso_list (8.99 KB, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
File: exploitable (82 bytes, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
File: limits (1.29 KB, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
File: maps (50.17 KB, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
File: open_fds (664 bytes, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
File: proc_pid_status (964 bytes, text/plain) 2014-04-23 09:05 UTC, Paul DeStefano	no flags	Details
valgrind output (745.40 KB, text/plain) 2014-04-30 19:07 UTC, Paul DeStefano	no flags	Details
valgrind results (704.05 KB, text/plain) 2014-04-30 23:38 UTC, Paul DeStefano	no flags	Details
Show Obsolete (1) View All

Description Paul DeStefano 2014-04-23 09:05:12 UTC

Description of problem:
I typed CTRL-D to end my session, and it crashed.  This is happening, constantly, and the crashes are not all identified as the same type of crash by ABRT.

Version-Release number of selected component:
gnuplot-4.6.3-6.fc20

Additional info:
reporter:       libreport-2.2.1
backtrace_rating: 4
cmdline:        gnuplot
crash_function: el_wgets
executable:     /usr/bin/gnuplot-wx
kernel:         3.13.10-200.fc20.x86_64
runlevel:       N 5
type:           CCpp
uid:            13013

Truncated backtrace:
Thread no. 1 (8 frames)
 #0 el_wgets at read.c:717
 #1 el_gets at eln.c:80
 #2 readline at readline.c:420
 #3 readline_ipc at ../../src/readline.c:104
 #4 rlgets at ../../src/command.c:2669
 #5 gp_get_string at ../../src/command.c:2865
 #6 read_line at ../../src/command.c:2897
 #7 com_line at ../../src/command.c:314

Comment 1 Paul DeStefano 2014-04-23 09:05:16 UTC

Created attachment 888812 [details]
File: backtrace

Comment 2 Paul DeStefano 2014-04-23 09:05:18 UTC

Created attachment 888813 [details]
File: cgroup

Comment 3 Paul DeStefano 2014-04-23 09:05:22 UTC

Created attachment 888814 [details]
File: core_backtrace

Comment 4 Paul DeStefano 2014-04-23 09:05:24 UTC

Created attachment 888815 [details]
File: dso_list

Comment 5 Paul DeStefano 2014-04-23 09:05:26 UTC

Created attachment 888816 [details]
File: exploitable

Comment 6 Paul DeStefano 2014-04-23 09:05:28 UTC

Created attachment 888817 [details]
File: limits

Comment 7 Paul DeStefano 2014-04-23 09:05:30 UTC

Created attachment 888818 [details]
File: maps

Comment 8 Paul DeStefano 2014-04-23 09:05:31 UTC

Created attachment 888819 [details]
File: open_fds

Comment 9 Paul DeStefano 2014-04-23 09:05:33 UTC

Created attachment 888820 [details]
File: proc_pid_status

Comment 10 Paul DeStefano 2014-04-23 09:08:27 UTC

See bug 1081764.  Both of these are occuring very often and I cannot figure out why it's happening.  All started with F20.

Comment 11 Orion Poplawski 2014-04-23 17:11:43 UTC

Could you give this build a try?

http://koji.fedoraproject.org/koji/taskinfo?taskID=6770153

No particular expectation of a fix, but worth a shot.

Comment 12 Paul DeStefano 2014-04-28 03:01:36 UTC

Sure.  I tried the packages you built and the symptoms are unchanged.

This has cost me another weekend of system rebuilds.  I thought it might be memory corruption because the backtraces are not always the same, eventhough I'm able to reproduce the crash using the same "technique" each time.  But, now I have been able to reproduce the crash on a completely different hardware platform (my netbook), so there is a software bug.  I just can't tell you where it is.  I would love to get this fixed as it's costing me a lot of time.  Please let me know how I can help you find the problem.

Comment 13 Paul DeStefano 2014-04-28 03:40:29 UTC

FWIW, I am not able to reproduce the crash on XUbuntu.

Comment 14 Orion Poplawski 2014-04-29 18:15:56 UTC

Can you run it under valgrind and attach the output from a session that crashed?  I'm not seeing anything obvious in the backtrace.  Might be a libedit issue as well.

Comment 15 Paul DeStefano 2014-04-30 19:07:11 UTC

Created attachment 891277 [details]
valgrind output

Full valgrind output from crashed gnuplot session.  I'm very confused by the results, though; I don't know how to interpret it.

Comment 16 Orion Poplawski 2014-04-30 19:33:11 UTC

Sorry, isn't going to be much use without needed debuginfo packages installed:

debuginfo-install gnuplot

Comment 17 Paul DeStefano 2014-04-30 23:38:56 UTC

Created attachment 891341 [details]
valgrind results

Full valgrind output for session that crashed.

To cause this crash, all I did was:

> plot sin(x)

then I hit CTRL-C three or four times very quickly.

Comment 18 Orion Poplawski 2014-05-01 02:06:24 UTC

(In reply to Paul DeStefano from comment #17)
> Created attachment 891341 [details]
> valgrind results
> 
> Full valgrind output for session that crashed.

Hmm, not very instructive I'm afraid.

> then I hit CTRL-C three or four times very quickly.

I think reports from crashes that happen from normal behavior might be more interesting.

Comment 19 Paul DeStefano 2014-05-01 19:25:00 UTC

You're joking, right?  After I did what you asked you are going to dismiss this reproduible crash?

I use CTRL-C a lot to cancel command edits and start over or go to a different commands in my history.  This is perfectly reasonable user bahavior.  I may be accelerating the process a bit to save time for this report, but that doesn't change the fact gnuplot crashes constantly.  Besides, I've already explained that it happens with a variety of use cases.  This is just the most recent way it crashed.

gnuplot crashes so often that I've had to stop using it.  Can you please help?

Comment 20 Orion Poplawski 2014-05-01 19:44:56 UTC

First off - we are all volunteers here and I *am* trying to help, so let's keep it civil, okay.

I'm sorry, but the reports you've sent (from not fault of your own) are not useful for debugging the crashes.  The ctrl-c behavior seems strange and may be why this last report is not useful.  I'm hopeful that a report from a session that crashed in a more natural way will be more helpful.  But maybe it won't, and maybe this will take longer.

A gdb backtrace would be helpful too.  Run gnuplot under gdb.  After crash enter "thread apply all bt" at the (gdb) prompt.

You might get better help here: https://sourceforge.net/p/gnuplot/bugs/ since they are more familiar with the code.

Comment 21 Paul DeStefano 2014-05-02 09:37:54 UTC

(In reply to Orion Poplawski from comment #20)
> First off - we are all volunteers here and I *am* trying to help, so let's
> keep it civil, okay.

Okay, good.  I honestly thought you were blowing off this report.  If not, then I'll just say I'm a volunteer, too, and I'm sorry if you felt I was uncivil.

> I'm sorry, but the reports you've sent (from not fault of your own) are not
> useful for debugging the crashes.  The ctrl-c behavior seems strange and may
> be why this last report is not useful.  I'm hopeful that a report from a
> session that crashed in a more natural way will be more helpful.  But maybe
> it won't, and maybe this will take longer.

Are you saying that CTRL-C might be useful becuase it was running under valgrind at the time.  Sort of like CTRL-C might have gone to valgrind, and not gnuplot.  Because I can understand that; I don't know if valgrind gets CTRL-C and, if it does, if it passes it on or what.

But, I don't know why you think this is such an unusual case.  If you are editing a command in you history and you decided not to run that command, how do you cancel the edit and start over?  This happens to me constantly.  And, I've never noticed a correlation with crashing before F20.

> A gdb backtrace would be helpful too.  Run gnuplot under gdb.  After crash
> enter "thread apply all bt" at the (gdb) prompt.

Okay!  Yes, I can do this.

> You might get better help here: https://sourceforge.net/p/gnuplot/bugs/
> since they are more familiar with the code.

Maybe, but this all started with F20, and, moreover, it doesn't happen on Ubuntu and that's the same version of gnuplot.

Comment 22 Orion Poplawski 2014-05-02 14:51:46 UTC

What version on libedit is on your ubuntu Install?

Comment 23 Paul DeStefano 2014-05-03 23:20:46 UTC

The libedit2 pkg says version 3.1-2013712-1

I just got the notice of a new major version, so I'm upgrading my XUbuntu system, now.

Comment 24 Orion Poplawski 2014-05-09 16:34:39 UTC

Can you try installing an updated libedit from here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=6832084

and see if that helps?

Comment 25 Paul DeStefano 2014-06-04 09:21:04 UTC

Sorry, I didn't forget; just took me while to get back to this.

I'm using a VM to do some F20 testing since I've had a couple very sudden and strange problems.  I installed F20 fresh and then installed koji build libedit-3.1-5.20140213cvs.fc21.x86_64.  The problem was reproducible very easy in using the "plot sin(x) then fast CTRL-C" method I've been using.

Also got you a backtrace with all debuginfos:

Program received signal SIGSEGV, Segmentation fault.
0x00007fada440c0a2 in el_wgets (el=el@entry=0x11f9a60, nread=0x0, 
    nread@entry=0x7fffb47f17f4) at read.c:717
717			*nread = num != -1 ? num : 0;
(gdb) bt
#0  0x00007fada440c0a2 in el_wgets (el=el@entry=0x11f9a60, nread=0x0, 
    nread@entry=0x7fffb47f17f4) at read.c:717
#1  0x00007fada441c10d in el_gets (el=0x11f9a60, 
    nread=nread@entry=0x7fffb47f17f4) at eln.c:80
#2  0x00007fada4417470 in readline (p=0x50e30e "gnuplot> ") at readline.c:427
#3  0x000000000047e875 in readline_ipc (prompt=<optimized out>)
    at ../../src/readline.c:104
#4  0x000000000041eaa7 in rlgets (prompt=0x50e30e "gnuplot> ", n=1024, 
    s=0x11f3100 "") at ../../src/command.c:2669
#5  gp_get_string (prompt=0x50e30e "gnuplot> ", len=1024, 
    buffer=0x11f3100 "") at ../../src/command.c:2865
#6  read_line (prompt=prompt@entry=0x50e30e "gnuplot> ", 
    start=<optimized out>, start@entry=0) at ../../src/command.c:2897
#7  0x0000000000421e7c in com_line () at ../../src/command.c:314
#8  0x0000000000415a76 in main (argc=0, argv=0x7fffb47f1b78)
    at ../../src/plot.c:684

second try:
Program received signal SIGSEGV, Segmentation fault.
0x00007fe9f98330a2 in el_wgets (el=el@entry=0xeb6a60, nread=0x0, 
    nread@entry=0x7fffdf23afd4) at read.c:717
717			*nread = num != -1 ? num : 0;
(gdb) bt
#0  0x00007fe9f98330a2 in el_wgets (el=el@entry=0xeb6a60, nread=0x0, 
    nread@entry=0x7fffdf23afd4) at read.c:717
#1  0x00007fe9f984310d in el_gets (el=0xeb6a60, 
    nread=nread@entry=0x7fffdf23afd4) at eln.c:80
#2  0x00007fe9f983e470 in readline (p=0x50e30e "gnuplot> ") at readline.c:427
#3  0x000000000047e875 in readline_ipc (prompt=<optimized out>)
    at ../../src/readline.c:104
#4  0x000000000041eaa7 in rlgets (prompt=0x50e30e "gnuplot> ", n=1024, 
    s=0xeb0100 "") at ../../src/command.c:2669
#5  gp_get_string (prompt=0x50e30e "gnuplot> ", len=1024, buffer=0xeb0100 "")
    at ../../src/command.c:2865
#6  read_line (prompt=prompt@entry=0x50e30e "gnuplot> ", 
    start=<optimized out>, start@entry=0) at ../../src/command.c:2897
#7  0x0000000000421e7c in com_line () at ../../src/command.c:314
#8  0x0000000000415a76 in main (argc=0, argv=0x7fffdf23b358)
    at ../../src/plot.c:684

Does this help?

Comment 26 Orion Poplawski 2014-06-04 14:46:55 UTC

Re-assigning to libedit to get some more eyes on this, but I don't understand it.  First thought is "oh, nread is set to null", but read.c has this at the top:

        if (nread == NULL)
                nread = &nrb;
        *nread = 0;

So it should already handle this case.  So it still makes no sense to me.

Next time in gdb, do a "print nread".

Comment 27 Jerry James 2014-06-04 15:24:28 UTC

Thread 1 is the crasher:
Thread 1 (Thread 0x7f68f2773a40 (LWP 20072)):
#0  0x0000003c27213f92 in el_wgets (el=el@entry=0x117df80, nread=0x0, nread@entry=0x7fff5bc158b4) at read.c:717
        retval = <optimized out>
        cmdnum = <optimized out>
        num = 185
        ch = 10 L'\n'
        cp = <optimized out>
        crlf = 0
        nrb = 32616

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000003c27213f92 in el_wgets (el=el@entry=0x117df80, nread=0x0, nread@entry=0x7fff5bc158b4) at read.c:717
717			*nread = num != -1 ? num : 0;

And valgrind reports:

==27740== Invalid write of size 8
==27740==    at 0x3C27E0EC18: recv (recv.c:35)
==27740==    by 0x4D6C067: ???
==27740==    by 0xFFEFFF57F: ???
==27740==    by 0xFFEFFF58F: ???
==27740==  Address 0xffefff448 is on thread 1's stack

which must be how nread is being set to NULL, because it isn't possible for it
to be NULL under normal circumstances.  I have tried for about 10 minutes to reproduce this, and have not been able to get it to crash, so clearly I'm not doing whatever causes that recv() invocation.

Can you reproduce under valgrind again but with debuginfo packages installed, so we can see where that recv() call is made?  I don't see any recv() calls in either libedit or gnuplot.

Comment 28 Paul DeStefano 2014-06-04 21:43:41 UTC

(In reply to Orion Poplawski from comment #26)
> Next time in gdb, do a "print nread".
Okay, I can do that.

(In reply to Jerry James from comment #27)
> Can you reproduce under valgrind again but with debuginfo packages
> installed, so we can see where that recv() call is made?  I don't see any
> recv() calls in either libedit or gnuplot.

Yes, I can do that.  But, I installed a ton of debuginfos already.  I can't tell what infos I'm still missing.  I actually whent through three stages of gdb telling me "use debuginfo-install ...".  It stopped saying that, so I figured I had finally got them all, but I guess not.  I thought debuginfo-install was supposed to get everything.  Is there a better way to install debuginfos for a particular application?

Comment 29 Jerry James 2014-06-04 21:59:57 UTC

Oh, I'm sorry.  My mistake.  I was looking at the first valgrind run you did, before you installed debuginfo packages.

The valgrind run from after installing debuginfo packages isn't much help, either.  It shows the probable cause of the stack corruption that leads to the crash:

==23580== Invalid write of size 8
==23580==    at 0x3C276EA9E4: ??? (syscall-template.S:81)
==23580==    by 0xFFEFFF79F: ???
==23580==  Address 0xffefff648 is on thread 1's stack

but without any useful information to show where that call came from.  Hmmm.  Is something using a separate signal stack, perhaps?

Comment 30 Jerry James 2014-06-04 22:20:24 UTC

I just took a quick look at bug 1081764.  It appears similar: a variable that has already been dereferenced is suddenly NULL, leading to the crash.  Something seems to be writing zeroes over memory it shouldn't be touching.

I still cannot reproduce the crash, by the way.  I've tried quite a few times now, and I never get the crash you are seeing.  I'm doing "plot sin(x)" and hitting Ctrl-C like mad.  That's your recipe, right?

If you feel adventuresome, you might try passing the --vgdb and --vgdb-error options to valgrind to see if you can catch that invalid write in action and try to figure out what is causing it.  It was the 3rd error in the first valgrind output you attached, and the 4th error in the second.

Comment 31 Paul DeStefano 2014-06-04 22:30:53 UTC

(In reply to Jerry James from comment #29)
> Oh, I'm sorry.  My mistake.  I was looking at the first valgrind run you
> did, before you installed debuginfo packages.

Actually I think the mistake was mine: you said run *valgrind* again.  Your right, I didn't have many debuginfos installed for my valgrind run.  I was thinking of running gdb, again.  So, no problem!  I'm running gdb just to get it to complain about missing debuginfo packages.  When it stops, I'll run valgrind and hopefully that will give us what we need.

(I don't understand why 'debuginfo-install gnuplot' says everything is installed and even gdb doesn't complain on start up.  But after the crash, gdb says it needs more infos.)

(In reply to Jerry James from comment #30)
> I still cannot reproduce the crash, by the way.  I've tried quite a few
> times now, and I never get the crash you are seeing.  I'm doing "plot
> sin(x)" and hitting Ctrl-C like mad.  That's your recipe, right?

Yes, that's about it.  It typically happens after a handfull of CTRL-C's, but if not, I just keep it depressed and let the keyboard repeat rate kick in, and that does it.

> If you feel adventuresome, you might try passing the --vgdb and --vgdb-error
> options to valgrind to see if you can catch that invalid write in action and
> try to figure out what is causing it.  It was the 3rd error in the first
> valgrind output you attached, and the 4th error in the second.

Sure, no problem.

Comment 32 Paul DeStefano 2014-06-05 00:26:09 UTC

Well, I take it back.  I'm having a hard time putting all these pieces together.  When I run gnuplot under valgrind now, it just dies when I CLTRL-C, it doesn't segfault. Looks like this:

gnuplot> plot sin(x)
gnuplot> Killed
$

Under gdb, I can get you a new bt with more debuginfos:

gnuplot> 
Program received signal SIGSEGV, Segmentation fault.
0x00000031d56ec703 in select () at ../sysdeps/unix/syscall-template.S:81
81	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) ^CQuit
(gdb) ^CQuit
(gdb) ^CQuit
(gdb) bt
#0  0x00000031d56ec703 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000000000503bd4 in wxt_waitforinput ()
    at ../../src/wxterminal/wxt_gui.cpp:3458
#2  0x000000000047e85a in getc_wrapper (fp=0x0) at ../../src/readline.c:82
#3  0x00000031e722029f in _getc_function (el=<optimized out>, 
    c=0x7fffffffdcc8 "\266\200E") at readline.c:221
#4  0x00000031e721392e in el_wgetc (el=el@entry=0xa0aa60, 
    cp=cp@entry=0x7fffffffdcc8 L"\x4580b6") at read.c:439
#5  0x00000031e7213bcf in read_getcmd (ch=0x7fffffffdcc8 L"\x4580b6", 
    cmdnum=<synthetic pointer>, el=0xa0aa60) at read.c:247
#6  el_wgets (el=el@entry=0xa0aa60, nread=nread@entry=0x7fffffffdd34)
    at read.c:586
#7  0x00000031e722407d in el_gets (el=0xa0aa60, 
    nread=nread@entry=0x7fffffffdd34) at eln.c:80
#8  0x00000031e7220b20 in readline (p=0x50e30e "gnuplot> ") at readline.c:420
#9  0x000000000041eaa7 in read_line (prompt=<optimized out>, 
    start=<optimized out>) at ../../src/command.c:2669
#10 0x0000000000421e7c in com_line () at ../../src/command.c:314
#11 0x0000000000415a76 in main (argc=0, argv=0x7fffffffe0b8)
    at ../../src/plot.c:684


I was able to get valgrind w/ gdbserver running and talking to gdb.  But, I have to do a number of machinations to get gdb NOT to convert interrupts into stop events.  And, even then, I see a lot of SIGTRAPs before it finally dies and it's not the same as under normal conditions.

-- Here is the GDB end of it: --

(gdb) 
(gdb) c
Continuing.

Program received signal SIGINT, Interrupt.
[New Thread 1932]

Program received signal SIGINT, Interrupt.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00000031d5e0ec18 in __libc_recv (fd=0, buf=0x4d10a04, n=4096, flags=-706679797) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:35
35	  LIBC_CANCEL_RESET (oldtype);
(gdb) c
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
__pthread_disable_asynccancel () at ../nptl/sysdeps/unix/sysv/linux/x86_64/cancellation.S:104
104	1:	ret
(gdb) 
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
__libc_recv (fd=0, buf=0x4d10a04, n=4096, flags=-706679797) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:38
38	}
(gdb) 
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00000031d5e0ec25 in __libc_recv (fd=0, buf=0x4d10a04, n=4096, flags=-706679797) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:38
38	}
(gdb) 
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00000031d5e0ec26 in __libc_recv (fd=0, buf=0x4d10a04, n=4096, flags=-706679797) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:38
38	}
(gdb) 
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x0000000ffefff988 in ?? ()
(gdb) 
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000000ffefff988 in ?? ()
(gdb) bt
#0  0x0000000ffefff988 in ?? ()
#1  0x0000000ffefff8b0 in ?? ()
#2  0x0000000ffefff8a0 in ?? ()
#3  0x0000000000000000 in ?? ()
(gdb) c
Continuing.

Program terminated with signal SIGKILL, Killed.
The program no longer exists.
(gdb) 

-- Here is part of the valgrind file: --

==1931== Memcheck, a memory error detector
==1931== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==1931== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==1931== Command: gnuplot
==1931== Parent PID: 1461
==1931== 
==1931== 
==1931== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==1931==   /path/to/gdb gnuplot
==1931== and then give GDB the following command
==1931==   target remote | /usr/lib64/valgrind/../../bin/vgdb --pid=1931
==1931== --pid is optional if only one valgrind process is running
==1931== 
==1931== 
==1931== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==1931==   /path/to/gdb gnuplot
==1931== and then give GDB the following command
==1931==   target remote | /usr/lib64/valgrind/../../bin/vgdb --pid=1931
==1931== --pid is optional if only one valgrind process is running
==1931== 
==1931== Invalid write of size 8
==1931==    at 0x31D5E0EC18: recv (recv.c:35)
==1931==    by 0x4CBC9C7: ???
==1931==    by 0xFFEFFF97F: ???
==1931==    by 0xFFEFFF98F: ???
==1931==  Address 0xffefff848 is on thread 1's stack
==1931== 
==1931== (action on error) vgdb me ... 
==1931== Continuing ...
==1931== Invalid read of size 8
==1931==    at 0x31D5E0E51F: __pthread_disable_asynccancel (cancellation.S:104)
==1931==    by 0x31D5E0EC1C: recv (recv.c:35)
==1931==    by 0xFFEFFF987: ???
==1931==    by 0xFFEFFF8AF: ???
==1931==    by 0xFFEFFF89F: ???
==1931==  Address 0xffefff848 is on thread 1's stack
... repeats many times ...
... heap summary ...
==1931== LEAK SUMMARY:
==1931==    definitely lost: 4,048 bytes in 9 blocks
==1931==    indirectly lost: 21,728 bytes in 890 blocks
==1931==      possibly lost: 69,992 bytes in 795 blocks
==1931==    still reachable: 2,819,834 bytes in 17,474 blocks
==1931==         suppressed: 0 bytes in 0 blocks
==1931== Reachable blocks (those to which a pointer was found) are not shown.
==1931== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1931== 
==1931== For counts of detected and suppressed errors, rerun with: -v
==1931== ERROR SUMMARY: 760 errors from 760 contexts (suppressed: 2 from 2)

See the test8 attachment for full valgrind output. 

I see that there are still debuginfos missing, but I don't know who recv.c belongs to.  Is this recv(2) as in sockets?

Comment 33 Boris Ranto 2014-11-20 15:08:53 UTC

OK, so me and a colleague of mine took a look at this and here is what we could come up with:

1.) The issue was introduced with the following patch:

http://pkgs.fedoraproject.org/cgit/gnuplot.git/tree/gnuplot-4.6.4-singlethread.patch?h=f20

If you compile gnuplot without it, the problem goes away.


2.) It does not seem to be libedit related at all as I got few call traces where it crashed without any libedit involvement whatsoever.



Afterwards, I was able to find the problem that the patch was supposed to fix:

http://gnuplot.10905.n7.nabble.com/wxGtk-crash-on-haswell-td17944.html

Considering the fact that the issue was haswell-specific and the report specifically mentions the problematic xbegin/xend instructions I'm inclined to believe that it was related to the following bug (although the bz is filed against f21, maybe the reporter updated the microcode manually?):

https://bugzilla.redhat.com/show_bug.cgi?id=1146967

I could not hit the issue specified in the nabble.com link when I compiled gnuplot without the patch (e.g. plot sin(x) works perfectly fine on my haswell machine without the patch) so hopefully, it is safe to revert the patch, now.

-> reassigning back to gnuplot.

Comment 34 Fedora End Of Life 2015-05-29 11:39:09 UTC

This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 35 Fedora End Of Life 2015-06-30 01:00:07 UTC

Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.