Red Hat Bugzilla – Bug 991534
stack smash in gnuplot-wx -persist
Last modified: 2013-11-11 09:29:10 EST
Created attachment 782070 [details]
Don't use atexit() handler to run application event loop
Description of problem:
When an error occurs in the gnuplot execution engine during redraw (for whatever reason), gnuplot aborts with a stack smash error (detected by gcc's -fstack-protector).
Version-Release number of selected component (if applicable):
Spotted in gnuplot-4.6.1-4.fc18.x86_64, verified still present in upstream 4.6.3 (but not reported upstream).
Reliably for 4.6.1, Fedora and upstream.
Almost certainly for 4.6.3, assuming you can find some way of generating an error within the script engine.
Steps to Reproduce:
1. echo "plot x" | gnuplot-wx -persist
2. in the resulting window, zoom in by right-clicking, dragging out a rectangle, then clicking again.
If you strace -f the gnuplot process you can see that it prints out a syntax error immediately before crashing. The exact error depends on the nature of the plot command used: with the above it complains that substr() must take integer arguments, but others are possible. That is actually a different bug which I'll raise separately. This bug is about the crash.
Digging in, it appears that to implement -persist gnuplot behaves mostly exactly the same as if -persist were not specified: in particular the EOF at the end of input causes it to exit the command interpreter and fall of the end of main(), to exit the process. However the graphic "terminal" handlers (wxt/qt) register atexit() handlers to regain control after main(), and attempt to continue running their respective event loops then.
This in itself is horrible: other things might register atexit() cleanup handlers, and you can't know for sure that you are the top-most handler and therefore that no critical subsystem you rely on hasn't been shut down already. Trying to run your whole program in an atexit() handler is insane!
Here things are worse however: the gnuplot command processor has no inline error handling, but calls setjmp() at the outermost level (within main() itself), and on any parse or execution error in the script engine will longjmp() back there to continue reading input at the next line.
Due to the other bug, we get such an error on the zoom operation, and longjmp() back to main (which due to already having seen EOF immediately exits again), but do so from an atexit() handler after main has already returned once. This is an absolute no-no, longjmp() must never try to re-enter a stack frame that has already exited - that is undefined behaviour. Attempting to use longjmp() to leave an atexit() handler is also, and separately, undefined behaviour. Often you used to not notice this, but -fstack-protector in this case specifically notices that we are leaving main() twice.
It looks like this bug is the cause of several existing WONTFIX/INSUFFICIENT_DATA abrt reports.
gnuplot appears to have two types of atexit() handler. Calls of raw atexit() look to be normal cleanup handlers and so are probably safe. atexit() wrapped by the GP_ATEXIT() macro are used solely by gnuplot's terminal and history mechanisms. By simply redefining this macro to implement a custom handler-stack mechanism, and calling the stack from within the end of main(), we avoid this undefined behaviour and resulting crash. See attached patch.
(Separate syntax-error bug raised as bug 991536)
Thanks for your excellent report. Neither upstream nor me could reproduce the crash in Gnuplot 4.6.4, which will appear in Fedora 20. Closing as NEXTRELEASE. If the problem persists, please reopen the issue.