Bug 1999819
| Summary: | segfault due to not properly handling realloc pointer return change [rhel-7.9.z] | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Paulo Andrade <pandrade> |
| Component: | fltk | Assignee: | Jan Grulich <jgrulich> |
| Status: | CLOSED ERRATA | QA Contact: | Tomas Pelka <tpelka> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.9 | CC: | kpfleming, sbarcomb, tpelka, tpopela |
| Target Milestone: | rc | Keywords: | Reopened, Triaged, ZStream |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | fltk-1.3.4-3.el7_9 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-11-02 16:38:38 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Hi, I've been trying to reproduce this issue on RHEL 7, RHEL 8 or Fedora 35, but I'm unable to. Can you tell me how to reproduce this issue step by step? It is not trivial to reproduce. Realloc might return the same pointer, and if that happens, the problem does not happen. It might also corrupt memory that does not trigger a segfault. The problem happens when realloc cannot grow (or prefer when shrinking) the memory, and needs to relocate it. Needs to keep selecting very large amounts of text, and somehow cause some memory fragmentation. Should be easier to reproduce if incrementally making larger selections. Could also instrument with gdb, to do something like: (gdb) call malloc(65536) (gdb) call free($) on separate breakpoints around the realloc calls. Likely mixing and not using only 65536, or it might just keep using the same pointer. Based on coredump data, of the two cases, the user frequently selects very large amounts of text; likely to cut&paste from different editors or different windows, possibly on shells on different servers. I spent again some time trying to reproduce this, but I don't even get to the point where the faulty code is called. Were you able to reproduce it? Can you help me with a reproducer? I don't get the "SelectionNotify" event and not sure under what circumstances it is supposed to be send. Development Management has reviewed and declined this request. You may appeal this decision by reopening this request. Sorry, I didn't intend to close it yet. (In reply to Jan Grulich from comment #5) > I spent again some time trying to reproduce this, but I don't even get to > the point where the faulty code is called. > > Were you able to reproduce it? Can you help me with a reproducer? I don't > get the "SelectionNotify" event and not sure under what circumstances it is > supposed to be send. You should get SelectionNotify in a procedure like: 1. Use some text editor, select text. 2. Use another text editor, select new text. Likely easier to reproduce with Xt or Motif based applications. The problem happens when there is a small selection buffer, then, it is changed to a very large one, of several megabytes. The crash happens because the code does not properly handle memory relocation when realloc changes the base pointer. If realloc uses the same base pointer, no problem will happen. If you running under gdb, you can insert a few calls like: (gdb) call malloc(4096) (gdb) call malloc(65536) and other values. to force a leak and memory fragmentation, so that realloc changes the base pointer. Basically, the patch would be to change from: if (actual == fl_INCR) { bytesread = getIncrData(sn_buffer, xevent.xselection, *(long*)portion); XFree(portion); break; } ... static long getIncrData(uchar* &data, const XSelectionEvent& selevent, long lower_bound) ... if (total + num_bytes > (size_t)lower_bound) data = (uchar*)realloc(data, total + num_bytes); to: if (actual == fl_INCR) { bytesread = getIncrData(&sn_buffer, xevent.xselection, *(long*)portion); XFree(portion); break; } ... static long getIncrData(uchar **data, const XSelectionEvent& selevent, long lower_bound) ... if (total + num_bytes > (size_t)lower_bound) *data = (uchar*)realloc(*data, total + num_bytes); I'm still not able to get the code to call "getIncrData()", no idea why. Without being able to reproduce I'm not likely to move forward. What is your way to reproduce it? How do you start Tigervnc and what desktop and apps you run on the server to test this? I am not able to reproduce the issue. I saw a few reports about it. Command line appears to be somewhat like: .../usr/bin/Xvnc :1 -auth .../.Xauthority -depth 24 -desktop Xvnc -fp catalogue:/etc/X11/fontpath.d -geometry 1920x1200 -pn -rfbauth .../.vnc/passwd -rfbport 5901 -rfbwait 30000 -desktop user@hist::1 -MaxCutText 100000000 -AlwaysShared -AcceptSetDesktopSize=0 -UseIPv6=0 -dpi 100 -cc 4 Looking a bit more around, I found related link: https://github.com/TigerVNC/tigervnc/issues/961 [Support for INCR in Xvnc (large clipboard)] Maybe the above command line shows an approach for very large copy&paste. Checking coredumps, it appears user is doing copy/cut & paste of very large text buffers in the desktop; might be faster or easier to user than using scp or a network driver... I see the bug is still present in upstream fltk. I will open an upstream issue as well. Looks upstream pushed some fixes to this issue. We can backport them to RHEL 7. We would like to backport following changes: 1) https://github.com/Albrecht-S/fltk/commit/5e46e3380887dcd30dc7ea582496bbd9a6a4d70c 2) https://github.com/Albrecht-S/fltk/commit/7ed06a7b65c6f91da50b6664c742208ce520a983 (In reply to Jan Grulich from comment #12) > Looks upstream pushed some fixes to this issue. We can backport them to RHEL > 7. Jan: Before I will approve it for z-stream - would it be possible to create a scratch build with these changes included? Paulo: Could the customer test such a scratch build? Scratch build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=46481435 And because the scratch build will be gone soon I downloaded x86_64 packages and uploaded them here: https://jgrulich.fedorapeople.org/fltk/ The customer reported that the test package set worked without issues. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (fltk bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:7345 |
This is a segfault in vncviewer: (gdb) bt #0 0x00007ffff4d48387 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55 #1 0x00007ffff4d49a78 in __GI_abort () at abort.c:90 #2 0x00007ffff4d8aed7 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7ffff4e9d350 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196 #3 0x00007ffff4d93299 in malloc_printerr (ar_ptr=0x7ffff50d9760 <main_arena>, ptr=<optimized out>, str=0x7ffff4e9d478 "double free or corruption (!prev)", action=3) at malloc.c:4967 #4 _int_free (av=0x7ffff50d9760 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3843 #5 0x00007ffff7192679 in XFree (data=<optimized out>) at XlibInt.c:1590 #6 0x00007ffff7950701 in fl_handle (thisevent=...) at Fl_x.cxx:1379 #7 0x00007ffff7952532 in do_queued_events () at Fl_x.cxx:210 #8 0x00007ffff79529f3 in fl_wait (time_to_wait=<optimized out>) at Fl_x.cxx:276 #9 0x00007ffff78f386f in Fl::wait (time_to_wait=<optimized out>) at Fl.cxx:607 #10 0x000000000042bd26 in run_mainloop () at /usr/src/debug/tigervnc-1.8.0/vncviewer/vncviewer.cxx:139 #11 0x000000000041d445 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/tigervnc-1.8.0/vncviewer/vncviewer.cxx:613 (gdb) f 6 #6 0x00007ffff7950701 in fl_handle (thisevent=...) at Fl_x.cxx:1379 1379 if (sn_buffer) {XFree(sn_buffer); sn_buffer = 0;} checking the contents of sn_buffer, it is an almost 300k long selection. The problem is in the call: if (actual == fl_INCR) { bytesread = getIncrData(sn_buffer, xevent.xselection, *(long*)portion); XFree(portion); break; } because getIncrData has prototype: static long getIncrData(uchar* &data, const XSelectionEvent& selevent, long lower_bound) and the argument might change in the line: if (total + num_bytes > (size_t)lower_bound) data = (uchar*)realloc(data, total + num_bytes); so, there are 2 issues if realloc changes the pointer. First it causes a leak. Second, it causes memory corruption, as the calling code keeps using the base pointer.