Description of problem: SEGV in a Mesa application FreeCAD on Intel 82Q35 Express graphics card Version-Release number of selected component (if applicable): mesa-libGL-devel-9.0.3-1.fc18.x86_64 How reproducible: every time Steps to Reproduce: 1. install freecad (currently need to enable testing because python-collada is still in QA: yum --enablerepo updates-testing install freecad 2. run freecad, click on one of the example projects in the main window 3. observe crash Actual results: SEGV Expected results: no SEGV Additional info:
Debugger session: Program received signal SIGSEGV, Segmentation fault. intel_miptree_unmap (intel=0x299bb10, mt=0x0, level=0, slice=0) at intel_mipmap_tree.c:1707 intel_miptree_unmap(struct intel_context *intel, 1703 struct intel_mipmap_tree *mt, 1704 unsigned int level, 1705 unsigned int slice) 1706 { 1707 if (mt->num_samples <= 1) 1708 intel_miptree_unmap_singlesample(intel, mt, level, slice); 1709 else 1710 intel_miptree_unmap_multisample(intel, mt, level, slice); 1711 } (gdb) p mt $5 = (struct intel_mipmap_tree *) 0x0 (gdb) up #1 0x00007fffede3e8bf in unmap_attachment (ctx=ctx@entry=0x2fbccc0, fb=fb@entry=0x41125a0, buffer=buffer@entry=BUFFER_DEPTH) at ../../../src/mesa/swrast/s_renderbuffer.c:611 611 ctx->Driver.UnmapRenderbuffer(ctx, rb); (gdb) p ctx->Driver.UnmapRenderbuffer $8 = (void (*)(struct gl_context *, struct gl_renderbuffer *)) 0x7fffee26a950 <intel_unmap_renderbuffer>
This seems to be a problem with the specific hardware drivers. For sanity check, I just confirmed that it's not a problem on my other hardware (ATI radeon).
another GDB session, so the actual addresses might be different. There is a strange disconnect between the callback value in the second-level call site, calling ctx->Driver.UnmapRenderbuffer: p ctx->Driver.UnmapRenderbuffer $3 = (void (*)(struct gl_context *, struct gl_renderbuffer *)) 0x7fffe53c71d0 <intel_unmap_renderbuffer> and the actual called procedure as seen in the top stack frame: (gdb) down #0 intel_miptree_unmap (intel=0x91d9f0, mt=0x0, level=0, slice=0) at intel_mipmap_tree.c:1751 1751 { (gdb) p intel_miptree_unmap $4 = {void (struct intel_context *, struct intel_mipmap_tree *, unsigned int, unsigned int)} 0x7fffe5399410 <intel_miptree_unmap> In other words, we call one address (intel_unmap_renderbuffer) and end up at another (intel_miptree_unmap). Stack corruption? some macro magic?
OK, the previous comment brings up a red herring---it's just tail call optimization (intel_unmap_renderbuffer() is a short procedure in intel_fbo.c:114) that calls intel_miptree_unmap() just before returning, which overwrites the stack frame) 162 intel_unmap_renderbuffer(struct gl_context *ctx, 163 struct gl_renderbuffer *rb) 164 { 165 struct intel_context *intel = intel_context(ctx); 166 struct swrast_renderbuffer *srb = (struct swrast_renderbuffer *)rb; 167 struct intel_renderbuffer *irb = intel_renderbuffer(rb); 168 169 DBG("%s: rb %d (%s)\n", __FUNCTION__, 170 rb->Name, _mesa_get_format_name(rb->Format)); 171 172 if (srb->Buffer) { 173 /* this is a malloc'd renderbuffer (accum buffer) */ 174 /* nothing to do */ 175 return; 176 } 177 178 intel_miptree_unmap(intel, irb->mt, irb->mt_level, irb->mt_layer); and the problem is that the inline proc intel_renderbuffer(rb) (defined in mesa/drivers/dri/intel/intel_fbo.h) returns a partially filled irb structure, whose most members, including irb->mt, are null. static INLINE struct intel_renderbuffer * intel_renderbuffer(struct gl_renderbuffer *rb) { struct intel_renderbuffer *irb = (struct intel_renderbuffer *) rb; if (irb && irb->Base.Base.ClassID == INTEL_RB_CLASS) { /*_mesa_warning(NULL, "Returning non-intel Rb\n");*/ return irb; } else return NULL; } the generic gl_renderbuffer * rb does not have the mt* fields, so they end up as NULLs. I conclude that NULL is a valid value, and users of intel_renderbuffer have to check for NULLs. Therefore, I suggest that the fix is to call intel_mipmap_tree.c conditionally in intel_fbo.c:178 if (irb->mt) intel_miptree_unmap(intel, irb->mt, irb->mt_level, irb->mt_layer); but I don't really understand this code so I can't say for sure if this isn't just masking the real problem which may be incorrect creation of the intel_renderbuffer structure. I also haven't checked other situations where the intel_renderbuffer structure is used---there may be more places where such test is required to prevent NULL dereferences.
Every call to intel_unmap_renderbuffer follows a call to intel_map_renderbuffer. The two calls are always paired together. Interestintly, intel_map_renderbuffer *does* check for `irb->mt == NULL`. Since the two functions are a pair, then intel_unmap_renderbuffer should check too. Przemek, does adding `if (!irb->mt) return;` to intel_unmap_renderbuffer fix this bug? If so, then I think it's a good fix. ---- About your comment: "I also haven't checked other situations where the intel_renderbuffer structure is used---there may be more places where such test is required to prevent NULL dereferences." The map/unmap functions are called during special circumstances. Just because we need to check for `irb->mt == NULL` in the map/unmap functions does not necessarily mean we need to do the check in all intel_renderbuffer functions. We may indeed need additional `irb->mt` checks, but this bug isn't a strong indicator of that.
Unfortunately, just adding a NULL test does not fix things. There's no crash, but the 3D buffer simply retains the previous pixels and no new content ever appears.
Created attachment 735004 [details] NULL test patch when intel_unmap_renderbuffer is calling intel_miptree_unmap
Created attachment 735005 [details] patch for mesa.spec adding the associated patch
Just wanted to mention that this bug has pretty wide consequences on Intel graphic hardware, in that it crashes on startup most 3D applications (FreeCAD, meshlab, etc)
Kevin Fenzi noticed that this seems to be related to https://bugzilla.redhat.com/show_bug.cgi?id=946960 https://retrace.fedoraproject.org/faf/problems/786379/
This message is a reminder that Fedora 18 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 18. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '18'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 18's end of life. Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 18 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 18's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.