922224 – NULL dereference SEGV due to unchecked NULL entries in struct intel_renderbuf in Intel Mesa drivers

Bug 922224 - NULL dereference SEGV due to unchecked NULL entries in struct intel_renderbuf in Intel Mesa drivers

Summary: NULL dereference SEGV due to unchecked NULL entries in struct intel_renderbuf...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	mesa
Sub Component:
Version:	18
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Adam Jackson
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-03-15 19:15 UTC by Przemek Klosowski
Modified:	2014-02-05 20:00 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2014-02-05 20:00:03 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
NULL test patch when intel_unmap_renderbuffer is calling intel_miptree_unmap (507 bytes, patch) 2013-04-12 21:31 UTC, Przemek Klosowski	no flags	Details \| Diff
patch for mesa.spec adding the associated patch (921 bytes, patch) 2013-04-12 21:34 UTC, Przemek Klosowski	no flags	Details \| Diff
View All

Description Przemek Klosowski 2013-03-15 19:15:59 UTC

Description of problem: 
SEGV in a Mesa application FreeCAD on Intel 82Q35 Express graphics card

Version-Release number of selected component (if applicable):
mesa-libGL-devel-9.0.3-1.fc18.x86_64


How reproducible: every time


Steps to Reproduce:
1. install freecad (currently need to enable testing because python-collada is still in QA: 
 yum --enablerepo updates-testing install freecad
2. run freecad, click on one of the example projects in the main window
3. observe crash
  
Actual results: SEGV


Expected results: no SEGV


Additional info:

Comment 1 Przemek Klosowski 2013-03-15 19:20:28 UTC

Debugger session:

Program received signal SIGSEGV, Segmentation fault.
intel_miptree_unmap (intel=0x299bb10, mt=0x0, level=0, slice=0) at intel_mipmap_tree.c:1707

	intel_miptree_unmap(struct intel_context *intel,
1703			    struct intel_mipmap_tree *mt,
1704			    unsigned int level,
1705			    unsigned int slice)
1706	{
1707	   if (mt->num_samples <= 1)
1708	      intel_miptree_unmap_singlesample(intel, mt, level, slice);
1709	   else
1710	      intel_miptree_unmap_multisample(intel, mt, level, slice);
1711	}

(gdb) p mt
$5 = (struct intel_mipmap_tree *) 0x0


(gdb) up
#1  0x00007fffede3e8bf in unmap_attachment (ctx=ctx@entry=0x2fbccc0, fb=fb@entry=0x41125a0, buffer=buffer@entry=BUFFER_DEPTH)
    at ../../../src/mesa/swrast/s_renderbuffer.c:611
611	      ctx->Driver.UnmapRenderbuffer(ctx, rb);

(gdb) p ctx->Driver.UnmapRenderbuffer
$8 = (void (*)(struct gl_context *, struct gl_renderbuffer *)) 0x7fffee26a950 <intel_unmap_renderbuffer>

Comment 2 Przemek Klosowski 2013-03-18 13:17:16 UTC

This seems to be a problem with the specific hardware drivers. For sanity check, I just confirmed that it's not a problem on my other hardware (ATI radeon).

Comment 3 Przemek Klosowski 2013-04-02 15:53:59 UTC

another GDB session, so the actual addresses might be different. There is a strange disconnect between the callback value in the second-level call site, calling ctx->Driver.UnmapRenderbuffer:

p ctx->Driver.UnmapRenderbuffer 
$3 = (void (*)(struct gl_context *, struct gl_renderbuffer *)) 0x7fffe53c71d0 <intel_unmap_renderbuffer>

and the actual called procedure as seen in the top stack frame:

(gdb) down
#0  intel_miptree_unmap (intel=0x91d9f0, mt=0x0, level=0, slice=0) at intel_mipmap_tree.c:1751
1751	{
(gdb) p intel_miptree_unmap
$4 = {void (struct intel_context *, struct intel_mipmap_tree *, unsigned int, unsigned int)} 0x7fffe5399410 <intel_miptree_unmap>

In other words, we call one address (intel_unmap_renderbuffer) and end up at another (intel_miptree_unmap). Stack corruption? some macro magic?

Comment 4 Przemek Klosowski 2013-04-02 16:55:51 UTC

OK, the previous comment brings up a red herring---it's just tail call optimization (intel_unmap_renderbuffer() is a short procedure in intel_fbo.c:114) that calls intel_miptree_unmap() just before returning, which overwrites the stack frame)


162	intel_unmap_renderbuffer(struct gl_context *ctx,
163				 struct gl_renderbuffer *rb)
164	{
165	   struct intel_context *intel = intel_context(ctx);
166	   struct swrast_renderbuffer *srb = (struct swrast_renderbuffer *)rb;
167	   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
168	
169	   DBG("%s: rb %d (%s)\n", __FUNCTION__,
170	       rb->Name, _mesa_get_format_name(rb->Format));
171	
172	   if (srb->Buffer) {
173	      /* this is a malloc'd renderbuffer (accum buffer) */
174	      /* nothing to do */
175	      return;
176	   }
177	
178	   intel_miptree_unmap(intel, irb->mt, irb->mt_level, irb->mt_layer);

and the problem is that the inline proc intel_renderbuffer(rb) (defined in mesa/drivers/dri/intel/intel_fbo.h) returns a partially filled irb structure, whose most members, including irb->mt, are null. 

static INLINE struct intel_renderbuffer *
intel_renderbuffer(struct gl_renderbuffer *rb)
{
   struct intel_renderbuffer *irb = (struct intel_renderbuffer *) rb;
   if (irb && irb->Base.Base.ClassID == INTEL_RB_CLASS) {
      /*_mesa_warning(NULL, "Returning non-intel Rb\n");*/
      return irb;
   }
   else
      return NULL;
}

the generic gl_renderbuffer * rb does not have the mt* fields, so they end up as NULLs. I conclude that  NULL is a valid value, and users of intel_renderbuffer have to check for NULLs. Therefore, I suggest that the fix is to call intel_mipmap_tree.c conditionally in intel_fbo.c:178

   if (irb->mt) 
      intel_miptree_unmap(intel, irb->mt, irb->mt_level, irb->mt_layer);

but I don't really understand this code so I can't say for sure if this isn't just masking the real problem which may be incorrect creation of the intel_renderbuffer structure. 

I also haven't checked other situations where the intel_renderbuffer structure is used---there may be more places where such test is required to prevent NULL dereferences.

Comment 5 chad.versace@linux.intel.com 2013-04-10 16:36:41 UTC

Every call to intel_unmap_renderbuffer follows a call to intel_map_renderbuffer. The two calls are always paired together.

Interestintly, intel_map_renderbuffer *does* check for `irb->mt == NULL`. Since the two functions are a pair, then intel_unmap_renderbuffer should check too.

Przemek, does adding `if (!irb->mt) return;` to intel_unmap_renderbuffer fix this bug? If so, then I think it's a good fix.

----

About your comment: "I also haven't checked other situations where the intel_renderbuffer structure is used---there may be more places where such test is required to prevent NULL dereferences."

The map/unmap functions are called during special circumstances. Just because we need to check for `irb->mt == NULL` in the map/unmap functions does not necessarily mean we need to do the check in all intel_renderbuffer functions.

We may indeed need additional `irb->mt` checks, but this bug isn't a strong indicator of that.

Comment 6 Przemek Klosowski 2013-04-12 21:12:41 UTC

Unfortunately, just adding a NULL test does not fix things. There's no crash, but the 3D buffer simply retains the previous pixels and no new content ever appears.

Comment 7 Przemek Klosowski 2013-04-12 21:31:56 UTC

Created attachment 735004 [details]
NULL test patch when intel_unmap_renderbuffer is calling intel_miptree_unmap

Comment 8 Przemek Klosowski 2013-04-12 21:34:59 UTC

Created attachment 735005 [details]
patch for mesa.spec adding the associated patch

Comment 9 Przemek Klosowski 2013-04-12 21:41:35 UTC

Just wanted to mention that this bug has pretty wide consequences on Intel graphic hardware, in that it crashes on startup most 3D applications (FreeCAD, meshlab, etc)

Comment 10 Przemek Klosowski 2013-04-23 20:40:38 UTC

Kevin Fenzi noticed that this seems to be related to 
https://bugzilla.redhat.com/show_bug.cgi?id=946960
https://retrace.fedoraproject.org/faf/problems/786379/

Comment 11 Fedora End Of Life 2013-12-21 12:10:58 UTC

This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Fedora End Of Life 2014-02-05 20:00:03 UTC

Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.