Bug 464619 - xfs segfaults intermittently
Summary: xfs segfaults intermittently
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: xorg-x11
Version: 4.8
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Kristian Høgsberg
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-09-29 18:33 UTC by ritz
Modified: 2018-10-20 02:38 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-05-18 20:26:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
coredump of xfs (91.05 KB, application/x-bzip2)
2008-09-29 18:36 UTC, ritz
no flags Details
sosreport of the box in question (2.06 MB, application/x-bzip2)
2008-09-29 18:40 UTC, ritz
no flags Details
Patch to implement the fix described above (1.08 KB, patch)
2008-12-08 11:03 UTC, Olivier Fourdan
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0998 0 normal SHIPPED_LIVE xorg-x11 bug fix and enhancement update 2009-05-18 18:39:40 UTC

Description ritz 2008-09-29 18:33:18 UTC
Description of problem:

  xfs segfaults intermittently 

Version-Release number of selected component (if applicable):
xorg-x11-xfs-6.8.2-1.EL.33.0.4-x86_64

How reproducible:
intermittent

Steps to Reproduce:
n/a
  
Actual results:
segfaults intermittently 

Expected results:
should not segfault

Additional info:

xfs config file has client-limit set to 1000. setting this to the default which is 10, does not help. The failure occurrs no more or less frequently when it was raised to 1000.


backtrace

#0  0x000000000040d01c in OpenFont (client=0x0, fid=0, format=0,
   format_mask=0, namelen=0, name=0x0) at fonts.c:542

warning: Source file is more recent than executable.

542    c->flags = (FontLoadInfo | FontLoadProps);
(gdb) bt
#0  0x000000000040d01c in OpenFont (client=0x0, fid=0, format=0,
   format_mask=0, namelen=0, name=0x0) at fonts.c:542
#1  0x0000000000000000 in ?? ()
(gdb) list
542    c->flags = (FontLoadInfo | FontLoadProps);
543    c->format = format;
544    c->format_mask = format_mask;
545    c->non_cachable_font = pfont;
546
547    (void) do_open_font(client, (pointer) c);
548    return FSSuccess;
549 }
550
551 static int
(gdb) p c
$1 = 0x0

This is looks odd.

Comment 5 Issue Tracker 2008-10-30 14:45:25 UTC
Any update at all? Do we know why it's segfaulting yet? Is there any
progress to a patch?


This event sent from IssueTracker by calvin_g_smith 
 issue 188699

Comment 6 RHEL Program Management 2008-10-31 16:45:25 UTC
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".

Comment 8 RHEL Program Management 2008-10-31 17:35:45 UTC
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".

Comment 14 Issue Tracker 2008-12-08 10:59:22 UTC
Instrumenting the code, we see that the pointer to ListenTransConns is free
and nullied by the cloned process in StopListening() but there is no actual
test for its validity in 

    Dec  8 11:11:17 localhost xfs[4840]: ListenTransConns = 0x522580 
    Dec  8 11:11:17 localhost xfs[4840]: In Dispatch() - ListenTransConns
= 0x522580 
    Dec  8 11:11:17 localhost xfs[4840]: ListenTransConns = 0x522580 
    Dec  8 11:11:17 localhost xfs[4840]: attempting clone... 

    Dec  8 11:11:17 localhost xfs[9608]: ListenTransConns = 0x522580 
    Dec  8 11:11:17 localhost xfs[9608]: In StopListening() -
ListenTransConns = 0x0 
    Dec  8 11:11:17 localhost xfs[9608]: clone: child becoming drone 
    Dec  8 11:11:17 localhost xfs[9608]: In Dispatch() - ListenTransConns
= 0x0 

So a fix would be to:

 - Make sure ListenTransCount is reset to 0 when ListenTransConns is
freed
 - Checks for nullity of the table in CloseSockets() and avoids the
segfault 
   of the child process which is about to finish.
   
Olivier.



This event sent from IssueTracker by ofourdan 
 issue 188699

Comment 15 Olivier Fourdan 2008-12-08 11:03:12 UTC
Created attachment 326111 [details]
Patch to implement the fix described above

The code is similar upstream, and the same reproducer tried on EL5 and on the latest F10 shows that the problem also affects current versions of xfs.

Comment 19 RHEL Program Management 2009-01-20 17:51:29 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 24 Zack Cerza 2009-02-06 18:40:27 UTC
Kristian, Xnest -fp unix:/:7100 :0 isn't working here; I'm getting an "Unable to find transport for unix: ..." error message. I've seen this before but I've long since forgotten how to work around it.

Comment 26 Kristian Høgsberg 2009-02-06 20:46:04 UTC
(In reply to comment #24)
> Kristian, Xnest -fp unix:/:7100 :0 isn't working here; I'm getting an "Unable
> to find transport for unix: ..." error message. I've seen this before but I've
> long since forgotten how to work around it.

It's unix/:7100, not unix:/:7100.  It should just work.

Comment 27 Kristian Høgsberg 2009-02-06 20:47:45 UTC
Back to MODIFIED.

Comment 28 Zack Cerza 2009-02-06 21:09:31 UTC
Uh, weird, could have sworn I tried that too. Thanks!

Comment 45 errata-xmlrpc 2009-05-18 20:26:59 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0998.html


Note You need to log in before you can comment on or make changes to this bug.