Bug 468015 - Running nautilus on Altix IA64 causes system to hang [NEEDINFO]
Running nautilus on Altix IA64 causes system to hang
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xorg-x11-server (Show other bugs)
5.3
All Linux
urgent Severity high
: rc
: ---
Assigned To: Adam Jackson
desktop-bugs@redhat.com
: OtherQA, Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-10-22 08:28 EDT by Alan Matsuoka
Modified: 2010-10-23 01:21 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 16:39:03 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
cmeadors: needinfo? (jolim)


Attachments (Terms of Use)
console excerpt and Xorg.0.log from using Dave's ATI driver (10.29 KB, application/gzip)
2008-11-06 21:40 EST, Jonathan Lim
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2009:0166 normal SHIPPED_LIVE xorg-x11-drv-ati enhancement and bug fix update 2009-01-20 11:05:28 EST

  None (edit)
Description Alan Matsuoka 2008-10-22 08:28:47 EDT
Description of problem:

 On Altix IA64 with FireMV 2200 graphics running
 RHEL5.2-Snapshot1, starting nautilus causes the
 system to hang.  If nautilus is disabled in
 /usr/share/gnome/default.session, the gnome
 desktop runs without any problems.  When run with
 the --no-desktop option, there is no problem
 either.  However, even without the gnome desktop
 (just using 'xinit' to start the X server with
 an xterm and no window manager), running nautilus
 without any options prints the following and
 causes the system to hang:

   Initializing nautilus-open-terminal extension

   ** (nautilus:4557): WARNING **: Can not caclulate _NET_NUMBER_OF_DESKTOPS

   ** (nautilus:4557): WARNING **: Can not calculate _NET_NUMBER_OF_DESKTOPS

   ** (nautilus:4557): WARNING **: Can not get _NET_WORKAREA

   ** (nautilus:4557): WARNING **: Can not determine workarea, guessing at layout

Supporting Materials: This issue was initially held up by bug 448139 (fixed in RHEL 5.3). Now that we've provided them with test packages, they were able to do a bit of debugging (see https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=170845&gid=707&view_type=lifoall#eid_2361662 ). If a sysreport is required let me know and I'll get one, but they seem to have narrowed it down.

Requested Action from SEG: Fix/Escalate. I'd like to generate a release note for 5.3, so a bugzilla sooner rather than later would be appreciated!



How reproducible:

 Always.

Steps to Reproduce:

 Run 'xinit' to start the X server with just an
 xterm and no window manager, then run 'nautilus'.

Actual results:

 System hangs without any error messages,
 requiring reboot.

Expected results:

 nautilus and window system run normally.

Additional info:

  Event posted 09-23-2008 05:57pm EDT by jlim-sgi 
	
I've narrowed the problem down:

#0  fbBlt (srcLine=0x2000000005807800, srcStride=5120, srcX=0,
   dstLine=0x600000000033fcf0, dstStride=5120, dstX=0, width=5120,
   height=1024, alu=3, pm=4294967295, bpp=32, reverse=0, upsidedown=0)
   at fbblt.c:96
#1  0x2000000004ffadf0 in fbCopyNtoN (pSrcDrawable=0x60000000002f7e10,
   pDstDrawable=0x6000000000301d30, pGC=0x6000000000214510,
   pbox=0x60000fffffc3f0f0, nbox=0, dx=0, dy=1030, reverse=0, upsidedown=0,
   bitplane=0, closure=0x0) at fbcopy.c:85
#2  0x2000000004ffdfa0 in fbCopyRegion (pSrcDrawable=0x60000000002f7e10,
   pDstDrawable=0x6000000000301d30, pGC=0x6000000000214510,
   pDstRegion=0x60000fffffc3f0f0, dx=0, dy=1030, copyProc=0x2000000000a81940,
   bitPlane=0, closure=0x0) at fbcopy.c:394
#3  0x2000000004fff410 in fbDoCopy (pSrcDrawable=0x60000000002f7e10,
   pDstDrawable=0x6000000000301d30, pGC=0x6000000000214510, xIn=0, yIn=1030,
   widthSrc=1280, heightSrc=1024, xOut=0, yOut=0, copyProc=0x2000000000a81940,
   bitPlane=0, closure=0x0) at fbcopy.c:594
#4  0x2000000004fff960 in fbCopyArea (pSrcDrawable=0x60000000002f7e10,
   pDstDrawable=0x6000000000301d30, pGC=0x6000000000214510, xIn=0, yIn=0,
   widthSrc=1280, heightSrc=1024, xOut=0, yOut=0) at fbcopy.c:632
#5  0x20000000050d3c80 in XAACopyAreaPixmap (pSrc=0x60000000002f7e10,
   pDst=0x6000000000301d30, pGC=0x6000000000214510, srcx=0, srcy=0,
   width=1280, height=1024, dstx=0, dsty=0) at xaaGC.c:400
#6  0x40000000004dd4f0 in cwCopyArea (pSrc=0x60000000002f7e10,
   pDst=0x6000000000301d30, pGC=0x6000000000214510, srcx=0, srcy=0, w=1280,
   h=1024, dstx=0, dsty=0) at cw_ops.c:202
#7  0x40000000004c3d60 in damageCopyArea (pSrc=0x60000000002f7e10,
   pDst=0x6000000000301d30, pGC=0x6000000000214510, srcx=0, srcy=0,
   width=1280, height=1024, dstx=0, dsty=0) at damage.c:790
#8  0x200000000515ff80 in XAAMoveOutOffscreenPixmap (pPix=0x60000000002f7e10)
   at xaaOffscreen.c:149
#9  0x200000000515f8d0 in XAARemoveAreaCallback (area=0x60000000002fe170)
   at xaaOffscreen.c:114
#10 0x20000000050cf580 in XAAValidateGC (pGC=0x60000000002c9600,
   changes=8388607, pDraw=0x600000000031f8b0) at xaaGC.c:104
#11 0x40000000004d6660 in cwValidateGC (pGC=0x60000000002c9600,
   stateChanges=8388607, pDrawable=0x600000000031f8b0) at cw.c:166
#12 0x40000000004bdcc0 in damageValidateGC (pGC=0x60000000002c9600,
   changes=8388607, pDrawable=0x600000000031f8b0) at damage.c:410
#13 0x40000000000f4e30 in ValidateGC (pDraw=0x600000000031f8b0,
   pGC=0x60000000002c9600) at gc.c:79
#14 0x40000000000a3300 in ProcPolyFillRectangle (client=0x6000000000293ef0)
   at dispatch.c:1948
#15 0x4000000000091c70 in Dispatch () at dispatch.c:459
#16 0x400000000003b400 in main (argc=1, argv=0x60000fffffc3fc78,
   envp=0x60000fffffc3fc88) at main.c:447

    46 fbBlt (FbBits   *srcLine,
    ...
    85         CARD8 *src = (CARD8 *) srcLine;
    ...
    96                 memcpy(dst + i * dstStride, src + i * srcStride, width);

The hang occurs in memcpy(); src appears to be out of bounds.

Xorg uses XAA for acceleration by default.  To prevent the hang, one of
the following entries may be specified in the "Device" section (for the
Radeon driver) in /etc/X11/xorg.conf:

 Option "NoAccel" "on"

 Option "AccelMethod" "EXA"

David: can you open a BZ against Xorg?  Thanks. 


SEG NOTES:

Waiting on sysreport.
Comment 1 RHEL Product and Program Management 2008-10-22 08:50:33 EDT
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.
Comment 2 Jonathan Lim 2008-11-06 21:40:39 EST
Created attachment 322809 [details]
console excerpt and Xorg.0.log from using Dave's ATI driver
Comment 3 Jonathan Lim 2008-11-06 21:45:54 EST
> Okay lets concentrate on the nautilius issue, I don't have access to IT so I'll
> have to get someone to attach that unless there is another bz open for it.
>
> What I'm getting right now, is that out of the box with XAA, it works except
> for Nautilus crashes everything? This must be one of the accel operatrions
>
> EXA isn't going to be supported at present, NoAccel is probably interesting to
> track down.
>
> Can someone try the packages from:
>
> http://people.redhat.com/airlied/radeon/
>
> with XAA to see it goes away?

The above is from BZ 448139.  I installed the package and ran 'startx' with XAA enabled by default, but it didn't work: the machine hung while nautilus was starting.  I set NoAccel and tried again but got the same result.
Comment 4 Dave Airlie 2008-11-06 22:10:36 EST
can we try Option "XAANoOffscreenPixmaps" "true"
Comment 5 Jonathan Lim 2008-11-06 22:29:33 EST
(In reply to comment #4)
> can we try Option "XAANoOffscreenPixmaps" "true"

Okay, that works: nautilus isn't causing the machine to hang anymore.
Also, I can open a terminal window and move it around without causing
a hang as described in BZ 448139; it also works when running only xinit
and twm.
Comment 6 Dave Airlie 2008-11-11 01:08:02 EST
I'm doing a server build at the moment that might potentially fix this properly

http://porkchop.devel.redhat.com/brewroot/scratch/airlied/task_1565365/

If this could get tested ASAP on the Altix without the XAANoOffscreenPixmaps workaround in place.
Comment 7 Jonathan Lim 2008-11-11 17:01:05 EST
(In reply to comment #6)
> I'm doing a server build at the moment that might potentially fix this properly
> 
> http://porkchop.devel.redhat.com/brewroot/scratch/airlied/task_1565365/
> 
> If this could get tested ASAP on the Altix without the XAANoOffscreenPixmaps
> workaround in place.

I reverted to the RHEL5.3-Snapshot1 xorg-x11-drv-ati and installed the following
from the location above:

  xorg-x11-server-Xorg
  xorg-x11-server-Xnest
  xorg-x11-server-randr-source
  xorg-x11-server-sdk

I also commented out XAANoOffscreenPixmaps in xorg.conf.

The system hung just as the desktop was coming up when I ran 'startx'.
Comment 8 Adam Jackson 2008-11-12 14:49:56 EST
Ah well.

Devel ack, should just turn off offscreen pixmaps on altix.
Comment 9 Jonathan Lim 2008-11-12 15:57:43 EST
(In reply to comment #7)
> I reverted to the RHEL5.3-Snapshot1 xorg-x11-drv-ati and installed the
> following from the location above:
> 
>   xorg-x11-server-Xorg
>   xorg-x11-server-Xnest
>   xorg-x11-server-randr-source
>   xorg-x11-server-sdk
> 
> I also commented out XAANoOffscreenPixmaps in xorg.conf.
> 
> The system hung just as the desktop was coming up when I ran 'startx'.

In addition to the above, I installed xorg-x11-drv-ati-6.6.3-3.20.el5 and graphics came up fine.
Comment 11 Adam Jackson 2008-11-18 10:17:57 EST
-> MODIFIED
Comment 13 Cameron Meadors 2008-12-10 11:54:23 EST
Jonathan, can you retest with snap6 when it comes out and report you results.
Comment 16 errata-xmlrpc 2009-01-20 16:39:03 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0166.html

Note You need to log in before you can comment on or make changes to this bug.