Bug 472315

Summary: BBC iPlayer "Stephen Fry in America: Pacific" nastily hangs FC9 system.
Product: [Fedora] Fedora Reporter: Kev 'Kyrian' Green <kyrian>
Component: xorg-x11-drv-atiAssignee: Dave Airlie <airlied>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: gecko-bugs-nobody, mcepl, walters, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
URL: http://www.bbc.co.uk/iplayer/episode/b00flx59
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-14 16:46:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
a script for collecting debug information
none
Xorg.conf file from crashy machine.
none
Xorg.0.log after locking up stephen fry ;-)
none
dmesg log file
none
messages log post crash.
none
Xorg.0.log after locking up stephen fry, then trying without an xorg.conf. none

Description Kev 'Kyrian' Green 2008-11-20 00:24:19 UTC
Description of problem:

The above specified URL (and accompanying 'standard resolution' version does the same) crashes out the entire system nastily, when run in full screen mode (possibly even when just run in normal in-window mode, I'm not sure on that). When the problem starts the keyboard usually becomes non-responsive (caps-lock doesn't do anything) although you can sometimes get out of it with 'Magic Sysrq' requests rather than a hard reboot. The video goes very jerky but the audio continues as normal, and eventually the audio too goes jerky (may even start to loop a bit) and then you're stuck and have to reboot.

Occasionally the video turns into a grey with wonky-green-squares type display, but usually it just hangs.

I doesn't happen straight away that you play the clip, but usually after some minutes (maybe up to 15).

I've never had this problem with all the previous episodes in the series (5 or so), so I'm assuming a recent update has done something?

Version-Release number of selected component (if applicable):

kernel-2.6.27.5-37.fc9.i686
mozilla-filesystem-1.9-2.fc9.i386
libflashsupport-000-0.5.svn20070904.i386
flash-plugin-10.0.12.36-release.i386
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.4) Gecko/2008111217
Fedora/3.0.4-1.fc9 Firefox/3.0.4

How reproducible:

Play the above URL with above software versions.

Wait about 15 minutes, and it dies in the above specified way.

Steps to Reproduce:
1.See above.
  
Actual results:

See above.

Expected results:

No crash, just reach the end of the video and be fine.

Additional info:

Firefox extensions installed:

DOM Inspector 2.0.1
Flashblock 1.5.7
Live PageRank 0.9.6
Tab To Window 1.2.8
Web Developer 1.1.6

I'm not able to say conclusively where the problem lies, and I am reluctant to keep completely crashing out my machine to prove that.

The problem disappeared (although I was at the end of the video by that point, so it may just not have been long enough to prove it after the changes...) when I disabled all of the above extensions, AND backed out to running kernel kernel-2.6.26.6-79.fc9.i686. So I guess it might be an either/or problem.

I tried it under Opera somewhere along the line, but IIRC that wouldn't even play at all.

There is a vague possibility this is screensaver related, but (on watching the same video) having had the screensaver begin, and hit space/return/caps lock/whatever to stop it several times, without a crash, I don't think it is.

Comment 1 Matěj Cepl 2008-11-20 01:38:57 UTC
Created attachment 324130 [details]
a script for collecting debug information

Thanks for the bug report.  We have reviewed the information you have provided above, and there is some additional information we require that will be helpful in our diagnosis of this issue.

When you reboot the machine (if possible to the runlevel 1, so that logs are not overwritten that much), there should be some files on the hard drive (that is, if they got written before computer crashed), which could help us tremendously.

Please attach your X server config file (/etc/X11/xorg.conf) and X server log file (/var/log/Xorg.*.log) to the bug report as individual uncompressed file attachments using the bugzilla file attachment link below.

Could you please also try to run without any /etc/X11/xorg.conf whatsoever and let X11 autodetect your display and video card? Attach to this bug /var/log/Xorg.0.log from this attempt as well, please.

Also, could we get output of the command

	rpm -qa *xulrun* *firefox* *mozilla* *flash* *plugin*

Please also install firefox-debuginfo (debuginfo-install is from
yum-utils package).

	debuginfo-install firefox

and then look for the file named core.(some number) (aka coredump) in the directory where firefox was run from?

If you run the attached script to process this coredump like 

gbt firefox core.<number>

you should get a file firefox-backtrace-<number>.txt Please attach this file as well.

We will review this issue again once you've had a chance to attach this information.

Thanks in advance.

Comment 2 Kev 'Kyrian' Green 2008-11-25 13:50:14 UTC
Internet problems at home so I can't really get you all this information right now, but I will do once it's back up an running.

I have taken copies of a variety of log files including Xorg.0.log from dropping into single user mode after a crash, which are good to go once I can get online.

The video card is a Radeon 9600 or there abouts, which I'm pretty sure is quite well supported and recognised, so I don't think it can be a mis-recognition issue.

However it seems worth mentioning that this is now happening periodically even when firefox is not running, once or twice even before I had logged in to X Windows.

Oh, and I let memtest86+ do its thing for about half an hour on the system, and it reported no errors.

That's all for now I'm afraid. Apologies for the potential unhelpful messages you will be getting.

Comment 3 Kev 'Kyrian' Green 2008-11-28 10:48:50 UTC
Happened again this morning with this BBC URL:

http://news.bbc.co.uk/newsbeat/hi/entertainment/newsid_7753000/7753638.stm

And new kernel package "kernel-2.6.27.5-41.fc9.i686", so I'm reverting grub back (again) to using "2.6.26.6-79.fc9.i686" for now, to preserve some of my own sanity.

Hmm, perhaps more than you bargained for from "rpm -qa *xulrun* *firefox* *mozilla* *flash* *plugin*" ;-)

java-1.6.0-openjdk-plugin-1.6.0.0-0.18.b09.fc9.i386
gstreamer-plugins-ugly-0.10.8-2.fc9.i386
anaconda-yum-plugins-1.0-1.fc9.noarch
xulrunner-1.9.0.4-1.fc9.i386
mozilla-filesystem-1.9-2.fc9.i386
gstreamer-plugins-pulse-0.9.5-0.5.svn20070924.fc9.i386
libflashsupport-000-0.5.svn20070904.i386
kipi-plugins-0.1.5-2.fc9.i386
setroubleshoot-plugins-2.0.11-1.fc9.noarch
vamp-plugin-sdk-1.1b-4.fc9.i386
gstreamer-plugins-flumpegdemux-0.10.15-2.fc9.i386
xfce4-mailwatch-plugin-1.1.0-1.fc9.i386
gstreamer-plugins-bad-0.10.7-4.fc9.i386
gutenprint-plugin-5.0.2-2.fc9.i386
alsa-plugins-jack-1.0.16-4.fc9.i386
gstreamer-plugins-farsight-0.12.7-2.fc9.i386
xfce-mcs-plugins-4.4.2-4.fc9.i386
flash-plugin-10.0.12.36-release.i386
mythplugins-0.21-13.fc9.i386
gstreamer-plugins-good-0.10.8-8.fc9.i386
thunar-archive-plugin-0.2.4-5.fc9.i386
gstreamer-plugins-base-0.10.19-4.fc9.i386
gstreamer-plugins-base-devel-0.10.19-4.fc9.i386
amsn-plugins-0.97.2-1.fc9.i386
totem-mozplugin-2.23.2-8.fc9.i386
alsa-plugins-pulseaudio-1.0.16-4.fc9.i386
nspluginwrapper-1.1.2-2.fc9.i386
firefox-3.0.4-1.fc9.i386

I have taken the liberty of excluding:

ladspa-*-plugins-*
audacious-plugins-*

As I am pretty sure they are unrelated PlanetCCRMA packages, and they halve the size of the list you'd be dealing with.

I'm already very late for work, thanks to various things, so I'll just attach the requested files and leave it at that until this evening, or tomorrow.

Comment 4 Kev 'Kyrian' Green 2008-11-28 10:51:18 UTC
Oh, it's worth noting that firefox doesn't crash, I have to use CTRL+ALT+PRINT SCR+{S,U,B} to get the system out of the spin it gets into, so there might not be a core dump at all.

I will install the debuginfo package anyway, but I don't hold out much hope of there being a core dump event.

Comment 5 Kev 'Kyrian' Green 2008-11-28 10:52:47 UTC
Created attachment 324970 [details]
Xorg.conf file from crashy machine.

As requested, this is the 'normal running' Xorg.conf, rather than a barebones conf to force graphics card detection.

Comment 6 Kev 'Kyrian' Green 2008-11-28 10:55:53 UTC
Created attachment 324973 [details]
Xorg.0.log after locking up stephen fry ;-)

Comment 7 Matěj Cepl 2008-11-28 15:21:02 UTC
Hmm, I don't see anything suspicious here -- could we get /var/log/dmesg and /var/log/messages as well?

Comment 8 Kev 'Kyrian' Green 2008-11-28 23:29:51 UTC
> Hmm, I don't see anything suspicious here -- could we get /var/log/dmesg and
> /var/log/messages as well?
>
Y'know, I thought you might say that; If it were something that obvious I would probably have solved it myself ;-)

I took copies of the following files (based on their respective timestamps being close to the current time) after a crash and going into single user mode:

dmesg.old-post-lockup-fry
maillog-post-lockup-fry
messages-post-lockup-fry
secure-post-lockup-fry
wtmp-post-lockup-fry
cron-post-lockup-fry
dmesg-post-lockup-fry
fry-lockup.txt
dot-xsession-errors-post-lockup-fry
Xorg.0.log.post-lockup-fry

Of which I will attach dmesg and messages momentarily.

K.

Comment 9 Kev 'Kyrian' Green 2008-11-28 23:30:49 UTC
Created attachment 325051 [details]
dmesg log file

Comment 10 Kev 'Kyrian' Green 2008-11-28 23:32:26 UTC
Created attachment 325052 [details]
messages log post crash.

Comment 11 Kev 'Kyrian' Green 2008-11-29 00:00:19 UTC
Created attachment 325055 [details]
Xorg.0.log after locking up stephen fry, then trying without an xorg.conf.

Woah, the resolution is really sharp, so much so that it is making my eyes go a bit funny. Maybe I will tweak it but keep this type of configuration ;-)

Anyway, as requested, I used RCS to check in my config into the relevant sub-directory (without keeping a working copy, before you ask), so I don't lose it, but it's definitely not where Xorg has found it.

[root@enkil ~]# ls -als /etc/X11/
applnk/     mwm/        rstart/     X           Xmodmap     
fontpath.d/ prefdm      twm/        xdm/        Xresources  
fs/         RCS/        wmconfig/   xinit/      xsm/        
[root@enkil ~]#

I've had the same lock-up a couple of times today while working on this bug, btw.

Comment 12 Matěj Cepl 2008-11-29 08:21:23 UTC
(In reply to comment #11)
> Woah, the resolution is really sharp, so much so that it is making my eyes go a
> bit funny. Maybe I will tweak it but keep this type of configuration ;-)

Just not sure if you noticed, but the autogenerated xorg.conf is now part of your Xorg.0.log. You can use it as a basis for your further tweaking.

> Y'know, I thought you might say that; If it were something that obvious
> I would probably have solved it myself ;-)

I don't see through bugzilla, so I don't know how experienced Linux geek you are.

Anyway, passing to developers.

Comment 13 Kev 'Kyrian' Green 2008-11-29 15:28:34 UTC
Hmmm... I had been having trouble determining whether this other issue was a symptom, or cause of the above. I have a feeling I should have mentioned it earlier, though.

At the time of reporting this bug it seemed like it was a symptom, but now I'm not so sure.

It's aggravating trying to do my VAT return with the damn thing crashing. That's for sure.

The problem resurfaced, but with a graphics slow-down (no cross-hatched blank screen though, typically that would happen if I left the thing in its b0rked state for a minute or so) and then hang resulting in my needing to do SYSRQ stuff to get out of it.

It was with this kernel:

Linux version 2.6.26.6-79.fc9.i686 (mockbuild@) (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) ) #1 SMP Fri Oct 17 14:52:14 EDT 2008

However when the system rebooted it would not recognise any of the hard disks at first, and then only the CD-ROM.

I re-seated cables, and powered down for a while etc. just in case and eventually it recognised them again.

I don't know if this is because restarting using the SYSRQ stuff leaves the drives in an odd state that upsets the BIOS etc, or not. or even whether this is the cause of the problem, rather than a symptom. I would assume a system could survive a fair while without disk access though. Also this BIOS disk lockup only happens in the minority of cases, so I doubt it?

Either way, the motherboard says between its PCI slots it is  "MS 6391 VER 1", and BIOS reports itself as "W639 1IF1 V1.4 051402", so I'm going to use that to hunt for any BIOS updates in case it could be that sort of thing.

For reference:

[kyrian@enkil ~]$ lsscsi 
[0:0:0:0]    disk    ATA      WDC WD1200BB-00G 08.0  /dev/sda
[0:0:1:0]    disk    ATA      SAMSUNG SP1614N  TM10  /dev/sdb
[1:0:0:0]    cd/dvd  LITE-ON  LTR-16101B       TS0N  /dev/sr0
[1:0:1:0]    cd/dvd  JLMS     XJ-HD165H        CH0Z  /dev/sr1
[kyrian@enkil ~]$

Comment 14 Kev 'Kyrian' Green 2008-11-29 17:33:35 UTC
Aha, we might be getting somewhere.

I switched my 2 CD-ROM drives over, they were configured as master on the first plug of the ATA lead (newer 80-wire type), and slave on the second plug. I thought it was not supposed to make a difference, but with master on the far plug, and slave on the near plug seems to have made a positive difference to the post-reboot-drive-lockup situation.

That is secondary news I think?

However, I just tried playing some content using mplayer instead of firefox, and while firefox was running in the background, I got a crash as described above (no cross-hatched picture though, I stopped it too early), but without firefox running and just mplayer going I got a similar thing (well before you'd expect a screensaver to fire up, by the way).

What's interesting is that in the latter case (mplayer, but not firefox), I could CTRL+ALT+F1 to get to a text console, and CTRL+ALT+F7 to get back again, and regain my X display as normal. after a few seconds.

At which point I get the following interesting stuff at the end of 'dmesg':

[drm] Initialized drm 1.1.0 20060810
ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
[drm] Initialized radeon 1.29.0 20080528 on minor 0
agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
[drm] Setting GART location based on new memory map
[drm] Loading R200 Microcode
[drm] writeback test succeeded in 2 usecs
SysRq : HELP : loglevel0-8 reBoot Crashdump tErm Full kIll saK aLlcpus showMem Nice powerOff showPc show-all-timers(Q) unRaw Sync showTasks Unmount shoW-blocked-tasks 
agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
[drm] Loading R200 Microcode
[ ... the last four lines repeat several times, presumably once for each flip between "display" 1 and 8 ]

The SysRq thing was something I did myself when it first crashed out, but I didn't make the kernel do anything with it, IIRC.

Nothing else different/significant shows up in any of the other log files that have been requested before.

When I test the same CTRL+ALT+FX switching in a normal situation I get those four lines repeated, however, so it may not be at all significant?

PS.

[kyrian@enkil ~]$ cat /proc/version 
Linux version 2.6.26.6-79.fc9.i686 (mockbuild@) (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) ) #1 SMP Fri Oct 17 14:52:14 EDT 2008
[kyrian@enkil ~]$

Comment 15 Kev 'Kyrian' Green 2008-11-29 18:29:54 UTC
Looking at the other bugs in Fedora/xorg-x11-drv-ati bugs...

I think this is probably a duplicate of bug #365691, similar to, but not duplicate of #219632.

It may also relate to #446398 & #448855.

I read vaguely somewhere that this was related to 'high contrast' situations, so I'm going to try running with no desktop background, as mine are usually pretty high contrast. Might help, might not, doesn't hurt either way.

Looking up stuff about forcing the AGP mode I found that there is no kernel option for it (only on/off/try_unsupported), but for Xorg there are these, so I'll try the latter's AGPMode suggestion (start with 2 and work down if needed):

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=467460
http://www.x.org/wiki/radeon

I'll try with the VESA driver in xorg.conf, if that doesn't help the situation.

This is a home machine, so if I can keep it stable enough to get my *ahem* infernal VAT return done, I'll be able to test various scenarios if needed.

K.

Comment 16 Kev 'Kyrian' Green 2008-11-29 19:04:29 UTC
Apologies for yet more emails on this, but the version difference here is certainly curious, even if it's just cosmetic:

[kyrian@enkil ~]$ rpm -q xorg-x11-drv-ati
xorg-x11-drv-ati-6.8.0-19.fc9.i386
[kyrian@enkil ~]$ grep -A5 /usr/lib/xorg/modules/driver ~kyrian/lockup-fry/Xorg.0.log.post-lockup-fry 
(II) Loading /usr/lib/xorg/modules/drivers//radeon_drv.so
(II) Module radeon: vendor="X.Org Foundation"
        compiled for 1.4.99.905, module version = 6.9.0
        Module class: X.Org Video Driver
        ABI class: X.Org Video Driver, version 4.1
(II) LoadModule: "mouse"
[kyrian@enkil ~]$ rpm -qf /usr/lib/xorg/modules/drivers//radeon_drv.so
xorg-x11-drv-ati-6.8.0-19.fc9.i386
[kyrian@enkil ~]$ rpm -q --changelog xorg-x11-drv-ati-6.8.0-19.fc9.i386|head -3
* Wed Jul 30 2008 Dave Airlie <airlied> 6.8.0-19
- Update to latest upstream release + fixes

Apropos, with the below AGPMode change, things once again seem more stable, although I've said that before, and been proved wrong moments later.

Section "Device"
	Identifier  "Videocard0"
	Driver      "radeon"
	# Bug fiddling, https://bugzilla.redhat.com/show_bug.cgi?id=472315
	#Driver      "vesa"
	Option	"AGPMode" "2"
	# END: Bug fiddling...
	VendorName  "Videocard vendor"
	BoardName   "ATI Technologies Inc M9+ 5C63 [Radeon Mobility 9200 (AGP)]"
EndSection

[kyrian@enkil ~]$ dmesg | grep -i agp
Linux agpgart interface v0.103
agpgart: Detected an Intel 845G Chipset.
agpgart: AGP aperture is 256M @ 0xb0000000
agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V2 device at 0000:00:00.0 into 2x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 2x mode
[kyrian@enkil ~]$ 

K.

Comment 17 Kev 'Kyrian' Green 2008-11-29 20:59:16 UTC
It looks like it'll work, so I'd be happy to try upgrading just these packages in isolation, and holding off on the rest of FC10 to see if that fixes things without the 'AGPMode' tweak above, for instance...

[root@enkil ~]# rpm -Uivh --test xorg-x11-drv-ati-6.9.0-54.fc10.i386.rpm  libdrm-2.4.0-0.21.fc10.i386.rpm  libdrm-devel-2.4.0-0.21.fc10.i386.rpm 
warning: xorg-x11-drv-ati-6.9.0-54.fc10.i386.rpm: Header V3 DSA signature: NOKEY, key ID 4ebfc273
Preparing...                ########################################### [100%]
[root@enkil ~]# 

K.

Comment 18 Kev 'Kyrian' Green 2008-12-09 11:44:45 UTC
It's all gone quiet...

I thought it might help to have a "similar" example of where it has never been observed to happen, even without AGP tweaks in xorg.conf which is my 'office' machine:

[kyrian@mybox ~]$ dmesg | grep '\(drm\|agp\)'
Linux agpgart interface v0.103
agpgart: Detected AGP bridge 0
agpgart: AGP aperture is 256M @ 0xd0000000
[drm] Initialized drm 1.1.0 20060810
[drm] Initialized radeon 1.29.0 20080528 on minor 0
agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V3 device at 0000:00:00.0 into 8x mode
agpgart: Putting AGP V3 device at 0000:01:00.0 into 8x mode
[drm] Setting GART location based on new memory map
[drm] Loading R300 Microcode
[drm] Num pipes: 1
[drm] writeback test succeeded in 1 usecs
[drm] Num pipes: 1

...

[kyrian@mybox ~]$ cat /etc/fedora-release 
Fedora release 9 (Sulphur)
[kyrian@mybox ~]$ rpm -q xorg-x11-drv-ati
xorg-x11-drv-ati-6.8.0-19.fc9.i386
kyrian@mybox ~]$ /sbin/lspci | grep -i ATI
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
01:00.0 VGA compatible controller: ATI Technologies Inc RV350 AS [Radeon 9550]
01:00.1 Display controller: ATI Technologies Inc RV350 AS [Radeon 9550] (Secondary)
[kyrian@mybox ~]$ grep -i agp /etc/X11/xorg.conf
[kyrian@mybox ~]$ 

I guess I could manage to hang around in the office one evening with some beer to try and break it with iPlayer if that would help folks.

Comment 19 Bug Zapper 2009-06-10 03:18:42 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 20 Bug Zapper 2009-07-14 16:46:45 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.