Bug 472315
Summary: | BBC iPlayer "Stephen Fry in America: Pacific" nastily hangs FC9 system. | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Kev 'Kyrian' Green <kyrian> |
Component: | xorg-x11-drv-ati | Assignee: | Dave Airlie <airlied> |
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 9 | CC: | gecko-bugs-nobody, mcepl, walters, xgl-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
URL: | http://www.bbc.co.uk/iplayer/episode/b00flx59 | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-07-14 16:46:45 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Kev 'Kyrian' Green
2008-11-20 00:24:19 UTC
Created attachment 324130 [details]
a script for collecting debug information
Thanks for the bug report. We have reviewed the information you have provided above, and there is some additional information we require that will be helpful in our diagnosis of this issue.
When you reboot the machine (if possible to the runlevel 1, so that logs are not overwritten that much), there should be some files on the hard drive (that is, if they got written before computer crashed), which could help us tremendously.
Please attach your X server config file (/etc/X11/xorg.conf) and X server log file (/var/log/Xorg.*.log) to the bug report as individual uncompressed file attachments using the bugzilla file attachment link below.
Could you please also try to run without any /etc/X11/xorg.conf whatsoever and let X11 autodetect your display and video card? Attach to this bug /var/log/Xorg.0.log from this attempt as well, please.
Also, could we get output of the command
rpm -qa *xulrun* *firefox* *mozilla* *flash* *plugin*
Please also install firefox-debuginfo (debuginfo-install is from
yum-utils package).
debuginfo-install firefox
and then look for the file named core.(some number) (aka coredump) in the directory where firefox was run from?
If you run the attached script to process this coredump like
gbt firefox core.<number>
you should get a file firefox-backtrace-<number>.txt Please attach this file as well.
We will review this issue again once you've had a chance to attach this information.
Thanks in advance.
Internet problems at home so I can't really get you all this information right now, but I will do once it's back up an running. I have taken copies of a variety of log files including Xorg.0.log from dropping into single user mode after a crash, which are good to go once I can get online. The video card is a Radeon 9600 or there abouts, which I'm pretty sure is quite well supported and recognised, so I don't think it can be a mis-recognition issue. However it seems worth mentioning that this is now happening periodically even when firefox is not running, once or twice even before I had logged in to X Windows. Oh, and I let memtest86+ do its thing for about half an hour on the system, and it reported no errors. That's all for now I'm afraid. Apologies for the potential unhelpful messages you will be getting. Happened again this morning with this BBC URL: http://news.bbc.co.uk/newsbeat/hi/entertainment/newsid_7753000/7753638.stm And new kernel package "kernel-2.6.27.5-41.fc9.i686", so I'm reverting grub back (again) to using "2.6.26.6-79.fc9.i686" for now, to preserve some of my own sanity. Hmm, perhaps more than you bargained for from "rpm -qa *xulrun* *firefox* *mozilla* *flash* *plugin*" ;-) java-1.6.0-openjdk-plugin-1.6.0.0-0.18.b09.fc9.i386 gstreamer-plugins-ugly-0.10.8-2.fc9.i386 anaconda-yum-plugins-1.0-1.fc9.noarch xulrunner-1.9.0.4-1.fc9.i386 mozilla-filesystem-1.9-2.fc9.i386 gstreamer-plugins-pulse-0.9.5-0.5.svn20070924.fc9.i386 libflashsupport-000-0.5.svn20070904.i386 kipi-plugins-0.1.5-2.fc9.i386 setroubleshoot-plugins-2.0.11-1.fc9.noarch vamp-plugin-sdk-1.1b-4.fc9.i386 gstreamer-plugins-flumpegdemux-0.10.15-2.fc9.i386 xfce4-mailwatch-plugin-1.1.0-1.fc9.i386 gstreamer-plugins-bad-0.10.7-4.fc9.i386 gutenprint-plugin-5.0.2-2.fc9.i386 alsa-plugins-jack-1.0.16-4.fc9.i386 gstreamer-plugins-farsight-0.12.7-2.fc9.i386 xfce-mcs-plugins-4.4.2-4.fc9.i386 flash-plugin-10.0.12.36-release.i386 mythplugins-0.21-13.fc9.i386 gstreamer-plugins-good-0.10.8-8.fc9.i386 thunar-archive-plugin-0.2.4-5.fc9.i386 gstreamer-plugins-base-0.10.19-4.fc9.i386 gstreamer-plugins-base-devel-0.10.19-4.fc9.i386 amsn-plugins-0.97.2-1.fc9.i386 totem-mozplugin-2.23.2-8.fc9.i386 alsa-plugins-pulseaudio-1.0.16-4.fc9.i386 nspluginwrapper-1.1.2-2.fc9.i386 firefox-3.0.4-1.fc9.i386 I have taken the liberty of excluding: ladspa-*-plugins-* audacious-plugins-* As I am pretty sure they are unrelated PlanetCCRMA packages, and they halve the size of the list you'd be dealing with. I'm already very late for work, thanks to various things, so I'll just attach the requested files and leave it at that until this evening, or tomorrow. Oh, it's worth noting that firefox doesn't crash, I have to use CTRL+ALT+PRINT SCR+{S,U,B} to get the system out of the spin it gets into, so there might not be a core dump at all. I will install the debuginfo package anyway, but I don't hold out much hope of there being a core dump event. Created attachment 324970 [details]
Xorg.conf file from crashy machine.
As requested, this is the 'normal running' Xorg.conf, rather than a barebones conf to force graphics card detection.
Created attachment 324973 [details]
Xorg.0.log after locking up stephen fry ;-)
Hmm, I don't see anything suspicious here -- could we get /var/log/dmesg and /var/log/messages as well? > Hmm, I don't see anything suspicious here -- could we get /var/log/dmesg and
> /var/log/messages as well?
>
Y'know, I thought you might say that; If it were something that obvious I would probably have solved it myself ;-)
I took copies of the following files (based on their respective timestamps being close to the current time) after a crash and going into single user mode:
dmesg.old-post-lockup-fry
maillog-post-lockup-fry
messages-post-lockup-fry
secure-post-lockup-fry
wtmp-post-lockup-fry
cron-post-lockup-fry
dmesg-post-lockup-fry
fry-lockup.txt
dot-xsession-errors-post-lockup-fry
Xorg.0.log.post-lockup-fry
Of which I will attach dmesg and messages momentarily.
K.
Created attachment 325051 [details]
dmesg log file
Created attachment 325052 [details]
messages log post crash.
Created attachment 325055 [details]
Xorg.0.log after locking up stephen fry, then trying without an xorg.conf.
Woah, the resolution is really sharp, so much so that it is making my eyes go a bit funny. Maybe I will tweak it but keep this type of configuration ;-)
Anyway, as requested, I used RCS to check in my config into the relevant sub-directory (without keeping a working copy, before you ask), so I don't lose it, but it's definitely not where Xorg has found it.
[root@enkil ~]# ls -als /etc/X11/
applnk/ mwm/ rstart/ X Xmodmap
fontpath.d/ prefdm twm/ xdm/ Xresources
fs/ RCS/ wmconfig/ xinit/ xsm/
[root@enkil ~]#
I've had the same lock-up a couple of times today while working on this bug, btw.
(In reply to comment #11) > Woah, the resolution is really sharp, so much so that it is making my eyes go a > bit funny. Maybe I will tweak it but keep this type of configuration ;-) Just not sure if you noticed, but the autogenerated xorg.conf is now part of your Xorg.0.log. You can use it as a basis for your further tweaking. > Y'know, I thought you might say that; If it were something that obvious > I would probably have solved it myself ;-) I don't see through bugzilla, so I don't know how experienced Linux geek you are. Anyway, passing to developers. Hmmm... I had been having trouble determining whether this other issue was a symptom, or cause of the above. I have a feeling I should have mentioned it earlier, though. At the time of reporting this bug it seemed like it was a symptom, but now I'm not so sure. It's aggravating trying to do my VAT return with the damn thing crashing. That's for sure. The problem resurfaced, but with a graphics slow-down (no cross-hatched blank screen though, typically that would happen if I left the thing in its b0rked state for a minute or so) and then hang resulting in my needing to do SYSRQ stuff to get out of it. It was with this kernel: Linux version 2.6.26.6-79.fc9.i686 (mockbuild@) (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) ) #1 SMP Fri Oct 17 14:52:14 EDT 2008 However when the system rebooted it would not recognise any of the hard disks at first, and then only the CD-ROM. I re-seated cables, and powered down for a while etc. just in case and eventually it recognised them again. I don't know if this is because restarting using the SYSRQ stuff leaves the drives in an odd state that upsets the BIOS etc, or not. or even whether this is the cause of the problem, rather than a symptom. I would assume a system could survive a fair while without disk access though. Also this BIOS disk lockup only happens in the minority of cases, so I doubt it? Either way, the motherboard says between its PCI slots it is "MS 6391 VER 1", and BIOS reports itself as "W639 1IF1 V1.4 051402", so I'm going to use that to hunt for any BIOS updates in case it could be that sort of thing. For reference: [kyrian@enkil ~]$ lsscsi [0:0:0:0] disk ATA WDC WD1200BB-00G 08.0 /dev/sda [0:0:1:0] disk ATA SAMSUNG SP1614N TM10 /dev/sdb [1:0:0:0] cd/dvd LITE-ON LTR-16101B TS0N /dev/sr0 [1:0:1:0] cd/dvd JLMS XJ-HD165H CH0Z /dev/sr1 [kyrian@enkil ~]$ Aha, we might be getting somewhere. I switched my 2 CD-ROM drives over, they were configured as master on the first plug of the ATA lead (newer 80-wire type), and slave on the second plug. I thought it was not supposed to make a difference, but with master on the far plug, and slave on the near plug seems to have made a positive difference to the post-reboot-drive-lockup situation. That is secondary news I think? However, I just tried playing some content using mplayer instead of firefox, and while firefox was running in the background, I got a crash as described above (no cross-hatched picture though, I stopped it too early), but without firefox running and just mplayer going I got a similar thing (well before you'd expect a screensaver to fire up, by the way). What's interesting is that in the latter case (mplayer, but not firefox), I could CTRL+ALT+F1 to get to a text console, and CTRL+ALT+F7 to get back again, and regain my X display as normal. after a few seconds. At which point I get the following interesting stuff at the end of 'dmesg': [drm] Initialized drm 1.1.0 20060810 ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16 [drm] Initialized radeon 1.29.0 20080528 on minor 0 agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode [drm] Setting GART location based on new memory map [drm] Loading R200 Microcode [drm] writeback test succeeded in 2 usecs SysRq : HELP : loglevel0-8 reBoot Crashdump tErm Full kIll saK aLlcpus showMem Nice powerOff showPc show-all-timers(Q) unRaw Sync showTasks Unmount shoW-blocked-tasks agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode [drm] Loading R200 Microcode [ ... the last four lines repeat several times, presumably once for each flip between "display" 1 and 8 ] The SysRq thing was something I did myself when it first crashed out, but I didn't make the kernel do anything with it, IIRC. Nothing else different/significant shows up in any of the other log files that have been requested before. When I test the same CTRL+ALT+FX switching in a normal situation I get those four lines repeated, however, so it may not be at all significant? PS. [kyrian@enkil ~]$ cat /proc/version Linux version 2.6.26.6-79.fc9.i686 (mockbuild@) (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) ) #1 SMP Fri Oct 17 14:52:14 EDT 2008 [kyrian@enkil ~]$ Looking at the other bugs in Fedora/xorg-x11-drv-ati bugs... I think this is probably a duplicate of bug #365691, similar to, but not duplicate of #219632. It may also relate to #446398 & #448855. I read vaguely somewhere that this was related to 'high contrast' situations, so I'm going to try running with no desktop background, as mine are usually pretty high contrast. Might help, might not, doesn't hurt either way. Looking up stuff about forcing the AGP mode I found that there is no kernel option for it (only on/off/try_unsupported), but for Xorg there are these, so I'll try the latter's AGPMode suggestion (start with 2 and work down if needed): http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=467460 http://www.x.org/wiki/radeon I'll try with the VESA driver in xorg.conf, if that doesn't help the situation. This is a home machine, so if I can keep it stable enough to get my *ahem* infernal VAT return done, I'll be able to test various scenarios if needed. K. Apologies for yet more emails on this, but the version difference here is certainly curious, even if it's just cosmetic: [kyrian@enkil ~]$ rpm -q xorg-x11-drv-ati xorg-x11-drv-ati-6.8.0-19.fc9.i386 [kyrian@enkil ~]$ grep -A5 /usr/lib/xorg/modules/driver ~kyrian/lockup-fry/Xorg.0.log.post-lockup-fry (II) Loading /usr/lib/xorg/modules/drivers//radeon_drv.so (II) Module radeon: vendor="X.Org Foundation" compiled for 1.4.99.905, module version = 6.9.0 Module class: X.Org Video Driver ABI class: X.Org Video Driver, version 4.1 (II) LoadModule: "mouse" [kyrian@enkil ~]$ rpm -qf /usr/lib/xorg/modules/drivers//radeon_drv.so xorg-x11-drv-ati-6.8.0-19.fc9.i386 [kyrian@enkil ~]$ rpm -q --changelog xorg-x11-drv-ati-6.8.0-19.fc9.i386|head -3 * Wed Jul 30 2008 Dave Airlie <airlied> 6.8.0-19 - Update to latest upstream release + fixes Apropos, with the below AGPMode change, things once again seem more stable, although I've said that before, and been proved wrong moments later. Section "Device" Identifier "Videocard0" Driver "radeon" # Bug fiddling, https://bugzilla.redhat.com/show_bug.cgi?id=472315 #Driver "vesa" Option "AGPMode" "2" # END: Bug fiddling... VendorName "Videocard vendor" BoardName "ATI Technologies Inc M9+ 5C63 [Radeon Mobility 9200 (AGP)]" EndSection [kyrian@enkil ~]$ dmesg | grep -i agp Linux agpgart interface v0.103 agpgart: Detected an Intel 845G Chipset. agpgart: AGP aperture is 256M @ 0xb0000000 agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 2x mode agpgart: Putting AGP V2 device at 0000:01:00.0 into 2x mode [kyrian@enkil ~]$ K. It looks like it'll work, so I'd be happy to try upgrading just these packages in isolation, and holding off on the rest of FC10 to see if that fixes things without the 'AGPMode' tweak above, for instance... [root@enkil ~]# rpm -Uivh --test xorg-x11-drv-ati-6.9.0-54.fc10.i386.rpm libdrm-2.4.0-0.21.fc10.i386.rpm libdrm-devel-2.4.0-0.21.fc10.i386.rpm warning: xorg-x11-drv-ati-6.9.0-54.fc10.i386.rpm: Header V3 DSA signature: NOKEY, key ID 4ebfc273 Preparing... ########################################### [100%] [root@enkil ~]# K. It's all gone quiet... I thought it might help to have a "similar" example of where it has never been observed to happen, even without AGP tweaks in xorg.conf which is my 'office' machine: [kyrian@mybox ~]$ dmesg | grep '\(drm\|agp\)' Linux agpgart interface v0.103 agpgart: Detected AGP bridge 0 agpgart: AGP aperture is 256M @ 0xd0000000 [drm] Initialized drm 1.1.0 20060810 [drm] Initialized radeon 1.29.0 20080528 on minor 0 agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V3 device at 0000:00:00.0 into 8x mode agpgart: Putting AGP V3 device at 0000:01:00.0 into 8x mode [drm] Setting GART location based on new memory map [drm] Loading R300 Microcode [drm] Num pipes: 1 [drm] writeback test succeeded in 1 usecs [drm] Num pipes: 1 ... [kyrian@mybox ~]$ cat /etc/fedora-release Fedora release 9 (Sulphur) [kyrian@mybox ~]$ rpm -q xorg-x11-drv-ati xorg-x11-drv-ati-6.8.0-19.fc9.i386 kyrian@mybox ~]$ /sbin/lspci | grep -i ATI 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 01:00.0 VGA compatible controller: ATI Technologies Inc RV350 AS [Radeon 9550] 01:00.1 Display controller: ATI Technologies Inc RV350 AS [Radeon 9550] (Secondary) [kyrian@mybox ~]$ grep -i agp /etc/X11/xorg.conf [kyrian@mybox ~]$ I guess I could manage to hang around in the office one evening with some beer to try and break it with iPlayer if that would help folks. This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |