| Summary: | [RV380] hang when X shuts down (regression) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Pierre Ossman <ossman> | ||||||||
| Component: | xorg-x11-drv-ati | Assignee: | Jérôme Glisse <jglisse> | ||||||||
| Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | 14 | CC: | airlied, jglisse, mcepl, xgl-maint | ||||||||
| Target Milestone: | --- | Keywords: | Triaged | ||||||||
| Target Release: | --- | ||||||||||
| Hardware: | i686 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | [cat:modesetting] | ||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2012-08-16 13:32:25 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Attachments: |
|
||||||||||
|
Description
Pierre Ossman
2011-03-22 21:42:52 UTC
(In reply to comment #0) > The machine is completely hung and cannot even be reached over the network. > Nothing in Xorg.0.log or messages when the machine is back up. Even when you boot to runlevel 3? (add number 3 to the kernel command line) Otherwise, please add drm.debug=0x04 to the kernel command line, restart computer, and attach * your X server config file (/etc/X11/xorg.conf, if available), * X server log file (/var/log/Xorg.*.log) from running in the runlevel 3, * output of the dmesg command (anytime before the termination of X), and * system log (/var/log/messages) to the bug report as individual uncompressed file attachments using the bugzilla file attachment link above. We will review this issue again once you've had a chance to attach this information. Thanks in advance. Ehm... not sure I understand you here. If I boot runlevel 3 then there won't be any X, so no Xorg.*.log. I am getting hangs with the "good" driver now as well. It seems it has the bug as well, just that the newer was more easily provoked. On that theory, starting just X (no clients) even with the "bad" driver does not provoke a hang. So it seems it needs to do some actual work before things go south... I'll attach some files from runlevel 5 with that debug flag. Hopefully something makes it into the log files... Created attachment 489204 [details]
dmesg.log
Created attachment 489205 [details]
messages
Created attachment 489206 [details]
Xorg.0.log
(In reply to comment #2) > Ehm... not sure I understand you here. If I boot runlevel 3 then there won't be > any X, so no Xorg.*.log. You can get X in the runlevel 3 by running startx command as normal user with the advantage that you get back to the command line and not getting locked out from whole system. I am sorry for not explaining this completely. I've been debugging the system remotely over ssh, so messing up the local console hasn't been an issue. I did manage to run one test this morning though, and that was recompiling the mesa driver package without gallium. And after that I could not reproduce the hang. So it seems the issue is in the R300 Gallium driver. I'll try netconsole when I get back home and see if I can get something before it locks up. Sorry, got distracted and then completely forgot about this issue. The machine has been running just fine using the classic driver for a month now. Not a single hang. Went back to gallium today to do some more testing. First, I upgraded the system fully so I'm running on a new kernel and newer Xorg now. No updates of mesa or the DDX though. I set up netconsole and disabled NetworkManager's suspend script (so that I could see things as the machine is going down). I've also kept drm.debug=0x04. I then tried to do some suspends. Most of the time, the machine locks up with the display running and valid output still on there. Nothing whatsoever is sent over netconsole. Twice, I got it to suspend but locked up on resume instead. I got this in netconsole going down: [ 184.676547] PM: Syncing filesystems ... done. [ 184.677857] PM: Preparing system for mem sleep [ 184.691980] [drm:drm_crtc_helper_set_config], [ 184.691985] [drm:drm_crtc_helper_set_config], crtc: f64e4000 7 fb: f6715978 connectors: f658f850 num_connectors: 1 (x, y) (0, 0) [ 184.692023] [drm:drm_crtc_helper_set_config], setting connector 13 crtc to f64e4000 [ 184.692040] [drm:drm_crtc_helper_set_config], [ 184.692043] [drm:drm_crtc_helper_set_config], crtc: f64e3000 8 fb: f6715978 connectors: f658f860 num_connectors: 0 (x, y) (0, 0) [ 184.692051] [drm:drm_crtc_helper_set_config], crtc has no fb, full mode set [ 184.692054] [drm:drm_crtc_helper_set_config], setting connector 13 crtc to f64e4000 [ 184.692067] [drm:drm_crtc_helper_set_config], [ 184.692069] [drm:drm_crtc_helper_set_config], crtc: f64e4000 7 fb: f6715978 connectors: f658f850 num_connectors: 1 (x, y) (0, 0) [ 184.692077] [drm:drm_crtc_helper_set_config], setting connector 13 crtc to f64e4000 [ 184.692503] Freezing user space processes ... (elapsed 0.01 seconds) done. [ 184.703073] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. [ 184.714033] PM: Entering mem sleep [ 184.714073] Suspending console(s) (use no_console_suspend to debug) Nothing coming up unfortunately. Just doing something simple, like stopping the X server through "telinit 3", results in the same hang with everything left on screen. Nothing using netconsole at that point either. So I tried cranking up the debug level to 0xffff (sidenote: which crashed the nouveau driver on the receiving machine. graphics drivers really love me :/). These are the last few lines that seem to be the same for every hang: [ 161.461834] [drm:drm_ioctl], pid=1526, cmd=0xc0206466, nr=0x66, dev 0xe200, auth=1 [ 161.461973] [drm:drm_ioctl], pid=1526, cmd=0x40086409, nr=0x09, dev 0xe200, auth=1 [ 161.462345] [drm:drm_ioctl], pid=1526, cmd=0xc01c64a3, nr=0xa3, dev 0xe200, auth=1 [ 161.462385] [drm:drm_ioctl], pid=1526, cmd=0xc01c64a3, nr=0xa3, dev 0xe200, auth=1 [ 161.462429] [drm:drm_ioctl], pid=1526, cmd=0x40086409, nr=0x09, dev 0xe200, auth=1 [ 161.462462] [drm:drm_ioctl], pid=1526, cmd=0x40086409, nr=0x09, dev 0xe200, auth=1 [ 161.462533] [drm:drm_ioctl], pid=1526, cmd=0x40086409, nr=0x09, dev 0xe200, auth=1 [ 161.462580] [drm:drm_ioctl], pid=1526, cmd=0xc0206466, nr=0x66, dev 0xe200, auth=1 [ 161.462663] [drm:drm_ioctl], pid=1526, cmd=0xc01c64a3, nr=0xa3, dev 0xe200, auth=1 [ 161.463707] [drm:drm_ioctl], pid=1526, cmd=0x641f, nr=0x1f, dev 0xe200, auth=1 Problem seems to be related to releasing control with GLX stuff active. If I kill xbmc, X will shut down momentarily and you see the console flicker by. I do not seem to be able to hang things this way. GLX commands still being executed and referencing freed memory or something nasty like that perhaps? Dänzer asked me to test mismatch combinations of the DRI driver between the X server and the client application. This is the result: Xorg xbmc Result ------------------------ Mesa Gallium Hang Gallium Mesa No hang Did some printk debugging of the kernel, and it makes it all the way through drm_dropmaster_ioctl() before it locks up. Not sure where to continue hunting at that point. This message is a notice that Fedora 14 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 14. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '14' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 14 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping |