Bug 634239
Summary: | black bars representing system messages always overlay current tty1 Xorg display | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jon Masters <jcm> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 19 | CC: | anton, atswartz, dougsland, gansalmon, itamar, jcm, jforbes, jonathan, kernel-maint, madhu.chinakonda, xgl-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2013-04-05 16:38:34 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Jon Masters
2010-09-15 15:42:40 UTC
Logging out and in again results in Xorg running on tty7. It then has less corruption. But that is another annoying bug, that the second time you login after booting (this is not FUS - there is no second user, this is a logout/in again) you now get Xorg running on tty7 without it being told explicitly to run on tty1. So you will need to reproduce this by booting rawhide from scratch and observing what happens. The second issue of X randomly moving to tty7 on the second login (after logging out from the first login), and being started without being told to run on a specific tty is here (and you'll likely re-assign that one to gdm): https://bugzilla.redhat.com/show_bug.cgi?id=634299 Nevertheless, the corruption of the display by random characters on tty1 is still true and the concern of this bug. Jon. I think this is kernel (logging). Reason being that I had checked only Xorg had the tty1 device node open and moved the device, but neglected to consider kernel logging might be screwed up. And with DRM debugging turned on, I now see the predictable once-per-5-seconds kernel thread kicking off to prove the external device VGA connector causing screen corruption. So, either the kernel is outputting this crap to tty1 regardless, or rsyslogd is failing. The kernel is: 2.6.36-0.21.rc4.git1.fc15.x86_64 The rsyslogd is: rsyslog-4.6.3-2.fc15.x86_64 Jon. Here, you can see the gdm login prompt is nicely corrupted by the flowing "text": http://www.flickr.com/photos/jonmasters/4994556587/ Jon. The two videos walk you through the problems. I explicitly set console=tty9 and I still see a repeatable once per 5 second two lines of black bars on the Xorg display, which is consistent with the two lines output by drm.debug=0x04 debugging. So, I think the kernel is outputting crap on tty1 no matter what, and rsyslogd is running, and it has klog open. Hmmm. Anyone? I was getting this same behavior in addition to: Wrong permissions on /dev/dri https://bugzilla.redhat.com/show_bug.cgi?id=626559 with two different ati cards 3870 & 2600. I fixed it by compiling my own kernels. Confirmed that the problem is kernel oops data being displayed, but replaced by black bars. I have been able to boot this laptop only once wherein the actual oops data displayed correctly - attaching a screenshot with my custom 2.6.36-rc1 kernel build running. There are a series of nasty oopses related to this for which I am attaching the dmesg logs. So this bug is still that we're overlaying the X session with the oops data. That's actually a good thing now we have KMS (especially if we could do it Solaris-style with color) but it's not working. Typically, the user sees only weird black bars overwriting the display that disappear when windows are moved over them. I suppose it's obvious now, but it wasn't obvious at first. Jon. Created attachment 448168 [details]
dmesg.txt
(In reply to comment #9) > I was getting this same behavior in addition to: Wrong permissions on /dev/dri > https://bugzilla.redhat.com/show_bug.cgi?id=626559 > with two different ati cards 3870 & 2600. I fixed it by compiling my own > kernels. With different kernel config options that you didn't specify. Created attachment 448560 [details]
working config
I adapted a config that I use on another distro and there are many changes, so I am not sure that this will be of much use.
Yea. I'd love to know. There's also a long-standing RCU check failure in lockdep on boot, but that seems unrelated. Worst case, I'll bisect this. But I'm going to try Linus' latest RC first, in case it has gone away in the latest GPU updates. If it has, the bisect is hopefully smaller the other way to bisect from working to broken. We'll see. I'll keep you informed Chuck. btw, this kernel is also horribly unstable in general. The box falls over after an hour or two and requires a hard reset. Various different oopses, panics, lockup warnings, you name it. I'll perhaps also try a config without lockdep and debugging options enabled. If it continues, I'll need to let Linus know. Jon. Ah, the config was attached but I hadn't refreshed this BZ. I'll look at it tomorrow. https://bugzilla.redhat.com/show_bug.cgi?id=626026 is the longstanding one that always hits this box with these kernels too. (In reply to comment #15) > btw, this kernel is also horribly unstable in general. The box falls over after > an hour or two and requires a hard reset. Various different oopses, panics, > lockup warnings, you name it. I'll perhaps also try a config without lockdep > and debugging options enabled. If it continues, I'll need to let Linus know. It's been totally stable for me. (In reply to comment #13) > (In reply to comment #9) > > I was getting this same behavior in addition to: Wrong permissions on /dev/dri > > https://bugzilla.redhat.com/show_bug.cgi?id=626559 > > with two different ati cards 3870 & 2600. I fixed it by compiling my own > > kernels. > > With different kernel config options that you didn't specify. Sorry about the delay, although there was a config for 2.6.36-0.18 on the linked bug report the whole time. When I run the rawhide kernel on F13 I don't see any of those video artifacts. Still happening with latest Linus RC. I was wrong about the instability - that was when it was booting an older kernel. I think it's just logging a lot of oops/other crap. I'll show you if you're in today as I'm headed in now. Are you around later this pm? Jon. I can see from running the latest RC that the oopses are gone. However, /var/log/messages is updated in time with the remaining corruption, so it's clearly a problem with the logging setup. I mean even for kernel it would happen if rsyslogd were for some reason still outputting on tty1 or its stdout. Attaching a screenshot immediately after the following landed in the log: Sep 20 16:39:16 tonnant kernel: gnome-volume-co[1723]: segfault at 7fff4a574ff8 ip 0000003032a0faa4 sp 00007fff4a575000 error 6 in libgobject-2.0.so.0.2515.0[3032a00000+4e000] Sep 20 16:39:17 tonnant abrt[2107]: saved core dump of pid 1723 (/usr/bin/gnome-volume-control-applet) to /var/spool/abrt/ccpp-1285015156-1723.new/coredump (30703616 bytes) Sep 20 16:39:17 tonnant abrtd: Directory 'ccpp-1285015156-1723' creation detected Sep 20 16:39:19 tonnant abrtd: New crash /var/spool/abrt/ccpp-1285015156-1723, processing Sep 20 16:39:19 tonnant abrtd: Registered Action plugin 'RunApp' Sep 20 16:39:19 tonnant abrtd: RunApp('/var/spool/abrt/ccpp-1285015156-1723','test x"`cat component`" = x"xorg-x11-server-Xorg" && cp /var/log/Xorg.0.log .') Sep 20 16:39:57 tonnant kernel: kworker/u:0 used greatest stack depth: 2976 bytes left Jon. (In reply to comment #20) > When I run the rawhide kernel on F13 I don't see any of those video artifacts. When I run the rawhide kernel (2.6.36-0.24) on F14, I do see the artifacts. Yet all the standard f14 kernels do not produce the artifacts. (In reply to comment #15) > I'll perhaps also try a config without lockdep > and debugging options enabled. If it continues, I'll need to let Linus know. > > Jon. You are on to something here. This difference may be enough. < # CONFIG_LOCKUP_DETECTOR is not set < # CONFIG_HARDLOCKUP_DETECTOR is not set > CONFIG_LOCKUP_DETECTOR=y > CONFIG_HARDLOCKUP_DETECTOR=y (In reply to comment #25) > You are on to something here. This difference may be enough. < # CONFIG_LOCKUP_DETECTOR is not set < # CONFIG_HARDLOCKUP_DETECTOR is not set > CONFIG_LOCKUP_DETECTOR=y > CONFIG_HARDLOCKUP_DETECTOR=y No luck. Needs more changes. I went back to the config that was working. We'll do some poking then I think. this particular problem has been fixed for me in kernel-2.6.36-0.39.rc8.git0.fc15.x86_64. This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19 |