Bug 207746

Summary: rhgb is freezing computer
Product: [Fedora] Fedora Reporter: cornel panceac <cpanceac>
Component: rhgbAssignee: Ray Strode [halfline] <rstrode>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: ajax, emeric.maschino, gnomeuser, jmtt, mharris, peter, than
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 0.16.4-1.fc6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-10-30 17:26:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description cornel panceac 2006-09-22 20:19:23 UTC
Description of problem:

while booting with rhgb, computer freezes
Version-Release number of selected component (if applicable):
rhgb-0.16.3-5.fc6

How reproducible:
pass rhgb as boot parameter to the kernel

Steps to Reproduce:
1.pass rhgb as boot parameter to the kernel
2.boot the machine
3.
  
Actual results:
the computer freezes before displaying gdm


Expected results:
computer displays gdm and waits for user input

Additional info:

Comment 1 Mike A. Harris 2006-09-27 05:06:43 UTC
This is very unlikely to be a bug in rhgb itself, but more likely to be rhgb
usage being the catalyst to exposing bugs in the X server and/or video drivers
which would not ordinarily ever be seen in normal X server usage, but which
only surface when multiple X servers are started simultaneously on the same
video hardware - such as when rhgb is used.

This is not a new problem however, it has been present in the OS ever since
rhgb was introduced.  If you experience this problem at all, it is recommended
to either:

1) Disable rhgb completely and/or uninstall it.

or

2) Crack out a debugger, attach it to the X server during rhgb-enabled
system startup, and start debugging why it is that the X server conks
out.

Upstream X.Org is pretty much totally uninterested in the issue, and I
can't say I blame them much either.  ;o)

Anyhow, until someone earning greenbacks is assigned to investigate this
issue directly and spend as much time debugging it as is necessary to fix
the problem - or until someone kindly volunteers to do so - it is very
extremely highly unlikely to vanish mysteriously.

Until then - don't use 2 or more X servers simultaneously on the same box,
and do not use rhgb at all period to avoid the problem.



Comment 2 Peter Gordon 2006-09-27 05:47:19 UTC
Not to be rude, but this seems like a fairly large regression, as RHGB has
worked near-perfectly out of the proverbial box on my Radeon 9250 ever since I
purchased the card soon after the release of FC4. 

Potentially, people who may be thinking of trying Fedora on such hardware (where
earlier  releases worked, friends told them) would install FC6 and not have it
be able to even boot properly due to this issue. 

Is it too late in the release cycle to make RHGB not an installed-by-default
package due to this?

Comment 3 Émeric Maschino 2006-09-27 07:44:49 UTC
In my situation, the system is "locally frozen". But it can be accessed by ssh 
and /usr/sbin/gdm-restart as root solves the problem.

Comment 4 cornel panceac 2006-09-27 08:26:31 UTC
my video card is nvidia geforce fx 5200. the driver is the default (open
source), not the proprietary. i haven't checked the ssh connectivity yet.
mharris: there's a nice discussion on fedora-test-list between alan cox, ajax
and others on this subject. you may wanna join :)

Comment 5 Mike A. Harris 2006-09-27 11:10:35 UTC
(In reply to comment #2)
> Not to be rude, but this seems like a fairly large regression, as RHGB has
> worked near-perfectly out of the proverbial box on my Radeon 9250 ever since I
> purchased the card soon after the release of FC4. 

That's not rude. ;o)

Realistically, this isn't one single bug, but rather multiple bugs, as
it happens on a variety of hardware - mostly Intel, ATI and Nvidia, but
then those are the most popular hardware out there so it could be more
general than that.  It does not seem to occur for _everyone_ however
obviously, or nobody would be able to boot the OS graphically at all
without encountering the problem.

I tracked the issue for several years, but have never reproduced it
personally on my systems.  I think that's the key factor that will
lead to any of these problems ever being fixed - developers working on
the code have to be able to reproduce the problems directly and
reproduceably on hardware they have available.

The multitude of bug reports I've seen concerning these type of issues
suggest that they are video driver specific bugs, and that they are
hardware specific it seems.  It's unclear beyond that what other
commonalities might exist though.


> Potentially, people who may be thinking of trying Fedora on such hardware
> (where earlier  releases worked, friends told them) would install FC6 and
> not have it be able to even boot properly due to this issue. 

Actually, I have confirmed with many who've experienced the problem that it
is not Fedora specific.  rhgb is one way of causing the problem to occur,
but if you encounter the problem with rhgb, you can install Ubuntu, Mandrake
or whatever on the same box, with the same version of X, and reproduce the
issue simply by starting an X server on :0, and another one on :1

So, currently users have 3 choices:

1) Disable rhgb

2) Debug and fix the bug(s) in X and the drivers themselves or wait for someone
   else to do it.

3) Switch distributions unnecessarily and don't use more than a single X
   server at the same time (which is accomplished in Fedora by disabling rhgb)


> Is it too late in the release cycle to make RHGB not an installed-by-default
> package due to this?

In theory it's never too late.  In reality however, it is unknown what the
exact percentage of users is which are affected by this problem, and to
disable rhgb by default for some percentage of users which are considered
a minority, is probably going to be considered an unlikely possibility
to consider by the Fedora decision making community.

One of two things is likely to occur:

1) The problem is (or rather 'problems are') indeed considered to be large
   enough impact to allocate manpower into diagnosing and fixing the drivers
   that exhibit this problem (and/or the X server itself), and then doing
   just that, and solving the problem for everyone period.

or

2) The problem is considered to affect only a smaller group of users, and is
   not considered critical, in which case fixing the problem will not be
   given priority.


Keep in mind, regardless of the scope of the problem, it can't be fixed
at all, until enough information is known about the problem, and developers
working on X (at Red Hat, or X.Org etc.) have actual hardware on hand which
directly reproduces the problem.  Since it is a rather unusual special case
situation to run more than one X server on the same machine at the same time,
I highly doubt anyone at X.Org will bother unless they personally have a need
to have such a setup work properly.  Fedora is a special case in that rhgb
starts a second server before the first one ends, in order to avoid visible
screen flicker - thus having 2 servers running for a brief instant, both
avoiding that unwanted flicker, and also causing rare bugs to be triggered
in X which nobody cares about. ;o)

Again though, for now, I just totally recommend everyone disable rhgb.  It's
nice eye candy, but if it makes your system unstable, just disable it.


(In reply to comment #4)
> my video card is nvidia geforce fx 5200. the driver is the default (open
> source), not the proprietary. i haven't checked the ssh connectivity yet.
> mharris: there's a nice discussion on fedora-test-list between alan cox, ajax
> and others on this subject. you may wanna join :)

Thanks, I haven't been watching the lists...  I'll have a look for the thread.


Comment 6 cornel panceac 2006-09-27 12:33:48 UTC
1.i'll check if gdmflexiserver runs ok in my case and then i'll report :)
2.about rhgb transition to gdm, in theory at least, rhgb can fade out and gdm
can fade in, so that they never run in the same time.
3.regarding the link between the hardware and the bug, in my case, on the same
hardware rhgb runs ok on fc5 (with nv or nvidia) so i suspect it's a problem
introduced by the new combination of xorg and rhgb.

Comment 7 Adam Jackson 2006-09-27 14:29:12 UTC
Think I have this one nailed.  It's fundamentally a kernel bug, and there's no
fixing it for real without adding new VT ioctls.  However the server can work
around it in a fairly reliable - if ugly - way.

Should be fixed in xorg-x11-server-Xorg 1.1.1-43.fc6, closing.

Comment 8 Mike A. Harris 2006-09-27 20:11:25 UTC
(In reply to comment #7)
> Think I have this one nailed.  It's fundamentally a kernel bug, and there's no
> fixing it for real without adding new VT ioctls.  However the server can work
> around it in a fairly reliable - if ugly - way.
> 
> Should be fixed in xorg-x11-server-Xorg 1.1.1-43.fc6, closing.

It'll be interesting to find out how many other bugs that are potential dupes
over the years will be fixed by this.  Possibly a good idea to backport to
FC5 et al. as well.

Either way though, nice to see this one (hopefully) nailed.

Comment 9 Than Ngo 2006-09-30 07:49:33 UTC
the problem still appears with KDM as X login.

How reproducible:
pass rhgb as boot parameter to the kernel

Steps to Reproduce:
1.add DISPLAYMANAGER=KDE in /etc/sysconfig/desktop
2.pass rhgb as boot parameter to the kernel
3.boot the machine

  
Actual results:
the computer freezes before displaying kdm


Expected results:
computer displays kdm and waits for user input


Additional info:
tested with xorg-x11-server-Xorg 1.1.1-43.fc6 and rhgb-0.16.3-6.fc6
it worked fine in FC6-Beta2


Comment 10 Adam Jackson 2006-10-03 15:21:31 UTC
Ngo, can you attach the X log from the failed startup?  Either scp it off the
machine while it's "frozen", or reboot to single user mode and copy it off then.

Comment 11 Peter Gordon 2006-10-29 18:41:53 UTC
I'm not sure exactly what did it or why, but I made a clean install of FC6 on my
desktop a couple of nights ago after rawhide starting breaking quite verbosely,
and booting with RHGB appears to work quite splendidly out of the proverbial box
again. (YAY!)

Software:
rhgb-0.16.4-1.fc6
xorg-x11-server-Xorg-1.1.1-47.fc6
xorg-x11-drv-ati-6.6.2-4.fc6

Hardware:
Pentium 4 (ABIT VT7 motherboard)
Radeon 9250



Comment 12 Than Ngo 2006-10-30 11:26:52 UTC
ajax, it seems to work now in FC6 release. I don't see this issue anymore.
It seems the bug was in Sysvinit, which has been already fixed

Comment 13 Mate Wierdl 2007-01-30 23:52:01 UTC
Please see bug 169900.  Can you also switch between VCs with this fixed version?
 Are you running gdm?  If yes, can you switch to VC1?