Bug 155016 - firstboot fails "could not open display"
firstboot fails "could not open display"
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: firstboot (Show other bugs)
4.0
All Linux
medium Severity medium
: ---
: 4.8
Assigned To: Chris Lumens
Alexander Todorov
: OtherQA, Reopened
Depends On:
Blocks: 367631 431715 458123
  Show dependency treegraph
 
Reported: 2005-04-15 14:05 EDT by Daniel W. Ottey
Modified: 2010-10-21 22:54 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
* on systems with two or more processors, a race condition existed between the X server starting and firstboot detecting it had started. If setxkbmap started and finished before metacity started, a "-terminate" command line argument sent to the X server by firstrun.py caused the X server to exit when the last running X client exited. By the time metacity was ready to start, no X server was available, so metacity would exit with an error message ("unable to open X display :1"). Firstboot no longer includes the "- terminate" command line argument, thus avoiding this race.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-18 16:29:09 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Screen capture of traceback error (17.64 KB, image/png)
2005-04-15 14:05 EDT, Daniel W. Ottey
no flags Details

  None (edit)
Description Daniel W. Ottey 2005-04-15 14:05:28 EDT
Description of problem:
We have successfully run firstboot on our IA32 systems.  In our first attempt at
running it on our IA64 system, we have run into a problem.  When firstboot runs
on start, we receive a traceback error.  I am attaching a screen capture of the
error, since no log file was produced.

We will be happy to produce any log files; if we only knew where the helpful
information is!  :-)


Version-Release number of selected component (if applicable):
firstboot-1.3.39-2
Comment 1 Daniel W. Ottey 2005-04-15 14:05:28 EDT
Created attachment 113239 [details]
Screen capture of traceback error
Comment 2 Suzanne Hillman 2005-05-12 15:29:34 EDT
Are you able to run X by hand? ie, after the machine has finished booting, does
X start?
Comment 3 Mike A. Harris 2005-07-06 18:26:29 EDT
The "FBIOPAN" error in the screenshot is indicative of a kernel fbdev bug
I believe.  There used to be a similar bug in our ppc kernel if I remember
correctly in RHEL3.  I don't recall the details of the issue, but searching
bugzilla for "FBIOPAN" might yield some matches.

Hope this helps.
Comment 7 Chris Lumens 2005-10-12 10:18:30 EDT
Daniel - Do you remember if you were able to do a graphical install on this
machine?  Does X start up fine for you after firstboot has failed?
Comment 8 Chris Lumens 2006-06-05 13:41:26 EDT
Closing since no information has been provided for several months.  If you are
still seeing this problem on more recent versions of RHEL4, please feel free to
reopen this bug.
Comment 9 Charlotte Richardson 2007-07-25 16:54:35 EDT
We have this bug also. We have to work around it by editing firstboot.py.

What is happening is a race between the X clients that firstboot.py starts. That
script starts the Xserver with the -terminate option, so the Xserver will exit
when the last remaining X client ends. However, since the script will kill the
PID of the Xserver when it is really done with it, there was no need for this
option being passed to X to get X to close itself.

The firstboot.py scripts starts up both setxkbmap and metacity. If setxkbmap
starts and ends before metacity starts, the -terminate will cause the Xserver to
exit, and metacity will get the error shown in the earlier posting (and you will
not see the firstboot screens).

The fix is to take the -terminate option out of the line that starts up the
Xserver in firstboot.py. That's what we did.

Obviously, this doesn't come up if you are coming up to runlevel 3 instead of 5.
Comment 10 Andrius Benokraitis 2007-07-31 14:08:09 EDT
Reopening bug since Stratus is seeing this on x86_64.
Comment 11 Andrius Benokraitis 2007-07-31 14:09:38 EDT
Stratus has tested this on RHEL 4.5.
Comment 13 Chris Lumens 2007-08-06 13:51:38 EDT
We use these same flags in devel, so this should be fine for 4.x as well.
Comment 14 RHEL Product and Program Management 2008-02-01 14:14:07 EST
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 16 Alexander Todorov 2008-09-03 03:11:57 EDT
Daniel, Charlotte,
can you guys report consistent steps to reproduce? 

I do understand this comment:
<quote>
If setxkbmap
starts and ends before metacity starts, the -terminate will cause the Xserver to
exit, and metacity will get the error shown in the earlier posting
</quote>

but I'm not clear how to reproduce it consistently. Can you please help?
Comment 17 Charlotte Richardson 2008-09-03 10:15:39 EDT
You need a multiprocessor system (I have eight CPUs) so that the various python pieces run simultaneously. Basically you need to have setxkbmap to be started and finished before metacity is started (on some other processor). You might be able to force the timing of it on a uniprocessor system by introducing a delay before starting metacity, I don't know. When I first saw this problem the system I was working on had only two processors, and the problem happened 100% of the time, so two processors is enough. You might ask Daniel what his system has.

What is happening is that the "-terminate" command line argument to the X server causes it to exit when the last running X client exits, rather than continuing to run if there are no clients connected to it. So if setxkbmap runs first to determine keyboard information and metacity hasn't started up yet, when setxkbmap exits, the X server exits too, and so isn't there for metacity to use, hence the error "unable to open X display :1". Since the X server gets shut down later by its PID anyhow, all you need to do is get rid of the "-terminate". That's how we fixed it here.

The RHEL5 version of this mechanism doesn't have this particular bug.
Comment 18 RHEL Product and Program Management 2008-09-18 15:15:52 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 20 Chris Lumens 2008-11-19 13:13:39 EST
This should be fixed in firstboot-1.3.39-7.  Thanks for the patch.
Comment 22 Ruediger Landmann 2009-01-29 02:19:46 EST
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
* on systems with two or more processors, a race condition existed between the X server starting and firstboot detecting it had started. If setxkbmap started and finished before metacity started, a "-terminate" command line argument sent to the X server by firstrun.py caused the X server to exit when the last running X client exited. By the time metacity was ready to start, no X server was available, so metacity would exit with an error message ("unable to open X display :1"). Firstboot no longer includes the "- terminate" command line argument, thus avoiding this race.
Comment 23 Chris Ward 2009-02-20 08:29:31 EST
~~ Attention Partners!  ~~
RHEL 4.8 Partner Alpha has been released on partners.redhat.com. There should
be a fix present in the Beta, which addresses this bug. If you have already completed testing your other URGENT priority bugs, and you still haven't had a chance yet to test this bug, please do so at your earliest convenience, to ensure that only the highest possible quality bits are shipped in the upcoming public Beta drop.

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. Further questions can be directed to your
Red Hat Partner Manager.

Thanks, more information about Beta testing to come.
 - Red Hat QE Partner Management
Comment 24 Charlotte Richardson 2009-02-20 12:03:54 EST
I was able to test this on a system that had exhibited the original problem, and firstboot is now behaving correctly. A check of /usr/share/firstboot/firstboot.py shows that the "-terminate" command line option tot he Xserver is now gone, which explains why it now works on a multiprocessor system (this particular one has 8 CPUs). Thanks!
Comment 25 Alexander Todorov 2009-02-23 03:47:12 EST
changing status to VERIFIED based on comment #24
Comment 27 errata-xmlrpc 2009-05-18 16:29:09 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1005.html

Note You need to log in before you can comment on or make changes to this bug.