Description of problem: We have successfully run firstboot on our IA32 systems. In our first attempt at running it on our IA64 system, we have run into a problem. When firstboot runs on start, we receive a traceback error. I am attaching a screen capture of the error, since no log file was produced. We will be happy to produce any log files; if we only knew where the helpful information is! :-) Version-Release number of selected component (if applicable): firstboot-1.3.39-2
Created attachment 113239 [details] Screen capture of traceback error
Are you able to run X by hand? ie, after the machine has finished booting, does X start?
The "FBIOPAN" error in the screenshot is indicative of a kernel fbdev bug I believe. There used to be a similar bug in our ppc kernel if I remember correctly in RHEL3. I don't recall the details of the issue, but searching bugzilla for "FBIOPAN" might yield some matches. Hope this helps.
Daniel - Do you remember if you were able to do a graphical install on this machine? Does X start up fine for you after firstboot has failed?
Closing since no information has been provided for several months. If you are still seeing this problem on more recent versions of RHEL4, please feel free to reopen this bug.
We have this bug also. We have to work around it by editing firstboot.py. What is happening is a race between the X clients that firstboot.py starts. That script starts the Xserver with the -terminate option, so the Xserver will exit when the last remaining X client ends. However, since the script will kill the PID of the Xserver when it is really done with it, there was no need for this option being passed to X to get X to close itself. The firstboot.py scripts starts up both setxkbmap and metacity. If setxkbmap starts and ends before metacity starts, the -terminate will cause the Xserver to exit, and metacity will get the error shown in the earlier posting (and you will not see the firstboot screens). The fix is to take the -terminate option out of the line that starts up the Xserver in firstboot.py. That's what we did. Obviously, this doesn't come up if you are coming up to runlevel 3 instead of 5.
Reopening bug since Stratus is seeing this on x86_64.
Stratus has tested this on RHEL 4.5.
We use these same flags in devel, so this should be fine for 4.x as well.
This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?".
Daniel, Charlotte, can you guys report consistent steps to reproduce? I do understand this comment: <quote> If setxkbmap starts and ends before metacity starts, the -terminate will cause the Xserver to exit, and metacity will get the error shown in the earlier posting </quote> but I'm not clear how to reproduce it consistently. Can you please help?
You need a multiprocessor system (I have eight CPUs) so that the various python pieces run simultaneously. Basically you need to have setxkbmap to be started and finished before metacity is started (on some other processor). You might be able to force the timing of it on a uniprocessor system by introducing a delay before starting metacity, I don't know. When I first saw this problem the system I was working on had only two processors, and the problem happened 100% of the time, so two processors is enough. You might ask Daniel what his system has. What is happening is that the "-terminate" command line argument to the X server causes it to exit when the last running X client exits, rather than continuing to run if there are no clients connected to it. So if setxkbmap runs first to determine keyboard information and metacity hasn't started up yet, when setxkbmap exits, the X server exits too, and so isn't there for metacity to use, hence the error "unable to open X display :1". Since the X server gets shut down later by its PID anyhow, all you need to do is get rid of the "-terminate". That's how we fixed it here. The RHEL5 version of this mechanism doesn't have this particular bug.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
This should be fixed in firstboot-1.3.39-7. Thanks for the patch.
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: * on systems with two or more processors, a race condition existed between the X server starting and firstboot detecting it had started. If setxkbmap started and finished before metacity started, a "-terminate" command line argument sent to the X server by firstrun.py caused the X server to exit when the last running X client exited. By the time metacity was ready to start, no X server was available, so metacity would exit with an error message ("unable to open X display :1"). Firstboot no longer includes the "- terminate" command line argument, thus avoiding this race.
~~ Attention Partners! ~~ RHEL 4.8 Partner Alpha has been released on partners.redhat.com. There should be a fix present in the Beta, which addresses this bug. If you have already completed testing your other URGENT priority bugs, and you still haven't had a chance yet to test this bug, please do so at your earliest convenience, to ensure that only the highest possible quality bits are shipped in the upcoming public Beta drop. If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. Thanks, more information about Beta testing to come. - Red Hat QE Partner Management
I was able to test this on a system that had exhibited the original problem, and firstboot is now behaving correctly. A check of /usr/share/firstboot/firstboot.py shows that the "-terminate" command line option tot he Xserver is now gone, which explains why it now works on a multiprocessor system (this particular one has 8 CPUs). Thanks!
changing status to VERIFIED based on comment #24
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1005.html