54365 – X server is very slow to start

Bug 54365 - X server is very slow to start

Summary: X server is very slow to start

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	XFree86
Sub Component:
Version:	7.1
Hardware:	alpha
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Mike A. Harris
QA Contact:	David Lawrence
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2001-10-04 21:30 UTC by Robert M. Riches Jr.
Modified:	2007-04-18 16:37 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2001-11-07 20:54:25 UTC
Embargoed:

Attachments	(Terms of Use)
X configuration file (1.78 KB, text/plain) 2001-10-04 21:31 UTC, Robert M. Riches Jr.	no flags	Details
log file from a run that took several seconds (less than 30) (20.49 KB, text/plain) 2001-10-04 21:32 UTC, Robert M. Riches Jr.	no flags	Details
View All

Description Robert M. Riches Jr. 2001-10-04 21:30:29 UTC

Description of Problem: X server is very slow to start.  With RH7.0,
the X server started fairly reliably within 5 seconds, using various
XFree versions, from 4.01 to 4.03.  With RH7.1, I'm seeing a fair
percentage of times it takes longer than 30 seconds,
and it usually takes more than 5 seconds.

Version-Release number of selected component (if applicable):
4.0.3-21 (default from CDs and/or up2date)

How Reproducible: It's always much slower than with RH7.0. At least
50% of the time it takes longer than 5 seconds.  About 5-10% of the
time it takes longer than 30 seconds.

Steps to Reproduce:
1. log in
2. run startx
3. 

Actual Results: X server takes many seconds to start, often
longer than 30 seconds.

Expected Results: X server should start within a few seconds.
(This is an Alpha 21264 with fast disks, etc.)

Additional Information:
	
Graphics cars is an Vanta TNT2 16MB, running 1600x1200 at 16 bit
color.  (I'd like to do 24-bit color, but that doesn't work.)  I
have a Permedia 3 Oxygen VX1 32MB card on order.  I'll attach
the X configuration file and a log file from an X server that took
serveral seconds (but not longer than 30).

Comment 1 Robert M. Riches Jr. 2001-10-04 21:31:08 UTC

Created attachment 33418 [details]
X configuration file

Comment 2 Robert M. Riches Jr. 2001-10-04 21:32:04 UTC

Created attachment 33419 [details]
log file from a run that took several seconds (less than 30)

Comment 3 Mike A. Harris 2001-11-01 12:16:57 UTC

Unable to reproduce this on my Alpha here.  Can you do an strace or
otherwise determine where it is hanging?

Is it the actual X server itself that is hanging, or is it the
desktop stuff?  Perhaps a DNS lookup gone bad?

Comment 4 Robert M. Riches Jr. 2001-11-02 19:45:28 UTC

I have reason to believe it is the X server itself rather
than "desktop" stuff, mostly because I don't run any "desktop"
stuff.  The problem affects my users who use Gnome/KDE the same
as they affect those who do not use Gnome/KDE.  Also, at the
point of hang, the X server has not caused the switch from VC1
to VC7, because the VC1 text-mode login sequence is still visible.

One of my hunches is the hang is caused while BTTV-related kernel
modules load during X server initialization.

Where should the strace command be inserted?  I tried inserting
it where startx calls xinit, but that did not seem to produce any
visible output.  xinit seems to be a compiled binary rather than a
script, which makes it difficult to insert an strace into it where
it calls something else.

Comment 5 Robert M. Riches Jr. 2001-11-05 19:28:33 UTC

I instrumented the startx script with echo "..." statements
at every non-trivial step, and put an strace in front of the
xinit command.  (After upgrading to kernel 2.4.9-12, teeing
the output of startx to a file intermittently causes the
server to not start up, but that's a separate issue.)  It
appears the delay is in the mcookie command.  Further testing
shows that reading just 4 bytes from /dev/random can take as
long as (or maybe even longer than) 20 seconds.

Was a change made to the kernel code behind /dev/random between
RH7.0 and RH7.1 that would increase the likelihood of long delays
in getting just a few random bits?  This suggests a workaround of
using "mcookie -f /dev/urandom", but that would decrease the
security of the xauth stuff.  Are there better ideas available?

Thanks.

Comment 6 Mike A. Harris 2001-11-06 03:35:31 UTC

Actually the info you just provided makes a lot of sense.  If the
entropy pool is empty, reading /dev/random can take a while.
If you move the mouse around, and hit random keys on the keyboard,
does it speed things up?

Arjan, do you have any comments perhaps to add here that might help
shed some light?

Comment 7 Robert M. Riches Jr. 2001-11-06 05:55:54 UTC

Typing gibberish on the keyboard or moving the mouse
(trackman marble, actually) around did the trick.  I
now have a "If you can see this, ..." message to
instruct users what to do when they try to start X.

Thank you for the solution/workaround.  I find it highly
amusing that you actually _CAN_ sometimes make a computer
go faster by moving the cursor around or typing gibberish
on the keyboard.

At this point, I'm okay with closing this report.

Comment 8 Arjan van de Ven 2001-11-06 10:01:59 UTC

As a workaround: if you can cause some extra disk io that will add to the secure
entropy pool.... that can even be scripted ;)

Comment 9 Robert M. Riches Jr. 2001-11-07 20:54:18 UTC

Thanks for the suggestion of using scripted disk I/O
to refill the entropy pool.  It appears to be working
quite nicely only taking a second or four to do the
job, and that's with a pretty lame script.

I'm okay with closing this report out, unless you want
to leave it open to try to get more informaton about
why this became an issue between 7.0 and 7.1.

Comment 10 Mike A. Harris 2002-05-30 03:19:50 UTC

Closing bug as NOTABUG, as it is merely an entropy issue.  XFree86 requires
entropy to be present and that means ensuring that activity occurs which
generates enough.  If stretched, I would put the job of working around
the problem at the kernel to generate entropy from more places, however
that isn't something likely to happen anytime soon or be a priority I doubt
just for this issue.  So the best workaround for the time being is to do
as you've been, and endsure things that keep the entropy pool filled, occur.

Note You need to log in before you can comment on or make changes to this bug.