Bug 178073 - Something happened to 3 FC4 units with the update downloads before January 17th
Something happened to 3 FC4 units with the update downloads before January 17th
Status: CLOSED WORKSFORME
Product: Fedora
Classification: Fedora
Component: xorg-x11 (Show other bugs)
4
i386 Linux
medium Severity high
: ---
: ---
Assigned To: X/OpenGL Maintenance List
David Lawrence
Login Screen is unusable and appears ...
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-01-17 13:06 EST by Greg Ennis
Modified: 2007-11-30 17:11 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-06-27 12:52:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Courrpted xfs config file (1.01 KB, application/octet-stream)
2006-01-20 17:46 EST, Greg Ennis
no flags Details

  None (edit)
Description Greg Ennis 2006-01-17 13:06:58 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
I have 3 FC4 units that started having problems with the login gui after the updates prior to January 17th,

These were the last two updates on one machine
Jan 14 14:24:38 Updated: system-config-bind.noarch 4.0.0-38_FC4
Jan 14 14:24:41 Updated: flex.i386 2.5.4a-35.fc4

This was the last update on a different machine
Jan 13 12:12:34 Updated: flex.i386 2.5.4a-35.fc4

I have another machine with the same symptoms as a fresh install with a complete upgrade.

In each case when I look for xfs it is not running, and when I try to run startx I receive the following:

Could not init font path element unix/:7100, removing from list!

Fatal server error:
could not open default font 'fixed'

I have put a notice on the Fedora user's group, but did not receive any help.  I have 3 units down at this point and am suspicious of flex.

I would appreciate your assistance

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Every time any one of these units is rebooted to run level 5
2.
3.
  

Actual Results:  The Login Screen appears to have an inappropriate resolution, but the resolutions have not been changed. 

Expected Results:  The normal gui login screen should appear

Additional info:
Comment 1 Greg Ennis 2006-01-17 19:46:16 EST
I am suspicious that 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169238

is a similar problem that I am having.  I have 3 other FC4 pc's that have had
all the updates that are not having this problem.  The three pc's affected are all 
"E-Machines".  Each PC functioned normally before the 17th of January, and there
were not setup changes made.  
Comment 2 Mike A. Harris 2006-01-17 20:07:24 EST
Neither flex nor system-config-bind bear any runtime relationship with
X working or not.  flex is a developmental package used during compilation
of various software (it's a lexical analyzer), it is not used at runtime
by X.  system-config-bind is the Fedora nameserver configuration utility,
also completely unrelated to X.

The problem described above, sounds more like a configuration problem than
a bug, but it is hard to conclude anything without more information.

With Fedora Core, the xfs font server must be running before you can start
the X server.  If it is not running, you will be unable to start X.  You'll
need to diagnose why your xfs server is not running first, and then resolve
that problem.  The fedora-list@redhat.com is a good place to post an email
about the problem and someone will likely be able to help you determine
the cause.  It's likely just a configuration problem.

Once you've had a chance to discuss the issue on fedora-list, please update
the report with your findings.

Hope this helps.


Thanks in advance.
Comment 3 Greg Ennis 2006-01-17 20:56:38 EST
Thank you for your response, but I am still at a loss as to what to do next.  

I understand the issue related to bind and flex, I would not have been
suspicious of that either, but the symptoms started after these were downloaded
with you in a nightly update.  

I have posted this issue on the user's list and there was no one that could help
that is why I reported it as a bug.  See thread ( xfs - gui - recent updates) 

Please understand that two of these three units have been working with FC4 for
over 2 months without difficulty and there was no recent setup changes.  Please
also understand that the 3rd unit was a fresh install, and the gui worked
perfectly until an update (yum -y update) was performed.  After the update and
after the reboot the gui login screen has failed as well as xfs.  

Each of these machines is an E-Machine with video cards

One video card is an Intell 82810 CGC that uses a Princeton LCD 17 inch
monitor

One video cards are ATI Technologies Inc #d Rage Pro AGP 1X/@X using a
Princeton Ultra 74

One video cards are ATI Technologies Inc #d Rage Pro AGP 1X/@X using a
Princeton Ultra 72

I have 4 other FC4 desktops that are not E-Machines that did not develop any
abnormalities after the recent updates, but these 3 have.  

I am still suspicious of some update mismatch. I would sure appreciate your help.

Thank you,

Greg Ennis

Comment 4 Greg Ennis 2006-01-19 17:30:32 EST
Help!!!!!!!

I am at a road block with this.  These three units are unusable.  I can not
stress the importance of getting this problem solved.  I am going to be required
to replace the os with MS if I am unable to make any headway.  


Please help!!

Greg
Comment 5 Mike A. Harris 2006-01-20 06:54:14 EST
When you reboot the system and log into the console, is the xfs server
running?  It should show up in the output of "ps ax".

If it's not running, try running "ntsysv" and ensuring that the xfs
service is enabled at boot time.  Now, to manually start the service, type:

service xfs start

Please indicate success or failure.  If it is successful, X should now start
properly, however if it fails, then some aspect of the system configuration
is likely causing a problem.  In the case of failure, please examine your
/var/log/messages file for any errors which might have been reported by
xfs during its attempt to start up.

In order for xfs to start up, the xfs config file must be correctly configured,
which is normally the case unless it has been manually edited.  Each of the
directories that are listed in the xfs config file must exist, and have
been properly processed by the appropriate font metadata file processing
utilities (ttmkfdir/mkfontscale/mkfontdir) for the types of fonts present
in the directory.  Normally this does not ever have to be done manually
by the administrator (or user), however if fonts are added manually, by
copying them into a directory or somesuch, you may need to run the utilities
manually.

Additionally, the fonts.dir file must exist in each font directory, and
be readable by the xfs server with the proper file ownership and permissions,
and each directory leading up to it must have the proper ownership and
permissions.  If any fonts get removed from a directory, the appropriate
utilities must be ran to reprepare the font metadata files.  Normally,
all of the font metadata file processing is done completely automatically
by the system during xfs initscript startup, and during font rpm package
installation/uninstallation/upgrade.

One more thing to check for, is to ensure that /usr is mounted read-write
at all times during any font installation/uninstallation/updating, as well
as during system startup, to ensure that the metadata files can be properly
processed.  Also, the partition in which /tmp resides must be read-write,
and must have free space available for the creation of a UNIX socket to
which the X server talks to the xfs server.

If any of these conditions are not met, the xfs server may fail to start,
and will log an error via syslog.

Please report back any xfs related error messages you find in your
/var/log/messages or displayed on screen during startup, as this will
be useful in diagnosing any issue.

Thanks in advance.
Comment 6 Elliot Lee 2006-01-20 10:08:07 EST
Hey Mike,

xfs is segfaulting in at least one circumstance. See
https://www.redhat.com/archives/fedora-list/2006-January/msg02842.html

Best,
-- Elliot
Comment 7 Mike A. Harris 2006-01-20 10:48:19 EST
Elliot:

Thanks for the additional info.  I've read the email now, and would like to
clarify a potential misunderstanding described in the mail.

>I posted a bug report with Red Hat, but they feel like this is a config
>problem and directed me back to this list.  It may be a config problem,
>but it is not one that I or the other users have created in that this
>the data below comes from a new install.

Just to be clear, my above comments and suggestions did not make any
conclusions that this problem is due to any specific configuration
changes, rather I am attempting to diagnose the problem from the
symptoms.  The usual cause for xfs not starting up, is due to bad
font metadata files, or due to being unable to create the initial
unix socket in /tmp/.font-unix or from corrupted or invalid fonts.

The idea was to help you to be able to narrow the problem down, as you
are in a bit of a pinch and seem to need a solution sooner rather than
later.

With the additional info Elliot has linked to, it seems like there is
a real bug in the xfs server, as it shouldn't SEGV.  At a glance of the
strace in the email, it isn't clear why the SEGV is occuring, however
it occurs during socket creation, which makes me wonder if /tmp is
full, or if perhaps the wrong ownership or permissions are on something.
Without getting into too much detail, there have been some problems WRT
/tmp files/dirs in the past, which aren't 100% solved in the current
X code base, and it is possible you could be hitting an issue related
to this perhaps.

It would be really useful to have a symbolic backtrace, however unfortunately
the monolithic X does not have an easy way to obtain debuginfo packages to
do this easily.  If you have the time and are willing to do a bit of
fiddling however, it could help expediate a solution quicker, if you could
rebuild xfs with debuginfo and then replace the xfs binary with the new
one.  If you're interested, you'll need to do the following:

1) Download the latest FC xorg-x11 src.rpm, and do "rpm -bi xorg-x11.spec"
   to build everything.

2) Go into the xc/programs/xfs subdir, and do a "makeg" to regenerate an
   unstripped xfs binary.

3) Move the system xfs binary out of the way temporarily by renaming it
   to xfs.orig

4) Move the newly built xfs binary into its place.

5) Enable coredumps with ulimit -c 0

6) Restart xfs, which should core dump

7) Run gdb --core <corefile>  and type "bt" to get a backtrace.

Hope this helps.

Comment 8 John Bass 2006-01-20 10:55:40 EST
I have two machines that took a power failure this morning after being updated
and now longer can get X running:

X Window System Version 6.8.2
Release Date: 9 February 2005
X Protocol Version 11, Revision 0, Release 6.8.2
Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF]
Current Operating System: Linux fastbox.dmsd.com 2.6.14-1.1637_FC4smp #1 SMP Wed
Nov 9 18:34:11 EST 2005 i686
Build Date: 21 September 2005
Build Host: hs20-bc1-6.build.redhat.com

        Before reporting problems, check http://wiki.X.Org
        to make sure that you have the latest version.
Module Loader present
OS Kernel: Linux version 2.6.14-1.1637_FC4smp
(bhcompile@hs20-bc1-4.build.redhat.com) (gcc version 4.0.1 20050727 (Red Hat
4.0.1-5)) #1 SMP Wed Nov 9 18:34:11 EST 2005
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(++) Log file: "/dev/null", Time: Fri Jan 20 03:06:52 2006
(++) Using config file: "/tmp/tmpwbMrlMxorg.config"
(WW) ATI(0): Failed to set up write-combining range (0xf6000000,0x800000)
(WW) ATI(0): Failed to set up write-combining range (0xf6000000,0x800000)
 matching Device section for instance (BusID PCI:5:8:0) found
(WW) MGA: No matching Device section for instance (BusID PCI:5:12:0) found
(EE) No devices detected.

Fatal server error:
no screens found

------------------ the other Matrox G200mms similar errors----------------
(--) MGA(0): Video BIOS info block at offset 0x07AC0
(--) MGA(0): Found and verified enhanced Video BIOS info block
(II) MGA(0): MGABios.RamdacType = 0x0
(WW) MGA(0): Failed to set up write-combining range (0xfd000000,0x800000)
(II) MGA(0): Splitting WC range: base: 0xfd000000, size: 0x2000000
(WW) MGA(0): Failed to set up write-combining range (0xfe000000,0x1000000)
(WW) MGA(0): Failed to set up write-combining range (0xfd000000,0x2000000)
(WW) MGA(0): Failed to set up write-combining range (0xfd000000,0x800000)
(--) MGA(0): VideoRAM: 8192 kByte
(WW) MGA(0): Failed to set up write-combining range (0xfd000000,0x800000)


Comment 9 Mike A. Harris 2006-01-20 11:02:36 EST
John Bass:

The problem you are describing is unrelated to the problem described in
this initial bug report, which describes the xfs server not starting
properly.  The config file referenced in the log file info in comment #8
does not refer to the system /etc/xorg.conf file.  The (WW) messages
that you see are not errors, they are harmless warnings that can safely be
ignored.  The xorg@lists.freedesktop.org mailing list is the best place
to post the problem you are experiencing, as it seems to simply be a
misconfiguration.
Comment 10 Greg Ennis 2006-01-20 16:27:36 EST
This problem is resolved.

I followed your step by step debug process and found  
/usr/X11R6/lib/X11/fs/config was corrupted in each of three machines.  In each
case a second file config.bak was present.  All I did to fix the problem was to
change the name of config.bak to config.  After rebooting the xfs daemon
functioned properly and the gui login, gnome, and kde worked perfectly.

I have 5 other fc4 units and have checked for this file and did not find any
config.bak present, but in every computer that had an unusable gui this file had
been courrpted.  There is no user that admits to a change of this file, and one
of these units was a fresh install with an immediate "yum -y update".

Although this problem is resolved very easily, once identified, it appears to me
there may be some scripting errors during a recent module update.

I would like to thank you for your exercise of patience in helping me fix this,
and my complements to your accuracy that this really was a config problem. My
conculding suggestion would be that this Bug is close to being closed once the
affected module update is corrrected.

If I can do anything to help you identify this module I would be pleased to do
so.... let me know.

Thanks again!!!

Greg Ennis
P.S. I will also post a notice on the user's list in that I would not be
surprised that this bug will occur again.


Comment 11 Greg Ennis 2006-01-20 17:46:45 EST
Created attachment 123508 [details]
Courrpted xfs config file

Here is the courrpted xfs config file.	There are some characters stored that
do not display on a terminal screen.
Comment 12 Gerry Tool 2006-01-21 10:42:46 EST
Thank you Greg and Mike for your persistance.  I have had exactly the same
problem - I have an ATI radeon 7500 card in a Soyo MB home built system and
experienced exactly the same events as Greg.  I was able to ignore it because I
have other partitions with other distros and went on working with those.  This
morning I decided I really wanted to get my FC4 partition back in use as my main
system and started looking in the mailing lists and found Greg's posts and
followed to this bug report.

Just as Greg did, I replaced my xfs config file with a .bak version that was
luckily there, and it fixed the problem.

Thanks again.
Comment 13 Gerry Tool 2006-01-21 11:05:58 EST
I decided I should also add that my system went south just as Greg described his
did.  I noted in a log I keep that it happened on 1/14/06.  I had logged into
KDE to see whether recent updates had made much of a change (I usually us
gnome), and the screen froze after a while.  When I rebooted I had the "Could
not init font path element unix/:7100, removing from list" error.

I probably had done a yum update that day, as I do most days to check for
updates. I reboot several times per day using other distros or Windows XP for a
scanner that isn't yet supported in xsane, so the update that caused the problem
was probably one from that day or the previous day.

Maybe this will help understand which update was responsible.
Comment 14 Greg Ennis 2006-01-25 13:03:15 EST
New information:  This may be a cups web edit problem

I made an observation yesterday that needs to be posted here.  I was working on
setting up remote printing using the cups web admin system.  When I rebooted
this machine had the the same xfs error, and my gdm gui was gone.  

In checking /usr/X11R6/lib/X11/fs/config it was courptted exactly like the
config file I posted with this bug report.  These two new problems occurred on
two different machines one was a compaq, and the other was a "Fry's" special. 
There were no recent updates on either machine, but I was trying to update the
cups system on both with their web edit process ie http://localhost:615

I am inclined to believe that there is no causative relationsip with either the
kind of PC or with the yum update process.

I am now suspicious of a bug related to the cups web edit process.  I will do
some additional testing to see if I can make this predictable.  
Comment 15 Mike A. Harris 2006-06-27 12:52:49 EDT
This bug is far too vague, and it is not possible to reproduce during
testing.

If the problem is still occuring, it is recommended to upgrade to Fedora
Core 5 or later.

Note You need to log in before you can comment on or make changes to this bug.