Bug 110124 (nvidiadri)

Summary:	3d open GL screensaver don't work after upgrade to Fedora
Product:	[Fedora] Fedora	Reporter:	Wendigo <wendigo3>
Component:	XFree86	Assignee:	Mike A. Harris <mharris>
Status:	CLOSED NOTABUG	QA Contact:
Severity:	medium	Docs Contact:
Priority:	medium
Version:	1	CC:	bart.martens, carwyn, gers4302, marcjw53, rebus
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2003-11-29 10:59:38 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Wendigo 2003-11-14 22:54:43 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; es-ES; rv:1.4.1)
Gecko/20031030

Description of problem:
After upgrading from RH9 to Fedora Core 1. 3d or openGL screensavers
stop to work.
The first time you choose one in the preview window it works, the next
ones don't show a preview. None of the seem to work when chosen.

Using Nvidia GeForce 4 MX440 and nvidia driver succesfully installed
and running other software ok (bzflag, Enemy Territory)


Version-Release number of selected component (if applicable):
xscreensaver-4.14-2

How reproducible:
Always

Steps to Reproduce:
1. Choose preferences -> screensaver
2. Try to preview a screensaver that uses open GL
3. The first one will work, the next ones will crash
    

Actual Results:  The screensavers don't work at all.

Expected Results:  The openGL screensaver should work in preview or
full screen mode.

Additional info:

Comment 1 Bill Nottingham 2003-11-14 23:40:29 UTC

This sounds like an issue with the nvidia driver or X; this works fine
for me with the radeon driver.

Comment 2 Wendigo 2003-11-15 00:10:30 UTC

Remove the next package before installing the drivers or many GL
applications (inlcuding screensavers) will not work

rpm -e nodeps XFree86-Mesa-libGL

Comment 3 Wendigo 2003-11-15 00:11:43 UTC

Sorry, incomplete report. After removing this...

rpm -e nodeps XFree86-Mesa-libGL

The screensavers work now without problem.

Comment 4 Mike A. Harris 2003-11-15 02:04:11 UTC

Since many people are encountering this problem and I'm having to
explain it over and over to lots of people, I've decided to make
this bug report the master bug duplicate for this specific
Nvidia driver installation induced problem.  I've given this bug
the alias "nvidiadri" also to make it easier to close dupes with.


Full details:

Nvidia's proprietary video drivers do not come in RPM package format,
although they did at one point in time, I'm not sure when they stopped
supplying rpm packages for their drivers.  Instead there is a binary
installer that you run and it unpacks the archive and drops the
driver files into the distribution outside of rpm context.

In addition to that, Nvidia's installer deletes various files that
are part of XFree86 which ship with Red Hat Linux including the
libglx.a X server extension and our Mesa libGL.  Those files become
replaced by Nvidia's proprietary libglx.a and libGL, of which only
work with their proprietary driver.  That means that once you install
Nvidia's video driver, if you ever use any other video driver
at all, be it the distribution supplied "nv" driver, or any other
video driver, you have no more OpenGL support at all, as their
GLX and libGL only work with their hardware period.  Since we do not
support Nvidia's proprietary drivers, any OpenGL related problems
that occur on any system that has had Nvidia's proprietary driver
installed on it are not supported by Red Hat until the user manually
reinstalls the Red Hat supplied XFree86-Mesa-libGL and XFree86
packages in order to restore the system supplied files Nvidia's
installer gratuitously deletes.

I mention the above in this bug report just for some background
into some of the problems caused by Nvidia's current driver
installation method, in hopes it is useful to understand the problem
reported in this bug report, and perhaps other problems Nvidia
users encounter.

The specific problem reported in this bug report will happen for
_all_ Nvidia proprietary driver users on all Nvidia hardware, and
it is caused by Nvidia's driver installation mechanism not using
RPM as it should be.  If they used RPM, then this type of problem
would be avoidable by using RPM's conflicts mechanism and stating the
following in their spec file:

    Conflicts: XFree86-Mesa-libGL

Their installation program *should* be uninstalling the Red Hat
supplied XFree86-Mesa-libGL package in order to remove our libGL,
or it should be installing their libGL elsewhere in the system
and using LD_PRELOAD script dropped in /etc/profile.d or some
other solution than randomly deleting distribution supplied files
that are managed by rpm.  They also should NOT be deleting any
of our XFree86 supplied X server modules as that causes Nvidia
users problems if they switch hardware or decide to use the 'nv'
driver.

However, currently they do just delete these files, and
that mechanism is not foolproof because it wrongly assumes:

1) That there is only _one_ system supplied libGL shared library

2) That the libGL supplied by the system is installed in a specific
   location that will never change

That assumption may have been true in previous operating system
releases, but only by random chance.  There is nothing anywhere
that specifies that multiple libGL's can't be supplied by the
operating system, nor where they must be installed.  The OpenGL
ABI for Linux on x86 states only that /usr/lib/libGL* must either
be the system libGL, or that it must be a symbolic link to the
system libGL.

The libGL which ships Fedora Core 1 contains several optimizations
which were done by Jakub Jelinek to improve application startup
time by making libGL prelinkable and dramatically reducing the number
of dynamic relocation processing that needs to be done at application
startup time.  This also improves runtime performance by having
less symbols have to go through the GOT.  Another performance
improvement done by Jakub is thread local storage (TLS) support
has been added to libGL.  For maximum performance with TLS, i686
instructions are used which are not present on i586 or earlier
processors, and are not present on some i686 class hardware which
does not implement the optional instructions in the i686 architecture.

As such, in order to both get this performance enhancement, and
also still provide libGL compatibility with Cyrix i686 CPUs, AMD
K5/K6 etc. CPUs and other similar chips, we have supplied 2 libGL
libraries, one which should work on all systems out there, which
is optimized for some performance gains, and a second libGL which
is specifically optimized for i686 class CPUs.  The compat libGL
resides in the normal location, and the i686 libGL resides in
the "tls" subdirectory under that.

Since Nvidia's installer only looks for the one single libGL when
it goes on it's deleting rampage, it misses the second libGL
that we supply.  When you start any OpenGL application on a true
i686 compatible chip which implements the optional i686 instructions,
the system detects if you have an i686 compatible processor or not,
and wether or not TLS support is available on your system, and
it will use the TLS libGL instead of the /usr/lib/libGL.so.*
library if your system supports TLS.  Since this library is not the
Nvidia libGL library, it will attempt to connect to the DRI
extension, however Nvidia does not support the DRI extension or
use DRI in any way, so the application will fail with the error:

    Xlib:  extension "XFree86-DRI" missing

We can't fix Nvidia's installer, however they will likely fix it in
a future driver update.  Hopefully they will fix it properly and
make it uninstall XFree86-Mesa-libGL first, to avoid this and any
future problems in future libGL enhancements.  Until Nvidia fixes
their installer however, or changes their installation and
configuration process to not conflict with or delete the XFree86
supplied files we ship, users need to manually work around this
problem by doing either:

rm -f /usr/X11R6/lib/tls/libGL*

or

rpm -e --nodeps XFree86-Mesa-libGL


Either of the above solutions will work around this Nvidia driver
installation bug for the time being until Nvidia is able to provide
users with a fixed driver package.

Note that every time you upgrade XFree86 on your system, the
XFree86-Mesa-libGL package will be installed in order to meet
libGL dependancy requirements.  That wouldn't occur if Nvidia's
drivers were in rpm format and provided libGL that way, as their
package would satisfy the libGL requirement then.

Hopefully in the future Nvidia will find a better method of
driver installation that is more harmonius with the operating
system installation, and users won't experience these problems
during upgrades.

Hopefully this will help users who are experience this problem
to both work around the issue, and also to understand it and other
similar related problems in the future.

Comment 5 Mike A. Harris 2003-11-15 02:11:16 UTC

*** Bug 107841 has been marked as a duplicate of this bug. ***

Comment 6 chemist109 2003-11-22 01:07:40 UTC

Thanks for the information, Mike.  One additional note, if you use the
second method for fixing the problem:

rpm -e --nodeps XFree86-Mesa-libGL

Apt shows at least 20 broken packages.  I suggest that users that also
use Apt to update software go with the second method:

rm -f /usr/X11R6/lib/tls/libGL*

I re-installed XFree86-Mesa-libGL then deleted
/usr/X11R6/lib/tls/libGL* and everything seems to work fine including
Synaptic (yay!).

Comment 7 chemist109 2003-11-24 03:51:26 UTC

Addendum:

Several SDL apps now segfault-- specifically: Tuxracer, Frozen Bubble,
and Foobillard.  I am not certain that it is the result of removing
the Mesa libGL files, but it seems likely. 

Frozen bubble outputs:
[SDL Init] [Graphics...] [Levels] Fatal signal: Segmentation Fault
(SDL Parachute Deployed)
Segmentation fault

Tuxracer:
Fatal signal: Segmentation Fault (SDL Parachute Deployed)
Segmentation fault

Foobillard:
Fatal signal: Segmentation Fault (SDL Parachute Deployed)

GLTron and Chromium work fine, so it appears to be just SDL and not
OpenGL.

Comment 8 Carwyn Edwards 2003-11-28 01:24:51 UTC

Nvidia do still provide SRPMS:

ftp://download.nvidia.com/XFree86/Linux-x86/1.0-4496/

.. they just don't advertise them, and they are still broken.

ATrpms: http://atrpms.physik.fu-berlin.de/dist/fc1/nvidia-graphics/

.. have ones that do work though.

Comment 9 Mike A. Harris 2003-11-29 10:52:16 UTC

Comment #6:

The problem with doing:

rm -f /usr/X11R6/lib/tls/libGL*

is that your rpm database says that it has Mesa-libGL installed,
but those libraries are not installed anymore.  While you may
satisfy rpm this way, you may experience software failures if
there are differences between Mesa-libGL and whatever libGL you
have installed.



Comment #7:

You're using unsupported Nvidia drivers now, any problems
encountered are problems you'll need to work out yourself,
or discuss on the mailing lists to seek technical support
help.  We don't provide tech support in bugzilla though.

Comment 10 Mike A. Harris 2003-11-29 10:59:38 UTC

Closing bug as NOTABUG, as this is a proprietary driver installation
and configuration issue, not a bug in our XFree86.

Comment 11 Bart Martens 2003-12-25 10:14:21 UTC

I referred to this nvidiadri bug on the nvidia-forum.
http://www.nvnews.net/vbulletin/showthread.php?s=&threadid=22712

Comment 12 Bart Martens 2003-12-27 11:10:59 UTC

Mike, any comments on the latest messages on the nvidia forum?
http://www.nvnews.net/vbulletin/showthread.php?s=&postid=251300#post251300

Comment 13 Mike A. Harris 2003-12-29 13:48:37 UTC

Sure, I'll comment on it.  Red Hat's location where the real OpenGL
library gets installed on the system, is:

1) 100% compliant with the OpenGL on Linux ABI specification, which
   states clearly that the libGL and libGLU libraries must be either

   1) Installed in /usr/lib and headers in /usr/include

   *OR*  (Yes, pay very close attention here Nvidia forum users, you
          may learn something)

   2) Located elsewhere in the system, with symbolic links in
      /usr/lib and /usr/include which point to the real OpenGL
      libraries.

2) The location where libGL gets installed, is the location where
   XFree86 itself installs the libraries to by default (and always
   has).  We go out of our way to ensure that the OpenGL ABI on Linux
   (for x86) is adhered to, by putting the requisite symbolic links
   in place.

Modern systems allow overriding of system libraries with
per-CPU-type libraries which are customized for more advanced
processor revisions of which can be runtime autodetected.  This
is done in the dynamic linker (ld.so), which will determine what
processor family is running on the current system, and what
capabilities it has.  If there are system libraries installed
which are optimized for a particular newer generation of processor
family such as i686, it will automatically look for i686 customized
override libraries.  Similar is done with the TLS versions of
libraries.

This is done for multiple (very good) reasons.  One, it allows
a single system installation to contain multiple different versions
of a given library, each optimized for a specific processor
generation which may have significant improvements over previous
generations.  No special end user configuration of the system or
library installation management is necessary or required, since
the multiple library variants co-exist nicely, which is a great
feature.  Secondly, it allows the processor type and features to be
autodetected at runtime, and the best library used for the given CPU
which will provide the best performance, etc.

Nvidia's installer makes assumptions about things that it should not,
and it also does not install in a manner that is considered "clean"
in terms of RPM based package management standards.

Nvidia users who get nailed by the flaw in Nvidia's installation
process may have an axe to grind out of their frustrations, and may
be looking for a scapegoat to blame their problems on.  I can't say
I blame them for being frustrated about their problems, nor for
looking for someone to cast blame upon, as they are just looking
for a solution that works, and doesn't require a lot of effort
from them.

However, casting blame, does not give them what they want, which
is a working solution, and casting blame upon Red Hat in their
frustration, also serves no purpose, as we comply to the OpenGL
ABI on Linux (which perhaps they should actually *read* some day
instead of just whining and making invalid assumptions themselves).

For those who are interested in a drop in solution, Nvidia has
graciously permitted people to take their driver tarballs and
repackage them for their favourite distribution.  This will allow
some enthusiastic user out there to at least theoretically create
RPM packages which will cleanly solve all of the problems that
people are having.  Perhaps if someone volunteers to create such
rpm packages, Nvidia may be kind enough to host them on their
download site even.

At any rate, this is not a Red Hat bug.  The _bottom_line_, is that
the Nvidia installer makes assumptions about things on a system
which it should not ever have made, but which through random chance
happened to at one time hold true, and so it happened to work, and
now those assumptions are no longer true.  There is nothing anywhere
in any standard document that declares that alternative OpenGL
librarys may not be installed elsewhere on a Linux system and managed
via symlinks or via the dynamic linker that Linux systems use.

Please seek a scapegoat elsewhere.

Comment 14 Mike A. Harris 2003-12-29 13:50:39 UTC

*** Bug 109745 has been marked as a duplicate of this bug. ***

Comment 15 Michal Ambroz 2004-01-02 17:44:42 UTC

Nvidia drivers packed in src.rpms and working for RH9 and Fedora Core
1 can be found also at:
<A HREF="http://rebus.webz.cz/#nvidia">http://rebus.webz.cz/#nvidia</A>

For Fedora Core you need gcc32 in order to recompile this package well.

Yours sincerelly

                 Michal Ambroz

Comment 16 Axel Thimm 2004-03-01 11:26:35 UTC

rpms for nvidia drivers (for 4363, 4496, 4620, 5328, 5336) fulfilling
Mike's specs (no overwriting or deinstallation of Mesa parts, clean
(de)installation procedures etc.) can be found at

http://atrpms.physik.fu-berlin.de/name/nvidia-graphics/

(built for FC1, RH9, RH8.0, RH7.3)

Furthermore they are concurrently installable, and can be switched
from outside X (rmmod the old module manually) with a common script
called nvidia-graphics-switch.