Red Hat Bugzilla – Bug 214050
/proc/bus/pci/devices missing entry (was Xorg PCI scan misses video card)
Last modified: 2007-11-30 17:11:47 EST
Description of problem: After upgrading from FC5 to FC6, my integrated ATI Rage
XL is no longer detected, with Xorg failing with "no device found". Some clues:
- worked in FC5, not in FC6
- scanpci shows my video device
- Xorg -scanpci does NOT show my video device
- video is the last one enumerated by scanpci. Perhaps a new off-by-one error?
Or perhaps previous device terminated the scan?
- strace indicated a brute force scan of all possible pci devices which stops
abruptly at /proc/bus/pci/03/03.0, while my device is the next(and last) at
- See attached zip file containg useful output (xorg.conf, Xorg.0.log, scanpci,
Xorg-scanpci, strace, etc)
- My hardware: Dell PowerEdge 700, with integrated ATI Rage XL
Version-Release number of selected component (if applicable):
How reproducible: Totally (on my machine)
Steps to Reproduce: Visit me :-)
Created attachment 140379 [details]
Zip file of key text files: (xorg.conf, Xorg.0.log, scanpci, Xorg-scanpci, strace, etc)
Also submitted to freedesktop.org as bug number 8894, with more recent clues.
I could NOT seem to add that reference to this ticket via the "External Bugzilla
References" sub-form herein, so I'm just doing it as a manual comment.
Okay, based on a debugging trail (logged in the xfreedesktop bugzilla #8894), it
has become apparent that this is really not an Xorg bug, but a bug in the
/proc/bus/pci logic (kernel?). So I'm going to try to reassign this. Here is
the AHA entry from my xfreedesktop bug report.
Hmmm. Looks like an OS issue. After stepping through the Xorg scan with gdb I
noticed that we are stopping after scanning 14 PCI devices, although my video
card is the 15th (and last).
It turns out that there is a mismatch between the contents of
/proc/bus/pci/devices (14 devices) and the nodes in /proc/bus/pci/xx/* (which
The device missing from /proc/bus/pci/devices is /proc/bus/pci/00/06.0.
Xorg is getting a count of PCI devices by counting the lines in
/proc/bus/pci/devices (function xf86OSLinuxGetPciDevs in lnx_pci.c). Since this
is missing one PCI device, the subsequent scan stops prematurely which is only a
problem if you video device is the last one. Mine is.
So there is clearly an OS problem, which I will try to figure out how to submit.
Can anybody suggest where? For that matter, is anybody reading this stuff?
Maybe a triage team? Some feedback would make me feel less lonely :-)
I'm guessing at the assignment, changing the category did NOT change the
assignment from original auto-generated value of X/OpenGL maintenance which
seems to be quite wrong given the change in component.
And I'm trying for a better Summary. Sorry for not doing this all in one change.
I don't know if you have mc installed but you can do the lspci |grep VGA to get
the video card pc id. Afterwards you can look into the devices file with f4 to
view the file details.
I have two video cards
lspci |grep VGA
01:00.0 VGA compatible controller: nVidia Corporation NV5M64 [RIVA TNT2 Model
64/Model 64 Pro] (rev 15)
02:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200 (rev 01)
See attachment for screenshot. I take it 0100 is the NV and 0200 is the Matrox
sorry for the interruption. Hopefully someone who knows what the heck they are
doing will reply to this bug. Mention on the list should have added visibility.
Created attachment 141216 [details]
viewing the devices file in /proc/bus/pci
You might find the F4 edit feature for mc useful to read content. Maybe!
Your bug seems to be far out of my knowledge base,
please attach output of dmesg
Created attachment 141225 [details]
dmesg output after reboot with failing Xorg server
Here is the dmesg output.
Sorry for the delay. A few days ago I built a modified Xorg that just bumped
the device count by one as a totally crude workaround for the pci scan problem.
Thought it prudent to roll back to the nominal Xorg prior to generating your
dmesg listing (just it case it affected anything you were looking for).
It's way past bedtime, so goodnight :-)
I think the previously attached files clearly indicate a bug in the pci related
processing for the procfs filesystem (as indicated by the fact that
/proc/bus/pci/devices contains a different number of PCI devices than the
/proc/bus/pci/xx/* device entries).
1) Is there any reason not to just report this upstream? That is,is there any
reason to suspect this is some bug added by Fedora customization of the
2) Am I correct in assuming that the distro maintainers are the proper
gatekeepers for submitting kernel bugs? If not, should I just go ahead and
submit this stuff myself?
Significatn Update - I finally figured out how download a vanilla 18.104.22.168
kernel from www.kernel.org and build it. The big surprise was that in this
vanilla kernel, there is NO mismatch between /proc/bus/pci/devices and
Wow! So it seems like the Fedora kernel mods may well be the culprit. However
at this point I'm totally unsure of how to proceed. There are are a tremendous
number of differences between a vanilla 22.214.171.124 kernel and the Fedora
Any suggestions on next steps?
New conclusion: It appears the bug is associated with the CONFIG_EXPERIMENTAL
flag in the stock 2.6.18 kernel.
Details: I rebuilt the vanilla 2.6.18 kernel two ways:
1) With FC6 .config file (which sets EXPERIMENTAL=y) - this manifests the bug
2) With FC6 .config file, (BUT setting EXPERIMENTAL undefined) - no bug!
So, its an upstream bug. Could somebody please suggest what the correct next
Try building with EXPERIMENTAL set but without the 82875 EDAC driver (should be
CONFIG_EDAC_82875P). I suspect it's doing something unpleasant that ends up
hiding it from /proc/bus/pci/devices.
Or, build the kernel with this patch:
This patch _is_ correct, but it doesn't appear to be in either the FC6 or
rawhide kernels yet.
It's in rawhide now, I'll poke Chuck to get it into FC6 updates too.
Is somebody going to submit this patch upstream?
My problem is resolved by FC6 kernel 2.6.20-1.2944.fc6. In this release the
contents of /proc/bus/pci/devices and the nodes in /proc/bus/pci/xx/* agree in
the number of devices (both indicate 15).
The previous release (2.6.20-1.2933.fc6), did NOT resolve the problem, so
thank-you to whoever fixed the problem between 2933 and 2944.
I have no idea if ALL of the issues discussed on this list have been fixed. I
suspect not, since there seem to be several different chunks of code that need
to arrive at the same conclusion about what PCI devices exist, which is a recipe
for future problems. At present, on my particular hardware configuration, the
various code paths seem to be in agreement.
Thanks again to all concerned!