Description of problem: Up to now all rawhide kernels, ending up with 2.6.23-6.fc8, were booting without any problems on my SK8V x86_64 machine. This is not true anymore with 2.6.23.1-23.fc8. It appears that it got affected by a disease already reported in bug 249174 for fc6 and f7 kernels. A line "agpgart: Detected AGP bridge 0" shows up on a screen and that is it. The crucial difference is that a hack described in https://bugzilla.redhat.com/show_bug.cgi?id=249174#c49 of dropping down amounts of available memory does not seem to work anymore. I tried different values, and also other possible options together and separately, without any effect. The only possible way to boot with this is kernel is when "agp=off" is added but then a machine is nearly unusable with a graphic desktop. The other difference with bug 249174 is does not matter what I am trying I cannot provoke one of those panics and backtraces which were showing up there. The kernel just sits there with "Detected AGP bridge 0". What got broken between 2.6.23 and 2.6.23.1? Version-Release number of selected component (if applicable): kernel-2.6.23.1-23.fc8 How reproducible: always
I checked kernels available at koji. The first rawhide kernel after 2.6.23-6.fc8, i.e. kernel-2.6.23.1-11.fc8, does not boot anymore (unless "agp=off" hammer is used) and kernel-2.6.23.1-26.fc8 does not change anything in this picture. So from changelog likely candidates appear to be either 2.6.23.1 or "Disable debug".
(In reply to comment #1) > I checked kernels available at koji. The first rawhide kernel after > 2.6.23-6.fc8, i.e. kernel-2.6.23.1-11.fc8, does not boot anymore > (unless "agp=off" hammer is used) and kernel-2.6.23.1-26.fc8 does not > change anything in this picture. So from changelog likely candidates > appear to be either 2.6.23.1 or "Disable debug". Hmm, can you try kernel-debug-<currentversion>? That has the same config as the rawhide kernels before "disable debug" was implemented.
This is probably a dupe of 336281
> This is probably a dupe of bug 336281. Yes, indeed. It looks that way. Also bug 249174 is related as noted before. Only the nasty this thing is that so far rawhide was booting and now it stopped. Running with 'agp=off' and 'Option "BusType" "PCI"', as proposed in bug 336281, indeed seems to provide a workaround but when you are trying to install in the first place you may run into some, ahem, difficulties.
Created attachment 234381 [details] dmesg from a succesful boot with a debug kernel > can you try kernel-debug-<currentversion>? This indeed does boot without any extra options. For what is worth here is dmesg from a boot with 2.6.23.1-26.fc8debug kernel.
if you boot the kernel-debug with the option slub_debug=- (a single minus sign.) does the bug come back ?
2.6.23.1-26.fc8debug with 'slub_debug=-' boots without problems. I also do not see any essential differences in dmesg. The biggest likely is: @@ -227,9 +225,7 @@ NET: Registered protocol family 1 NET: Registered protocol family 17 powernow-k8: Power state transitions not supported - Magic number: 11:413:294 - hash matches device hpet - hash matches device tty27 + Magic number: 11:43:604 Freeing unused kernel memory: 708k freed Write protecting the kernel read-only data: 1088k ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21 which is way past the sore spot.
A stupid question. Does kernel debugging affects how __devinit is handled? I ask because if backtraces, like those recorded in bug 249174 and bug 336281, do happen they seem to consistently point fingers at 'int agp_add_bridge(struct agp_bridge_data *bridge)' called at the very bottom of 'static int __devinit agp_amd64_probe()', with 'bridge' structure on stack in this function, and every time memory is clearly corrupted. So maybe something gets dropped prematurely when debugging is off?
We do have a workaround for this: 1) Boot the installer with agp=off 2) Install and use the -debug kernel to get AGP working So this does not have to be a blocker...
The original subject was explicit about SK8V as this was observed on that particular board. This really looks the same as bug 249174 and bug 336281, with reports from a significantly wider range of boards, so that subject become misleading. A workaround proposed in bug 336281 is plausibly a better idea than the one from comment #9 (although it assumes that it is possbile to configure a suitable video driver to use PCI instead of a default AGP).
This is surely a duplicate of bug #249174.
Marking as dup. Feel free to reopen if this understanding is incorrect. *** This bug has been marked as a duplicate of 249174 ***