Red Hat Bugzilla – Bug 855275
Kernel Error during Bootup: [drm:radeon_cs_ib_chunk] *ERROR* Failed to schedule IB !
Last modified: 2013-01-12 05:25:38 EST
Created attachment 610646 [details]
messages.txt from /var/log
I own a E325 Lenovo Notebook with the brand type (NWX2UGE 12972UG) running F17 and the following kernel:
Linux localhost.localdomain 3.5.3-1.fc17.i686 #1 SMP Wed Aug 29 19:25:38 UTC 2012 i686 i686 i386 GNU/Linux
The build in APU is a combo of CPU and Radeon graphic card.
Whenever I boot Fedora I end up in playing bingo. Quite often the system hangs during boot and won't show the blue fedora logo.
I have to reboot multiple times (sometimes up to 5 times or more) to get the little logo to popup and continue loading. Therefore this is basicly ALWAYS reproducable.
I already flashed a new BIOS on the Notebook to see whether this solvs the problem but sadly it doesn't.
Attached is the message log providing more information about architecture and system components. The log clearly describes an kernel issue with scheduling the radeon driver - which by my understanding may cause this problem.
Reproduced on Sony VAIO model VPCYB1S1R, with AMD E-350+Radeon HD 6310 APU.
(In reply to comment #1)
> Reproduced on Sony VAIO model VPCYB1S1R, with AMD E-350+Radeon HD 6310 APU.
(Forgot) On 3.5.3-1.fc17.i686.PAE kernel
Also, randomly, on Samsung NP305U1A (AMD E-450).
Linux sam.daniele.vigano.me 3.6.2-4.fc17.x86_64 #1 SMP Wed Oct 17 02:43:21 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
[ 3.205027] [drm] Initialized radeon 2.24.0 20080528 for 0000:00:01.0 on mino
[ 13.364085] radeon 0000:00:01.0: GPU lockup CP stall for more than 10000msec
[ 13.364104] radeon 0000:00:01.0: GPU lockup (waiting for 0x0000000000000002 l
ast fence id 0x0000000000000001)
[ 37.260730] radeon 0000:00:01.0: couldn't schedule ib
[ 37.260742] [drm:radeon_cs_ib_chunk] *ERROR* Failed to schedule IB !
put notebook in standby (in my case pressing the hw power button) and then resume from standby
Upstream bug report: https://bugzilla.kernel.org/show_bug.cgi?id=47481
> workaround: put notebook in standby (in my case pressing the hw
> power button) and then resume from standby
Doesn't work for me.
Created attachment 637051 [details]
Greetings from kernel-3.6.5-1.fc17.i686
This is what happened with todays update. Someone must have messed the gfx subsystem for this APU which causes exactly this to happen.
Normally it requires me 10 reboots or so to get Fedora 17 loaded once due the error. But now it requires me 10 reboots to get exactly this. Usually the point where the blue Fedora spinner/logo pops up.
The square thing on this picture is the mouse pointer. Switching to console works but gives a similar broken picture.
So basicly kernel-3.6.5-1.fc17.i686 is a nogo.
Downgrading to kernel-3.6.3-1.fc17.i686 works as it should. Including 10 times or so reboot to get the system booted up once.
This with a notebook that is sold as fully linux compatible from lenovo.
Created attachment 637198 [details]
Archive with *Screenshots*
Reproduced on 3.5.4-3.6.3 kernels. But now, it is something different...
It's same "playing in bingo" way to boot up my Fedora 17, but:
* Turning off rhgb, can slightly increase the chances of booting up.
* Sometimes stucks on message:
[ 5.948975] fb: conflicting fb hw usage radeondrmfb vs VESA VGA - removing generic driver
Blacklisting vesafb slightly reduces the chances of appearing this, but not 100%.
* Sometimes(with increased chances of appearing if wakeing up notebook from hibernation) can see this, forgot the name, "artifact". Seen in the photos(in archive) I attached.
That's all i can say for now. This is still "playing in bingo" style booting up system. Can't wait fix...
Created attachment 637484 [details]
Full of oops'es
I came to the point where I start to regret having switched from stable Ubuntu to Fedora. Within the past 11 months (Fedora 16 to 17) I got trapped into so many problems related to untested stuff being pumped to the masses from Fedora that I ask myself where the quality people are.
I spent the entire last evening and the entire day now to get this sorted out and get a "halfway working" system again. No luck!.
* Kernel downgraded. No luck!
* xorg-ati downgraded. No luck!
* recreated initramfs. No luck!
* abrt downgraded. No luck!
* Only luck with "nomodeset" but this is worse than it was before with the 10 times reboot.
Read upstream bugreport. Thanks for nothing dudes!
Anyone knows a quick and dirty workaround to get the system usable again ? Something specific to downgrade ? Not "nomodeset" or things like this. I am using it right now. I hope the solution is not called Windows.
After some really frustrating hours I found something:
I stripped the system down to a minimum. Tried to boot the kernel in a stripped way without background things.
Tried to recreate initramfs with dracut and figured out that dracut was trying to search for ucode files related to radeon. I then went to the freedesktop page and read a few lines about the open source radeon driver. Figured out that certain radeon cards may need the ucode files.
I realized that under some unknown circumstances the "linux-firmware" package was missing or got removed from this installation. I will check my 2 days old backup to see whether the pack is present in the "untouched and" clean package.
Try this. Maybe it helps:
sudo yum update
sudo yum install linux-firmware <-- could be missing
sudo dracut --force --xz -v
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
Somehow I got the system running again without these white stripes. I will try a couple of reboots from now on and see if things work out again. Nonetheless the "playing bingo 10 times reboot" stuff seem to be still existing. Maybe donwgrading ati drivers may help too.
sudo yum downgrade xorg-x11-drv-ati
The latest from the upstream bug seems to be that the original issue is fixed in 3.7 but still needs backporting to 3.6.x.
(In reply to comment #12)
> The latest from the upstream bug seems to be that the original issue is
> fixed in 3.7 but still needs backporting to 3.6.x.
I agree here! I keep getting Kernel updates for F17, like the 3.6.8 that I am downloading right now. In hope that with increasing versions that this bug is going to disappear.
To solve this issue can be seen urgent, since it may cause file system corruption, file corruption, because you are forced to power off/on the system. I already had this issue happen once. Luckely I had a backup of my F17 so I was able to recover from it.
So either way. Please someone from the Fedora Kernel Maintainers. Please backport this fix if it's not done in mainstream. Or simply provide 3.7.x Kernel versions.
This issue is more than frustrating and could be solved easily by providing the fix. It also affects a lot of users with similar Hardware!
3.7 kernels are in rawhide. You can use those on f17 and f18. Both f17 and f18 will be rebased to 3.7 in the not too distant future. If you would like to test those 3.7 kernels and let us know if it actually resolves your issue, that would be very helpful.
The commit was CC'd to stable and should be in the next 3.6.x release.
(In reply to comment #14)
> 3.7 kernels are in rawhide. You can use those on f17 and f18. Both f17 and
> f18 will be rebased to 3.7 in the not too distant future. If you would like
> to test those 3.7 kernels and let us know if it actually resolves your
> issue, that would be very helpful.
Yes, I tested "kernel-3.7.0-0.rc7.git1.1.fc19.i686" today on my FC17 box. Did a dozen reboots (warm and cold). The system came up perfectly all the time. This was the first time that turing the machine on (or rebooting) it, didn't end in frustrations. Sadly the kernel is in debug mode (therefore slow).
> The commit was CC'd to stable and should be in the next 3.6.x release.
Ohhh yes I can't wait until the next 3.6.x is showing up in updates-testing. This will finally gives a lot less pain here :)
Thanks. I will report back once the new kernel shows up.
(In reply to comment #15)
> > The commit was CC'd to stable and should be in the next 3.6.x release.
> Ohhh yes I can't wait until the next 3.6.x is showing up in updates-testing.
> This will finally gives a lot less pain here :)
The patch missed the official 3.6.9 stable release. I've added it into the Fedora kernel git and started an F17 scratch build. If those impacted by this bug could test the build below once it completes, that would be very appreciated.
(In reply to comment #16)
> If those impacted by this bug could test the build below once it completes,
> that would be very appreciated.
Thanks for the time and the work you've spent on this build. I downloaded it this morning (kernel bins) and installed it on my system.
I do notice the same "cleaner" turning off the lcd and turning on the lcd (simple spoken) as the 3.7.x kernel that I tested did.
Allthough after a few reboots 10x I was still trapped 2x in the black screen issue. I realized that I might have missed to re-run dracut to re-create the initramfs. I did this too and tried rebooting a couple of times again and all seem to work fine. But still I am a bit worried about the 2x fails that I run into.
But all in all this is a better improvement. From once 20% - 80% (2 success boots / 8 fail boots out of 10) to 90% - 10% (9 success boots / 1 fail boots out of 10 (because of missing dracut re-build)) this seem to be much better.
I will now use this kernel for the next upcoming days and report back. Maybe someone else could do some tests as well and provide some feedback.
(In reply to comment #16)
> (In reply to comment #15)
> > > The commit was CC'd to stable and should be in the next 3.6.x release.
> > Ohhh yes I can't wait until the next 3.6.x is showing up in updates-testing.
> > This will finally gives a lot less pain here :)
> The patch missed the official 3.6.9 stable release. I've added it into the
> Fedora kernel git and started an F17 scratch build. If those impacted by
> this bug could test the build below once it completes, that would be very
I downloaded this kernel, installed, and executed /usr/libexec/plymouth-update-initrd before the tests(for reasons of clarity).
10 reboots / no fails. I think the problem is fixed now.
Thanks to both of you.
I've submitted official builds for this to koji (the F17 build will be identical to that of the build you've already tested). Bodhi will leave the usual comments as things to into the repos, etc.
Well we have to thank you for your time.
Finally a working AND painless boot process.
Could someone please tell me the difference between issuing plymouth-update-initrd and dracut ? Which one to prefer ?
(In reply to comment #20)
> Well we have to thank you for your time.
> Finally a working AND painless boot process.
> Could someone please tell me the difference between issuing
> plymouth-update-initrd and dracut ? Which one to prefer ?
There is no differences, really
[nexfwall@Sony-PCG31311V ~]$ cat /usr/libexec/plymouth/plymouth-update-initrd
/sbin/new-kernel-pkg --package kernel --mkinitrd --dracut --depmod --install $(uname -r)
And /sbin/new-kernel-pkg is just a bash script too.
Thank you Josh, for your time.
kernel-3.6.9-4.fc18 has been submitted as an update for Fedora 18.
kernel-3.6.9-2.fc17 has been submitted as an update for Fedora 17.
kernel-3.6.9-2.fc16 has been submitted as an update for Fedora 16.
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.6.9-2.fc17'
as soon as you are able to, then reboot.
Please go to the following url:
then log in and leave karma (feedback).
kernel-3.6.9-2.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.6.9-4.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.6.10-2.fc16 has been submitted as an update for Fedora 16.
kernel-3.6.10-2.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.
I have an ATI 4850 and had not seen the "failed to schedule IB" problem until I upgraded to kernel-3.6.10-2.fc17.x86_64 and now I see frequent lock-ups, screen corruption of failures to initialise Xorg. Booting with kernel-3.6.9-2.fc17.x86_64 works perfectly.
Well. For *us* the upstream regression fix solved the problem of at least cleanly booting Linux on fc17. But.... and this is something new.... I am now on fc18 with 3.6.11-3.fc18.i686 running. Whenever I switch to console and then back to Xorg I get the same issues as Matt describes. But this happened recently.
Here it plays STROBO ... black and white flashing of the screen.... Impossible to initiate CTRL+ALT+DEL. Only way to solve it is to hard shutdown with the power buttons.
Booting into GDM works perfectly. This must have been introduced recently. On fc18 I also saw that Xorg and some other components got updated.
This is really a sad situation for people with APU and people with normal GPU.
Matt, could you provide some logs like /var/log/messages or something and Xorg.. I get some new radeon stuff being printed.
Maybe the regressions in upstream are only partially fixed.
I will provide some logs NEXT YEAR :)
Created attachment 670869 [details]
Logs from failed 3.6.10 session
Attaching /var/log/messages and Xorg.log from a failed attempt to boot my machine with 3.6.10 kernel. I was able to get to a console login with ctrl-alt-3 and shutdown the system. I think this attempt to boot just presented a snowy grey screen instead of gdm or anything helpful.
I see! You are getting the same problems that I initially had and that's what this regression fix from upstream Kernel is supposed to fix. So I wonder why you still receive this in your logs.
May I ask you to try this:
sudo yum install linux-firmware <-- could be missing
sudo depmoad -ae
sudo dracut --force --xz -v
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
On next boot describe what happens. What I also like to know is what happens when you switch from Xorg to Console and then back to Xorg. Do you get this "STROBO" like effect too ?
With the same .10 kernel try another boot using "nomodeset" inside grub. I would like to know what the results are now.
linux firmware was already installed and up-to-date.
The problem is rather intermittent, and apart from one slight glitch, it is refusing to happen at the moment. I've run the other commands you listed and will monitor this over the next few days to see if the problem reoccurs.
I continued to get problems after running the commands you requested, but never with consistent symptoms. As the machine affected was my primary PC, I bought a cheap NVidia card last weekend and am using that now instead. Sorry I can't help further.