Bug 470635

Summary: Running KVM and Nvidia driver hard locks Compaq F572US.
Product: [Fedora] Fedora Reporter: Gideon Mayhak <gnafu_the_great>
Component: kvmAssignee: Glauber Costa <gcosta>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: medium    
Version: 10CC: berrange, clalance, gcosta, markmc, poelstra, quintela, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-04-22 22:33:28 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
dmesg logfile immediately after a KVM lockup (grabbed after booting into another Fedora install). none

Description Gideon Mayhak 2008-11-07 23:11:35 EST
Description of problem:
Since Fedora 7 or 8, I have never been able to get KVM to work on my Compaq F572US laptop.  It has an Athlon 64 X2 and the BIOS has a setting for virtualization (which is enabled), but I've never gotten it to work with KVM.  Up until Fedora 10, I was using i386, so I never filed a bug because I thought it might be because I wasn't using x86_64 or something.  Now I am running Fedora 10 rawhide x86_64 and I experience the exact same behavior.

This isn't critical for me, as I just want to try out KVM on my laptop.  VirtualBox seems to work fine with AMD-V enabled.  I just figured I'd file a bug report and see if it's something trivial or if my system really isn't compatible.

I'm not sure where to look for logs, as it doesn't seem to have time to write any before the system locks.  Please let me know what information you might need.  I'll be glad to provide anything.


Version-Release number of selected component (if applicable):
kvm.x86_64                          74-5.fc10


How reproducible:
Every time.


Steps to Reproduce:
1. Load virt-manager
2. Walk through wizard to create a KVM with 512MB RAM and install from either boot.iso or local network tree (http://gidux.dyndns.org:8080/fedora/)
3. Wait for it to start loading and then watch the system lock up completely

  
Actual results:
Complete system lock.


Expected results:
VM starts running like it does when I choose QEMU or KQEMU.


Additional info:
http://www.smolts.org/show?uuid=pub_80508f48-43f1-4578-898c-15ca791ac977
Comment 1 Glauber Costa 2008-11-08 05:57:32 EST
does the system ever come back ?

Can you either:
Press ctrl+alt+f1, login (in case system is not really locked, just appears to be), and get the output of the dmesg command? Or:
leave a terminal window in your window manager running dmesg in a loop to see if there are any new messages arriving? 

In any case, the dmesg in the host before the crash is useful, so we can see if there's anything going seriously wrong.

BTW, this seems like a kvm.ko bug.
Comment 2 Gideon Mayhak 2008-11-08 19:00:07 EST
The system completely locks, because I can't even get the CapsLock light to come on.  I can't Ctrl-Alt-F1 or anything, not even pressing the power button will put it into shutdown.  I have to press and hold the power button.  dmesg does not seem to have any useful information; the system appears to lock up before any useful information can be written.  I will try running a KVM again and look at the dmesg output.

What's the easiest way to run dmesg in a loop?  Otherwise, I can just post the tail of the dmesg logfile.
Comment 3 Gideon Mayhak 2008-11-08 19:14:41 EST
Created attachment 322975 [details]
dmesg logfile immediately after a KVM lockup (grabbed after booting into another Fedora install).

Here is my dmesg logfile right after a lockup.  I ran qemu-kvm from a terminal and added "> /root/kvm.log" to the end, but it didn't create a /root/kvm.log file.  Let me know about opening dmesg in a loop in another terminal window and I'll gladly let you know if anything is displayed there before the lock.

As far as the behavior leading up to the lock, the QEMU window comes up and the system locks almost immediately after the boot.iso boot menu screen comes up.
Comment 4 Glauber Costa 2008-11-10 07:25:32 EST
Your kernel is tainted.
It might have absolutely nothing to do with the problem, but the very first thing we have to do, is to make sure of that. Please do a clean boot without the NVIDIA driver loaded. Again, I believe you have a real KVM bug in here, but there's no point in going forward with this variable present.

Also, note that the latest KVM is kvm-78-4.fc11 (http://koji.fedoraproject.org/koji/buildinfo?buildID=68366) although what you have is probably a kernel space problem, unrelated to the package.
Comment 5 Gideon Mayhak 2008-11-10 20:37:56 EST
Wow, good call!  I uninstalled my NVIDIA-related RPMs and rebooted.  I was then able to start a KVM with Fedora 10 Preview over my network!  It's installing as I post this, so it's been steady so far.

Now, the downside.  At this time, and on this machine, having the proprietary NVIDIA drivers installed is more important to me than getting to play with KVM.  I know that it's not typical to modify things to work around binary blob-related bugs, and I wouldn't dream of asking you to do such a thing, but I would be happy to provide any information you want if you would like to further investigate this.  I would like to know why having the NVIDIA driver would prevent KVM from working.

Also, I'm going to reinstall the NVIDIA drivers and that newer version of KVM.  I'll let you know if that newer version works fine.  If it does work, could that version be pushed into Fedora 10?  I think adding KVM support for users of NVIDIA graphics cards would be worth a freeze break ;).  I'll let you know how it goes.

Thanks!
Comment 6 Gideon Mayhak 2008-11-10 22:11:56 EST
It works!  The 78 version works!  I'm currently running the F10 Preview installer with NVIDIA drivers enabled (with TwinView, no less).  Once I have finished running the installer, I'll try downgrading to the current F10 version to see if maybe just reinstalling the NVIDIA driver had anything to do with it (doubtful).

Otherwise, may I hereby request that the 78 version of KVM be pushed for release with F10?  I understand it could always be release as an update shortly after, but is there much chance of this breaking anything?  I didn't need to update anything else but KVM, so the update dependencies appear to be 0.

I'll let you know if downgrading works, but I doubt it.  Thanks again!
Comment 7 Gideon Mayhak 2008-11-10 22:49:35 EST
Yeah, downgrading brought me back to hard lockup.  Please consider bringing that newer version of KVM to F10.
Comment 8 Glauber Costa 2008-11-11 06:30:58 EST
I would love to know why KVM does not work with the NVIDIA driver too. Unfortunately, they won't let me. It lives in kernel space and can be doing _anything_. We'll never know.

As for the new version, I'm glad it works for you. But unfortunately, we can't backport it to F-10, since F-10 is freezed, and kvm-78 adds a considerable amount of new features. However, kvm does not have too many dependencies, and you should be able to just install the newer rpm without any problem.

If you are willing to spend a time bisecting it, we can find the exact set of patches that fixes your particular problem, and if it is not too invasive, backport just them
Comment 9 Gideon Mayhak 2008-11-11 21:41:32 EST
Are you sure they wouldn't be willing to break the freeze for something like this?

Anyway, I'd be happy to help isolate the bug.  What information do you need and what's the easiest way to get it?  :)
Comment 10 Glauber Costa 2008-11-12 07:13:47 EST
Yes, I am sure.

"something like this", in this case, is a piece of binary blob about which we have no information whatsoever. It could have worked for you for sheer luck, and then in the other day, the blob decides to screw up yet another kernel data structure, and it breaks again. Look at kerneloops.org. NVIDIA blob oopses tops the list by far. There's no way we can assure it works, and there is no way we can keep trying to work around every possible bug which the cause is unknown to us.

But that said, I'd like to help users as much as I can. So, as I've stated before, if you are willing too:
* download kvm-userspace.git.
* mark kvm-74 as bad
* mark kvm-78 as good.
* bisect it, and find the exact patch that fixes the problem
AND the fix is not invasive (this is the most important part), I can apply it.

But again, I doubt it worth the job. kvm does not have that many dependencies, you should be able to use rawhide's kvm-78 and just be happy. It would probably cost you less
Comment 11 Gideon Mayhak 2008-11-13 00:06:38 EST
Is 78 (or newer) going to eventually become an update for Fedora 10?  I'm just thinking for other NVIDIA users.  But anyway, I thank you very much for the suggestions and help :).
Comment 12 Bug Zapper 2008-11-26 00:01:52 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 13 Gideon Mayhak 2009-04-22 22:33:28 EDT
To save you from having to look at this as an open bug, I'm going to close it.  KVM has been working great in rawhide using the nouveau driver, and the nouveau driver has gotten so good that I don't plan on using the proprietary NVIDIA driver with the F11 release.  NVIDIA's crap is not your problem, and you couldn't fix it if you wanted to.  Consider this CLOSED WONTFIX.

Thanks for your time!