Bug 1318697

Summary: Cannot boot vm's in RHEV in a nested virtualization environment
Product: [Fedora] Fedora Reporter: Jason Montleon <jmontleo>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: gansalmon, itamar, jmatthew, jmontleo, jonathan, kernel-maint, labbott, madhu.chinakonda, mchehab
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-10 12:34:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jason Montleon 2016-03-17 14:14:17 UTC
Description of problem:
We have been using nested virtualization to test RHEV deployments. Typically we have a Fedora 23 host with nested virtualization enabled, which runs a hypervisor and engine. In turn we'll launch vm's in the RHEV environment (typically CFME at this time)

A new feature in 4.4 is:
"KVM: Nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster"

With 4.4 I see the vpid feature when I cat /proc/cpuinfo on RHEV hypervisors installed. When booting 4.3 I do not see this feature in the guests. On 4.4 kernels we are unable to boot VM's under RHEV. VM's get in a bad state when trying to launch them and the hypervisor starts complaining about cpu soft lockups.

If I add a modeprobe.d file with 'options kvm_intel vpid=0' and unload and load the module before attempting to launch a vm it runs fine. 

Version-Release number of selected component (if applicable):
kernel-4.4.4-301.fc23.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install RHEV on a Fedora 23 host with nested kvm enabled
2. Try to boot a vm on the new rhev cluster

Actual results:
VM hangs trying to start. You'll probably see cpu soft lockups on the hypervisor after awhile

Expected results:
VM boots

Additional info:
Going back to 4.3 kernels seems to work, as does adding a modprobe.d file to disable vpid within the kvm-intel module.

Comment 1 Jason Montleon 2016-03-17 14:16:15 UTC
One bit of clarification, the modprobe.d file to disable vpid would be created on the RHEV hypervisor, not the Fedora baremetal host.

Comment 2 Laura Abbott 2016-03-21 17:14:15 UTC
From the kernel bugzilla, it looks like this is fixed by http://article.gmane.org/gmane.linux.kernel/2179954. I think my preference is for this to just come in through the regular stable release instead of pulling in the patches separately. Please correct me if I found the wrong patch set.

Comment 3 Jason Montleon 2016-03-21 17:23:11 UTC
Yes, I built a custom kernel with Paolo's patches and it works to fix the problem. I'm not opposed to us waiting for it to hit a newer kernel.

Comment 4 Laura Abbott 2016-09-23 19:49:08 UTC
*********** MASS BUG UPDATE **************
 
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 23 kernel bugs.
 
Fedora 23 has now been rebased to 4.7.4-100.fc23.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 24 or 25, and are still experiencing this issue, please change the version to Fedora 24 or 25.
 
If you experience different issues, please open a new bug report for those.

Comment 5 Jason Montleon 2016-10-10 12:33:14 UTC
This is no longer an issue and this bug can be closed. Thanks!