Hide Forgot
Description of problem: Creating a host fails with the following log error entry: <------------------------- 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._isVirtualizationEnabled:179 virtualization support GenuineIntel (cpu: True, bios: False) 2013-10-23 15:47:15 DEBUG otopi.context context._executeMethod:130 method exception Traceback (most recent call last): File "/tmp/ovirt-D7lCbutUQy/pythonlib/otopi/context.py", line 120, in _executeMethod method['method']() File "/tmp/ovirt-D7lCbutUQy/otopi-plugins/ovirt-host-deploy/vdsm/hardware.py", line 205, in _validate_virtualization _('Hardware does not support virtualization') RuntimeError: Hardware does not support virtualization 2013-10-23 15:47:15 ERROR otopi.context context._executeMethod:139 Failed to execute stage 'Setup validation': Hardware does not support virtualization ------------------> However, this host has virtulization enabled in the bios and the kvm modules load without any issue. Looks like the following function in hardware.py is broken: /usr/share/ovirt-host-deploy/plugins/ovirt-host-deploy/vdsm/hardware.py:126 msr = self._prdmsr(0, MSR_IA32_FEATURE_CONTROL) This function should return true, but is returning false. Version-Release number of selected component (if applicable): How reproducible: Yes. Steps to Reproduce: 1. Try to create a new host with virtualization enabled in rhev-m. It will fail. 2. 3. Actual results: The new host add fails with the above log error. Expected results: The host should add correctly, since kvm modules load on the system and we are able to run vms in it. Additional info:
Can you please send complete host-deploy log? This should have been solved long ago (since 1.0.0)[1] [1] http://gerrit.ovirt.org/#/c/9758/
Created attachment 817369 [details] requested host deploy log. Attaching host-attach log as requested.
Thanks!!! First, are you using oVirt or RHEVM? I cannot understand... from log I see: --- 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:134 vmx bios: True 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._cpuid:81 cpuid: (67237, 269486080, 10281917, 3219913727) 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._cpu_has_vmx_support:88 vmx support: True 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._isVirtualizationEnabled:179 virtualization support GenuineIntel (cpu: True, bios: False) --- This means that both functions (self._vmx_enabled_by_bios, self._cpu_has_vmx_support) return True, but somehow the bios_ok is turned into False... I do not understand how it is possible. Can you please investigate some farther? as it seems that nobody else is effected by this.
Created attachment 817395 [details] vmx bios I am using RHEV-M 3.3. vmx bios is True, because I manually changed it in the code while debugging. You can look at th earlier log I have provided where vmx bios is False. I'm not sure what further investigation you want me to do? If there some command etc. you would like me to try to help narrow down the problem, I can try it out. Perhaps the python command is broken and needs to be investigated? Is the flag you are sending into the command correct?
Please state when you manually changing anything... If you are using rhevm-3.3 how come you use otopi-1.0 and ovirt-host-deploy-1.0? Both should have been pulled to 1.1. What channel do you use? what exact rpms: rpm -qa | grep rhevm rpm -q otopi rpm -q ovirt-host-deploy Thanks!
I have no clue about the version of otopi, ovirt-host-deploy etc. If such an update is required, it should have been specified as an rpm dependency that would be managed by yum/apt-get automatically and not require customer to open bugzilla entries to figure this out. We use internal Symantec channels that are synchronized to redhat channels. Which channel would you like us to use? [root@ranjit ~]# rpm -qa|grep rhevm rhevm-config-3.2.0-11.30.el6ev.noarch rhevm-doc-3.2.0-4.el6eng.noarch rhevm-genericapi-3.2.0-11.30.el6ev.noarch rhevm-userportal-3.2.0-11.30.el6ev.noarch rhevm-iso-uploader-3.2.2-1.el6ev.noarch rhevm-dbscripts-3.2.0-11.30.el6ev.noarch rhevm-image-uploader-3.2.2-1.el6ev.noarch rhevm-sdk-3.2.0.11-1.el6ev.noarch rhevm-tools-common-3.2.0-11.30.el6ev.noarch rhevm-3.2.0-11.30.el6ev.noarch rhevm-backend-3.2.0-11.30.el6ev.noarch rhevm-spice-client-x64-cab-3.2-10.el6ev.noarch rhevm-log-collector-3.2.2-3.el6ev.noarch rhevm-cli-3.2.0.9-1.el6ev.noarch rhevm-setup-3.2.0-11.30.el6ev.noarch rhevm-webadmin-portal-3.2.0-11.30.el6ev.noarch rhevm-spice-client-x86-cab-3.2-10.el6ev.noarch rhevm-restapi-3.2.0-11.30.el6ev.noarch rhevm-notification-service-3.2.0-11.30.el6ev.noarch [root@ranjit ~]# rpm -q otopi otopi-1.0.0-2.el6ev.noarch [root@ranjit ~]# rpm -q ovirt-host-deploy ovirt-host-deploy-1.0.0-2.el6ev.noarch [root@ranjit ~]#
I do not understand what happening... are you sure that now attachment#817395 [details] is a product of execution without any modification to software? 2013-10-23 14:46:49 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._prdmsr:118 prdmsr: 5 --- def _vmx_enabled_by_bios(self): MSR_IA32_FEATURE_CONTROL = 0x3a MSR_IA32_FEATURE_CONTROL_LOCKED = 0x1 MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED = 0x4 msr = self._prdmsr(0, MSR_IA32_FEATURE_CONTROL) ret = ( msr & ( MSR_IA32_FEATURE_CONTROL_LOCKED | MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED ) ) != MSR_IA32_FEATURE_CONTROL_LOCKED self.logger.debug('vmx bios: %s', ret) return ret --- This reduced to the following statement: 5 & (4 | 1) != 1 Which evaluates to True while in the log I can see this statement evaluates to False, which is kind of impossible.
No changes to that portion of the code. Are you sure the return value from that function is 5?
(In reply to Ranjit Noronha from comment #8) > No changes to that portion of the code. Are you sure the return value from > that function is 5? --- def _prdmsr(self, cpu, index): ret = -1 msr = '/dev/cpu/%d/msr' % cpu if not os.path.exists(msr): raise AttributeError('No %s' % msr) with open(msr, 'r') as f: try: f.seek(index) ret = struct.unpack('L', f.read(8))[0] except struct.error: pass self.logger.debug('prdmsr: %s', ret) return ret --- --- 2013-10-23 14:46:49 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._prdmsr:118 prdmsr: 5 --- if the above code and message are used, the return return exactly what the message prints, unless something is terribly wrong somewhere. can you please add more debug lines so we can analyze that? At engine side: /usr/share/ovirt-host-deploy/plugins/ovirt-host-deploy/vdsm/hardware.py --- def _vmx_enabled_by_bios(self): MSR_IA32_FEATURE_CONTROL = 0x3a MSR_IA32_FEATURE_CONTROL_LOCKED = 0x1 MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED = 0x4 msr = self._prdmsr(0, MSR_IA32_FEATURE_CONTROL) + self.logger.debug('DEBUG001: %s:%s', type(msr), msr) + self.logger.debug('DEBUG002: %s', msr & 5) + self.logger.debug('DEBUG002: %s', msr & (MSR_IA32_FEATURE_CONTROL_LOCKED|MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED)) ret = ( msr & ( MSR_IA32_FEATURE_CONTROL_LOCKED | MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED ) ) != MSR_IA32_FEATURE_CONTROL_LOCKED self.logger.debug('vmx bios: %s', ret) return ret --- Then re-deploy host. Thanks
Created attachment 817841 [details] attaching with debugging info Here you go. The full log is attached. Looked like the function is returning 1.
Looks like we are not seeing this issue with 3.3 so you can close this bug.
(In reply to Ranjit Noronha from comment #10) > Created attachment 817841 [details] > attaching with debugging info > > Here you go. The full log is attached. Thank you! > Looked like the function is returning 1. It is seems to be impossible, as we print the return value of the function before returning, and it is 5. --- 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._prdmsr:118 prdmsr: 5 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:127 DEBUG001: <type 'int'>:1 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:128 DEBUG002: 1 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:130 DEBUG002: 1 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:139 vmx bios: False --- Can you please add more debug, notice the new one at _vmx_enabled_by_bios. Thank you for your help! --- def _prdmsr(self, cpu, index): ret = -1 + self.logger.debug('DEBUG011: entry') msr = '/dev/cpu/%d/msr' % cpu if not os.path.exists(msr): raise AttributeError('No %s' % msr) with open(msr, 'r') as f: try: f.seek(index) ret = struct.unpack('L', f.read(8))[0] except struct.error: pass self.logger.debug('prdmsr: %s', ret) + self.logger.debug('DEBUG012: %s:%s', type(ret), ret) return ret def _vmx_enabled_by_bios(self): MSR_IA32_FEATURE_CONTROL = 0x3a MSR_IA32_FEATURE_CONTROL_LOCKED = 0x1 MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED = 0x4 + self.logger.debug('DEBUG000: call') msr = self._prdmsr(0, MSR_IA32_FEATURE_CONTROL) + self.logger.debug('DEBUG001: %s:%s', type(msr), msr) + self.logger.debug('DEBUG002: %s', msr & 5) + self.logger.debug('DEBUG002: %s', msr & (MSR_IA32_FEATURE_CONTROL_LOCKED|MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED)) ret = ( msr & ( MSR_IA32_FEATURE_CONTROL_LOCKED | MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED ) ) != MSR_IA32_FEATURE_CONTROL_LOCKED self.logger.debug('vmx bios: %s', ret) return ret ---
(In reply to Ranjit Noronha from comment #11) > Looks like we are not seeing this issue with 3.3 so you can close this bug. there is no difference in this regard between 3.2 and 3.3, I suspect python version.
(In reply to Alon Bar-Lev from comment #13) > (In reply to Ranjit Noronha from comment #11) > > Looks like we are not seeing this issue with 3.3 so you can close this bug. > > there is no difference in this regard between 3.2 and 3.3, I suspect python > version. 3.2 & 3.3 both run on RHEL 6.4 - why do you suspect it's a python version issue ?
(In reply to Barak from comment #14) > (In reply to Alon Bar-Lev from comment #13) > > (In reply to Ranjit Noronha from comment #11) > > > Looks like we are not seeing this issue with 3.3 so you can close this bug. > > > > there is no difference in this regard between 3.2 and 3.3, I suspect python > > version. > > 3.2 & 3.3 both run on RHEL 6.4 - why do you suspect it's a python version > issue ? as I wrote in comment#12, based on the analysis so far, it seems that a function returning '5' and caller gets '1', this is not global issue as we would have probably gotten many reports regarding this one, so it is something that either local or some edge in python.
Alon, Is this behaviour reproducible in your environment ? Arthur
(In reply to Arthur Berezin from comment #16) > Alon, Is this behaviour reproducible in your environment ? No... had it been reproduced, it would have already been solved :) It is also first report of this kind we got... very strange one... had I not known better, I would have guessed that this is memory corruption within python.
Please reopen if reproduced. Thanks!