Bug 1024618
Summary: | Cannot create a host through RHEVM since it is not able to detect virtualization suport in the bios | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-host-deploy | Reporter: | Ranjit Noronha <ranjit_noronha> | ||||||||
Component: | Plugins.VDSM | Assignee: | Alon Bar-Lev <alonbl> | ||||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Haim <hateya> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | master | CC: | aberezin, acathrow, alonbl, bazulay, bugs, dougsland, iheim, Rhev-m-bugs, yeylon | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | infra | ||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-11-28 09:34:22 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Ranjit Noronha
2013-10-30 05:50:22 UTC
Can you please send complete host-deploy log? This should have been solved long ago (since 1.0.0)[1] [1] http://gerrit.ovirt.org/#/c/9758/ Created attachment 817369 [details]
requested host deploy log.
Attaching host-attach log as requested.
Thanks!!! First, are you using oVirt or RHEVM? I cannot understand... from log I see: --- 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:134 vmx bios: True 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._cpuid:81 cpuid: (67237, 269486080, 10281917, 3219913727) 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._cpu_has_vmx_support:88 vmx support: True 2013-10-23 15:47:15 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._isVirtualizationEnabled:179 virtualization support GenuineIntel (cpu: True, bios: False) --- This means that both functions (self._vmx_enabled_by_bios, self._cpu_has_vmx_support) return True, but somehow the bios_ok is turned into False... I do not understand how it is possible. Can you please investigate some farther? as it seems that nobody else is effected by this. Created attachment 817395 [details]
vmx bios
I am using RHEV-M 3.3.
vmx bios is True, because I manually changed it in the code while debugging. You can look at th earlier log I have provided where vmx bios is False.
I'm not sure what further investigation you want me to do? If there some command etc. you would like me to try to help narrow down the problem, I can try it out.
Perhaps the python command is broken and needs to be investigated? Is the flag you are sending into the command correct?
Please state when you manually changing anything... If you are using rhevm-3.3 how come you use otopi-1.0 and ovirt-host-deploy-1.0? Both should have been pulled to 1.1. What channel do you use? what exact rpms: rpm -qa | grep rhevm rpm -q otopi rpm -q ovirt-host-deploy Thanks! I have no clue about the version of otopi, ovirt-host-deploy etc. If such an update is required, it should have been specified as an rpm dependency that would be managed by yum/apt-get automatically and not require customer to open bugzilla entries to figure this out. We use internal Symantec channels that are synchronized to redhat channels. Which channel would you like us to use? [root@ranjit ~]# rpm -qa|grep rhevm rhevm-config-3.2.0-11.30.el6ev.noarch rhevm-doc-3.2.0-4.el6eng.noarch rhevm-genericapi-3.2.0-11.30.el6ev.noarch rhevm-userportal-3.2.0-11.30.el6ev.noarch rhevm-iso-uploader-3.2.2-1.el6ev.noarch rhevm-dbscripts-3.2.0-11.30.el6ev.noarch rhevm-image-uploader-3.2.2-1.el6ev.noarch rhevm-sdk-3.2.0.11-1.el6ev.noarch rhevm-tools-common-3.2.0-11.30.el6ev.noarch rhevm-3.2.0-11.30.el6ev.noarch rhevm-backend-3.2.0-11.30.el6ev.noarch rhevm-spice-client-x64-cab-3.2-10.el6ev.noarch rhevm-log-collector-3.2.2-3.el6ev.noarch rhevm-cli-3.2.0.9-1.el6ev.noarch rhevm-setup-3.2.0-11.30.el6ev.noarch rhevm-webadmin-portal-3.2.0-11.30.el6ev.noarch rhevm-spice-client-x86-cab-3.2-10.el6ev.noarch rhevm-restapi-3.2.0-11.30.el6ev.noarch rhevm-notification-service-3.2.0-11.30.el6ev.noarch [root@ranjit ~]# rpm -q otopi otopi-1.0.0-2.el6ev.noarch [root@ranjit ~]# rpm -q ovirt-host-deploy ovirt-host-deploy-1.0.0-2.el6ev.noarch [root@ranjit ~]# I do not understand what happening... are you sure that now attachment#817395 [details] is a product of execution without any modification to software?
2013-10-23 14:46:49 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._prdmsr:118 prdmsr: 5
---
def _vmx_enabled_by_bios(self):
MSR_IA32_FEATURE_CONTROL = 0x3a
MSR_IA32_FEATURE_CONTROL_LOCKED = 0x1
MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED = 0x4
msr = self._prdmsr(0, MSR_IA32_FEATURE_CONTROL)
ret = (
msr & (
MSR_IA32_FEATURE_CONTROL_LOCKED |
MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED
)
) != MSR_IA32_FEATURE_CONTROL_LOCKED
self.logger.debug('vmx bios: %s', ret)
return ret
---
This reduced to the following statement:
5 & (4 | 1) != 1
Which evaluates to True while in the log I can see this statement evaluates to False, which is kind of impossible.
No changes to that portion of the code. Are you sure the return value from that function is 5? (In reply to Ranjit Noronha from comment #8) > No changes to that portion of the code. Are you sure the return value from > that function is 5? --- def _prdmsr(self, cpu, index): ret = -1 msr = '/dev/cpu/%d/msr' % cpu if not os.path.exists(msr): raise AttributeError('No %s' % msr) with open(msr, 'r') as f: try: f.seek(index) ret = struct.unpack('L', f.read(8))[0] except struct.error: pass self.logger.debug('prdmsr: %s', ret) return ret --- --- 2013-10-23 14:46:49 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._prdmsr:118 prdmsr: 5 --- if the above code and message are used, the return return exactly what the message prints, unless something is terribly wrong somewhere. can you please add more debug lines so we can analyze that? At engine side: /usr/share/ovirt-host-deploy/plugins/ovirt-host-deploy/vdsm/hardware.py --- def _vmx_enabled_by_bios(self): MSR_IA32_FEATURE_CONTROL = 0x3a MSR_IA32_FEATURE_CONTROL_LOCKED = 0x1 MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED = 0x4 msr = self._prdmsr(0, MSR_IA32_FEATURE_CONTROL) + self.logger.debug('DEBUG001: %s:%s', type(msr), msr) + self.logger.debug('DEBUG002: %s', msr & 5) + self.logger.debug('DEBUG002: %s', msr & (MSR_IA32_FEATURE_CONTROL_LOCKED|MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED)) ret = ( msr & ( MSR_IA32_FEATURE_CONTROL_LOCKED | MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED ) ) != MSR_IA32_FEATURE_CONTROL_LOCKED self.logger.debug('vmx bios: %s', ret) return ret --- Then re-deploy host. Thanks Created attachment 817841 [details]
attaching with debugging info
Here you go. The full log is attached. Looked like the function is returning 1.
Looks like we are not seeing this issue with 3.3 so you can close this bug. (In reply to Ranjit Noronha from comment #10) > Created attachment 817841 [details] > attaching with debugging info > > Here you go. The full log is attached. Thank you! > Looked like the function is returning 1. It is seems to be impossible, as we print the return value of the function before returning, and it is 5. --- 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._prdmsr:118 prdmsr: 5 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:127 DEBUG001: <type 'int'>:1 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:128 DEBUG002: 1 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:130 DEBUG002: 1 2013-10-31 22:05:43 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.hardware hardware._vmx_enabled_by_bios:139 vmx bios: False --- Can you please add more debug, notice the new one at _vmx_enabled_by_bios. Thank you for your help! --- def _prdmsr(self, cpu, index): ret = -1 + self.logger.debug('DEBUG011: entry') msr = '/dev/cpu/%d/msr' % cpu if not os.path.exists(msr): raise AttributeError('No %s' % msr) with open(msr, 'r') as f: try: f.seek(index) ret = struct.unpack('L', f.read(8))[0] except struct.error: pass self.logger.debug('prdmsr: %s', ret) + self.logger.debug('DEBUG012: %s:%s', type(ret), ret) return ret def _vmx_enabled_by_bios(self): MSR_IA32_FEATURE_CONTROL = 0x3a MSR_IA32_FEATURE_CONTROL_LOCKED = 0x1 MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED = 0x4 + self.logger.debug('DEBUG000: call') msr = self._prdmsr(0, MSR_IA32_FEATURE_CONTROL) + self.logger.debug('DEBUG001: %s:%s', type(msr), msr) + self.logger.debug('DEBUG002: %s', msr & 5) + self.logger.debug('DEBUG002: %s', msr & (MSR_IA32_FEATURE_CONTROL_LOCKED|MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED)) ret = ( msr & ( MSR_IA32_FEATURE_CONTROL_LOCKED | MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED ) ) != MSR_IA32_FEATURE_CONTROL_LOCKED self.logger.debug('vmx bios: %s', ret) return ret --- (In reply to Ranjit Noronha from comment #11) > Looks like we are not seeing this issue with 3.3 so you can close this bug. there is no difference in this regard between 3.2 and 3.3, I suspect python version. (In reply to Alon Bar-Lev from comment #13) > (In reply to Ranjit Noronha from comment #11) > > Looks like we are not seeing this issue with 3.3 so you can close this bug. > > there is no difference in this regard between 3.2 and 3.3, I suspect python > version. 3.2 & 3.3 both run on RHEL 6.4 - why do you suspect it's a python version issue ? (In reply to Barak from comment #14) > (In reply to Alon Bar-Lev from comment #13) > > (In reply to Ranjit Noronha from comment #11) > > > Looks like we are not seeing this issue with 3.3 so you can close this bug. > > > > there is no difference in this regard between 3.2 and 3.3, I suspect python > > version. > > 3.2 & 3.3 both run on RHEL 6.4 - why do you suspect it's a python version > issue ? as I wrote in comment#12, based on the analysis so far, it seems that a function returning '5' and caller gets '1', this is not global issue as we would have probably gotten many reports regarding this one, so it is something that either local or some edge in python. Alon, Is this behaviour reproducible in your environment ? Arthur (In reply to Arthur Berezin from comment #16) > Alon, Is this behaviour reproducible in your environment ? No... had it been reproduced, it would have already been solved :) It is also first report of this kind we got... very strange one... had I not known better, I would have guessed that this is memory corruption within python. Please reopen if reproduced. Thanks! |