Bug 1149601
| Summary: | Unexpected exception in vdsm log, when setting up hosted-engine on EL7 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Artyom <alukiano> | ||||||||||
| Component: | vdsm | Assignee: | Nobody <nobody> | ||||||||||
| Status: | CLOSED WORKSFORME | QA Contact: | Jiri Belka <jbelka> | ||||||||||
| Severity: | high | Docs Contact: | |||||||||||
| Priority: | unspecified | ||||||||||||
| Version: | 3.5.0 | CC: | alukiano, bazulay, danken, dfediuck, ecohen, fromani, gklein, lpeer, lsurette, michal.skrivanek, mkalinin, msivak, nbarcet, oourfali, pkrempa, raul.laansoo, rgolan, sbonazzo, ybronhei, yeylon | ||||||||||
| Target Milestone: | ovirt-3.5.6 | Keywords: | Triaged | ||||||||||
| Target Release: | 3.5.6 | ||||||||||||
| Hardware: | x86_64 | ||||||||||||
| OS: | Linux | ||||||||||||
| Whiteboard: | sla | ||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2015-10-19 10:48:12 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
I'm not able to reproduce this, please reopen if you can provide more info on how to reproduce it with latest build. i still see it in vdsm.log.
qemu-kvm-rhev-1.5.3-60.el7_0.10.x86_64
libvirt-daemon-1.1.1-29.el7_0.3.x86_64
vdsm-4.16.7.4-1.el7ev.x86_6
- steps:
* install rhel7
* install ovirt-hosted-engine-setup
* hosted-engine --deploy
i didn't use dns but modified /etc/hosts.
- some fake ip for hosted engine during initial hosted-engine --deploy
- when guest os is installed, query its ip and modify host's /etc/hosts to match
Thread-19::ERROR::2014-11-21 13:07:45,339::sampling::475::vm.Vm::(collect) vmId=`bef5cfb5-b8ac-4ec8-bdda-bedb757ea695`::Stats function failed: <AdvancedStatsFunction _sampleCpu at 0x304c840>
Traceback (most recent call last):
File "/usr/share/vdsm/virt/sampling.py", line 471, in collect
statsFunction()
File "/usr/share/vdsm/virt/sampling.py", line 346, in __call__
retValue = self._function(*args, **kwargs)
File "/usr/share/vdsm/virt/vm.py", line 313, in _sampleCpu
cpuStats = self._vm._dom.getCPUStats(True, 0)
File "/usr/share/vdsm/virt/vm.py", line 668, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2016, in getCPUStats
if ret is None: raise libvirtError ('virDomainGetCPUStats() failed', dom=self)
libvirtError: Requested operation is not valid: cgroup CPUACCT controller is not mounted
Thread-19::DEBUG::2014-11-21 13:07:45,342::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 55 edom: 10 level: 2 message: Requested operation is not valid: cgroup CPU controller is not mounted
Thread-19::ERROR::2014-11-21 13:07:45,342::sampling::475::vm.Vm::(collect) vmId=`bef5cfb5-b8ac-4ec8-bdda-bedb757ea695`::Stats function failed: <AdvancedStatsFunction _sampleCpuTune at 0x3047650>
Traceback (most recent call last):
File "/usr/share/vdsm/virt/sampling.py", line 471, in collect
statsFunction()
File "/usr/share/vdsm/virt/sampling.py", line 346, in __call__
retValue = self._function(*args, **kwargs)
File "/usr/share/vdsm/virt/vm.py", line 359, in _sampleCpuTune
infos = self._vm._dom.schedulerParameters()
File "/usr/share/vdsm/virt/vm.py", line 668, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2134, in schedulerParameters
if ret is None: raise libvirtError ('virDomainGetSchedulerParameters() failed', dom=self)
libvirtError: Requested operation is not valid: cgroup CPU controller is not mounted
Created attachment 959670 [details]
host's logs
So, just to recap, hosted-engine --deploy finished with success, I can see in the logs:
"Hosted Engine successfully set up"
The engine VM is up and running and everything seems to work but VDSM is reporting
Thread-19::ERROR::2014-11-21 13:07:30,330::sampling::475::vm.Vm::(collect) vmId=`bef5cfb5-b8ac-4ec8-bdda-bedb757ea695`::Stats function failed: <AdvancedStatsFunction _sampleCpu at 0x304c840>
Traceback (most recent call last):
File "/usr/share/vdsm/virt/sampling.py", line 471, in collect
statsFunction()
File "/usr/share/vdsm/virt/sampling.py", line 346, in __call__
retValue = self._function(*args, **kwargs)
File "/usr/share/vdsm/virt/vm.py", line 313, in _sampleCpu
cpuStats = self._vm._dom.getCPUStats(True, 0)
File "/usr/share/vdsm/virt/vm.py", line 668, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2016, in getCPUStats
if ret is None: raise libvirtError ('virDomainGetCPUStats() failed', dom=self)
libvirtError: Requested operation is not valid: cgroup CPUACCT controller is not mounted
Thread-19::DEBUG::2014-11-21 13:07:30,335::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 55 edom: 10 level: 2 message: Requested operation is not valid: cgroup CPU controller is not mounted
Moving to VDSM infra for further investigation since it may be just an uncaught exception.
The issue relates to specific libvirt request for cpu states. I think virt more familiar with the process and can give better inputs about this area I saw the same message casually on one Centos 7 host of mine. I haven't pinpointed here the root cause but looks like we faced it in the not so distant past https://bugzilla.redhat.com/show_bug.cgi?id=982306 relevant libvirt documentation: https://libvirt.org/cgroups.html it seems it is a mismatch on the tree hierarchy systemd configures and libvirt expects. CPUStats and Hosted Engine are SLA, moving for further investigation. Yaniv, it may be also a configuration issue so it may end up back on infra... What configuration issue can it be? I can't find something related to this exception and never encountered it.. Let's get more input and see what can lead to this. I follow it Martin, any idea? (In reply to Yaniv Bronhaim from comment #8) > What configuration issue can it be? I can't find something related to this > exception and never encountered it.. Let's get more input and see what can > lead to this. I follow it > > Martin, any idea? The above seems quite clear- libvirtError: Requested operation is not valid: cgroup CPUACCT controller is not mounted We need to verify cgroups are mounted, and it seems they are not. This is what causing the exception. In the past host deploy verified it. Yaniv, can you please check what changed? Martin - Please check whether cgroups is really available or not (or perhaps ask Artyom to reproduce first). As far as I know cgroups are part of the OS, so perhaps it is a libvirt issue. Anyway, it is not a host-deploy issue. Libvirt is responsible for using the proper cgroups structure and libvirt daemon should depend on cgroups mounting in systemd. I will ask libvirt people for clarification, but I think there is nothing we can do here. It also might be related to cgroups disappearing in some cases due to a bug in systemd: https://bugzilla.redhat.com/show_bug.cgi?id=1179715 so I guess this is worth a retest with https://rhn.redhat.com/errata/RHBA-2015-0037.html Created attachment 998344 [details]
hosted-engine-setup and vdsm log
I did deploy on rhel7, with:
systemd-python-208-11.el7_0.6.x86_64
systemd-libs-208-11.el7_0.6.x86_64
systemd-208-11.el7_0.6.x86_64
systemd-sysv-208-11.el7_0.6.x86_64
and do not see above error in vdsm log, how I understand it's not happen all time, so this retest not mean much.
what's different from comment #2? I see a little difference in versions: vdsm-4.16.8.1-7.el7ev.x86_64 qemu-kvm-rhev-1.5.3-60.el7_0.12.x86_64 libvirt-daemon-1.1.1-29.el7_0.7.x86_64 It can be different cpu or something else, or bug occurred but not always happen. Maybe Jiri can provide more information(if he can reproduce it and provide logs with exception) I also see this error with http://resources.ovirt.org/pub/ovirt-3.5/iso/ovirt-node-iso-3.5-0.201502231653.el6.iso. Created attachment 1005577 [details]
vdsm.log
ovirt-node-iso-3.5-0.201502231653.el6.iso
moving to 3.5.4 due to capacity planning for 3.5.3. if you believe this should remain in 3.5.3, please sync with pm/dev/qe and a full triple ack for it. also - ensure priority is set accordingly. Artyom can you or Jiri try to reproduce? if its still there will handle it for 3.6, otherwise close it. I checked deployment on clean host and no exception connect to libvirt error. So hard for me to say if bug still actually or not, I that Raul Laansoo was the last who saw bug, so maybe he can help. |
Created attachment 944164 [details] vdsm log Description of problem: Unexpected exception in vdsm log, when setting up hosted-engine on EL7, setup of hosted engine success and work. Version-Release number of selected component (if applicable): vdsm-4.16.5-2.el7.x86_64 ovirt-hosted-engine-setup-1.2.0-1.el7.noarch How reproducible: Always Steps to Reproduce: 1. Run hosted-engine --deploy and finish setup 2. 3. Actual results: Unexpected exception in vdsm log: Thread-20::ERROR::2014-10-05 17:23:10,961::sampling::475::vm.Vm::(collect) vmId=`f4675766-a472-4c49-bde7-ad68db30e77c`::Stats function failed: <AdvancedStatsFunction _sampleCpu at 0x2adf1b0> Traceback (most recent call last): File "/usr/share/vdsm/virt/sampling.py", line 471, in collect statsFunction() File "/usr/share/vdsm/virt/sampling.py", line 346, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/virt/vm.py", line 313, in _sampleCpu cpuStats = self._vm._dom.getCPUStats(True, 0) File "/usr/share/vdsm/virt/vm.py", line 670, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2016, in getCPUStats if ret is None: raise libvirtError ('virDomainGetCPUStats() failed', dom=self) libvirtError: Requested operation is not valid: cgroup CPUACCT controller is not mounted Thread-20::DEBUG::2014-10-05 17:23:10,964::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 55 edom: 10 level: 2 message: Requested operation is not valid: cgroup CPU controller is not mounted Expected results: No any exceptions in vdsm log. Additional info: