Bug 851936
Summary: | Occassionally gluster storage domain goes offline, but RHS nodes are all online. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Gowrishankar Rajaiyan <grajaiya> | ||||
Component: | vdsm | Assignee: | Dan Kenigsberg <danken> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Sudhir D <sdharane> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 2.0 | CC: | barumuga, grajaiya, hchiramm, perfbz, rhs-bugs, vbellur | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-09-16 06:00:18 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Thread-194253::DEBUG::2011-10-11 14:01:15,351::task::978::TaskManager.Task::(_decref) Task=`4dc31781-bb5b-4a71-8d72-5e12ea60ef2e`::ref 0 aborting False Thread-194254::DEBUG::2011-10-11 14:01:15,384::libvirtvm::240::vm.Vm::(_getDiskStats) vmId=`6de348d7-bf46-4881-ab23-e5d41d13f42e`::Disk hdc stats not available Thread-194254::DEBUG::2011-10-11 14:01:15,385::libvirtvm::240::vm.Vm::(_getDiskStats) vmId=`d0f58ab3-7202-45a0-a9b9-6088e00a65f1`::Disk hdc stats not available Thread-194254::DEBUG::2011-10-11 14:01:15,386::libvirtvm::240::vm.Vm::(_getDiskStats) vmId=`4f1d9317-46ab-4eb3-8f90-e0b258513d19`::Disk hdc stats not available Thread-194254::DEBUG::2011-10-11 14:01:15,386::libvirtvm::240::vm.Vm::(_getDiskStats) vmId=`816ac538-6aa6-44af-8ad0-7c35d149d6c0`::Disk hdc stats not available Thread-194254::DEBUG::2011-10-11 14:01:15,387::libvirtvm::240::vm.Vm::(_getDiskStats) vmId=`d9e0053c-0e7b-442f-abb5-733f31da97b4`::Disk hdc stats not available Thread-25::ERROR::2011-10-11 14:01:15,465::utils::399::vm.Vm::(collect) vmId=`d9e0053c-0e7b-442f-abb5-733f31da97b4`::Stats function failed: <AdvancedStatsFunction _sampleNet at 0x29b0a78> Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 395, in collect statsFunction() File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 272, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/libvirtvm.py", line 179, in _sampleNet netSamples[nic.name] = self._vm._dom.interfaceStats(nic.name) File "/usr/share/vdsm/libvirtvm.py", line 491, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 82, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1762, in interfaceStats if ret is None: raise libvirtError ('virDomainInterfaceStats() failed', dom=self) libvirtError: internal error client socket is closed Thread-22::ERROR::2011-10-11 14:01:19,192::utils::399::vm.Vm::(collect) vmId=`4f1d9317-46ab-4eb3-8f90-e0b258513d19`::Stats function failed: <AdvancedStatsFunction _sampleNet at 0x29b0a78> Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 395, in collect statsFunction() File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 272, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/libvirtvm.py", line 179, in _sampleNet netSamples[nic.name] = self._vm._dom.interfaceStats(nic.name) File "/usr/share/vdsm/libvirtvm.py", line 491, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 82, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1762, in interfaceStats if ret is None: raise libvirtError ('virDomainInterfaceStats() failed', dom=self) libvirtError: internal error client socket is closed Thread-21::ERROR::2011-10-11 14:01:19,205::utils::399::vm.Vm::(collect) vmId=`816ac538-6aa6-44af-8ad0-7c35d149d6c0`::Stats function failed: <AdvancedStatsFunction _sampleNet at 0x29b0a78> Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 395, in collect statsFunction() File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 272, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/libvirtvm.py", line 179, in _sampleNet netSamples[nic.name] = self._vm._dom.interfaceStats(nic.name) File "/usr/share/vdsm/libvirtvm.py", line 491, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 82, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1762, in interfaceStats if ret is None: raise libvirtError ('virDomainInterfaceStats() failed', dom=self) libvirtError: internal error client socket is closed I am not clear about this exception. Please help on this. The attached log is unrelated to storage - this is not what made Engine believe that the storage is offline. Do you see a prepareForShutdown somewhere? Could you find some other clues in engine and vdsm logs? Has the VM crashed, or did it appear up after vdsm was restarted? Which libvirt version is used? Does it logs have clues about vdsm's disconnecting from it? Has libvirt process crashed? Haven't seen this behaviour after upgrading to si17. Please reopen with requested info if it ever reproduces. |
Created attachment 607114 [details] vdsm.log Description of problem: During RHEV3.1-RHS2.0+ testing, I noticed that gluster storage domains go offline even though all the RHS nodes are online. Version-Release number of selected component (if applicable): glusterfs-3.3.0rhs-26.el6rhs.x86_64 rhev-hypervisor-6.1-20120607.0.el6_1.noarch vdsm-4.9.6-28.0.el6_3.x86_64 How reproducible: Occasionally Steps to Reproduce: 1. Setup RHEV environment 2. Create Datacenter with Storage as gluster mount (POSIX compliant FS). 3. Create a virtual machine on this storage. Actual results: After some time, the storage domains go offline. Restarting vdsm brings them back online. Expected results: Storage domain should never go offline. Additional info: