Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 837735

Summary: [VDSM] Node randomally goes offline.
Product: [Retired] oVirt Reporter: Robert Middleswarth <robert>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED INSUFFICIENT_DATA QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.1 RCCC: abaron, acathrow, bazulay, dyasny, iheim, mgoldboi, ykaul
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-27 23:20:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Log Collector files from the crashing system. none

Description Robert Middleswarth 2012-07-05 03:49:47 UTC
Created attachment 596310 [details]
Log Collector files from the crashing system.

Description of problem:
1 of my 3 Nodes keeps randomally going off line the other 3 nodes are running fine without any issues.

Version-Release number of selected component (if applicable):
Installed Packages
Name        : vdsm
Arch        : x86_64
Version     : 4.10.0
Release     : 0.58.gita6f4929.el6
Size        : 2.3 M
Repo        : installed
From repo   : vdsm-dre
Summary     : Virtual Desktop Server Manager
URL         : http://www.ovirt.org/wiki/Vdsm
License     : GPLv2+
Description : The VDSM service is required by a Virtualization Manager to manage
            : the Linux hosts. VDSM manages and monitors the host's storage,
            : memory and networks as well as virtual machine creation, other
            : host administration tasks, statistics gathering, and log
            : collection.

How reproducible:
Seems to be happening ever few hours but only the one host.

Steps to Reproduce:
1.No steps needed it happens
  
Actual results:
The other hosts are stable but not this one.

Expected results:
All 3 being stable.

Additional info:
Attached is ovirt-log-collector files.

Comment 1 Itamar Heim 2012-07-05 06:56:28 UTC
not sure which OS/distro this is from (I'm guessing centos 6.2), but:
all root files (dmidecode, etc.) are 0 byte size
vdsm log is missing

I think some sos plugins on this host are missing.

keith - thoughts

Comment 2 Ayal Baron 2012-07-05 07:55:07 UTC
In addition to proper logs, What do you mean it goes offline?
The physical host shuts down?
vdsm stops?
It is non-operational in engine?

Comment 3 Robert Middleswarth 2012-07-05 22:54:42 UTC
Itamar,

Please define what you mean by SOS plugins?

Ayal,
Host goes non-operational.

Thanks
Robert

Comment 4 Robert Middleswarth 2012-07-05 22:57:12 UTC
I moved to a newer build and the one node hasn't gone offline since.  If it pop's back up what logs are you looking for?

Thanks
Robert

Comment 5 Dan Kenigsberg 2012-10-27 23:20:11 UTC
I'd like to see the output of getVdsCaps on the non-op machine. Running `sosreport -o vdsm` should provide it (and more). Please reopen if needed.