Bug 837735 - [VDSM] Node randomally goes offline.
[VDSM] Node randomally goes offline.
Product: oVirt
Classification: Community
Component: vdsm (Show other bugs)
3.1 RC
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Dan Kenigsberg
Depends On:
  Show dependency treegraph
Reported: 2012-07-04 23:49 EDT by Robert Middleswarth
Modified: 2012-10-27 19:20 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-10-27 19:20:11 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Log Collector files from the crashing system. (4.84 MB, application/octet-stream)
2012-07-04 23:49 EDT, Robert Middleswarth
no flags Details

  None (edit)
Description Robert Middleswarth 2012-07-04 23:49:47 EDT
Created attachment 596310 [details]
Log Collector files from the crashing system.

Description of problem:
1 of my 3 Nodes keeps randomally going off line the other 3 nodes are running fine without any issues.

Version-Release number of selected component (if applicable):
Installed Packages
Name        : vdsm
Arch        : x86_64
Version     : 4.10.0
Release     : 0.58.gita6f4929.el6
Size        : 2.3 M
Repo        : installed
From repo   : vdsm-dre
Summary     : Virtual Desktop Server Manager
URL         : http://www.ovirt.org/wiki/Vdsm
License     : GPLv2+
Description : The VDSM service is required by a Virtualization Manager to manage
            : the Linux hosts. VDSM manages and monitors the host's storage,
            : memory and networks as well as virtual machine creation, other
            : host administration tasks, statistics gathering, and log
            : collection.

How reproducible:
Seems to be happening ever few hours but only the one host.

Steps to Reproduce:
1.No steps needed it happens
Actual results:
The other hosts are stable but not this one.

Expected results:
All 3 being stable.

Additional info:
Attached is ovirt-log-collector files.
Comment 1 Itamar Heim 2012-07-05 02:56:28 EDT
not sure which OS/distro this is from (I'm guessing centos 6.2), but:
all root files (dmidecode, etc.) are 0 byte size
vdsm log is missing

I think some sos plugins on this host are missing.

keith - thoughts
Comment 2 Ayal Baron 2012-07-05 03:55:07 EDT
In addition to proper logs, What do you mean it goes offline?
The physical host shuts down?
vdsm stops?
It is non-operational in engine?
Comment 3 Robert Middleswarth 2012-07-05 18:54:42 EDT

Please define what you mean by SOS plugins?

Host goes non-operational.

Comment 4 Robert Middleswarth 2012-07-05 18:57:12 EDT
I moved to a newer build and the one node hasn't gone offline since.  If it pop's back up what logs are you looking for?

Comment 5 Dan Kenigsberg 2012-10-27 19:20:11 EDT
I'd like to see the output of getVdsCaps on the non-op machine. Running `sosreport -o vdsm` should provide it (and more). Please reopen if needed.

Note You need to log in before you can comment on or make changes to this bug.