Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1412092 - Hosts moving to connecting state if one of the servers in the DC is in non-responsive state
Summary: Hosts moving to connecting state if one of the servers in the DC is in non-re...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra
Version: future
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ovirt-4.1.0-rc
: 4.1.0
Assignee: Piotr Kliczewski
QA Contact: Petr Matyáš
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-11 09:01 UTC by Michael Burman
Modified: 2017-02-01 14:59 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-01 14:59:03 UTC
oVirt Team: Infra
rule-engine: ovirt-4.1+
rule-engine: blocker+


Attachments (Terms of Use)
engine logs (655.17 KB, application/x-gzip)
2017-01-11 09:01 UTC, Michael Burman
no flags Details
new engine log (510.31 KB, application/x-gzip)
2017-01-11 09:02 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 69993 0 master MERGED handle ssl closed status 2017-01-12 08:25:40 UTC
oVirt gerrit 70385 0 ovirt-4.1 MERGED handle ssl closed status 2017-01-15 11:39:49 UTC

Description Michael Burman 2017-01-11 09:01:55 UTC
Created attachment 1239364 [details]
engine logs

Description of problem:
Hosts moving to connecting state if one of the servers in the DC is in non-responsive state

Version-Release number of selected component (if applicable):
4.1.0-0.4.master.20170110134514.git1586fd4.el7.centos
vdsm-4.19.1-26.gitc25fa08.el7.centos.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Have few hosts in a DC
2. Make one host non-responsive(stop vdsmd) or try to add host and fail
3. All servers and storage domain are going down, DC is down and all serves stuck in connecting state forever. 
Only engine restart make them come UP again.

Comment 1 Michael Burman 2017-01-11 09:02:48 UTC
Created attachment 1239365 [details]
new engine log

Comment 2 Piotr Kliczewski 2017-01-11 10:42:19 UTC
There is wrong version of the library used so changing the version.

Comment 3 Petr Matyáš 2017-01-23 11:47:19 UTC
When I stop vdsm on one of the hosts (with PM) it stays in connecting for 60s and doesn't do anything to the other hosts. But after that, it goes to non responsive and isn't fenced, should I report this as a new bug or move this one to assigned?

Comment 4 Piotr Kliczewski 2017-01-23 11:51:40 UTC
Petr fencing is not part of this patch. I suggest to open new BZ for it.

Comment 5 Petr Matyáš 2017-01-23 11:56:33 UTC
In that case, verified on 4.1.0-8


Note You need to log in before you can comment on or make changes to this bug.