Bug 689814 - 2.2.8 - Host became non-responsive after attaching ISCSI data storage domain to the DataCenter
Summary: 2.2.8 - Host became non-responsive after attaching ISCSI data storage domain ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: vdsm22
Version: 5.6
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Igor Lvovsky
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-22 14:16 UTC by Evgeniy German
Modified: 2016-04-18 06:39 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-14 12:46:49 UTC
Target Upstream Version:


Attachments (Terms of Use)
RHEVM and VDSM22 logs (1.44 MB, application/x-gzip)
2011-03-22 14:16 UTC, Evgeniy German
no flags Details
vdsm log (RHEL6 log) (1.31 MB, application/octet-stream)
2011-04-12 09:47 UTC, Evgeniy German
no flags Details

Description Evgeniy German 2011-03-22 14:16:49 UTC
Created attachment 486804 [details]
RHEVM and VDSM22 logs

Description of problem:
The status of non SPM host is non-responsive after attaching ISCSI Data storage domain to the datacenter

Version-Release number of selected component (if applicable):
RHEVM version:ic 104
vdsm22 on both hosts:vdsm22-4.5-63.24.el5_6


Steps to Reproduce:
1.Create data center and cluster (type iscsi version 2.2)
2.Add at least two hosts
3.Create ISCSI Data storage domain
4.Attach created storage
5.One host is SPM and another one is non-responsive

Expected results:
All hosts in status UP and One of them is SPM

Additional info:
*The same behaviour also with REST API

Comment 1 Dan Kenigsberg 2011-03-30 22:43:36 UTC
For how long does the non-SPM host stay non-responsive? Forever?

Is this behavior new to rhev-m-2.3?

Either way,

Thread-12940::ERROR::2011-03-22 15:45:57,547::misc::66::irs::'masterValidate'
Thread-12940::ERROR::2011-03-22 15:45:57,548::misc::67::irs::Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 978, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1696, in public_repoStats
    valid = (master_stats['masterValidate']['mount'] and
KeyError: 'masterValidate'

smells like the outcome of a race between adding a SD and reporting its repoStats.

Comment 2 Evgeniy German 2011-03-31 05:09:03 UTC
(In reply to comment #1)
> For how long does the non-SPM host stay non-responsive? Forever?
> 
> Is this behavior new to rhev-m-2.3?
> 
> Either way,
> 
> Thread-12940::ERROR::2011-03-22 15:45:57,547::misc::66::irs::'masterValidate'
> Thread-12940::ERROR::2011-03-22 15:45:57,548::misc::67::irs::Traceback (most
> recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 978, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/storage/hsm.py", line 1696, in public_repoStats
>     valid = (master_stats['masterValidate']['mount'] and
> KeyError: 'masterValidate'
> 
> smells like the outcome of a race between adding a SD and reporting its
> repoStats.

The non-SPM host stay forever on non-responsive state.

Comment 3 Igor Lvovsky 2011-04-10 15:23:19 UTC
It looks like setup issue.
I just added several logs and the problem disappeared. 
Let's try to reproduce it on RHEL6

Comment 4 Evgeniy German 2011-04-12 09:47:35 UTC
Created attachment 491439 [details]
vdsm log (RHEL6 log)

Comment 5 Evgeniy German 2011-04-12 09:48:19 UTC
Reproduced in RHEL6 (log attached)


Note You need to log in before you can comment on or make changes to this bug.