Bug 684595 - [vdsm] [storage] [scale] deactivate storage domain doesn't return with valid return response
Summary: [vdsm] [storage] [scale] deactivate storage domain doesn't return with valid ...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Eduardo Warszawski
QA Contact: Haim
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-13 17:40 UTC by Haim
Modified: 2014-01-13 00:49 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-05-01 08:36:43 UTC
Target Upstream Version:


Attachments (Terms of Use)
vdsm logs. (10.68 MB, application/x-tar)
2011-03-13 17:40 UTC, Haim
no flags Details

Description Haim 2011-03-13 17:40:45 UTC
Created attachment 484029 [details]
vdsm logs.

Description of problem:

running on small scale system, with new meta-data, at some point, rhevm decided to deactivate storage domain due too problematic state (high latency), and sent deactivate storage domain, vdsm started to process this command, but never returns valid response code (greped all logs), and looks like this: 
------------------------------------------------------------------------------
Thread-86543::INFO::2011-03-12 15:36:31,328::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: deactivateStorageDomain, args: ( sdUUID=aa6a8d5
3-8c0a-4be1-865e-452948c2ef83 spUUID=a8e3a5e0-1437-4dfb-9ac5-c6835227a074 msdUUID=00000000-0000-0000-0000-000000000000 masterVersion=146)
Thread-86543::DEBUG::2011-03-12 15:36:31,684::task::491::TaskManager.Task::(_debug) Task f3e5ac66-b788-414f-a57f-ff9b35e7c97c: moving from state init -> state
preparing
------------------------------------------------------------------------------

then, I see few logs regarding this thread, at some point it goes to sleep for 2 minutes (didn't manage to acquire resource), and then i see the following:
------------------------------------------------------------------------------
Thread-86543::INFO::2011-03-12 15:38:18,908::sp::942::Storage.StoragePool::(deactivateSD) sdUUID=aa6a8d53-8c0a-4be1-865e-452948c2ef83 spUUID=a8e3a5e0-1437-4dfb-9ac5-c6835227a074 msdUUID=00000000-0000-0000-0000-000000000000
------------------------------------------------------------------------------

no return response what so ever, towards the end of log, i get the following errors, but non over that specific SD: 
------------------------------------------------------------------------------
- RuntimeError: _handleRequests._checkForMail - Could not read mailbox
- AttributeError: 'NoneType' object has no attribute 'partial'

[root@rhev-i32c-01 vdsm]# zgrep  deactivateStorage /var/log/vdsm/vdsm.log.*  |grep aa6a8d53-8c0a-4be1-865e-452948c2ef83 | grep Run

/var/log/vdsm/var/log/vdsm/vdsm.log.27.gz:Thread-86543::INFO::2011-03-12 15:36:31,328::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: deactivateStorageDomain, args: ( sdUUID=aa6a8d53-8c0a-4be1-865e-452948c2ef83 spUUID=a8e3a5e0-1437-4dfb-9ac5-c6835227a074 msdUUID=00000000-0000-0000-0000-000000000000 masterVersion=146)

/var/log/vdsm/vdsm.log.42.gz:Thread-63070::INFO::2011-03-12 00:27:12,374::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: deactivateStorageDomain, args: ( sdUUID=aa6a8d53-8c0a-4be1-865e-452948c2ef83 spUUID=a8e3a5e0-1437-4dfb-9ac5-c6835227a074 msdUUID=00000000-0000-0000-0000-000000000000 masterVersion=141)

result: 

vg is activate, but has no link in '/rhev/data-center/mnt/blockSD/', backend rollbacked command, and 'thinks' vg (domain) is up, meaning, totally a mess.

setup:

1) fcp
2) 31 storage domains 
3) vm load - 194

Comment 1 RHEL Program Management 2011-04-04 02:12:42 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 2 Eduardo Warszawski 2011-04-07 09:53:08 UTC
The attached logs are not from this bug.
Haim, please add them or reproduce.

Comment 3 Haim 2011-05-01 08:36:43 UTC
(In reply to comment #2)
> The attached logs are not from this bug.
> Haim, please add them or reproduce.

small chances to reproduce - will re-open in case i'll hit it again


Note You need to log in before you can comment on or make changes to this bug.