Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1109156

Summary:

[vdsm] Adding first NFS data domain failed couple of times before added successfully

Product:

[Retired] oVirt

Reporter:

Jiri Belka <jbelka>

Component:

vdsm

Assignee:

Federico Simoncelli <fsimonce>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Jiri Belka <jbelka>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

3.5

CC:

amureini, bazulay, bugs, derez, fsimonce, gklein, iheim, jbelka, mgoldboi, rbalakri, yeylon

Target Milestone:

---

Keywords:

Reopened

Target Release:

3.5.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

storage

Fixed In Version:

ovirt-engine-3.5.0_beta1

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-10-17 12:24:25 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Storage

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
sosreport-LogCollector-20140613131328.tar.xz	none

Description Jiri Belka 2014-06-13 11:18:26 UTC

Created attachment 908492 [details]
sosreport-LogCollector-20140613131328.tar.xz

Description of problem:
Adding NFS data domain failed couple of times before added successfully. All with following error in Admin Portal:

~~~
Operation Canceled
Error while executing action: A Request to the Server failed with the following Status Code: 500
~~~

FYI this was first data domain and DC was in status 'Uninitialized' (obviously).

~~~
Thread-13::DEBUG::2014-06-13 13:10:08,328::task::595::TaskManager.Task::(_updateState) Task=`2a1320ef-de70-4b07-9b13-8290361dd806`::moving from state preparing -> state finished
Thread-13::DEBUG::2014-06-13 13:10:08,328::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-13::DEBUG::2014-06-13 13:10:08,328::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-13::DEBUG::2014-06-13 13:10:08,328::task::993::TaskManager.Task::(_decref) Task=`2a1320ef-de70-4b07-9b13-8290361dd806`::ref 0 aborting False
Thread-13::DEBUG::2014-06-13 13:10:08,384::BindingXMLRPC::325::vds::(wrapper) client [10.34.60.239] flowID [2f4d8180]
Thread-13::DEBUG::2014-06-13 13:10:08,385::task::595::TaskManager.Task::(_updateState) Task=`bf8ce66d-40e5-40fa-9932-8837d479710b`::moving from state init -> state preparing
Thread-13::INFO::2014-06-13 13:10:08,385::logUtils::44::dispatcher::(wrapper) Run and protect: getSpmStatus(spUUID='00000002-0002-0002-0002-0000000002b0', options=None)
Thread-13::ERROR::2014-06-13 13:10:08,385::task::866::TaskManager.Task::(_setError) Task=`bf8ce66d-40e5-40fa-9932-8837d479710b`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 607, in getSpmStatus
    pool = self.getPool(spUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 325, in getPool
    raise se.StoragePoolUnknown(spUUID)
StoragePoolUnknown: Unknown pool id, pool not connected: ('00000002-0002-0002-0002-0000000002b0',)
~~~

I manually check and cleaned NFS share before every try...

Please inspect what's going on...

Version-Release number of selected component (if applicable):
vdsm-4.15.0-78.git349f848.el6.x86_64

How reproducible:
???

Steps to Reproduce:
1. add first data domain
2.
3.

Actual results:
fail, fail, fail... then it was successfully added

Expected results:
work out of the box

Additional info:
my DC is 3.4 version/CL is 3.4 version as well

Comment 1 Allon Mureinik 2014-06-14 10:50:35 UTC

Fede, could this be related to your refactoring in the pool metadata handling? Please take a look.

Comment 2 Daniel Erez 2014-06-18 14:54:43 UTC

According to the described scenario, it looks like a duplicate of bug 1109156 (should probably be included in 3.5 alpha-3 build).

Comment 3 Federico Simoncelli 2014-07-14 10:59:53 UTC

I haven't been able to reproduce this (never noticed in my environment) and I couldn't find any vdsm log attached.

Can you please try to reproduce and ping me to debug the issue together?
Thanks.

Comment 4 Daniel Erez 2014-07-14 11:28:26 UTC

(In reply to Daniel Erez from comment #2)
> According to the described scenario, it looks like a duplicate of bug
> 1109156 (should probably be included in 3.5 alpha-3 build).

Sorry, meant duplicate of bug 1107945.

Comment 5 Federico Simoncelli 2014-07-14 11:31:28 UTC

(In reply to Daniel Erez from comment #4)
> (In reply to Daniel Erez from comment #2)
> > According to the described scenario, it looks like a duplicate of bug
> > 1109156 (should probably be included in 3.5 alpha-3 build).
> 
> Sorry, meant duplicate of bug 1107945.

Thanks. Tracking bug 1107945 state in order to verify this scenario as well.

Comment 6 Jiri Belka 2014-07-24 11:49:14 UTC

I can't reproduce with ovirt-3.5.0-beta1.1.

Comment 7 Jiri Belka 2014-07-24 11:51:25 UTC

#6 is valid anyway ;)

Comment 8 Sandro Bonazzola 2014-10-17 12:24:25 UTC

oVirt 3.5 has been released and should include the fix for this issue.