Bug 1215967

Summary: [HE] Failed to deploy additional host using NFS
Product: Red Hat Enterprise Virtualization Manager Reporter: Artyom <alukiano>
Component: vdsmAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED ERRATA QA Contact: Artyom <alukiano>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.6.0CC: acanan, ahino, alex, alukiano, aperotti, bazulay, cinglese, dfediuck, gklein, justin, lpeer, lsurette, melewis, nsednev, nsoffer, pstehlik, sbonazzo, stirabos, ycui, yeylon, ykaul
Target Milestone: ovirt-3.6.0-rcKeywords: Regression, TestBlocker, Triaged
Target Release: 3.6.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
With this update prepareImage is now called by VDSM to mount the required NFS storage to deploy additional hosts using NFS.
Story Points: ---
Clone Of:
: 1271272 (view as bug list) Environment:
Last Closed: 2016-03-09 19:37:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1101553, 1150087, 1159314, 1178535, 1216172, 1221148, 1234915, 1271272    
Attachments:
Description Flags
deployment log
none
vdsm logs from the second host
none
new setup and vdsm logs
none
ovirt-hosted-engine-setup
none
new hosted-engine-setup log file
none
vdsm.log none

Description Artyom 2015-04-28 08:58:35 UTC
Created attachment 1019587 [details]
deployment log

Description of problem:
Failed to deploy additional host under NFS with error:
[ ERROR ] Failed to execute stage 'Setup validation': [Errno 2] No such file or directory: '/rhev/data-center/mnt/10.35.160.108:_RHEV_artyom__hosted__engine__36/5874a7c1-3384-4b07-839c-b5cc4aa555a6/ha_agent/hosted-engine.metadata'

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150401110307.git9665976.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy HE environment on one host
2. Try to deploy another host
3.

Actual results:
Deployment failed, with traceback:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line **FILTERED**45, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 260, in _validation
    ] + ".metadata",
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line **FILTERED**68, in get_all_host_stats_direct
    self.StatModes.HOST)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line **FILTERED**9, in get_all_stats_direct
    stats = sb.get_raw_stats_for_service_type("client", service_type)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line **FILTERED**25, in get_raw_stats_for_service_type
    f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 2] No such file or directory: '/rhev/data-center/mnt/**FILTERED**0.35.**FILTERED**60.**FILTERED**08:_RHEV_artyom__hosted__engine__36/5874a7c**FILTERED**-3384-4b07-839c-b5cc4aa555a6/ha_agent/hosted-engine.metadata'


Expected results:
Deployment success without any errors

Additional info:
I success pass falling of deployment because this bug https://bugzilla.redhat.com/show_bug.cgi?id=1215663 with local smal path

Comment 1 Sandro Bonazzola 2015-05-04 10:43:56 UTC
Targeting to 3.6 as it has been reported with version 3.6.
Artyom, did you see this on 3.5 too?

Comment 2 Artyom 2015-05-04 11:34:03 UTC
No, also Nikolay checked it on 3.5.1 and it worked fine.

Comment 3 Sandro Bonazzola 2015-05-13 10:25:54 UTC
We're investigating on this. Please, next time use a longer password, it will ease reading the logs...

Comment 4 Artyom 2015-05-13 10:39:41 UTC
Yes, sorry for this, but it not so good security mechanism to hide password with *, it can be easily recover from log context and it really doing hard work to read log.
But you can also to do auto-replace **FILTERED** to 1)

Comment 7 Simone Tiraboschi 2015-05-15 10:48:51 UTC
It seams something on the storage side.

On first host I've:

[root@r71he36h1 ~]# ls -lh /rhev/data-center/mnt/192.168.1.115\:_Virtual_exthe7/2d8224d4-ea6a-4064-b7ec-ddbcae559ed8/ha_agent/hosted-engine.metadata 
lrwxrwxrwx. 1 vdsm kvm 132 15 mag 11.33 /rhev/data-center/mnt/192.168.1.115:_Virtual_exthe7/2d8224d4-ea6a-4064-b7ec-ddbcae559ed8/ha_agent/hosted-engine.metadata -> /var/run/vdsm/storage/2d8224d4-ea6a-4064-b7ec-ddbcae559ed8/390bf090-c246-4046-b32e-1cdb0651fcd8/fc5342f7-825e-4806-abea-cbb9ae0aef8a

While on the second I've:
[root@r71he36h2 ~]# ls -lh /rhev/data-center/mnt/192.168.1.115\:_Virtual_exthe7/2d8224d4-ea6a-4064-b7ec-ddbcae559ed8/ha_agent/hosted-engine.metadata 
lrwxrwxrwx. 1 vdsm kvm 132 May 15 11:33 /rhev/data-center/mnt/192.168.1.115:_Virtual_exthe7/2d8224d4-ea6a-4064-b7ec-ddbcae559ed8/ha_agent/hosted-engine.metadata -> /var/run/vdsm/storage/2d8224d4-ea6a-4064-b7ec-ddbcae559ed8/390bf090-c246-4046-b32e-1cdb0651fcd8/fc5342f7-825e-4806-abea-cbb9ae0aef8a
[root@r71he36h2 ~]# ls -l /var/run/vdsm/storage/2d8224d4-ea6a-4064-b7ec-ddbcae559ed8/390bf090-c246-4046-b32e-1cdb0651fcd8/fc5342f7-825e-4806-abea-cbb9ae0aef8a
ls: cannot access /var/run/vdsm/storage/2d8224d4-ea6a-4064-b7ec-ddbcae559ed8/390bf090-c246-4046-b32e-1cdb0651fcd8/fc5342f7-825e-4806-abea-cbb9ae0aef8a: No such file or directory

Indeed the whole:
/var/run/vdsm/storage
is not existing on the second host.

[root@r71he36h2 ~]# ls -l /var/run/vdsm/
total 0
-rw-r--r--. 1 vdsm kvm   0 May 15 12:34 client.log
drwxr-xr-x. 2 vdsm kvm  60 May 15 12:25 lvm
-rw-r--r--. 1 root root  0 May 15 11:59 nets_restored
drwxr-xr-x. 2 vdsm kvm  40 May 14 13:52 sourceRoutes
srwxr-xr-x. 1 vdsm kvm   0 May 15 12:25 svdsm.sock
drwxr-xr-x. 2 vdsm kvm  40 May 14 13:52 trackedInterfaces
drwxr-xr-x. 2 vdsm kvm  40 May 14 13:52 v2v

While on vdsm logs on the second host I found:
Thread-59::ERROR::2015-05-15 12:28:31,261::task::863::Storage.TaskManager.Task::(_setError) Task=`dfff003d-7743-4c9f-a7f3-e113150c5e9d`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 870, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 619, in spmStop
    pool.stopSpm()
  File "/usr/share/vdsm/storage/securable.py", line 75, in wrapper
    raise SecureError("Secured object is not in safe state")
SecureError: Secured object is not in safe state


Additional info:
[root@r71he36h2 ~]# vdsClient -s 0 getStorageDomainsList 
2d8224d4-ea6a-4064-b7ec-ddbcae559ed8

[root@r71he36h2 ~]# vdsClient -s 0 getStorageDomainInfo 2d8224d4-ea6a-4064-b7ec-ddbcae559ed8
	uuid = 2d8224d4-ea6a-4064-b7ec-ddbcae559ed8
	version = 3
	role = Master
	remotePath = 192.168.1.115:/Virtual/exthe7
	type = NFS
	class = Data
	pool = ['fec9bb0e-4ad6-4836-b1f0-046f669aaf76']
	name = hosted_storage

[root@r71he36h2 ~]# vdsClient -s 0 getStoragePoolInfo fec9bb0e-4ad6-4836-b1f0-046f669aaf76
Unknown pool id, pool not connected: ('fec9bb0e-4ad6-4836-b1f0-046f669aaf76',)
[root@r71he36h2 ~]#

Comment 8 Simone Tiraboschi 2015-05-15 10:59:51 UTC
Created attachment 1025743 [details]
vdsm logs from the second host

vdsm logs from the second host

Comment 9 Sandro Bonazzola 2015-05-18 09:25:58 UTC
moving to storage for reviewing.

Comment 10 Allon Mureinik 2015-05-18 09:38:49 UTC
Moving to 3.6.0 since this didn't reproduce on 3.5.z.

Seems the second host is not connected to the pool.
Ala, can you take a look please?

Comment 11 Simone Tiraboschi 2015-06-08 08:06:52 UTC
*** Bug 1226670 has been marked as a duplicate of this bug. ***

Comment 12 Allon Mureinik 2015-06-14 14:38:12 UTC
*** Bug 1158833 has been marked as a duplicate of this bug. ***

Comment 13 Artyom 2015-06-30 13:19:42 UTC
Created attachment 1044684 [details]
new setup and vdsm logs

Checked on 
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150623153111.git68138d4.el7.noarch

Still exist the same problem as described above

On second host:
lrwxrwxrwx. 1 vdsm kvm 132 Jun 30  2015 hosted-engine.lockspace -> /var/run/vdsm/storage/3a9e3f94-084f-4600-b5c3-6cbb424e3262/b5eee90d-8184-4514-84c8-a1a4d2013f33/4c438492-4600-4b13-ab13-c24d9112d7cd
lrwxrwxrwx. 1 vdsm kvm 132 Jun 30  2015 hosted-engine.metadata -> /var/run/vdsm/storage/3a9e3f94-084f-4600-b5c3-6cbb424e3262/04527b01-2461-49fa-9bea-bd6941658cb1/a5084edd-efc4-4449-910d-21f13c4ac1bf

but it's no directory storage under /var/run/vdsm/

Comment 14 Simone Tiraboschi 2015-07-15 09:24:01 UTC
*** Bug 1241566 has been marked as a duplicate of this bug. ***

Comment 15 Artyom 2015-09-02 10:46:20 UTC
Verified on ovirt-hosted-engine-setup-1.3.0-0.4.beta.git42eb801.el7ev.noarch
Deployment of additional host on NFS storage passed without any errors

Comment 16 Sandro Bonazzola 2015-09-07 12:00:23 UTC
*** Bug 1159314 has been marked as a duplicate of this bug. ***

Comment 18 Justin Hammond 2015-12-28 06:05:45 UTC
I've got the same issue as reported. No /var/run/vdsm/storage directory on the 2nd host:

[root@h2216-n1 storage]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7.x86_64
ovirt-setup-lib-1.0.0-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.1.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.3.6-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.0.3-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-release35-006-1.noarch
ovirt-release36-002-2.noarch

Anything additional required? (I see most of what is reported here is the same as what I've got on my box)

Comment 19 Artyom 2015-12-28 07:46:01 UTC
No it must work. Can you please provide ovirt-hosted-engine-setup log from second host?

Comment 20 Justin Hammond 2015-12-28 08:05:29 UTC
Created attachment 1109975 [details]
ovirt-hosted-engine-setup

Comment 21 Justin Hammond 2015-12-28 08:07:22 UTC
There you go.

I also note, I don't have a connected StoragePool (using the same vdsClient commands as in comment #7)

Comment 22 Artyom 2015-12-28 09:14:10 UTC
From logs it looks like the same type of Traceback
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 447, in _validation
    ] + ".metadata",
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 171, in get_all_host_stats_direct
    self.StatModes.HOST)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 122, in get_all_stats_direct
    stats = sb.get_raw_stats_for_service_type("client", service_type)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 140, in get_raw_stats_for_service_type
    .format(str(e)))
RequestError: failed to read metadata: [Errno 2] No such file or directory: '/rhev/data-center/mnt/nfs1.dmz.dynam.ac:_data_256G__M/e2d5d3a0-893a-431d-9723-c18bfe34660a/ha_agent/hosted-engine.metadata'

Can you please provide the full vdsm log?

Comment 23 Justin Hammond 2015-12-31 09:22:40 UTC
Created attachment 1110723 [details]
new hosted-engine-setup log file

This is a updated log file for the hosted-engine-setup to match the vdsm.log file.

Comment 24 Justin Hammond 2015-12-31 09:23:17 UTC
Created attachment 1110724 [details]
vdsm.log

vdsm.log file as requested

Comment 25 Justin Hammond 2015-12-31 09:24:05 UTC
attached. I restarted vdsm so its a fresh log file as well while the host was in maintenance mode.

Comment 26 Simone Tiraboschi 2016-01-04 17:17:53 UTC
I wasn't able to reproduce with:

ovirt-hosted-engine-ha.noarch          1.3.3.6-1.el7.centos           @ovirt-3.6
ovirt-hosted-engine-setup.noarch       1.3.1.4-1.el7.centos           @ovirt-3.6

In my setup the hosted-engine storage domain was correctly imported from the engine.

Justin, can you please try upgrading ovirt-hosted-engine-setup to 1.3.1.4-1?

Comment 27 Charlie Inglese 2016-01-27 04:03:39 UTC
I'm seeing this same issue using the following baseline:
ovirt-hosted-engine-ha-1.3.3.6-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.1.4-1.el7.centos.noarch
vdsm-4.17.13-0.el7.centos.noarch

1) Excerpt from vdsm.log:
Thread-143::INFO::2016-01-26 22:49:49,077::logUtils::48::dispatcher::(wrapper) Run and protect: prepareImage(sdUUID='813b39ec-8680-47ef-a55c-fb12628001fd', spUUID='3d1d44ac-6151-415d-acc4-1ded432c28d1', imgUUID='6854eea9-3027-4b8e-b8a5-25c07d91d7b4', leafUUID='deb60cb0-9d6e-463e-ace3-b8898b701a61')
Thread-143::ERROR::2016-01-26 22:49:49,077::task::866::Storage.TaskManager.Task::(_setError) Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 3203, in prepareImage
    self.getPool(spUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 314, in getPool
    raise se.StoragePoolUnknown(spUUID)
StoragePoolUnknown: Unknown pool id, pool not connected: ('3d1d44ac-6151-415d-acc4-1ded432c28d1',)

2) ll /var/run/vdsm/
-rw-r--r--. 1 vdsm kvm   0 Jan 26 22:49 client.log
drwxr-xr-x. 2 vdsm kvm  60 Jan 26 22:49 lvm
srwxr-xr-x. 1 vdsm kvm   0 Jan 26 22:49 mom-vdsm.sock
-rw-r--r--. 1 root root  0 Jan 26 22:49 nets_restored
drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 sourceRoutes
srwxr-xr-x. 1 vdsm kvm   0 Jan 26 22:49 svdsm.sock
drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 trackedInterfaces
drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 v2v

3) vdsClient -s 0 getStorageDomainsList
813b39ec-8680-47ef-a55c-fb12628001fd

4) vdsClient -s 0 getStorageDomainInfo 813b39ec-8680-47ef-a55c-fb12628001fd
       uuid = 813b39ec-8680-47ef-a55c-fb12628001fd
       version = 3
       role = Regular
       remotePath = pserver5:/engine
       type = GLUSTERFS
       class = Data
       pool = []
       name = storage

5) vdsClient -s 0 getStoragePoolInfo 813b39ec-8680-47ef-a55c-fb12628001fd
Unknown pool id, pool not connected: ('813b39ec-8680-47ef-a55c-fb12628001fd',)

Comment 28 Charlie Inglese 2016-01-27 04:05:55 UTC
(In reply to Charlie Inglese from comment #27)
> I'm seeing this same issue using the following baseline:
> ovirt-hosted-engine-ha-1.3.3.6-1.el7.centos.noarch
> ovirt-hosted-engine-setup-1.3.1.4-1.el7.centos.noarch
> vdsm-4.17.13-0.el7.centos.noarch
> 
> 1) Excerpt from vdsm.log:
> Thread-143::INFO::2016-01-26
> 22:49:49,077::logUtils::48::dispatcher::(wrapper) Run and protect:
> prepareImage(sdUUID='813b39ec-8680-47ef-a55c-fb12628001fd',
> spUUID='3d1d44ac-6151-415d-acc4-1ded432c28d1',
> imgUUID='6854eea9-3027-4b8e-b8a5-25c07d91d7b4',
> leafUUID='deb60cb0-9d6e-463e-ace3-b8898b701a61')
> Thread-143::ERROR::2016-01-26
> 22:49:49,077::task::866::Storage.TaskManager.Task::(_setError)
> Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 873, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
>     res = f(*args, **kwargs)
>   File "/usr/share/vdsm/storage/hsm.py", line 3203, in prepareImage
>     self.getPool(spUUID)
>   File "/usr/share/vdsm/storage/hsm.py", line 314, in getPool
>     raise se.StoragePoolUnknown(spUUID)
> StoragePoolUnknown: Unknown pool id, pool not connected:
> ('3d1d44ac-6151-415d-acc4-1ded432c28d1',)
> 
> 2) ll /var/run/vdsm/
> -rw-r--r--. 1 vdsm kvm   0 Jan 26 22:49 client.log
> drwxr-xr-x. 2 vdsm kvm  60 Jan 26 22:49 lvm
> srwxr-xr-x. 1 vdsm kvm   0 Jan 26 22:49 mom-vdsm.sock
> -rw-r--r--. 1 root root  0 Jan 26 22:49 nets_restored
> drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 sourceRoutes
> srwxr-xr-x. 1 vdsm kvm   0 Jan 26 22:49 svdsm.sock
> drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 trackedInterfaces
> drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 v2v
> 
> 3) vdsClient -s 0 getStorageDomainsList
> 813b39ec-8680-47ef-a55c-fb12628001fd
> 
> 4) vdsClient -s 0 getStorageDomainInfo 813b39ec-8680-47ef-a55c-fb12628001fd
>        uuid = 813b39ec-8680-47ef-a55c-fb12628001fd
>        version = 3
>        role = Regular
>        remotePath = pserver5:/engine
>        type = GLUSTERFS
>        class = Data
>        pool = []
>        name = storage
> 
> 5) vdsClient -s 0 getStoragePoolInfo 813b39ec-8680-47ef-a55c-fb12628001fd
> Unknown pool id, pool not connected:
> ('813b39ec-8680-47ef-a55c-fb12628001fd',)

Correction to Item 1) vdsm.log

Thread-143::INFO::2016-01-26 22:49:49,077::logUtils::48::dispatcher::(wrapper) Run and protect: prepareImage(sdUUID='813b39ec-8680-47ef-a55c-fb12628001fd', spUUID='3d1d44ac-6151-415d-acc4-1ded432c28d1', imgUUID='6854eea9-3027-4b8e-b8a5-25c07d91d7b4', leafUUID='deb60cb0-9d6e-463e-ace3-b8898b701a61')
Thread-143::ERROR::2016-01-26 22:49:49,077::task::866::Storage.TaskManager.Task::(_setError) Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 3203, in prepareImage
    self.getPool(spUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 314, in getPool
    raise se.StoragePoolUnknown(spUUID)
StoragePoolUnknown: Unknown pool id, pool not connected: ('3d1d44ac-6151-415d-acc4-1ded432c28d1',)
Thread-143::DEBUG::2016-01-26 22:49:49,078::task::885::Storage.TaskManager.Task::(_run) Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Task._run: ac0fb5a8-5e4b-47f5-88eb-57a416cc047b ('813b39ec-8680-47ef-a55c-fb12628001fd', '3d1d44ac-6151-415d-acc4-1ded432c28d1', '6854eea9-3027-4b8e-b8a5-25c07d91d7b4', 'deb60cb0-9d6e-463e-ace3-b8898b701a61') {} failed - stopping task
Thread-143::DEBUG::2016-01-26 22:49:49,078::task::1246::Storage.TaskManager.Task::(stop) Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::stopping in state preparing (force False)
Thread-143::DEBUG::2016-01-26 22:49:49,078::task::993::Storage.TaskManager.Task::(_decref) Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::ref 1 aborting True
Thread-143::INFO::2016-01-26 22:49:49,078::task::1171::Storage.TaskManager.Task::(prepare) Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::aborting: Task is aborted: 'Unknown pool id, pool not connected' - code 309
Thread-143::DEBUG::2016-01-26 22:49:49,079::task::1176::Storage.TaskManager.Task::(prepare) Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Prepare: aborted: Unknown pool id, pool not connected

Comment 29 Charlie Inglese 2016-01-27 04:34:46 UTC
(In reply to Charlie Inglese from comment #28)
> (In reply to Charlie Inglese from comment #27)
> > I'm seeing this same issue using the following baseline:
> > ovirt-hosted-engine-ha-1.3.3.6-1.el7.centos.noarch
> > ovirt-hosted-engine-setup-1.3.1.4-1.el7.centos.noarch
> > vdsm-4.17.13-0.el7.centos.noarch
> > 
> > 1) Excerpt from vdsm.log:
> > Thread-143::INFO::2016-01-26
> > 22:49:49,077::logUtils::48::dispatcher::(wrapper) Run and protect:
> > prepareImage(sdUUID='813b39ec-8680-47ef-a55c-fb12628001fd',
> > spUUID='3d1d44ac-6151-415d-acc4-1ded432c28d1',
> > imgUUID='6854eea9-3027-4b8e-b8a5-25c07d91d7b4',
> > leafUUID='deb60cb0-9d6e-463e-ace3-b8898b701a61')
> > Thread-143::ERROR::2016-01-26
> > 22:49:49,077::task::866::Storage.TaskManager.Task::(_setError)
> > Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Unexpected error
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/task.py", line 873, in _run
> >     return fn(*args, **kargs)
> >   File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
> >     res = f(*args, **kwargs)
> >   File "/usr/share/vdsm/storage/hsm.py", line 3203, in prepareImage
> >     self.getPool(spUUID)
> >   File "/usr/share/vdsm/storage/hsm.py", line 314, in getPool
> >     raise se.StoragePoolUnknown(spUUID)
> > StoragePoolUnknown: Unknown pool id, pool not connected:
> > ('3d1d44ac-6151-415d-acc4-1ded432c28d1',)
> > 
> > 2) ll /var/run/vdsm/
> > -rw-r--r--. 1 vdsm kvm   0 Jan 26 22:49 client.log
> > drwxr-xr-x. 2 vdsm kvm  60 Jan 26 22:49 lvm
> > srwxr-xr-x. 1 vdsm kvm   0 Jan 26 22:49 mom-vdsm.sock
> > -rw-r--r--. 1 root root  0 Jan 26 22:49 nets_restored
> > drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 sourceRoutes
> > srwxr-xr-x. 1 vdsm kvm   0 Jan 26 22:49 svdsm.sock
> > drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 trackedInterfaces
> > drwxr-xr-x. 2 vdsm kvm  40 Dec  9 08:24 v2v
> > 
> > 3) vdsClient -s 0 getStorageDomainsList
> > 813b39ec-8680-47ef-a55c-fb12628001fd
> > 
> > 4) vdsClient -s 0 getStorageDomainInfo 813b39ec-8680-47ef-a55c-fb12628001fd
> >        uuid = 813b39ec-8680-47ef-a55c-fb12628001fd
> >        version = 3
> >        role = Regular
> >        remotePath = pserver5:/engine
> >        type = GLUSTERFS
> >        class = Data
> >        pool = []
> >        name = storage
> > 
> > 5) vdsClient -s 0 getStoragePoolInfo 813b39ec-8680-47ef-a55c-fb12628001fd
> > Unknown pool id, pool not connected:
> > ('813b39ec-8680-47ef-a55c-fb12628001fd',)
> 
> Correction to Item 1) vdsm.log
> 
> Thread-143::INFO::2016-01-26
> 22:49:49,077::logUtils::48::dispatcher::(wrapper) Run and protect:
> prepareImage(sdUUID='813b39ec-8680-47ef-a55c-fb12628001fd',
> spUUID='3d1d44ac-6151-415d-acc4-1ded432c28d1',
> imgUUID='6854eea9-3027-4b8e-b8a5-25c07d91d7b4',
> leafUUID='deb60cb0-9d6e-463e-ace3-b8898b701a61')
> Thread-143::ERROR::2016-01-26
> 22:49:49,077::task::866::Storage.TaskManager.Task::(_setError)
> Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 873, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
>     res = f(*args, **kwargs)
>   File "/usr/share/vdsm/storage/hsm.py", line 3203, in prepareImage
>     self.getPool(spUUID)
>   File "/usr/share/vdsm/storage/hsm.py", line 314, in getPool
>     raise se.StoragePoolUnknown(spUUID)
> StoragePoolUnknown: Unknown pool id, pool not connected:
> ('3d1d44ac-6151-415d-acc4-1ded432c28d1',)
> Thread-143::DEBUG::2016-01-26
> 22:49:49,078::task::885::Storage.TaskManager.Task::(_run)
> Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Task._run:
> ac0fb5a8-5e4b-47f5-88eb-57a416cc047b
> ('813b39ec-8680-47ef-a55c-fb12628001fd',
> '3d1d44ac-6151-415d-acc4-1ded432c28d1',
> '6854eea9-3027-4b8e-b8a5-25c07d91d7b4',
> 'deb60cb0-9d6e-463e-ace3-b8898b701a61') {} failed - stopping task
> Thread-143::DEBUG::2016-01-26
> 22:49:49,078::task::1246::Storage.TaskManager.Task::(stop)
> Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::stopping in state preparing
> (force False)
> Thread-143::DEBUG::2016-01-26
> 22:49:49,078::task::993::Storage.TaskManager.Task::(_decref)
> Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::ref 1 aborting True
> Thread-143::INFO::2016-01-26
> 22:49:49,078::task::1171::Storage.TaskManager.Task::(prepare)
> Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::aborting: Task is aborted:
> 'Unknown pool id, pool not connected' - code 309
> Thread-143::DEBUG::2016-01-26
> 22:49:49,079::task::1176::Storage.TaskManager.Task::(prepare)
> Task=`ac0fb5a8-5e4b-47f5-88eb-57a416cc047b`::Prepare: aborted: Unknown pool
> id, pool not connected

Nevermind, this issue was fixed in https://bugzilla.redhat.com/show_bug.cgi?id=1277010.

Comment 31 errata-xmlrpc 2016-03-09 19:37:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0362.html