Bug 1274622
Summary: | getImagesList fails if called on a file based storageDomain which is not connected to any storage pool | ||||||
---|---|---|---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Simone Tiraboschi <stirabos> | ||||
Component: | Bindings-API | Assignee: | Fred Rolland <frolland> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Elad <ebenahar> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.17.10 | CC: | acanan, amureini, bugs, cmestreg, dfediuck, ebenahar, frolland, gklein, nsoffer, sasundar, sbonazzo, stirabos, tnisan, ylavi | ||||
Target Milestone: | ovirt-4.0.0-alpha | Flags: | rule-engine:
ovirt-4.0.0+
rule-engine: planning_ack+ rule-engine: devel_ack+ acanan: testing_ack+ |
||||
Target Release: | 4.17.999 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | ovirt 4.0.0 alpha1 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-08-01 12:28:34 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1325657 | ||||||
Bug Blocks: | 1277939 | ||||||
Attachments: |
|
Description
Simone Tiraboschi
2015-10-23 07:13:47 UTC
Nir, can you review the attached patch please? In oVirt testing is done on single release by default. Therefore I'm removing the 4.0 flag. If you think this bug must be tested in 4.0 as well, please re-add the flag. Please note we might not have testing resources to handle the 4.0 clone. In 3.6 you must be connected to the pool. We are not going to support snything else. This will possible in 4.0. This bug is marked for z-stream, yet the milestone is for a major version, therefore the milestone has been reset. Please set the correct milestone or drop the z stream flag. Simone hi, Can you test on master if this bug is still relevant ? Adam changed the logic to not use the pool in this patch : https://gerrit.ovirt.org/#/c/49684/3 Thanks It looks OK with vdsm.noarch 4.17.999-536.git433b527.el7.centos form master. Can you please backport path 49684 to 3.6 too? Simone, this is not a question to Freddy but rather to VDSM maintainers and product managers, in general this late into 3.6 I think it's too risky, we can propose it to 3.6.z if you'd like Thanks Tal, I'm asking this for bug 1303316. Just to summarize the issue: hosted-engine-setup deploys the engine VM, only when the data-center goes up (it requires the user to manually add the first regular storage domain) after an hour it will create the OVF_STORE volumes so the issue is that hosted-engine-setup cannot know about the OVF_STORE uuid. After a reboot, ovirt-ha-agent has to scan the OVF_STORE to get the latest configuration for the engine VM. The issue here is that we are not calling prepareImage since ovirt-ha-agent doesn't know the OVF_STORE uuid and we cannot call getImagesList since we are not connected to any storage pool at that point. In general it seams to, strangely, work also without the prepareImage on the OVF_STORE but on bug 1303316 we have a report about OVF_STORE being not accessible on FC since its LV is still down. Is there another way to prepare all the images on a storage domain without knowing their UUID? Not sure, Nir, any idea? (In reply to Tal Nisan from comment #8) > Simone, this is not a question to Freddy but rather to VDSM maintainers and > product managers, in general this late into 3.6 I think it's too risky, we > can propose it to 3.6.z if you'd like This patch is part of the storage domain manifest in 4.0. It won't be backported. (In reply to Allon Mureinik from comment #11) > (In reply to Tal Nisan from comment #8) > > Simone, this is not a question to Freddy but rather to VDSM maintainers and > > product managers, in general this late into 3.6 I think it's too risky, we > > can propose it to 3.6.z if you'd like > This patch is part of the storage domain manifest in 4.0. > It won't be backported. It should be possible to backport this patch, the dependency on the pool is not needed in this code path. getImagesList seams to fails also on iSCSI on vdsm.noarch 4.17.21-0.el7ev @rhev-3.6.3-3 Thread-5472::DEBUG::2016-02-23 19:07:26,796::task::595::Storage.TaskManager.Task::(_updateState) Task=`d364b006-34f5-429d-95c7-a4725f449763`::moving from state init -> state preparing Thread-5472::INFO::2016-02-23 19:07:26,797::logUtils::48::dispatcher::(wrapper) Run and protect: getImagesList(sdUUID={'status': {'message': 'OK', 'code': 0}, 'imageslist': ['2e9a5dc5-5c83-43bd-8f32-79825194c368', 'aa908bcb-16f0-48e8-a9cc-bcc35bf4a258', '9dae0125-7467-4a53-b975-f022e32d15ac', '7d8c0950-399c-4b0f-aa90-317381da1c6e']}, options=None) Thread-5472::WARNING::2016-02-23 19:07:26,797::resourceManager::834::Storage.ResourceManager.Owner::(acquire) Unexpected exception caught while owner 'd364b006-34f5-429d-95c7-a4725f449763' tried to acquire 'Storage.{'status': {'message': 'OK', 'code': 0}, 'imageslist': ['2e9a5dc5-5c83-43bd-8f32-79825194c368', 'aa908bcb-16f0-48e8-a9cc-bcc35bf4a258', '9dae0125-7467-4a53-b975-f022e32d15ac', '7d8c0950-399c-4b0f-aa90-317381da1c6e']}' Traceback (most recent call last): File "/usr/share/vdsm/storage/resourceManager.py", line 814, in acquire locktype, timeout) File "/usr/share/vdsm/storage/resourceManager.py", line 514, in acquireResource request = self.registerResource(namespace, name, lockType, callback) File "/usr/share/vdsm/storage/resourceManager.py", line 538, in registerResource if not self._resourceNameValidator.match(name): TypeError: expected string or buffer Thread-5472::ERROR::2016-02-23 19:07:26,797::task::866::Storage.TaskManager.Task::(_setError) Task=`d364b006-34f5-429d-95c7-a4725f449763`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3307, in getImagesList vars.task.getSharedLock(STORAGE, sdUUID) File "/usr/share/vdsm/storage/task.py", line 1375, in getSharedLock timeout) File "/usr/share/vdsm/storage/resourceManager.py", line 835, in acquire raise se.ResourceException(fullName) ResourceException: General Exception, UUID: "Storage.{'status': {'message': 'OK', 'code': 0}, 'imageslist': ['2e9a5dc5-5c83-43bd-8f32-79825194c368', 'aa908bcb-16f0-48e8-a9cc-bcc35bf4a258', '9dae0125-7467-4a53-b975-f022e32d15ac', '7d8c0950-399c-4b0f-aa90-317381da1c6e']}" Thread-5472::DEBUG::2016-02-23 19:07:26,797::task::885::Storage.TaskManager.Task::(_run) Task=`d364b006-34f5-429d-95c7-a4725f449763`::Task._run: d364b006-34f5-429d-95c7-a4725f449763 ({'status': {'message': 'OK', 'code': 0}, 'imageslist': ['2e9a5dc5-5c83-43bd-8f32-79825194c368', 'aa908bcb-16f0-48e8-a9cc-bcc35bf4a258', '9dae0125-7467-4a53-b975-f022e32d15ac', '7d8c0950-399c-4b0f-aa90-317381da1c6e']},) {} failed - stopping task Thread-5472::DEBUG::2016-02-23 19:07:26,797::task::1246::Storage.TaskManager.Task::(stop) Task=`d364b006-34f5-429d-95c7-a4725f449763`::stopping in state preparing (force False) Thread-5472::DEBUG::2016-02-23 19:07:26,798::task::993::Storage.TaskManager.Task::(_decref) Task=`d364b006-34f5-429d-95c7-a4725f449763`::ref 1 aborting True Thread-5472::INFO::2016-02-23 19:07:26,798::task::1171::Storage.TaskManager.Task::(prepare) Task=`d364b006-34f5-429d-95c7-a4725f449763`::aborting: Task is aborted: u'General Exception, UUID: "Storage.{\'status\': {\'message\': \'OK\', \'code\': 0}, \'imageslist\': [\'2e9a5dc5-5c83-43bd-8f32-79825194c368\', \'aa908bcb-16f0-48e8-a9cc-bcc35bf4a258\', \'9dae0125-7467-4a53-b975-f022e32d15ac\', \'7d8c0950-399c-4b0f-aa90-317381da1c6e\']}"' - code 100 Thread-5472::DEBUG::2016-02-23 19:07:26,798::task::1176::Storage.TaskManager.Task::(prepare) Task=`d364b006-34f5-429d-95c7-a4725f449763`::Prepare: aborted: General Exception, UUID: "Storage.{'status': {'message': 'OK', 'code': 0}, 'imageslist': ['2e9a5dc5-5c83-43bd-8f32-79825194c368', 'aa908bcb-16f0-48e8-a9cc-bcc35bf4a258', '9dae0125-7467-4a53-b975-f022e32d15ac', '7d8c0950-399c-4b0f-aa90-317381da1c6e']}" Thread-5472::DEBUG::2016-02-23 19:07:26,798::task::993::Storage.TaskManager.Task::(_decref) Task=`d364b006-34f5-429d-95c7-a4725f449763`::ref 0 aborting True Thread-5472::DEBUG::2016-02-23 19:07:26,798::task::928::Storage.TaskManager.Task::(_doAbort) Task=`d364b006-34f5-429d-95c7-a4725f449763`::Task._doAbort: force False (In reply to Simone Tiraboschi from comment #13) > getImagesList seams to fails also on iSCSI on vdsm.noarch > 4.17.21-0.el7ev @rhev-3.6.3-3 > > Thread-5472::DEBUG::2016-02-23 > 19:07:26,796::task::595::Storage.TaskManager.Task::(_updateState) > Task=`d364b006-34f5-429d-95c7-a4725f449763`::moving from state init -> state > preparing > Thread-5472::INFO::2016-02-23 > 19:07:26,797::logUtils::48::dispatcher::(wrapper) Run and protect: > getImagesList(sdUUID={'status': {'message': 'OK', 'code': 0}, 'imageslist': > ['2e9a5dc5-5c83-43bd-8f32-79825194c368', > 'aa908bcb-16f0-48e8-a9cc-bcc35bf4a258', > '9dae0125-7467-4a53-b975-f022e32d15ac', > '7d8c0950-399c-4b0f-aa90-317381da1c6e']}, options=None) Sorry, found: it's just the result of a bad request. Stupid bug! *** Bug 1331503 has been marked as a duplicate of this bug. *** Hi Fred, Can I ask you about the way to test this? I'm using a deactivated storage domain (in ovirt) and test with vdsClient getImagesList but I'm getting a: Storage domain does not exist: ('928a25e0-2489-4ea2-bc55-520b112c0920',) should I test checking the logs when importing a domain (does it gets trigger the getImagesList in that case)? Hi, Can you provide the vdsm log ? Thanks, Fred Created attachment 1178904 [details]
vdsm.log storagedomaindoesnotexist
After deactivate the domain I call vdsClient getImagesList and got the StorageDomainDoesNotExist exception:
Thread-90274::ERROR::2016-07-12 15:59:50,668::task::868::Storage.TaskManager.Task::(_setError) Task=`b70c9b2b-d341-42e5-a757-6932d248972b`::Unexpected error
I think you should ask Simone on how to test this as it is a hosted engine flow. The exception is not about the pool as the bug is about. Simone, can you provide the steps on HE to verify this? (In reply to Carlos Mestre González from comment #21) > Simone, can you provide the steps on HE to verify this? 1. Deploy hosted-engine (please try it once on NFS and once on iSCSI); add your first regular storage domain and ensure that the engine imports the hosted-engine storage domain 2. set global maintenance mode with hosted-engine --set-maintenance --mode=global 3. Cleanly shutdown the engine with hosted-engine --vm-shutdown 4. Reboot the host 5. Ged the hosted-engine storage domain uuid from /etc/ovirt-hosted-engine/hosted-engine.conf (sdUUID) 5. Run 'vdsClient -s 0 getImagesList <sdUUID> getImagesList returns the images list (4 in total) under the hosted_storage storage domain after performing the scenario described in comment #22. Steps as described in comment #22, deployed twice, iSCSI and NFS (over a clean server - freshly installed OS before deployment). [root@blond-vdsf ~]# vdsClient -s 0 getImagesList 028649d6-d048-453a-b7aa-e96881cf0443 62ee53ee-5220-4b62-8ad8-f6c536e18070 a400f71f-fd52-49ca-9efd-3e5345303675 fd64b0a4-a1d2-439d-8649-cf8d6e3765c6 9e02d366-c926-485d-9870-dacfafabf5bc Verified using: ovirt-vmconsole-host-1.0.3-1.el7ev.noarch ovirt-hosted-engine-ha-2.0.0-1.el7ev.noarch ovirt-host-deploy-1.5.0-1.el7ev.noarch ovirt-hosted-engine-setup-2.0.0.2-1.el7ev.noarch ovirt-setup-lib-1.0.2-1.el7ev.noarch ovirt-vmconsole-1.0.3-1.el7ev.noarch libgovirt-0.3.3-1.el7_2.4.x86_64 ovirt-imageio-common-0.3.0-0.el7ev.noarch ovirt-engine-sdk-python-3.6.7.0-1.el7ev.noarch ovirt-imageio-daemon-0.3.0-0.el7ev.noarch vdsm-api-4.18.5.1-1.el7ev.noarch vdsm-infra-4.18.5.1-1.el7ev.noarch vdsm-xmlrpc-4.18.5.1-1.el7ev.noarch vdsm-jsonrpc-4.18.5.1-1.el7ev.noarch vdsm-4.18.5.1-1.el7ev.x86_64 vdsm-python-4.18.5.1-1.el7ev.noarch vdsm-yajsonrpc-4.18.5.1-1.el7ev.noarch vdsm-cli-4.18.5.1-1.el7ev.noarch vdsm-hook-vmfex-dev-4.18.5.1-1.el7ev.noarch fence-agents-rhevm-4.0.11-27.el7_2.8.x86_64 rhevm-appliance-20160623.0-1.el7ev.noarch libvirt-daemon-1.2.17-13.el7_2.5.x86_64 |