Opening to track storage review of the fix and proper solution. +++ This bug was initially created as a clone of Bug #1319721 +++ Description of problem: Call to getImagesList on host without connected storage pool, but with SD, return {'status': {'message': 'OK', 'code': 0} Version-Release number of selected component (if applicable): vdsm-4.17.23.1-0.el7ev.noarch How reproducible: Always Steps to Reproduce: 1. run script sdUUID = 'SD_UUID' cli = vdscli.connect(timeout=60) result = cli.getImagesList(sdUUID) print(result) result = cli.getConnectedStoragePoolsList() print(result) 2. 3. Actual results: That prints: {'status': {'message': 'OK', 'code': 0}, 'imageslist': []} {'status': {'message': 'OK', 'code': 0}, 'poollist': []} Expected results: result = cli.getImagesList(sdUUID) print(result) {'status': {'message': 'list index out of range', 'code': 100}} Additional info: the image are available on the NFS share [root@alma07 images]# pwd /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Virt_alukiano__HE__upgrade/3ac831d6-6124-4b42-a060-f89c64be09a1/images [root@alma07 images]# ls -l total 16 drwxr-xr-x. 2 vdsm kvm 4096 17 mar 15.36 4d1915e1-a9f7-4bca-b666-0997adec5ef4 drwxr-xr-x. 2 vdsm kvm 4096 18 mar 00.52 995171f0-1abb-488b-9b18-3e17aad0c3de drwxr-xr-x. 2 vdsm kvm 4096 20 mar 15.40 9ecd4e5f-bb24-4fd6-8c20-c442425b59b6 drwxr-xr-x. 2 vdsm kvm 4096 18 mar 00.52 b6b637a4-37be-48e9-aacb-e3d4a6be29cc but VDSM is not reporting them. --- Additional comment from Red Hat Bugzilla Rules Engine on 2016-03-21 14:12:27 IST --- This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. --- Additional comment from Simone Tiraboschi on 2016-03-21 15:03 IST --- --- Additional comment from Allon Mureinik on 2016-03-21 15:05:34 IST --- I don't understand. Isn't this just a duplicate of bug 1274622 ? --- Additional comment from Simone Tiraboschi on 2016-03-21 15:11:12 IST --- We can have an additional simple workaround on hosted-engine side (as for https://gerrit.ovirt.org/#/c/54982/ ) but, if possible, I'd prefer to get this properly fixed on VDSM side to prevent any other surprise on other behavior changes. --- Additional comment from Simone Tiraboschi on 2016-03-21 15:17:04 IST --- (In reply to Allon Mureinik from comment #3) > I don't understand. Isn't this just a duplicate of bug 1274622 ? No, unfortunately it's not: at 1274622 that call was failing with: {'status': {'message': 'list index out of range', 'code': 100}} and we implemented a workaround for that directly globing under /rhev/data-center/mnt/{mnt_point}/{sduuid}/images to directly find the images on the NFS share. Now it's also worst since VDSM returns {'status': {'message': 'OK', 'code': 0}, 'imageslist': []} (which is wrong since the images are there!) and so our workaround doesn't trigger anymore. We can add an additional workaround over that but if possible I'd prefer to get it properly fixed. --- Additional comment from Sandro Bonazzola on 2016-03-21 15:23:08 IST --- Workaround has been posted here: https://gerrit.ovirt.org/#/c/54982/1 --- Additional comment from Sandro Bonazzola on 2016-03-21 15:31:24 IST --- Moving this bug to ovirt-hosted-engine-ha since it has been decided to use the workaround instead of a proper fix in vdsm. Allon, please consider to schedule a proper fix for 3.6.5. --- Additional comment from Allon Mureinik on 2016-03-21 17:25:59 IST --- (In reply to Sandro Bonazzola from comment #7) > Moving this bug to ovirt-hosted-engine-ha since it has been decided to use > the workaround instead of a proper fix in vdsm. Allon, please consider to > schedule a proper fix for 3.6.5. Please clone it to track that initiative, although, atm, I can't commit to such a fix.
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
oVirt 4.0 beta has been released, moving to RC milestone.
Raz, I see that the patches are merged, can you try and reproduce and let us know if it's still an open issue?
(In reply to Tal Nisan from comment #5) > Raz, I see that the patches are merged, can you try and reproduce and let us > know if it's still an open issue? cli.getConnectedStoragePoolsList() returns: {'status': {'message': 'OK', 'code': 0}, 'poollist': []} cli.getStorageDomainsList() returns: {'status': {'message': 'OK', 'code': 0}, 'domlist': ['3f433c06-5057-4a13-afc7-8d70953e34b5']} and the cli.getImagesList('3f433c06-5057-4a13-afc7-8d70953e34b5') returns: {'status': {'message': 'OK', 'code': 0}, 'imageslist': ['e8b3eb91-3892-4e74-8962-b73405295fbe', '25c6e941-12ed-4ea7-a334-b71fed8ab0e7', 'a535c196-eeda-4f3f-8593-1380bd8a7cb1', '374c37d9-2387-48c6-86ae-4ef3d64c6ee8']} I'm not sure if that's the expected results, because I couldn't really understand why the expected results is 'list index out of range'. vdsm version: vdsm-4.18.999-1216.git34aa313.el7.centos.x86_64
Tal, Please reply to comment #6
the patches attached are workarounds to the storage function issue. We want a solution that HE will not need to workaround anymore.
I don't understand the flow here. "Call to getImagesList on host without connected storage pool, but with SD" - what is a host without a storage pool, but with a storage domain? Is it a host in status = maintenance and with a storage domain that its status is up? Is it an activated host with a detached storage domain? Can you please explain in more details what's the flow here, who calls getImages and when? Thanks!
I believe we talk about the scenario when we did HE deployment but still did not add the master storage domain to the engine. I do not sure if the bug still relevant for the 4.1, so maybe Simone can clarify the situation.
(In reply to Artyom from comment #10) > I believe we talk about the scenario when we did HE deployment but still did > not add the master storage domain to the engine. > I do not sure if the bug still relevant for the 4.1, so maybe Simone can > clarify the situation. Yes, correct, still worth to check it.
Moving out all non blocker\exceptions.
We do not support anything when you are not connected to storage pool, except starting/stopping monitoring on external domains. This sounds like RFE for future version, and does not fit bug fix for 4.1. For 4.1 we should accept now only critical bug fixes in exiting features, not add features we do not have. Anything else is risking the stability of 4.1.
I installed HE without adding a storage domain. I ran "vdsm-client StorageDomain getImages storagedomainID=<uuid>" with the NFS domain id that I created during the installation, and I got 4 images: [root@localhost data-center]# vdsm-client StorageDomain getImages storagedomainID=74683202-0d17-4992-86a0-4f865d59f4cd [ "9dd2b426-87f0-44b7-b5f8-48f3ed3c266f", "fa67d28c-44d8-4e3d-b435-26935ead0d61", "80bfe663-7b94-49ee-8a63-42a25fb78272", "ddebc6b7-57c8-4978-aa94-4a48b6c1faad" ] These are config and regular images of hosted engine. Closing as works for me, as it seems to be the right behavior.