Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1455871 - [downstream clone - 4.2.0] [CodeChange] move vdsm calls (mostly removeImage) from transactional endAction()
[downstream clone - 4.2.0] [CodeChange] move vdsm calls (mostly removeImage) ...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
unspecified
All Linux
high Severity high
: ovirt-4.2.0
: ---
Assigned To: Fred Rolland
Kevin Alon Goldblatt
: CodeChange, ZStream
Depends On: 1390936
Blocks:
  Show dependency treegraph
 
Reported: 2017-05-26 06:49 EDT by rhev-integ
Modified: 2018-08-04 10:06 EDT (History)
16 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1390936
Environment:
Last Closed: 2018-05-15 13:42:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:1488 None None None 2018-05-15 13:44 EDT

  None (edit)
Description rhev-integ 2017-05-26 06:49:19 EDT
+++ This bug is an upstream to downstream clone. The original bug is: +++
+++   bug 1390936 +++
======================================================================

Currently some flows has vdsm calls (mostly removeImage) performed on the endAction() method. If the endAction() is executed within transaction we might get a transaction timeout on some scenarios.
This RFE is about moving those calls out of the endAction() - In order to do that:
1. Our "COCO-Storage" infrastructure (serial callback) needs to be modified to support moving vdsm calls out of endWithFailure().
2. The relevant flows needs to start using the COCO infrastructure instead of relying on the tasks infrastructure.

(Originally by laravot)
Comment 1 rhev-integ 2017-05-26 06:49:32 EDT
Can you describe the functional impact?

(Originally by Yaniv Dary)
Comment 3 rhev-integ 2017-05-26 06:49:44 EDT
Sure,
The functional impact is that we may get transaction timeouts when executing vdsm calls within transactional endAction(). 
In BZ 1372743 (see https://bugzilla.redhat.com/show_bug.cgi?id=1372743#c22) it caused us to remain with a locked disk.

Let me know if further info is needed.

Thanks,
Liron

(Originally by laravot)
Comment 4 rhev-integ 2017-05-26 06:49:51 EDT
Tal, I'm treating this as code change. Please decide on a target for it.

(Originally by Yaniv Dary)
Comment 6 Allon Mureinik 2017-09-28 07:35:32 EDT
With the recent work around both LSM and cold move, the issue in the ticket should be resolved, setting to MODIFIED.

We'll keep the upstream tracking bug for other code improvements.
Comment 13 Kevin Alon Goldblatt 2017-12-04 05:38:02 EST
Verified with the following code:
----------------------------------
ovirt-engine-4.2.0-0.5.master.el7.noarch
vdsm-4.20.8-53.gitc3edfc0.el7.centos.x86_64

Verified with the following scenario:
----------------------------------
1. Created a vm with disks and OS installed on nfs
2. Moved the host to maintenance
3. Edit the file /usr/lib/python2.7/site-packages/vdsm/API.py on the host and added a sleep as follows:
-------------------------------------------------------------------
from time import sleep

class Image(APIBase):
    ctorArgs = ['imageID', 'storagepoolID', 'storagedomainID']

    BLANK_UUID = sc.BLANK_UUID

    class DiskTypes:
        UNKNOWN = image.UNKNOWN_DISK_TYPE
        SYSTEM = image.SYSTEM_DISK_TYPE
        DATA = image.DATA_DISK_TYPE
        SHARED = image.SHARED_DISK_TYPE
        SWAP = image.SWAP_DISK_TYPE
        TEMP = image.TEMP_DISK_TYPE

    def __init__(self, UUID, spUUID, sdUUID):
        APIBase.__init__(self)
        self._UUID = UUID
        self._spUUID = spUUID
        self._sdUUID = sdUUID

    def delete(self, postZero, force, discard=False):
        sleep(600)
        return self._irs.deleteImage(self._sdUUID, self._spUUID, self._UUID,
                                     postZero, force, discard)
-----------------------------------------------------------------------
4. Restarted the vdsm on the host
5. Cold move of the disk on the vm created in step 1  >>>>> The delete image operation times out and fails. The disk is left in OK state.


Moving to VERIFY
Comment 14 RHV Bugzilla Automation and Verification Bot 2017-12-06 11:16:07 EST
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[No relevant external trackers attached]

For more info please contact: rhv-devops@redhat.com
Comment 15 RHV Bugzilla Automation and Verification Bot 2017-12-12 16:14:43 EST
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[No relevant external trackers attached]

For more info please contact: rhv-devops@redhat.com
Comment 16 RHV Bugzilla Automation and Verification Bot 2017-12-18 12:05:07 EST
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[No relevant external trackers attached]

For more info please contact: rhv-devops@redhat.com
Comment 19 errata-xmlrpc 2018-05-15 13:42:49 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1488

Note You need to log in before you can comment on or make changes to this bug.