Bug 1669367

Summary: When changing disk settings, 'propagate_errors' value gets reset
Product: Red Hat Enterprise Virtualization Manager Reporter: Marcus West <mwest>
Component: ovirt-engineAssignee: Daniel Erez <derez>
Status: CLOSED ERRATA QA Contact: Elad <ebenahar>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.2.8CC: aefrat, derez, ebenahar, frolland, lleistne, mtessun, Rhev-m-bugs, sborella, tnisan
Target Milestone: ovirt-4.3.1   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
This fix ensures that the current propogate_errors setting does not get reset when changing the disk properties.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-08 12:39:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marcus West 2019-01-25 03:46:20 UTC
## Description of problem:

Recently, customers have started requesting that disks (particularly direct-attach) propagate errors rather than pause the VM.  We have provided steps to do this via a script, and there is an RFE to allow it via the GUI [1]

Today I have noticed that if you change the disk properties (ie, 'Enable SCSI Pass-Through'), it resets propagate_errors back to it's default value ('Off').  This probably would not be noticed by the customer initially, until the disk next unexpected had an error or go offline...

## Version-Release number of selected component (if applicable):

ovirt-engine-4.2.8.2-0.1.el7ev.noarch  (and most likely older versions)

## How reproducible:

Always

## Steps to Reproduce:

1. Set a disk to propagate errors, ie:

> update base_disks set propagate_errors = 'On' where disk_id = '747ce3bb-45d0-40cb-bf93-45c0b869bbcd';
UPDATE 1

> select disk_id, sgio, propagate_errors pe, disk_alias, disk_description from base_disks where disk_alias = 'rhel7-test2_Disk4';
               disk_id                | sgio | pe |    disk_alias     | disk_description 
--------------------------------------+------+----+-------------------+------------------
 747ce3bb-45d0-40cb-bf93-45c0b869bbcd |      | On | rhel7-test2_Disk4 | cf5e
(1 row)


2. Change the 'Enable SCSI Pass-Through' setting (VM -> disk -> edit, via the GUI)

3. Check database settings again:

select disk_id, sgio, propagate_errors pe, disk_alias, disk_description from base_disks where disk_alias = 'rhel7-test2_Disk4'";
               disk_id                | sgio | pe  |    disk_alias     | disk_description 
--------------------------------------+------+-----+-------------------+------------------
 747ce3bb-45d0-40cb-bf93-45c0b869bbcd |    1 | Off | rhel7-test2_Disk4 | cf5e
(1 row)


## Actual results:

propagate_errors changes back to default value ('Off')

## Expected results:

propagate_errors should remain to whatever it was set to (in this case, 'On')

## Additional info:

Public kbase article on how to change propagate_errors:

  https://access.redhat.com/solutions/526303

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1314160

Comment 2 Tal Nisan 2019-01-29 09:39:37 UTC
Seems like we're explicitly setting propagate errors to off when flushing, Daniel do you recall what's the reason for doing so?

Comment 3 Daniel Erez 2019-01-29 15:05:13 UTC
(In reply to Tal Nisan from comment #2)
> Seems like we're explicitly setting propagate errors to off when flushing,
> Daniel do you recall what's the reason for doing so?

I guess it was just for initializing the property, but no longer needed as it's done in BaseDisk ctr.
Send a patch to remove it: https://gerrit.ovirt.org/#/c/97402/

Comment 6 RHV bug bot 2019-02-21 17:26:27 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops

Comment 7 Elad 2019-03-06 10:16:20 UTC
1) Added a direct LUN and attached to a VM with SCSI passthrough enabled
2) Updated the disk propagate errors to On:

engine=# update base_disks set propagate_errors = 'On' where disk_id = 'ded5395b-41f2-4f92-b2d5-ba20447dbcfc';
UPDATE 1
engine=# select disk_id, sgio, propagate_errors pe, disk_alias, disk_description from base_disks where disk_alias = 'test_Disk1';
               disk_id                | sgio | pe | disk_alias | disk_description 
--------------------------------------+------+----+------------+------------------
 ded5395b-41f2-4f92-b2d5-ba20447dbcfc |    1 | On | test_Disk1 | 1f6b

3) Updated the disk with SCSI passthrough to disabled:

2019-03-06 12:11:15,680+02 INFO  [org.ovirt.engine.core.bll.storage.disk.UpdateVmDiskCommand] (default task-17) [4c116fa6-5ea7-4882-b958-87cfc8d0cc80] Running command: UpdateVmDiskCommand internal: false. Entiti
es affected :  ID: ded5395b-41f2-4f92-b2d5-ba20447dbcfc Type: DiskAction group EDIT_DISK_PROPERTIES with role type USER,  ID: ded5395b-41f2-4f92-b2d5-ba20447dbcfc Type: DiskAction group CONFIGURE_SCSI_GENERIC_IO
 with role type ADMIN


Propagate errors remained On:
engine=# select disk_id, sgio, propagate_errors pe, disk_alias, disk_description from base_disks where disk_alias = 'test_Disk1';
               disk_id                | sgio | pe | disk_alias | disk_description 
--------------------------------------+------+----+------------+------------------
 ded5395b-41f2-4f92-b2d5-ba20447dbcfc |      | On | test_Disk1 | 1f6b
(1 row)


==============
Used:
ovirt-engine-4.3.1.1-0.1.el7.noarch
vdsm-4.30.9-1.el7ev.x86_64

Comment 9 errata-xmlrpc 2019-05-08 12:39:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:1085