Bug 1024428 - [RFE] Allow to specify disk attributes such as error_policy and discard per disk in Web UI
Summary: [RFE] Allow to specify disk attributes such as error_policy and discard per d...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: RFEs
Version: 3.2.0
Hardware: All
OS: All
high
high
Target Milestone: ---
: 3.6.0
Assignee: Allon Mureinik
QA Contact: Aharon Canan
URL:
Whiteboard: storage
: 970520 (view as bug list)
Depends On:
Blocks: 902971
TreeView+ depends on / blocked
 
Reported: 2013-10-29 16:07 UTC by Marina Kalinin
Modified: 2021-09-09 11:32 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-05-11 17:35:31 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:
sherold: Triaged+
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-43203 0 None None None 2021-08-30 12:07:13 UTC
Red Hat Knowledge Base (Solution) 526303 0 None None None Never
oVirt gerrit 38329 0 'None' ABANDONED engine, webadmin: Allow to specify disk error_policy in Web UI 2021-01-14 08:03:44 UTC

Description Marina Kalinin 2013-10-29 16:07:35 UTC
Currently, there are different error_policy values available for KVM guests:
'report' - propagate all errors to the guest
'enospc' - stop guest on ENOSPC, propagate all other errors to the guest
'stop'   - stop guest
'ignore' - ignore it and let OS handle it.

Today, RHEV supports only one option - stop.

There are scenarios, when a VM has more then one disk attached to it and customers are willing to handle IO errors on the OS level, when the error is not on the system disk, instead of having the whole VM got paused.

RHEV should add the ability of specifying error_policy for each disk.
This option should be available to the end user - system Administrator.
And it should support all the supported values by qemu-kvm, as stated above.

Comment 2 Dan Kenigsberg 2013-10-29 17:04:16 UTC
Engine/Vdsm API supports the control of stop/enospace since ever via the propagateErrors attribute. Would extending this to REST and GUI be enough? If not, please explain why it is useful to report "enospace" to the guest.

Another option to solve/hack this bug is to extend http://www.ovirt.org/Features/Device_Custom_Properties to disk devices, and have a hook tweak the error_policy to whatever the user wants to.

Comment 3 Marina Kalinin 2013-10-29 17:08:05 UTC
Dan, enospace will still pause the VM.
I am not interested in this resolution.
In order not to stop the guest, we should specify report or ignore, and from what I see on vdsm code, those options are not supported right now.

Comment 4 Marina Kalinin 2013-10-29 17:12:55 UTC
Sorry, not only vdsm code, but everywhere in RHEV.
base_disks table has a boolean field - propagate_errors, that vdsm uses later to decide if to use stop or enospc, but it is still not enough.

If the device property defined here: http://www.ovirt.org/Features/Device_Custom_Properties would let the user specify error_policy per disk - would be fine with this RFE, as long as the rest of the code would be able to handle it.

Comment 7 Ayal Baron 2013-12-15 15:05:45 UTC
*** Bug 970520 has been marked as a duplicate of this bug. ***

Comment 8 Michal Skrivanek 2014-02-24 08:37:30 UTC
*** Bug 1064630 has been marked as a duplicate of this bug. ***

Comment 13 Marina Kalinin 2014-08-26 22:16:54 UTC
Right now we are using the following workaround:
# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE base_disks SET propagate_errors = 'On'"

This, together with the fix of vdsm in https://bugzilla.redhat.com/show_bug.cgi?id=1064630, will switch the guests from pausing on all I/O errors, to pausing only on out-of-space (which is what RHEV actually needs to expand the disk volumes).

But this is not a convenient way to implement this.
It should be run for each new guest added to the system.
And it is a hack, not a proper solution.

Comment 25 Francesco Romani 2015-02-23 12:03:55 UTC
Draft VDSM patch to allow Engine fine tune disk's error policy:

http://gerrit.ovirt.org/#/c/38009/

Comment 29 Nir Soffer 2015-03-06 23:21:40 UTC
The safest option would be to expose the propagateErrors option in the ui
per disk.

Supporting the "report" option should be ok, but needs lot of testing.

I don't see how the "ignore" option can be useful - I don't think we should
support this in the engine ui.

Kevin, what happens in the guest when using error_policy="ignore" and
qemu is having an io error reading or writing from a device?

Comment 31 Kevin Wolf 2015-03-09 10:03:09 UTC
(In reply to Nir Soffer from comment #29)
> Supporting the "report" option should be ok, but needs lot of testing.

I don't think that would actually be useful in a lot of cases, because the only
thing this achieves is breaking sparse image allocation, but it can probably be
made work.
 
> I don't see how the "ignore" option can be useful - I don't think we should
> support this in the engine ui.

I agree. "ignore" is not only hardly ever useful, but it is dangerous and not
a mode that should be used in production.

> Kevin, what happens in the guest when using error_policy="ignore" and
> qemu is having an io error reading or writing from a device?

It pretends that the operation has succeeded while it really hasn't. In case of
a write operation, nothing is written, and in case of a read, garbage is returned.

Comment 32 Nir Soffer 2015-03-09 13:06:27 UTC
(In reply to Kevin Wolf from comment #31)
> (In reply to Nir Soffer from comment #29)
> > Supporting the "report" option should be ok, but needs lot of testing.
> 
> I don't think that would actually be useful in a lot of cases, because the
> only
> thing this achieves is breaking sparse image allocation, but it can probably
> be
> made work.

Can you elaborate on this?

Do you think this can be useful in some cases, and we can support this option?

Comment 33 David Gibson 2015-03-10 04:41:38 UTC
In the customer case where this originated, this feature was wanted as a workaround for the fact that in the qemu errors mode which RHEV defaults to, all IO errors always caused a pause, even if they were triggered by a specific guest action, and in no way the responsibility of the host.  In that case, something on the guest was causing disk accesses beyond the bounds of the logical device - obviously pausing the guest isn't going to help with that.

At the time, qemu didn't have a mode which would report those sorts of errors to the guest, while still pausing for "backend" IO errors - other bugs have been filed to address that.  However qemu's "enospc" mode does do the right thing for that case, without breaking lazy image growing.

Even once the qemu side bug of not always reporting guest-caused IO errors is fixed, there are cases where "enospc" mode would make more sense than "pause always".  It's plausible to want the guest to see and deal with IO errors, rather than assuming they can be handled on the host side.  Doing so without hacking the RHEV database requires the feature requested by this bug.

Hope that background helps to clarify.

Comment 34 Kevin Wolf 2015-03-10 09:45:38 UTC
(In reply to David Gibson from comment #33)
> In the customer case where this originated, this feature was wanted as a
> workaround for the fact that in the qemu errors mode which RHEV defaults to,
> all IO errors always caused a pause, even if they were triggered by a
> specific guest action, and in no way the responsibility of the host.  In
> that case, something on the guest was causing disk accesses beyond the
> bounds of the logical device - obviously pausing the guest isn't going to
> help with that.

Aha! Another good example why bug reports should be describing the problem
instead of requesting what someone thinks is a solution.

Stopping on such errors is a qemu bug and needs to be fixed there. Adding more
options to the mangement UI isn't going to solve this in a helpful way, because
while it may improve the behaviour for guest-caused errors, it breaks the
behaviour for errors on the host side.

> At the time, qemu didn't have a mode which would report those sorts of
> errors to the guest, while still pausing for "backend" IO errors - other
> bugs have been filed to address that.  However qemu's "enospc" mode does do
> the right thing for that case, without breaking lazy image growing.

"enospc" has existed pretty much always. It's not the solution, though, because
a short network outage for a network-backed image would break the guest now.

The correct solution for the problem you described above is backporting
upstream commits 3c2daac0 (virtio-blk) and 58ac3211 (IDE). This happened
already in 6.6 in the context of bug 1064643.

So to fix that specific use case no change to management is needed at all.

> Even once the qemu side bug of not always reporting guest-caused IO errors
> is fixed, there are cases where "enospc" mode would make more sense than
> "pause always".  It's plausible to want the guest to see and deal with IO
> errors, rather than assuming they can be handled on the host side.  Doing so
> without hacking the RHEV database requires the feature requested by this bug.

Yes, I agree that there could be cases where werror=enospc is a useful setting.
These are corner cases, though, and the attached customer cases don't seem to
be such cases, but rather about the qemu bug discussed above. Therefore, we may
consider adding it, but it might have lower priority than this BZ may make it
appear.

I am very doubtful about [rw]error=report ever being useful and would advise
against providing it in the UI unless there is a very specific need for it.
Getting it right might be non-trivial and isn't worth the effort without a
customer use case.

As for [rw]error=ignore, I'm fully and decidedly against providing it. It has
no use outside of testing and possibly debugging and is dangerous.

Comment 35 Marina Kalinin 2015-03-11 17:55:48 UTC
(In reply to Kevin Wolf from comment #34)
> "enospc" has existed pretty much always. It's not the solution, though,
> because
> a short network outage for a network-backed image would break the guest now.
> 
> The correct solution for the problem you described above is backporting
> upstream commits 3c2daac0 (virtio-blk) and 58ac3211 (IDE). This happened
> already in 6.6 in the context of bug 1064643.
> 
> So to fix that specific use case no change to management is needed at all.
> 
Are you suggesting then to have management defaulting to enospc rather then stop?

Comment 36 Kevin Wolf 2015-03-12 11:11:57 UTC
No, the management defaults should stay as they are. That problem was fixed
in qemu, and it needed to be fixed only in qemu.

Comment 37 Marina Kalinin 2015-03-12 15:19:30 UTC
(In reply to Kevin Wolf from comment #36)
> No, the management defaults should stay as they are. That problem was fixed
> in qemu, and it needed to be fixed only in qemu.

So, I probably misunderstand something.
If we continue stopping the guest on each IO error (current default), how would we handle the case where we propagate the error to OS?

Comment 38 David Gibson 2015-03-13 06:00:46 UTC
Marina, as I understand the current state, the idea of the qemu side fixes is that even in "errors=stop" mode, qemu will no longer stop on all disk errors.  Specifically, it will always report to the guest errors which cannot be anything but the guest's fault - the main case being accesses outside the bounds of the device.

Kevin, so to be clear.  In comment 33 I'm explaining the context in which this bug was filed (which, yes, I should have better explained initially).  But the intention isn't for this bug to specifically address that problem (other bugs were filed for that).  Instead the idea of this bug is to allow easy configuration of a guest which is in errors=enospc mode for cases where you really do want errors reported to the guest (e.g. this might be useful for a guest with some secondary storage on a remote/flaky network connection for which the guest application has its own recovery mechanisms).  At present RHEV sort of supports such a use case with the propagate errors flag, but there's no way to actually activate this flag other than direct database manipulation.

Dropping the priority is appropriate.

I concur that "errors=ignore" is a bad idea.

Comment 39 Marina Kalinin 2015-05-01 17:59:59 UTC
Let's summarize the above and understand what action items should be taken on RHEV side to cover all possible functionalities.

Right now RHEV supports only "errors=stop".
Qemu exposed the following options:
'report' - propagate all errors to the guest
'enospc' - stop guest on ENOSPC, propagate all other errors to the guest
'stop'   - stop guest
'ignore' - ignore it and let OS handle it.

However, today's qemu behavior has been changed so that:
'stop' -> will not stop the guest and always propagate the error to the guest.
'ignore' -> should be discarded as an option / should not be available.
All the rest are unchanged.
Is this correct? David, Kevin?

Next - what options should be exposed to the end user (RHEV/RHOS)?
And why? What is the expected behavior if qemu on each of the values today?

Comment 41 David Gibson 2015-05-04 01:13:12 UTC
> Is this correct? David, Kevin?

That's correct with one clarification.

"stop" has been changed so that errors which can only be guest triggered (such as access beyond the bounds of the device) will always propagate to the guest.  All other errors still stop the VM, as before.

> Next - what options should be exposed to the end user (RHEV/RHOS)?

Really that's a question for the RHEV and RHOS guys.

As you say RHEV currently only supports errors=stop.  It does also have kinda-sorta half-baked support for errors=enospc: setting the propagate_errors field in the db to true will switch qemu to errors=enospc mode, but there's no way to set that field short of direct database manipulation.

When this cluster of bugs was started, I was suggesting changing the half-baked support for errors=enospc to full support.  I think there are cases where it makes sense for the guest to handle genuine disk errors rather than handling them in the host.

But since then I think I've seen some RHEV people indicating they're not convinced that's a good idea.

Comment 42 Kevin Wolf 2015-05-04 09:32:05 UTC
(In reply to Marina from comment #39)
> 'report' - propagate all errors to the guest
> 'enospc' - stop guest on ENOSPC, propagate all other errors to the guest
> 'stop'   - stop guest
> 'ignore' - ignore it and let OS handle it.

No, it's just "ignore it", period. qemu pretends that the request has succeeded
even if it hasn't, so the guest OS can't handle anything because it doesn't
even know that there was an error.

> However, today's qemu behavior has been changed so that:
> 'stop' -> will not stop the guest and always propagate the error to the
> guest.

'stop' will stop the guest if there was a host-side error (e.g. network went
away and image is stored on NFS). It won't stop the guest for guest-side errors
like submitting invalid requests.

> 'ignore' -> should be discarded as an option / should not be available.

qemu won't remove the option, but it has very little use outside of debugging.
RHEV shouldn't expose it in the GUI. I have no strong opinion on whether it
should be accessible using some method like manipulating the database.

Comment 43 Marina Kalinin 2015-05-04 13:51:37 UTC
Thank you for clarification.
So here is the summary of today's options:

'report' - propagate all errors to the guest
'enospc' - stop guest on ENOSPC, propagate all other errors to the guest
'stop'   - stop guest on host errors, propagate to guests - guest OS related errors
'ignore' - ignore the error (used for internal debugging only).


With this new 'stop' implementation/fix, it would make the life of end users easier. And with it, it is also more reasonable to have this as a default option in RHEV (as it is now already).

Question: can you see any needs for exposing 'report' and 'enospc' to the end user? What can be the use case for it?

And second question: assuming we decide there is no big need exposing it to the UI and enough to keep it in db, for special debugging cases; would we need to change rhev to support the 'report' option too, since right now we support only 'enospc' and 'stop'.

Comment 44 David Gibson 2015-05-05 01:31:17 UTC
> With this new 'stop' implementation/fix, it would make the life of end users easier. And with it, it is also more reasonable to have this as a default option in RHEV (as it is now already).


I don't quite follow what you're saying here.  The new "stop" implementation is simply more correct.  The old version would sometimes stop - essentially asking the host to intervene - for errors which the host cannot possibly fix.

> Question: can you see any needs for exposing 'report' and 'enospc' to the end user? What can be the use case for it?

I don't think "report" can reasonably be exposed, because without at least stopping for out of space errors, RHEV can't implement on-demand growing of guest volumes.

"enospc" is almost the same as "report", except for the handling of out of space errors.  I think there are use cases for this mode: e.g. (a) you have a physical disk with known errors, and you have a tool for analyzing / recovering those errors which you want to run inside a guest.  (b) you are running some sort of fault-tolerant server application inside your guest - if the application has better information to recover from errors than the host, you want the errors reported to it, rather than just to the host.

> And second question: assuming we decide there is no big need exposing it to the UI and enough to keep it in db, for special debugging cases; would we need to change rhev to support the 'report' option too, since right now we support only 'enospc' and 'stop'.

As above, "report" would break on-demand volumes, so I don't think we can use it.  "enospc" essentially exists for exactly this sort of case - most errors are reported to the guest, but the few required for on-demand volume resizing are reported to the host.

Comment 45 Marina Kalinin 2015-05-05 19:37:54 UTC
So, based on David's explanation, I think we should be ok as we are now - without exposing anything to UI and without adding additional fields to the database.

If you all agree, I am ok closing this RFE as closed wontfix and cover the info in a kcs article.

Comment 46 Allon Mureinik 2015-05-10 07:29:14 UTC
(In reply to Marina from comment #45)
> So, based on David's explanation, I think we should be ok as we are now -
> without exposing anything to UI and without adding additional fields to the
> database.
> 
> If you all agree, I am ok closing this RFE as closed wontfix and cover the
> info in a kcs article.
Ack. Please close it.


Note You need to log in before you can comment on or make changes to this bug.