Bug 1232261 - RFE: Blivet provides only partial device details on any major disk failure
Summary: RFE: Blivet provides only partial device details on any major disk failure
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: python-blivet
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: David Lehman
QA Contact: RHS-C QE
URL:
Whiteboard:
Depends On:
Blocks: 1232275
TreeView+ depends on / blocked
 
Reported: 2015-06-16 11:38 UTC by Timothy Asir
Modified: 2016-02-25 08:35 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-02-25 08:35:35 UTC
Embargoed:


Attachments (Terms of Use)

Description Timothy Asir 2015-06-16 11:38:56 UTC
Description of problem:
Blivet provides only a partial device details if it encounter any UnusableConfigurationError occurs and there is no default option available to ignore faulty devices. Because of this issue, we could not able to know any details about the remaining devices and also we where unable to perform any lvm operations on the remaining devices thru python-blivet.


How reproducible:
Always

Steps to Reproduce:
1. empty the root device which has some partition (we can use dd command)
2. fetch the list of devices 
3. will throw an UnusableConfigurationError erro

Actual results:


Expected results:
There should be an option like "ignore faulty devices" with default value set to True. So that the blivet.devices could just ignore and provides remaining devices details. However it will be good if it provides information like what kind of disk failure the disk has in the blivet.device.status.

Additional info:

Comment 2 mulhern 2015-06-19 21:20:56 UTC
This feels a lot like an RFE, so I'm changing the title to reflect this.

There are a few kinds of errors that fall under the UnusableConfigurationError heading. The problem is that recovery might require different strategies for different errors and that any recovery strategies will be complicated.

For example, one error is that of two vg's with the same name. Probably the only recovery strategy is to ignore them both and proceed. But, in that case, it seems like all their lvs should be ignored as well. And so forth.

Encoding information about what error caused the device to be hidden is also a bit tricky. It would be reasonable if all devices that were hidden for a single reason shared a cause object of some sort.

The same complicated caveats go for the other errors.

Comment 3 mulhern 2015-06-19 21:29:39 UTC
An option is to extend the UnusableConfigurationError to identify the offending device, include it in ignored devices, and attempt a reset again, until quiescence or 0 devices found. But that is potentially very expensive.

Comment 4 mulhern 2015-06-22 13:36:52 UTC
Scope is large, reassigning...

Comment 6 Sahina Bose 2016-02-24 06:29:46 UTC
Is there a plan to take this up? We need a way of knowing if the list is partial due to errors determining the device list.

Comment 7 David Lehman 2016-02-24 14:23:37 UTC
It seems like you could set a flag if/when you catch UnusableConfigurationError since that is a reliable indicator of errors determining the device list. Is that sufficient, or do you need more information than whether or not an error occurred when finding devices?

Comment 8 Sahina Bose 2016-02-25 08:33:32 UTC
I think we can atleast show the user there's an error returning the full device list, and yes, catching UnusableConfigurationError and setting a flag can work for us.

I'm closing this RFE based on this.


Note You need to log in before you can comment on or make changes to this bug.