Bug 1041153

Summary: [RFE][nova]: Chaos drivers to test corner cases
Product: Red Hat OpenStack Reporter: RHOS Integration <rhos-integ>
Component: RFEsAssignee: RHOS Maint <rhos-maint>
Status: CLOSED UPSTREAM QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: markmc, yeylon
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
URL: https://blueprints.launchpad.net/nova/+spec/chaos-drivers
Whiteboard: upstream_milestone_none upstream_status_unknown upstream_definition_drafting
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-19 17:26:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description RHOS Integration 2013-12-12 13:48:18 UTC
Cloned from launchpad blueprint https://blueprints.launchpad.net/nova/+spec/chaos-drivers.

Description:

It would be very neat to implement a set of 'chaos' drivers that could be implemented for each pluggable 'driver-like' backend that can be provided in nova (for example the servicegroup driver, or the vm driver..). The concept would be that a single chaos driver would wrap a working driver and the chaos driver would randomly (likely via a specified seed and/or rate) raise different types of exceptions that the driver interface allows to be thrown. For the cases where it does not throw it would pass on the request to the underlying driver, thus it would act as a driver that sporadically fails at a much higher rate (depending on the randomness) than the underlying wrapped driver. This would be neat as a way to test the corner cases of the code using said driver. It also provides a repeatable (due to the set seed)  & unique view into what state the system is left in after said exceptions occur. The results of such types of 'chaos' drivers could provide unique insights into how the overall system recovers from failures and could lead to new code to ensure the system is left in a consistent state (if it is not already).

Specification URL (additional information):

None