Bug 1331817

Summary: [RFE] Change default value for partial_activation to True in LVM resource agent
Product: Red Hat Enterprise Linux 6 Reporter: michal novacek <mnovacek>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED WONTFIX QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.8CC: agk, cluster-maint, fdinitto, jruemker
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-24 15:10:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description michal novacek 2016-04-29 16:08:03 UTC
Description of problem:
From resource agent description:

partial_activation: If set, the volume group will be activated even only
partial of the physical volumes available. It helps to set to true, when you
are using mirroring logical volumes.

Default value for partial_activation is False.

Now we support other raid types as well it would make in my opinion more sense
to True because:

* with volumes that are not raid volumes this choice makes no sense

* with raid volume being:

  * healthy, resource agent would activate vg with either value

  * degraded, but working raid: this is the only case where we would always
      want to have partial_activation=true (or the raid would make no sense)

  * faulty raid (two many failed disks to work) vg would not be activated with
      either value because it cant be activated

Version-Release number of selected component (if applicable):
resource-agents-3.9.5-34.el6.x86_64

Comment 2 John Ruemker 2016-08-02 17:09:05 UTC
I would be concerned with changing the default value to true in a minor release, or at all for that matter.  The specific case where this can be problematic is:

>> degraded, but working raid: this is the only case where we would always
>>      want to have partial_activation=true (or the raid would make no sense)

See this bug for a scenario where it is possible to lose data as a result of partially activating volumes:

https://bugzilla.redhat.com/show_bug.cgi?id=1251462

The summary is that basically if you have a storage split between nodes of the cluster where each side maintains access to the local site's devices, if the resource then relocates and you activate partially, you can end up using the outdated copy of the data.  This will happen silently if you have partial-activation true, and now the new site will be writing its data to a separate mirror copy.  At this point you have diverged on both sides in a way that could not be easily merged back together.

Perhaps we could give the proper caveats and warnings, but if any customer was using the default value of false to protect from this scenario and suddenly we allow their data to be corrupted, they won't be happy.

Comment 3 John Ruemker 2016-08-02 17:09:51 UTC
Also as this is an RFE, it should probably be closed anyways, given we're into Prod Phase 2.  I won't close it myself, since its not customer-initiated.

Comment 4 Oyvind Albrigtsen 2016-08-24 15:10:24 UTC
Closing due to concerns with changing default value for a minor release, and the issues it might cause (see comment #2).