Bug 1513960 - [RFE] Increase resilience of disk layout in block based storage domains against accidental wiping
Summary: [RFE] Increase resilience of disk layout in block based storage domains again...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.7
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Yaniv Lavi
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks: CEECIR_RHV43_proposed
TreeView+ depends on / blocked
 
Reported: 2017-11-16 11:44 UTC by Julio Entrena Perez
Modified: 2021-12-10 15:31 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-11 09:03:07 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1305327 0 medium CLOSED [RFE] - LVM commands called with autoback 2022-04-21 06:44:21 UTC
Red Hat Issue Tracker RHV-44199 0 None None None 2021-12-10 15:31:23 UTC

Internal Links: 1305327

Description Julio Entrena Perez 2017-11-16 11:44:45 UTC
> 1. Proposed title of this feature request
     
Increase resilience of disk layout in block based storage domains against accidental wiping
 
> 3. What is the nature and description of the request?
 
Customer would like to see a disk layout on the LUNs of their RHV storage domains that is more resilient to accidental wiping of the initial area of the LUNs.
 
> 4. Why does the customer need this? (List the business requirements here)
 
Customer has suffered multiple instances of LUNs belonging to RHV storage domains having their starting areas wiped accidentally.
This is commonly seen when one of the hosts is being reinstalled with the FC cables attached and the kernel modules for the HBAs loaded, thus giving the RHEL installer access to all LUNs.
While measures are in place to prevent this from happening, on rare occasions even these measures do not appear to be enough and damage to the LVM metadata area of the LUNs in a storage domain happens.
Recovery of the SD LVM metadata is a manual, risky process and we don't consider appropriate to automate it due to the risk of customers restoring incorrect metadata and damaging the data (automated recovery RFE - bug 968370 was rejected).
 
Customer would like RHV to take a more proactive approach in preventing this situation in the first place.
Since RHV can not control the behaviour of a standalone, uncoordinated host with access to the LUNs, customer would like to see RHV (optionally) using a disk layout that makes damage to the LVM metadata less likely to happen.
 
The following is just a suggested approach to illustrate the goal and not necessarily an exact description of how the goal can be accomplished:
 
RHV could create two partitions on each LUN: an initial, dummy, small partition that takes a few MBs and a second partition taking the rest of the LUN that contains the LVM PV (including its metadata).
Should any uncontrolled process write a new LVM label, partition table, software raid signature, etc. at the beginning of the SD LUN, the LVM metadata would remain unaffected.
Recovering from such event would involve restoring a simple partition table and, knowing a fixed size of the first partition, it's straightforward to restore it either manually or even in an automated way.
 
After consulting with the Virt SBR in EMEA, accidental wiping of the starting area of RHV SD LUNs is a regular event reported by customers.
Therefore anything Red Hat can do to make the storage layout in the RHV SD LUNs more resilient to these events would result in more resilient RHV deployments and less load on Red Hat support and less customer dissatisfaction.
 
> 5. How would the customer like to achieve this? (List the functional requirements here)
 
RHV (optionally) creates a disk layout on LUNs for storage domains that prevents accidental writes to the initial area of the LUNs from affecting the LVM metadata.
 
> 6. For each functional requirement listed, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented.
 
Customer can wipe the beginning of LUNs that belong to SDs and easily recover from that event.
E.g. customer can run 'pvcreate' on a RHV SD LUN and easily restore the original partition table re-gaining access to the affected SD with no/minimal intervention from Red Hat support and minimal impact to the workload(s) in the affected SD.
 
> 7. Is there already an existing RFE upstream or in Red Hat Bugzilla?
 
None found.
 
> 8. Does the customer have any specific timeline dependencies and which release would they like to target (i.e. RHEL5, RHEL6)?
 
Customer would like to see this enhancement considered for a RHV 4.3 release.
 
> 9. Is the sales team involved in this request and do they have any additional input?
 
Not yet.
 
> 10. List any affected packages or components.
 
Unknown at this stage.
 
11. Would the customer be able to assist in testing this functionality if implemented?
 
Yes, certainly.

Comment 7 Marina Kalinin 2018-03-02 20:56:50 UTC
Maybe RHV can also duplicate its metadata (or relocate it completely) to the end of the allocated PV? So even if the beginning of the PV is wiped, we can still read from the end?

Comment 8 Julio Entrena Perez 2018-03-05 15:46:30 UTC
Marina, that could be worse since any accidental wiping of the beginning of LUNs would then affect VM date, which is even more challenging to restore than LVM metadata.

Is there any technical reason against the original suggestion of (optionally) leaving the first 10 MB or so of each LUN empty (e.g. with a dummy partition)?
Any customer ever beaten by this would be happy to loose 10 MB in exchange for avoiding the downtime associated with these events.

Comment 9 Nir Soffer 2018-04-08 15:51:45 UTC
(In reply to Marina from comment #7)
> Maybe RHV can also duplicate its metadata (or relocate it completely) to the
> end of the allocated PV? So even if the beginning of the PV is wiped, we can
> still read from the end?

We don't keep any metadata on the LUNs. RHV metadata is kept in the special lvs
(metadata, master, ...).

- The first MiB of the LUN is pv header
- the next 128 MiB is the vg metadata
- The last 128 MiB of the LUN contain exact copy of the vg metadata

I don't think there is a backup of the pv header (first 1MiB).

We cannot change these details, this is LVM format (maybe you should open LVM RFE
to make this format more resilient).

We cannot change old storage domains layout, but we can introduce a new storage
domain format using partition tables, moving the pv header a way from the start
of the LUN. But the same code that wipe today the start of the LUNs, can wipe
the start of the lvm partition, so I'm not sure this will solve anything.

I think the root cause here is not using proper lvm filter, hiding *all* the
devices that are not required by the host. LVM commands on a RHV host must have 
access only to the LUNs used by the host (e.g. host booting from SAN).

Comment 10 Julio Entrena Perez 2018-04-09 08:08:05 UTC
(In reply to Nir Soffer from comment #9)
> We cannot change old storage domains layout, but we can introduce a new
> storage
> domain format using partition tables, moving the pv header a way from the
> start
> of the LUN. But the same code that wipe today the start of the LUNs, can wipe
> the start of the lvm partition, so I'm not sure this will solve anything.

The most common case, is anaconda wiping the beginning of the LUNs and possibly writing some LVM or filesystem header as well.
This would protect the RHV SD from the most frequent accident.

Comment 11 Nir Soffer 2018-04-09 08:19:30 UTC
(In reply to Julio Entrena Perez from comment #10)
> (In reply to Nir Soffer from comment #9)
> > We cannot change old storage domains layout, but we can introduce a new
> > storage
> > domain format using partition tables, moving the pv header a way from the
> > start
> > of the LUN. But the same code that wipe today the start of the LUNs, can wipe
> > the start of the lvm partition, so I'm not sure this will solve anything.
> 
> The most common case, is anaconda wiping the beginning of the LUNs and
> possibly writing some LVM or filesystem header as well.
> This would protect the RHV SD from the most frequent accident.

Why would anaconda wipe LUNs? is this something it does itself or just a user error
asking to use LUN which the user should not touch?

Do we have anaconda bug for this?

Comment 12 Julio Entrena Perez 2018-04-09 08:28:31 UTC
Often kickstart files will include:

clearpart --all --initlabel (--drives=sda)

This relies in the kernel modules for the FC HBAs not being loaded, since otherwise there is no guarantee that sda will be the local disk.
Unfortunately and occasionally HBA kernel modules will be loaded (misconfiguration in the Satellite, typo in the ks file, new model of HBA using a kernel module that hasn't been blacklisted, etc).
It's a common mistake with terrible consequences for RHV.

There isn't much mileage in an anaconda bug since anaconda is doing as instructed.

Comment 13 Yaniv Lavi 2018-04-11 09:03:07 UTC
Marina wrote in BZ#1305327
> I will close this bug wontfix for now and I suggest moving this discussion
> toward this RFE:
> bz#1541165 - Provide a way to extract lvm metadata backups from a PV

Comment 14 Franta Kust 2019-05-16 13:06:48 UTC
BZ<2>Jira Resync


Note You need to log in before you can comment on or make changes to this bug.