Bug 1513960
Summary: | [RFE] Increase resilience of disk layout in block based storage domains against accidental wiping | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Julio Entrena Perez <jentrena> |
Component: | ovirt-engine | Assignee: | Yaniv Lavi <ylavi> |
Status: | CLOSED WONTFIX | QA Contact: | Elad <ebenahar> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.1.7 | CC: | gveitmic, jentrena, lsurette, mkalinin, nsoffer, pdwyer, rbalakri, Rhev-m-bugs, srevivo, ylavi |
Target Milestone: | --- | Keywords: | FutureFeature |
Target Release: | --- | Flags: | lsvaty:
testing_plan_complete-
|
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-11 09:03:07 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1523346 |
Description
Julio Entrena Perez
2017-11-16 11:44:45 UTC
Maybe RHV can also duplicate its metadata (or relocate it completely) to the end of the allocated PV? So even if the beginning of the PV is wiped, we can still read from the end? Marina, that could be worse since any accidental wiping of the beginning of LUNs would then affect VM date, which is even more challenging to restore than LVM metadata. Is there any technical reason against the original suggestion of (optionally) leaving the first 10 MB or so of each LUN empty (e.g. with a dummy partition)? Any customer ever beaten by this would be happy to loose 10 MB in exchange for avoiding the downtime associated with these events. (In reply to Marina from comment #7) > Maybe RHV can also duplicate its metadata (or relocate it completely) to the > end of the allocated PV? So even if the beginning of the PV is wiped, we can > still read from the end? We don't keep any metadata on the LUNs. RHV metadata is kept in the special lvs (metadata, master, ...). - The first MiB of the LUN is pv header - the next 128 MiB is the vg metadata - The last 128 MiB of the LUN contain exact copy of the vg metadata I don't think there is a backup of the pv header (first 1MiB). We cannot change these details, this is LVM format (maybe you should open LVM RFE to make this format more resilient). We cannot change old storage domains layout, but we can introduce a new storage domain format using partition tables, moving the pv header a way from the start of the LUN. But the same code that wipe today the start of the LUNs, can wipe the start of the lvm partition, so I'm not sure this will solve anything. I think the root cause here is not using proper lvm filter, hiding *all* the devices that are not required by the host. LVM commands on a RHV host must have access only to the LUNs used by the host (e.g. host booting from SAN). (In reply to Nir Soffer from comment #9) > We cannot change old storage domains layout, but we can introduce a new > storage > domain format using partition tables, moving the pv header a way from the > start > of the LUN. But the same code that wipe today the start of the LUNs, can wipe > the start of the lvm partition, so I'm not sure this will solve anything. The most common case, is anaconda wiping the beginning of the LUNs and possibly writing some LVM or filesystem header as well. This would protect the RHV SD from the most frequent accident. (In reply to Julio Entrena Perez from comment #10) > (In reply to Nir Soffer from comment #9) > > We cannot change old storage domains layout, but we can introduce a new > > storage > > domain format using partition tables, moving the pv header a way from the > > start > > of the LUN. But the same code that wipe today the start of the LUNs, can wipe > > the start of the lvm partition, so I'm not sure this will solve anything. > > The most common case, is anaconda wiping the beginning of the LUNs and > possibly writing some LVM or filesystem header as well. > This would protect the RHV SD from the most frequent accident. Why would anaconda wipe LUNs? is this something it does itself or just a user error asking to use LUN which the user should not touch? Do we have anaconda bug for this? Often kickstart files will include: clearpart --all --initlabel (--drives=sda) This relies in the kernel modules for the FC HBAs not being loaded, since otherwise there is no guarantee that sda will be the local disk. Unfortunately and occasionally HBA kernel modules will be loaded (misconfiguration in the Satellite, typo in the ks file, new model of HBA using a kernel module that hasn't been blacklisted, etc). It's a common mistake with terrible consequences for RHV. There isn't much mileage in an anaconda bug since anaconda is doing as instructed. Marina wrote in BZ#1305327 > I will close this bug wontfix for now and I suggest moving this discussion > toward this RFE: > bz#1541165 - Provide a way to extract lvm metadata backups from a PV BZ<2>Jira Resync |