Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be unavailable on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 596061 - [RFE] - File domains metadata should be saved in 2 copies to avoid corruption in case of IO failure to one copy
Summary: [RFE] - File domains metadata should be saved in 2 copies to avoid corruption...
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: unspecified
Hardware: All
OS: Linux
Target Milestone: ---
: 3.3.4
Assignee: Dan Kenigsberg
QA Contact:
Whiteboard: storage
Depends On:
TreeView+ depends on / blocked
Reported: 2010-05-26 09:10 UTC by Moran Goldboim
Modified: 2015-01-24 10:35 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Last Closed: 2013-01-30 22:51:01 UTC
oVirt Team: ---

Attachments (Terms of Use)
vdsm log (3.85 MB, application/octet-stream)
2010-05-26 09:10 UTC, Moran Goldboim
no flags Details

System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 574733 0 high CLOSED RHEV nfs soft mounts needs to be changed to hard mounts 2021-02-22 00:41:40 UTC

Description Moran Goldboim 2010-05-26 09:10:44 UTC
Created attachment 416719 [details]
vdsm log

Description of problem:
IO error on activate storage (pool meta data corruption)  will lead to nonfunctional datacenter
Thread-1084::ERROR::2010-05-26 06:38:17,351::dispatcher::103::irs::[Errno 5] Input/output error
Thread-1084::ERROR::2010-05-26 06:38:17,352::dispatcher::104::irs::Traceback (most recent call last):
Thread-1247::ERROR::2010-05-26 06:46:34,620::misc::58::irs::Meta Data parameter invalid: ('Version or spm id invalid',)
Thread-1247::ERROR::2010-05-26 06:46:34,620::misc::59::irs::Traceback (most recent call last):
Thread-1247::ERROR::2010-05-26 06:46:34,624::dispatcher::98::irs::{'status': {'message': "Meta Data parameter invalid: ('Version or spm id invalid',)", 'code': 755}, 'args': [('Version or spm id invalid',)]}

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Actual results:
data center becomes down, no host can take spm.

Expected results:
the metadata should be restored to last version

Additional info:

Comment 2 Cyril Plisko 2010-05-26 10:06:01 UTC
The problem happened due to NFS server becoming unavailable exactly at the time we are going to update the metadata. In general we have nothing to prevent the the unrecoverable metadata loss in case of underlying storage (NFS or SAN) going down on us during critical metadata update. We kinda relying on storage being available and, well, reliable.

If we are decide to not trust storage anymore we could develop two phased metadata commit process that would keep two (synchronized) copies of the metadata. Thus eliminating corruption of the only metadata copy we have.

Such addition would probably fail beyond the scope of 2.2, so I am re-targeting this issue to 6.0-2.3.

Feel free to change it if I am wrong.

Comment 3 Cyril Plisko 2010-05-26 10:09:14 UTC
See also bug 574733

Comment 5 RHEL Program Management 2010-06-07 16:21:28 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for

Comment 7 Itamar Heim 2013-01-30 22:51:01 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.

Note You need to log in before you can comment on or make changes to this bug.