623846 – Filesystem corrupted when running parted on dm device

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 623846 - Filesystem corrupted when running parted on dm device

Summary: Filesystem corrupted when running parted on dm device

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	parted
Sub Component:
Version:	6.0
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Hans de Goede
QA Contact:	Release Test Team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	613754
TreeView+	depends on / blocked

Reported:	2010-08-12 23:49 UTC by Mike Burns
Modified:	2016-04-26 15:47 UTC (History)
CC List:	25 users (show)
Fixed In Version:	parted-2.1-10.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:	613754
Environment:
Last Closed:	2010-11-10 21:19:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Proposed workaround for parted (466 bytes, patch) 2010-08-16 18:08 UTC, Milan Broz	no flags	Details \| Diff
View All

Comment 2 RHEL Program Management 2010-08-13 00:17:53 UTC

This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 4 Ben Marzinski 2010-08-13 17:49:05 UTC

The reason that is was assigned to device-mapper is that the exact same thing happens if you use a simple linear device.  Whatever is causing the corruption is not in any multipath specific code.

Comment 5 Ben Marzinski 2010-08-13 17:57:16 UTC

Unfortunately, it's also not consistently reproduceable, but I have hit this multiple times with both a multipath device, and simple linear device.

Comment 6 Ben Marzinski 2010-08-13 18:35:14 UTC

Here's how I reproduced it.

# dmsetup create test --table "0 563214336 linear 8:16 0"
# parted /dev/mapper/test -s "mklabel gpt"
# parted /dev/mapper/test -s "mkpart primary ext2 0M 256M"
# parted /dev/mapper/test -s "mkpart primary ext2 256M 512M"
# mke2fs -t ext2 /dev/mapper/testp1 -L TEST1
# mke2fs -t ext2 /dev/mapper/testp2 -L TEST2
# findfs LABEL=TEST1

#THIS WORKS#

# parted /dev/mapper/test -s "mkpart primary ext2 512M -1"
# findfs LABEL=TEST1

#HERE IT FAILS. NOT ONLY THAT, THE FILESYSTEM IS NOW UNMOUNTABLE#

The only problem is that after reproducing it about three or four times, using both a multipath device, and a linear device, it stopped reproducing.  I haven't been able to hit it since, and I have no clue what the difference is.

Comment 7 Milan Broz 2010-08-13 19:19:02 UTC

Is it still reproducible, when you add "udevadm settle" between commands?

Comment 8 Milan Broz 2010-08-13 19:21:21 UTC

And are there any messages from device-mapper in syslog? (like device is busy etc)

Comment 9 Mike Burns 2010-08-13 19:34:46 UTC

(In reply to comment #7)
> Is it still reproducible, when you add "udevadm settle" between commands?    

In RHEV-H, we do run udevadm settle after each of the parted commands.  We still see this problem.

Comment 10 Mike Burns 2010-08-13 19:40:30 UTC

This issue is a blocker for RHEL 6 since it is causing a blocker in RHEV-H 6.  I've tried to work around the issue to no avail.

Comment 11 Mike Burns 2010-08-13 19:52:26 UTC

(In reply to comment #6)
> Here's how I reproduced it.
> 
> # dmsetup create test --table "0 563214336 linear 8:16 0"
> # parted /dev/mapper/test -s "mklabel gpt"

In RHEV-H, we're using msdos, not gpt.  Not sure if it matters, but that is a difference from what we're running.

Comment 12 Milan Broz 2010-08-13 19:56:40 UTC

Seems I am able to reproduce it (sometimes). This is probably bug in parted, I see that even after creating second parititon, "parted print" cannot see filesystem on the first partition. (blkid -p still see it though)

Anyway, there is clear data corruption.

Comment 13 Milan Broz 2010-08-13 20:00:34 UTC

(In reply to comment #6)
> # parted /dev/mapper/test -s "mkpart primary ext2 0M 256M"
produces
Warning: The resulting partition is not properly aligned for best performance

changing this to aligned offset
parted /dev/mapper/test -s "mkpart primary ext2 4M 256M"

fixes the problem. strange...

Comment 14 Milan Broz 2010-08-13 20:39:22 UTC

I am quite confused - parted do not see fs, after flush it see it again.

+ dmsetup create test --table '0 312500000 linear /dev/sdh 0'
+ parted /dev/mapper/test -s 'mklabel gpt'
+ parted /dev/mapper/test -s 'mkpart primary ext2 0M 256M'
Warning: The resulting partition is not properly aligned for best performance.
+ mke2fs -t ext2 /dev/mapper/testp1 -L TEST1
mke2fs 1.41.12 (17-May-2010)
Filesystem label=TEST1
...

+ parted /dev/mapper/test print
...
Number  Start   End    Size   File system  Name     Flags
 1      17.4kB  256MB  256MB               primary

* NO FS *

# parted /dev/mapper/test print
...
Number  Start   End    Size   File system  Name     Flags
 1      17.4kB  256MB  256MB               primary

# echo 3 > /proc/sys/vm/drop_caches

* FLUSH CACHE *

# parted /dev/mapper/test print
...
Number  Start   End    Size   File system  Name     Flags
 1      17.4kB  256MB  256MB  ext2         primary

and ext2 here again...

Comment 18 Milan Broz 2010-08-13 21:47:09 UTC

to be exact, proper 1MiB alignment is this, 1024k seems not to work properly...
mkpart primary ext2 2048s 256M

Comment 19 Hans de Goede 2010-08-13 21:59:30 UTC

(In reply to comment #18)
> to be exact, proper 1MiB alignment is this, 1024k seems not to work properly...
> mkpart primary ext2 2048s 256M    

Hmm 1024k should work just fine really. What do the device mapper
/sys/block/dm-#/queue/*_io_size attributes look like ?

Comment 20 Hans de Goede 2010-08-13 22:03:34 UTC

(In reply to comment #19)
> (In reply to comment #18)
> > to be exact, proper 1MiB alignment is this, 1024k seems not to work properly...
> > mkpart primary ext2 2048s 256M    
> 
> Hmm 1024k should work just fine really. What do the device mapper
> /sys/block/dm-#/queue/*_io_size attributes look like ?    

Duh never mind parted's cmdline uses 1000 as k size instead of 1024, how nice.

Comment 21 Milan Broz 2010-08-16 18:07:02 UTC

All this come to his commit:
http://git.debian.org/?p=parted/parted.git;a=commitdiff;h=2a6936fab4d4499a4b812dd330d3db50549029e0

"I've checked with two independend storage subsystem kernel
developers, and /dev/sda and /dev/sda#, guarantee cache coherency
now-a-days.  So there is no need to do this for 2.6, which also
eliminates the need to call _flush_cache() on device open at all."

Didn't you forgot about DM? :-)

Well, I think DM _should_ guarantee cache coherency on the whole device stack
(we should probably have separate bug here), apparently there is 
problem:
- we have one mapping over the whole device (multipath or linear) and partition mappings over that
- while mkfs operates on partition device, parted accesses directly to underlying device

Because in example the device is misaligned and GPT/part table write requires aligned write and so the real aligned write exceeds to partition mapping.

And it seems to use old cached data before fs was created, overwriting mkfs-ed partition data.
(The real corruption happends in ped_device_write() when writing part table with new partition.)

Comment 22 Milan Broz 2010-08-16 18:08:31 UTC

Created attachment 439001 [details]
Proposed workaround for parted

This workaround reintroduce flush-cache for changed DM device on open which seems to be enough to fix this problem.

Comment 26 Hans de Goede 2010-08-16 20:47:56 UTC

mbroz' patch which fixes this is in parted-2.1-10.el6, moving to modified.

Comment 27 Hans de Goede 2010-08-16 23:06:11 UTC

I've verified that the new parted does not break installs with both mdraid and dmraid bios raid set installs. Using both auto and custom partitioning.

Comment 29 Joey Boggs 2010-08-17 13:06:26 UTC

Retried this on RHEV-H again with updated parted and still end up with: 

...
Number  Start   End    Size  Type     File system  Name     Flags
 1      512B  256MB   256MB  primary 
 2      257M  512MB   255MB  primary     ext2

Comment 30 Milan Broz 2010-08-17 13:37:52 UTC

(if you have new reproducer please paste it here - it works for me - both for gpt and msdos partition)

Comment 32 Milan Broz 2010-08-17 16:20:05 UTC

Do you have on that system installed udisks package?

If so, can you remove it and try to reproduce it?

Comment 33 Mike Burns 2010-08-17 16:45:07 UTC

RHEV-H does not have udisks installed.
I tried a new spin of RHEV-H this morning and I'm not seeing the corruption anymore.  I'm hitting another issue but it doesn't appear to be related at this point. 

My install machine is an IBM Server using multipathed luns on an HP MSA for the storage device.

Comment 34 Mike Burns 2010-08-17 20:39:25 UTC

*** Bug 613754 has been marked as a duplicate of this bug. ***

Comment 36 Robert M Williams 2010-08-30 19:10:28 UTC

moving to verified based on comment 33

Comment 37 releng-rhel@redhat.com 2010-11-10 21:19:45 UTC

Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.

agk
bmarzins
borgan
christophe.varoqui
ddumas
dwysocha
egoggin
hdegoede
heinzm
jboggs
jbrassow
joe.thornber
junichi.nomura
kueda
llim
lmb
mbroz
mburns
michael.hagmann
ovirt-maint
prajnoha
prockai
rwilliam
syeghiay
tranlan