Bug 1607527
Summary: | data corruption with 'split' workload to XFS on DM cache with its 3 underlying devices being on same NVMe device | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Mike Snitzer <msnitzer> |
Component: | kernel | Assignee: | Mike Snitzer <msnitzer> |
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | agk, airlied, bgurney, bhull, bmarzins, bskeggs, cmarthal, ddouwsma, dmilburn, emilne, ewk, hartsjc, hdegoede, ichavero, idryomov, itamar, jarodwilson, jbrassow, jglisse, jmoyer, john.j5live, jonathan, josef, jpittman, keith.busch, kent.overstreet, kernel-maint, lilin, linville, loberman, mchehab, minlei, mjg59, mpatocka, msnitzer, rhandlin, steved, storage-qe, thornber, vumrao, zlang, zyan |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1605222 | Environment: | |
Last Closed: | 2018-07-25 15:12:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Mike Snitzer
2018-07-23 16:16:20 UTC
Doesn't seem to be specific to NVMe. I reproduced this on a system with an Intel NVMe card, but also with a 25GB scsi_debug device configured as follows: modprobe scsi_debug dev_size_mb=25600 Command (m for help): n Partition type: p primary (0 primary, 0 extended, 4 free) e extended Select (default p): p Partition number (1-4, default 1): 1 First sector (2048-52428799, default 2048): Using default value 2048 Last sector, +sectors or +size{K,M,G} (2048-52428799, default 52428799): +5G Partition 1 of type Linux and of size 5 GiB is set Command (m for help): n Partition type: p primary (1 primary, 0 extended, 3 free) e extended Select (default p): p Partition number (2-4, default 2): First sector (10487808-52428799, default 10487808): Using default value 10487808 Last sector, +sectors or +size{K,M,G} (10487808-52428799, default 52428799): Using default value 52428799 Partition 2 of type Linux and of size 20 GiB is set Command (m for help): p Disk /dev/sdb: 26.8 GB, 26843545600 bytes, 52428800 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 32768 bytes Disk label type: dos Disk identifier: 0x1707677c Device Boot Start End Blocks Id System /dev/sdb1 2048 10487807 5242880 83 Linux /dev/sdb2 10487808 52428799 20970496 83 Linux #!/bin/sh modprobe dm-service-time modprobe dm-cache-smq DEVICE=/dev/sdb2 SIZE=`blockdev --getsz $DEVICE` echo "0 $SIZE multipath 2 queue_mode rq 0 1 1 service-time 0 1 2 $DEVICE 1000 1" | dmsetup create sdb_mpath [root@rhel-storage-61 ~]# more .dmtest/config profile :sdb_shared do metadata_dev '/dev/sdb1' #data_dev '/dev/sdb2' data_dev '/dev/mapper/sdb_mpath' end default_profile :sdb_shared diff --git a/lib/dmtest/tests/cache/io_use_tests.rb b/lib/dmtest/tests/cache/io_use_tests.rb index 1ada8ad..e68a6a8 100644 --- a/lib/dmtest/tests/cache/io_use_tests.rb +++ b/lib/dmtest/tests/cache/io_use_tests.rb @@ -91,7 +91,7 @@ class IOUseTests < ThinpTestCase :cache_size => gig(4), #:cache_size => gig(46), # would like to get up to 512GB to match customer but... - :data_size => gig(48), + :data_size => gig(19), ##:data_size => gig(210), #:io_mode => :writeback, :io_mode => :writethrough, (In reply to Ewan D. Milne from comment #1) > Doesn't seem to be specific to NVMe. I reproduced this on a system with an > Intel NVMe card, but also with a 25GB scsi_debug device configured as > follows: Yeap, thanks for doing that test Ewan. It is clear that generic_make_request() isn't used for request-based DM cloned request submission. And that request-based DM simply cannot be layered on conventional DOS partitions as is. Closing this bug as WONTFIX. I may circle back to hardening the kernel to either support it or at least prevent this configuration (e.g. drivers/md/dm-mpath.c:multipath_ctr() could check if device is a partition). Hello Mike So I need to make sure everybody is aware that we have many customers using fdisk on a multipath device and creating a single (DOS) type partition. They then take this and either create a PV on it for LVM or add it to ASM as a raw device or even run mkfs on the mpath*p1 Its worked forever, but its always been a single partition, but is till a partition. What part am I not understanding about generic request based DM and partitions that has not seen major corruptions since all this of late showed up. Thanks Laurence (In reply to loberman from comment #3) > Hello Mike > > So I need to make sure everybody is aware that we have many customers using > fdisk on a multipath device and creating a single (DOS) type partition. > > They then take this and either create a PV on it for LVM or add it to ASM as > a raw device or even run mkfs on the mpath*p1 > > Its worked forever, but its always been a single partition, but is till a > partition. > > What part am I not understanding about generic request based DM and > partitions that has not seen major corruptions since all this of late showed > up. If you look at Mike's test, the multipath device itself is on top of the partition. Multipath is not running on the whole device. The multipath tools do not allow this, and never have. The only way you can do this is to manually create the device table, like Mike is doing. Customers whole multipath the whole device, and then create a partition on it (which will get mapped to a kpartx device on top of multipath) will not encounter this bug. > Thanks > Laurence OK, That makes perfect sense. That was my foolish misunderstanding. Thanks Laurence The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |