RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1212655 - mkfs.xfs does not read/detect lvm raid layout when stripe unit is 4k
Summary: mkfs.xfs does not read/detect lvm raid layout when stripe unit is 4k
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: xfsprogs
Version: 7.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Eric Sandeen
QA Contact: Filesystem QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-16 22:52 UTC by lejeczek
Modified: 2015-04-22 16:27 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-04-22 16:27:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description lejeczek 2015-04-16 22:52:05 UTC
Description of problem:

if I remember correctly mkfs.xfs would look into device and take care of stips and stripes. From my old notes, long time ago:

  --- Segments ---
  Logical extent 0 to 228929:
    Type    striped
    Stripes   2
    Stripe size   64.00 KiB
    Stripe 0:
      Physical volume /dev/sds
      Physical extents  0 to 114464
    Stripe 1:
      Physical volume /dev/sdt
      Physical extents  0 to 114464

# and mkfs.xfs does find underlaying geometry correctly!!

mkfs.xfs /dev/mapper/h200Internal-0
meta-data=/dev/mapper/h200Internal-0 isize=256    agcount=32, agsize=7325744 blks
         =                       sectsz=4096  attr=2, projid32bit=0
data     =                       bsize=4096   blocks=234423808, imaxpct=25
         =                       sunit=16     swidth=32 blks # <------------ HERE

today, I created a lv like this:

lvcreate --type raid5 -i 9 -I 4 -n raid5 -l 100%pv 5tb.Toshiba-Lot $(echo /dev/sd{g..p}

so it uses 10 drives/pvs, then usual mkfs.xfs but then:

meta-data=/dev/5tb.Toshiba-Lot/raid5 isize=256    agcount=41, agsize=268435455 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=10988467200, imaxpct=5
         =                       sunit=0      swidth=0 blks # <--HERE ?????
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

is this a bug??


Version-Release number of selected component (if applicable):

xfsprogs-3.2.1-6.el7.x86_64
3.10.0-229.1.2.el7.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Eric Sandeen 2015-04-16 22:59:20 UTC
Can you provide the output of:

# blockdev --getss --getpbsz --getiomin --getioopt --getbsz /dev/5tb.Toshiba-Lot/raid5

please?

thanks,
-Eric

Comment 3 lejeczek 2015-04-16 23:08:07 UTC
blockdev --getss --getpbsz --getiomin --getioopt --getbsz /dev/5tb.Toshiba-Lot/raid5
512
4096
4096
36864
4096

Comment 4 Eric Sandeen 2015-04-16 23:13:40 UTC
So:

Sector size: 512
Physical block size: 4096
Minimum IO size: 4096
Optimal IO size: 36k

mkfs uses minimum and optimal sizes for stripe unit and stripe width:

        val = blkid_topology_get_minimum_io_size(tp);
        *sunit = val;
        val = blkid_topology_get_optimal_io_size(tp);
        *swidth = val;

but a stripe unit which matches the physical block size is not a stripe unit at all:

        /*
         * If the reported values are the same as the physical sector size
         * do not bother to report anything.  It will only cause warnings
         * if people specify larger stripe units or widths manually.
         */
        if (*sunit == *psectorsize || *swidth == *psectorsize) {
                *sunit = 0;
                *swidth = 0;
        }

That's why it's coming up zero.

And sunit == psectorsize because that's what you specified on the lvcreate commandline, with -I 4:

       -I, --stripesize StripeSize
              Gives the number of kilobytes for the granularity of the stripes.

so the problem here is your lvcreate commandline, I think.  Did you really want a 4k stripe unit?

-Eric

Comment 5 Eric Sandeen 2015-04-17 02:11:46 UTC
So, I'm inclined to close this NOTABUG.  mkfs.xfs intentionally ignores a stripe unit of the size you specified... thoughts?

Comment 6 lejeczek 2015-04-17 06:03:30 UTC
yes, it seems ok when, eg. -I 8

mkfs.xfs /dev/5tb.Toshiba-Lot/raid5 
meta-data=/dev/5tb.Toshiba-Lot/raid5 isize=256    agcount=41, agsize=268435454 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=10988467200, imaxpct=5
         =                       sunit=2      swidth=18 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

but what's wrong with stripe size of 4Kb? I though arrays spanning larger number of drives are better off with smaller stripe sizes so I went with 4Kb.

but also(or separate thing happens to my ext4 on the same VG, this time it's a simple stripe:

lvcreate -i 4 -I 4 -n raid0

and dumpe2fs -h

Inode blocks per group:   128
RAID stripe width:        4
Flex block group size:    16
Filesystem created:       Thu Apr 16 23:56:14 2015

so it seems that 4K is the rule across the OS, what's the reasoning behind it?

Comment 7 Eric Sandeen 2015-04-17 19:15:13 UTC
The reason mkfs.xfs rejects/ignores stripe unit == physical sector size as a valid stripe geometry is because some non-striped storage reports an "optimal IO size" as the physical sector size:

commit 3dc7147f03cdd4cfe689d78d4ca4b2650c49a263
Author: Eric Sandeen <sandeen>
Date:   Wed Dec 12 17:26:24 2012 -0600

    mkfs.xfs: don't detect geometry values <= psectorsize
    
    blkid_get_topology() ignores devices which report 512
    as their minimum & optimal IO size, but we should ignore
    anything up to the physical sector size; otherwise hard-4k
    sector devices will report a "stripe size" of 4k, and warn
    if anything larger is specified:
    
    # modprobe scsi_debug physblk_exp=3 num_parts=2 dev_size_mb=128
    # mdadm --create /dev/md1 --level=0 --raid-devices=2  -c 4 /dev/sdb1 /dev/sdb2
    # mkfs.xfs -f -d su=16k,sw=2 /dev/md1
    mkfs.xfs: Specified data stripe unit 32 is not the same as the volume stripe unit 
    mkfs.xfs: Specified data stripe width 64 is not the same as the volume stripe widt
    ...

Generally you'll want stripe units larger than a 4k, which is a single filesystem block.  Unless there's a compelling reason to need a 4k stripe unit, I think your best path forward is to just create your raid with a larger stripe unit.

Comment 8 Eric Sandeen 2015-04-22 16:27:03 UTC
Closing NOTABUG; this is working as designed.


Note You need to log in before you can comment on or make changes to this bug.