Bug 961501
Summary: | mkfs.xfs: go into multidisk mode when geometry is on cmdline | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Eric Sandeen <esandeen> | |
Component: | xfsprogs | Assignee: | Eric Sandeen <esandeen> | |
Status: | CLOSED ERRATA | QA Contact: | Boris Ranto <branto> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 6.4 | CC: | bengland, bfoster, dchinner, eguan, nkhare, perfbz, rwheeler, vbellur | |
Target Milestone: | rc | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | xfsprogs-3.1.1-11.el6 | Doc Type: | Bug Fix | |
Doc Text: |
When stripe geometry was specified manually to the mkfs.xfs utility, mkfs.xfs did not properly select "multidisk mode" as it does when stripe geometry is automatically detected. As a result, a less than optimal number of allocation groups were created. With this update, multidisk mode is selected properly, and a larger number of allocation groups are created.
|
Story Points: | --- | |
Clone Of: | ||||
: | 968418 (view as bug list) | Environment: | ||
Last Closed: | 2013-11-21 21:19:38 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 968418, 971698 |
Description
Eric Sandeen
2013-05-09 18:23:48 UTC
xfstests xfs/292 tests this: # FS QA Test No. 292 # # Ensure mkfs with stripe geometry goes into multidisk mode # which results in more AGs perf team is tracking this for support of RHS 2.1, thanks Eric for requesting this. Is this fix making it into 6.4 z-stream? We need RHS to start using this. I did not see the above xfsprogs version in the RHEL6.5 nightly build dated June 4th, nor did I find it in http://download.lab.bos.redhat.com/composes/nightly/latest-RHEL6.5/6.5/Server/x86_64/os/Packages/, am I looking in the right place? I reproduced a problem with XFS using RHEL6.4 with glusterfs-3.4.0.8 if only 5 allocation groups with smallfile benchmark (Peter Portante reported it with Catalyst workload before). I think this fix would have prevented the problem by forcing more allocation groups, will try xfsprogs version that has fix. The "sync" command hangs for at least 15 min after I append to a bunch of small files, and perf utility shows that xfsalloc threads are spending time waiting on a spin lock, I think this is associated with an allocation group. This happened on multiple servers, not a hardware problem. 19.77% [kernel] [k] _spin_lock 15.91% [xfs] [k] xfs_alloc_busy_trim 12.96% [xfs] [k] xfs_btree_get_rec 8.34% [xfs] [k] xfs_alloc_get_rec 5.30% [xfs] [k] xfs_alloc_ag_vextent_near 5.16% [xfs] [k] xfs_btree_get_block 4.54% [xfs] [k] xfs_btree_increment 3.88% [xfs] [k] xfs_alloc_compute_aligned 3.43% [xfs] [k] xfs_btree_readahead 3.29% [xfs] [k] xfs_btree_rec_offset 2.26% [xfs] [k] _xfs_buf_find 2.19% [xfs] [k] xfs_btree_decrement 2.01% [xfs] [k] xfs_btree_rec_addr 0.99% [xfs] [k] xfs_trans_buf_item_match I'm getting stack traces like this in /var/log/messages: Jun 3 19:00:47 gprfs048 kernel: INFO: task sync:25314 blocked for more than 120 seconds. [root@gprfs048 ~]# xfs_info /mnt/brick0 meta-data=/dev/mapper/vg_brick0-lv isize=512 agcount=5, agsize=268435392 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=1167851520, imaxpct=5 = sunit=64 swidth=640 blks naming =version 2 bsize=8192 ascii-ci=0 log =internal bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@gprfs048 ~]# mount | grep xfs /dev/mapper/vg_brick0-lv on /mnt/brick0 type xfs (rw,noatime,inode64) workload: after this command runs successfully, run "sync" command: 13-06-04-09-07-04 : command 2352: cd /root/smallfile-v1.9.13 ; ./smallfile_cli.py --top /mnt/glusterfs/smf.d-pass2 --host-set gprfc088,gprfc089,gprfc090,gprfc091,gprfc092,gprfc094,gprfc095,gprfc096 --operation append --threads 4 --file-size 4 --record-size 0 --files-per-dir 100 --dirs-per-dir 10 --files 32768 --response-times Y --stonewall N --pause 500 > /shared/benchmarks/gluster_test/logs/13-06-02-20-02-16/smallfile.13-06-04-09-07-04 zstream has not yet been granted. Patience... the Wheels of Process must turn. For testing you can grab it from here: http://download.devel.redhat.com/brewroot/packages/xfsprogs/3.1.1/11.el6/ Thanks, -Eric when I use mkfs.xfs ... -d agcount=32,sw=10... I get a warning from mkfs. 32 is the value that your mkfs patch generates if sw=10 is specified (bz 961501). I get rid of the warning if I use 31 instead of 32 Also, the warning is slightly off -- it says that AG size is a multiple of stripe width, but it is not, right? Maybe it wants agcount value that has no common factors with sw so there is even wear on drives? Does it matter? Example: # mkfs -t xfs -f -i size=512 -n size=8192 -d agcount=32,su=256k,sw=10 -L RHSbrick0 /dev/vg_brick0/lv Warning: AG size is a multiple of stripe width. This can cause performance problems by aligning all AGs on the same disk. To avoid this, run mkfs with an AG size that is one stripe unit smaller, for example 36495296. meta-data=/dev/vg_brick0/lv isize=512 agcount=32, agsize=36495360 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1167851520, imaxpct=5 = sunit=64 swidth=640 blks naming =version 2 bsize=8192 ascii-ci=0 log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 It is a multiple, yes: In blocks: 36495360 / (256*1024/4096 * 10) = 57024.0 In this version and versions prior, mkfs will slightly lower the agsize if needed to avoid the multiple, by default. If you specify agcount it changes when calculations are done and sometimes issues this warning. If you think that's a problem which must be fixed in RHEL, it needs a new bug. The whole point of this change was to not *need* to specify an agcount on the cmdline in order to go into "multidisk mode." So behavior when specifying agcount is not relevant or related to this bug, nor is it changed behavior with this patch. is this change in RHEL6.5 as well? When I yum installed xfsprogs from http://download.lab.bos.redhat.com/nightly/latest-RHEL6.5/6.5/Server/x86_64/os I get this: [root@perf88 ~]# rpm -q xfsprogs xfsprogs-3.1.1-4.el6.x86_64 and mkfs does this: [root@perf88 network-scripts]# lvcreate --name brick1 --size 2750G vg_bricks /dev/sdb Logical volume "brick1" created [root@perf88 network-scripts]# mkfs -t xfs -L perf88-brk1 -i size=512 -n size=8192 -d su=256k,sw=10 /dev/vg_bricks/brick1 meta-data=/dev/vg_bricks/brick1 isize=512 agcount=4, agsize=180223936 blks = sectsz=512 attr=2 data = bsize=4096 blocks=720895744, imaxpct=5 = sunit=64 swidth=640 blks naming =version 2 bsize=8192 ascii-ci=0 log =internal log bsize=4096 blocks=352000, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 Sorry, user brain damage on my part. I updated pointer in yum repo file for RHEL65 but not ScalableFileSystem, so I didn't get the right xfsprogs. Never mind. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1657.html |