Bug 730433
Summary: | Error message when allocation group size too big is misleading | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | linuxteer | ||||||||
Component: | xfsprogs | Assignee: | Eric Sandeen <esandeen> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Boris Ranto <branto> | ||||||||
Severity: | low | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 6.1 | CC: | dchinner, eguan, tlavigne | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | xfsprogs-3.1.1-8.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-02-21 11:00:44 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
linuxteer
2011-08-12 22:22:02 UTC
Another good reason to stick with the defaults, and not fiddle with things like agcount? :) But agreed, it could be a better error message. (In reply to comment #0) > Description of problem: > In my test system I created a FS where the default allocation groups size > happened to be of about 1 GiB: > > [root@localhost ~]# mkfs.xfs -L nss6_1 -f -d su=512k,sw=20 -l > sunit=512,size=64m /dev/sdc > meta-data=/dev/sdc isize=256 agcount=37, agsize=268435328 blks > = sectsz=512 attr=2 > data = bsize=4096 blocks=9764864000, imaxpct=5 > = sunit=128 swidth=2560 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal log bsize=4096 blocks=16384, version=2 > = sectsz=512 sunit=64 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > > However if I specify less allocation groups then the default, instead of > getting a message complaining of AG size too big I get this error message: > > [root@localhost ~]# mkfs.xfs -L nss6_1 -f -d agcount=31,su=512k,sw=20 -l > sunit=512,size=64m ${dev1} > Allocation group size (314995613) is not a multiple of the stripe unit (128) That's not misleading - it's correctly detecting an error with the configuration you specified - but it's not the error you -expected-. All it means is that the alignment checks are done before the size checks. Indeed, we have to do the checking that way, because when we cater for AG alignment during automatic sizing it changes the size of the AGs. Hence the size checks must be done after the alignment checks. You've just triggered an alignment check failure before the size checks are done.... > Actual results: > "Allocation group size (314995613) is not a multiple of the stripe unit (128)" > > Expected results: > "Allocation group size (314995613 blks) is over maximum allocation groups size > of 1 TiB (268435328 blks)" Just because you are trying to trigger a specific error, it doesn't mean that the specific error you want to see is the only possible error that can occur from the given configuration. A different error occurs doesn't necessarily mean there is a bug in the program. > Or a message related to the AG size being over maximum allowed. > Also a nice improvement is to add the units for values displayed in the message > (blocks in this case). Yes, I agree the error message could be more verbose and mention units, but that is a secondary issues and unrelated to your (incorrect) expectation of what error should be detected given the input configuration. Created attachment 518343 [details]
Output for mkfs.xfs when agcount set to primes < 37
Created attachment 518344 [details]
Output for mkfs.xfs when agcount set to primes >= 37
Yes Dave, I did not check if alignment was a problem, since I was supplying mkfs.xfs with agcount, not agsize. AGsize was calculated by mkfs.xfs. Incidentally, I used prime numbers on agcount to avoid getting the warning: "Warning: AG size is a multiple of stripe width. This can cause performance problems by aligning all AGs on the same disk..." To me, after checking that the agsize with agcount=37 was close to 1 TB, the "obvious" issue was that a smaller count forced the agsize over the 1 TB limit. The agcount values of primes below 37 gave all the same error, except 2 and 5 which generate multiples of 128 and display what I thought was the correct message (which BTW includes units): "agsize (1952972800b) too big, maximum is 268435455 blocks" Providing agcount with primes 37 and over (up to 257), all worked fine. Even more, the agsize in the alignment error message (314995613) seems to be reporting the ceiling function for number of blocks from the data section (9764864000) divided by the agcount provided (for primes below 37, except 2 & 5). However, when specifying agcount with primes from 37 to 257, agsize was always adjusted to the next multiple of 128 (only 149 generates a multiple of 128), but for primes under 37 it was not. See adjunct files for details. When the defaults are allowed, the number of data blocks divided by 37 (default agcount) = 263915243.24 and default agsize selected was 268435328, the highest multiple of 128 below the 1 TB limit. So it seems to be optimizing for max agsize and then getting agcount. Without checking the source code, to a user of mkfs.xfs trying to manipulate agcount (in my case, for parallel scalability optimization purposes) it looks like the agsize is correctly selected/adjusted to be aligned to sunit But when using smaller values for agcount than default, it looks like a wrong error message was displayed. Especially if you see the 'agsize too big" error message for agcount set to 2 or 5. Created attachment 518347 [details]
Output for mkfs.xfs when agcount set to primes < 37
Selected the wrong file in the original upload. Apologies.
Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Dave, I do find this interesting, in the mkfs code: if ((tmp_agsize >= XFS_AG_MIN_BLOCKS(blocklog)) && (tmp_agsize <= XFS_AG_MAX_BLOCKS(blocklog))) { ... } else { if (nodsflag) { dsunit = dswidth = 0; } else { fprintf(stderr, _("Allocation group size (%lld) is not a multiple of the stripe unit (%d)\n"), (long long)agsize, dsunit); exit(1); } } At that point we have tried to round the agsize up and down to align it, and have found it to be too large in both cases. At the exit(1) point, it seems like it'd make some sense to point that out in the error message. agsize wasn't specified, it was calculated given the specified agcount. There were efforts to fix agsize up w.r.t. stripe geometry, but no efforts to make it fit within the maximum size; hence I tend to agree that if one specifies agcount so small that agsize is out of bounds, that does seem like a reasonable first error message to provide. Having said all that, this seems like the sort of thing which could be tweaked upstream, but doesn't necessarily rise to the level of requiring a RHEL package update... -Eric commit ddf12ea5dc56a728f24d24c5d7403c3412b40b86 Author: Eric Sandeen <sandeen> Date: Wed Mar 28 22:23:11 2012 -0500 mkfs.xfs: print std info if agcount makes agsize out of bounds When specifying a too-small agcount with stripe geometry, mkfs.xfs can fail with a somewhat unexpected message: $ mkfs.xfs -f -d file,name=fsfile,size=9764864000b,agcount=31,su=512k,sw=20 Allocation group size (314995613) is not a multiple of the stripe unit (128) This strikes me as especially odd because normally, mkfs.xfs tries to fix up the agsize to be a stripe multiple. The only way we get to the above error message is if ag _size_ is out of bounds; exiting with an error about alignment rather than about size seems odd. Maybe below is too clever, but if by the time we've decided that agsize is out of bounds after rounding it both up and down, as necessary, to get to a stripe-width multiple, calling validate_ag_geometry() will give us the same standard message as if we had specified no stripe geometry: $ mkfs/mkfs.xfs -f -d file,name=fsfile,size=9764864000b,agcount=31,su=512k,sw=20 agsize (314995613b) too big, maximum is 268435455 blocks Usage: mkfs.xfs ... $ mkfs/mkfs.xfs -f -d file,name=fsfile,size=9764864000b,agcount=31 agsize (314995613b) too big, maximum is 268435455 blocks Usage: mkfs.xfs ... Also, tidy up error message to explicitly state "blocks" not "b" Signed-off-by: Eric Sandeen <sandeen> Reviewed-by: Dave Chinner <dchinner> This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate, in the next release of Red Hat Enterprise Linux. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0481.html |