2211095 – mkfs.xfs crashes if agcount is big, e.g. 400

This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2211095 - mkfs.xfs crashes if agcount is big, e.g. 400

Summary: mkfs.xfs crashes if agcount is big, e.g. 400

Keywords:
Status:	CLOSED MIGRATED
Alias:	None
Product:	Red Hat Enterprise Linux 9
Classification:	Red Hat
Component:	xfsprogs
Sub Component:
Version:	9.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Eric Sandeen
QA Contact:	Zorro Lang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2210773
TreeView+	depends on / blocked

Reported:	2023-05-30 13:34 UTC by Roman Safronov
Modified:	2023-09-23 11:46 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-09-23 11:46:23 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHEL-7960	0	None	Migrated	None	2023-09-23 11:46:18 UTC
Red Hat Issue Tracker	RHELPLAN-158547	0	None	None	None	2023-05-30 14:04:34 UTC

Description Roman Safronov 2023-05-30 13:34:37 UTC

Description of problem:
The issue was found on attempt to restore a RHEL9 VM (openstack node) from a backup by means of ReaR tool.

We use xfs_growfs to increase fs size to 10GB and result is that agcount=400 and mkfs.xfs crashes
When using xfs_growfs  to increase fs size to 8GB, result is agcount=320 and mkfs.xfs does not crash


Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux release 9.2 (Plow)
xfsprogs-5.14.2-1.el9.x86_64

How reproducible:
100% if agcount is more than around 320 (I mean 320 is ok but 400 causes crash) 


Steps to Reproduce:
See details in https://bugzilla.redhat.com/show_bug.cgi?id=2155253#c31 how to reproduce the issue

Actual results:
mkfs.xfs crashes if agcount is big (e.g. 400)

Expected results:
mkfs.xfs does not crash

Additional info:

Comment 1 Eric Sandeen 2023-05-30 14:27:06 UTC

This bug doesn't make sense to me.

"We use xfs_growfs to increase fs size to 10GB and result is that agcount=400 and mkfs.xfs crashes"

you've specified two different utilities. How does "mkfs.xfs" crash when you're running "xfs_growfs?"

Step by step reproducer, please.

(Also, as an aside: 400+ AGs is pessimal. Who can we talk to to work out a deployment strategy that does not result in this sort of terrible filesystem geometry, which will result in poor performance in many cases?)

Comment 2 Eric Sandeen 2023-05-30 14:28:40 UTC

"See details in https://bugzilla.redhat.com/show_bug.cgi?id=2155253#c31 how to reproduce the issue" - there is no crash shown in that comment.

Comment 3 Pavel Cahyna 2023-05-30 14:34:49 UTC

Hi Eric,

the crash is described in https://bugzilla.redhat.com/show_bug.cgi?id=2155253#c23.

https://bugzilla.redhat.com/show_bug.cgi?id=2155253#c31 merely describes how we created a filesystem with settings that we attempt to reproduce using the command in comment 23.

Comment 4 Eric Sandeen 2023-05-30 14:56:05 UTC

ok, so a full reproducer is:

# truncate --size=10049552384 fsfile
# mkfs.xfs -f -m uuid=23ce7347-fce3-48b4-9854-60a6db155b16 -i size=512 -d agcount=400 -s size=512 -i attr=2 -i projid32bit=1 -m crc=1 -m finobt=1 -b size=4096 -i maxpct=25 -d sunit=128 -d swidth=128 -l version=2 -l sunit=128 -l lazy-count=1 -n size=4096 -n version=2 -r extsize=4096 fsfile
mkfs.xfs: xfs_mkfs.c:3016: align_ag_geometry: Assertion `!cli_opt_set(&dopts, D_AGCOUNT)' failed.
Aborted (core dumped)

Comment 5 Pavel Cahyna 2023-05-30 15:01:53 UTC

Eric, yes.

Since our goal here is to recreate a filesystem that is as close to the original as feasible, I wonder whether it would be better to use -d agsize= instead of -d agcount= to match the settings of the original. Is there a reason to prefer one to the other (fixed agsize vs. fixed agcount when creating a filesystem)?

Comment 6 Eric Sandeen 2023-05-30 15:25:53 UTC

The underlying problem here seems to be that xfs_growfs has created a geometry that mkfs.xfs won't accept.

When the filesytem was grown, it created the last (400th) AG with only 2048 blocks, or 8MB.
mkfs.xfs will not allow an AG to be less than 16 MB, so it fails.

If I start with a smaller filesystem with 6144-block AG size, and then grow it to the size in your case, we can see this:

# truncate --size=10049552384 fsfile
# mkfs.xfs -f -m uuid=23ce7347-fce3-48b4-9854-60a6db155b16 -i size=512 -d agcount=40,size=1006632960 -s size=512 -i attr=2 -i projid32bit=1 -m crc=1 -m finobt=1 -b size=4096 -i maxpct=25 -d sunit=128 -d swidth=128 -l version=2 -l sunit=128 -l lazy-count=1 -n size=4096 -n version=2 -r extsize=4096 fsfile
meta-data=fsfile                 isize=512    agcount=40, agsize=6144 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=245760, imaxpct=25
         =                       sunit=16     swidth=16 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1872, version=2
         =                       sectsz=512   sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

# mount -o loop fsfile mnt

# xfs_growfs mnt
meta-data=/dev/loop0             isize=512    agcount=40, agsize=6144 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=245760, imaxpct=25
         =                       sunit=16     swidth=16 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1872, version=2
         =                       sectsz=512   sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 245760 to 2453504

# xfs_info mnt
meta-data=/dev/loop0             isize=512    agcount=400, agsize=6144 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=2453504, imaxpct=25
         =                       sunit=16     swidth=16 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1872, version=2
         =                       sectsz=512   sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

# bc
2453504%6144
2048

2048 4k blocks is only 8 MB, but mkfs.xfs enforces 16MB.

xfs_growfs should probably not be creating this geometry; that's probably a kernel bug. If the goal of REAR is to make exactly the same geometry as the original, it seems that this may not be possible, at least not with all versions of mkfs.xfs...

Yes, you could use i.e. "-d agsize=6144b" instead (you need to specify the "-b size=" parameter before you can use the "b" unit in later specifications.

However, that will result in a slightly smaller filesystem, because it will lop off the last AG since it's too small.

All this said, it is a pity to be so carefully reproducing such a poor filesystem geometry :(

Comment 7 Pavel Cahyna 2023-05-30 16:26:21 UTC

(In reply to Eric Sandeen from comment #6)
> xfs_growfs should probably not be creating this geometry; that's probably a
> kernel bug. If the goal of REAR is to make exactly the same geometry as the
> original, it seems that this may not be possible, at least not with all
> versions of mkfs.xfs...

We can relax this to "reasonably similar geometry, exact if possible", important is that the creation be reliable.

> Yes, you could use i.e. "-d agsize=6144b" instead (you need to specify the
> "-b size=" parameter before you can use the "b" unit in later specifications.
> 
> However, that will result in a slightly smaller filesystem, because it will
> lop off the last AG since it's too small.

Is using agsize instead of agcount something that could be done always? I am afraid of creating regressions in some other cases.

> All this said, it is a pity to be so carefully reproducing such a poor
> filesystem geometry :(

ReaR is a backup and recovery tool, not some storage optimizer, so this is expected. Garbage in, garbage out. (That said, one may specify other XFS options and they will override these from the original system, but they will override everything.)

I am curious, which part of the filesystem geometry here is poor?

Comment 8 Pavel Cahyna 2023-05-30 16:31:25 UTC

AIUI, also, this kind geometry can be produced by the xfs tools using default (or almost) settings.

Comment 9 Eric Sandeen 2023-05-30 16:52:37 UTC

The poor geometry is a relatively small filesystem with an inordinately large number of AGs. In the case at https://bugzilla.redhat.com/show_bug.cgi?id=2155253#c24, this is a 9GB filesystem with 400 24-megabyte allocation groups.

A normal mkfs would have created 4 AGs, not 400. 400 AGs is most likely a result of the suboptimal deployment of these images, where an extremely small source image gets created, deployed, and then grown onto whatever storage is available on the target.

The only time mkfs would create 400AGs by default is for a 400T filesystem (the maximum AG size is 1T)

Comment 10 Pavel Cahyna 2023-05-31 08:13:19 UTC

Eric, yes, here is the information about the original image: https://bugzilla.redhat.com/show_bug.cgi?id=2155253#c27
It has agcount=2, agsize=6144 blks
bsize=4096   blocks=12288

Maybe the OpenStack colleagues would be interested to know what would be a "right" source image size to create for subsequent growing? Is it around 1 TB (in order to create one 1TB AG)? To me this deployment method looks reasonable and I have used it in the past as well. Not sure if they have the option of choosing a size that is less "extremely small", though.

Comment 11 Pavel Cahyna 2023-05-31 16:09:19 UTC

Eric, by the way, couldn't xfs_growfs resize the AG if it is way too small, instead of adding more AGs of the same size? And why is agcount=2 and not 1 for the original filesystem, when the AG size is already way too small?

Comment 15 Eric Sandeen 2023-06-08 22:56:57 UTC

1) AG sizes are fixed after mkfs time. There has been some speculation about merging small AGs to resize them during growfs, but this sort of change is not trivial.
2) We have had multiple discussions w/ the openstack folks about how this is suboptimal, but we've not yet reached any consensus on deployment changes.
3) I have raised the issue of differing minimum AG sizes between userspace/mkfs and kernel/growfs. This will get sorted out, but it won't change filesystems already in the field.

For the purposes of this particular bug, it's probably best to focus on whether it is really necessary or wise to perfectly reproduce the exact geometry of the original filesystem. As we've seen, it appears that for at least some filesystems and/or versions of xfsprogs, this may not be possible.

Indeed, restore may be the perfect time to remedy a very suboptimal filesystem geometry, and/or inherit improved defaults from newer versions of mkfs.xfs.

Comment 16 Eric Sandeen 2023-06-22 17:47:20 UTC

I'm not sure how to proceed on this one. It's not the ag count, but rather the (too-small last) ag size that is making mkfs.xfs throw an assert.

We could probably be a bit more graceful than an ASSERT but that still doesn't really solve the problem.

dchinner is looking at the ag sizing constraints upstream. I suppose if we end up relaxing mkfs.xfs constraints, newer mkfs will agree to create the same filesystem that growfs has produced. If it goes the other way, and upstream decides growfs needs to stop creating this small AG, then there's not going to be any way for mkfs to recreate such a filesystem...

Comment 17 Pavel Cahyna 2023-06-23 09:31:00 UTC

> it's probably best to focus on whether it is really necessary or wise to perfectly reproduce the exact geometry of the original filesystem

A backup & recovery & cloning tool like ReaR can't really decide when it is necessary or wise to reproduce the exact geometry of the original filesystem. If you are suggesting that it is never really necessary, I would like to know what is the alternative to "perfectly reproduce the exact geometry" then. As I said above, we can relax the requirement to "reasonably similar geometry, exact if possible", but how to achieve that? Is there a mkfs.xfs option to reproduce the geometry inexactly?

In this particular case, setting -d agsize= instead of -d agcount= seems to do the trick, but, as I asked above, is using agsize instead of agcount something that could be done always? I am afraid of creating regressions in some other cases. Can you guarantee that setting agsize will work reliably in all the cases where setting agcount is working now?

Comment 18 Eric Sandeen 2023-06-23 13:46:15 UTC

Yeah, I understand that it is tricky. This is a corner case inconsistency in xfs, which makes your job almost impossible, if your job is to exactly recreate existing on-disk geometry of the original system. Today the only way to get there is mkfs followed by growfs, and ReaR has no way to know that.

Using -d agsize instead of -d agcount is relying on undocumented behavior, so I would not be eager to suggest that either. And if it lops off the last AG, you'll get a slightly smaller filesystem, which could be problematic in the (also pessimal) case of a nearly 100% full filesystem. So no, I can't guarantee that that will work reliably in all cases. It's essentially using undocumented internal mkfs.xfs heuristics (which, frankly, can change over time.)

The alternative to perfectly reproducing the original geometry is to just take the mkfs.xfs defaults, which is something we almost always suggest to users anyway. But of course that doesn't cover all the bases either. Sometimes (rarely, but sometimes) users do have good reason to override defaults, and they may want that back.

I suppose one option - and maybe this doesn't fit within how ReaR is supposed to work, as it is less "relaxing" - is to run mkfs.xfs -N to see what mkfs.xfs would do by default. Compare that to the original geometry. If it differs, ask the user if they'd like to use current upstream defaults (recommended), or attempt to recreate the existing geometry. That might be a lot more interactive than you desire, though.

I'll keep an eye on the upstream developments; if we end up accepting the tiny AG as a tolerable minimum then we can fix this behavior for you. If it's decided that this is a bug in growfs and that the tiny ag is not acceptable going forward, then mkfs isn't going to allow them in the future either, and we're left with ReaR not being allowed to recreate a filesystem geometry that's considered buggy.

If you need a /tolerable/ (but not foolproof) simple suggestion for now, using agsize= is probably reasonable, with the caveat that your resulting filesystem might be slightly smaller than the original.

Comment 19 Pavel Cahyna 2023-06-23 14:23:03 UTC

(In reply to Eric Sandeen from comment #18)

> If you need a /tolerable/ (but not foolproof) simple suggestion for now,
> using agsize= is probably reasonable, with the caveat that your resulting
> filesystem might be slightly smaller than the original.

Do you mean using it as a fallback when mkfs fails with -d agcount ? Because above you write 

> Using -d agsize instead of -d agcount is relying on undocumented behavior,
> so I would not be eager to suggest that either. 

By the way, why is it undocumented behavior ? I see -d agsize described in mkfs.xfs(8).

Comment 20 Eric Sandeen 2023-06-23 18:35:24 UTC

Sure, "-d agsize" is documented, but the details of the heuristics around minimum sizing for the last AG, and whether it will get lopped off, are not described.

agsize and agcount are closely related, of course.  agcount == (disk size) / (agsize) more or less. But there are fiddly bits related to how that last allocation group is handled; essentially the left-over modulo.

I suppose using -d agsize as a fallback when your current -d agcount approach fails might make the most sense for now, yes. It would keep you closest to the original geometry.

Comment 21 RHEL Program Management 2023-09-23 11:43:49 UTC

Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 22 RHEL Program Management 2023-09-23 11:46:23 UTC

This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.

Note You need to log in before you can comment on or make changes to this bug.