Description of problem: Openstack packages such as overcloud-full-13.0-20181029.1.x86_64.tar contain small prebuilt qcow2 images that are copied onto storage with qemu-image convert during deployment. They are subsequently grown to use the space available on the installation target. The problem with this is that the filesystem is created with geometry specific to the initial image, which is submarvolous when moved to the real target. Problems include - Selection of the size of an allocation group that is only appropriate for a small device means that growing the filesystem will create an unnecessarily large number of allocation groups. This can lead to performance problems and file and free space fragmentation. - The filesystem log will be sized for the initial filesystem, for larger filesystems this will be too small and will be a performance bottleneck. Example # tar xf ~/overcloud-full-13.0-20181029.1.x86_64.tar # ll total 1255388 -rw-r--r--. 1 root root 62455185 Oct 31 2018 overcloud-full.initrd -rw-r--r--. 1 root root 1216217088 Oct 31 2018 overcloud-full.qcow2 -rw-r--r--. 1 root root 55244 Oct 31 2018 overcloud-full-rpm.manifest -rw-r--r--. 1 root root 144118 Oct 31 2018 overcloud-full-signature.manifest -rwxr-xr-x. 1 root root 6635920 Oct 31 2018 overcloud-full.vmlinuz Create a test image and install the qcow as to mimic https://github.com/openstack/ironic-python-agent/blob/stable/queens/ironic_python_agent/shell/write_image.sh # truncate -s$((468843089*4096)) test.image # ls -slh test.image 0 -rw-r--r--. 1 root root 1.8T Sep 13 10:55 test.image # losetup --find --show test.image /dev/loop0 # qemu-img convert -t directsync -O host_device overcloud-full.qcow2 /dev/loop0 # mount /dev/loop0 /mnt # xfs_info /mnt meta-data=/dev/loop0 isize=512 agcount=4, agsize=327500 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0, rmapbt=0 = reflink=0 data = bsize=4096 blocks=1310000, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 Grow the filesystem over the space available in the device, simulating later steps. # xfs_growfs /mnt meta-data=/dev/loop0 isize=512 agcount=4, agsize=327500 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0, rmapbt=0 = reflink=0 data = bsize=4096 blocks=1310000, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 data blocks changed from 1310000 to 468843089 # xfs_info /mnt meta-data=/dev/loop0 isize=512 agcount=1432, agsize=327500 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0, rmapbt=0 = reflink=0 data = bsize=4096 blocks=468843089, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 # df -h /mnt Filesystem Size Used Avail Use% Mounted on /dev/loop0 1.8T 3.2G 1.8T 1% /mnt We now have a 1.8TB filesystem, with 1432 allocation groups, and 2560 block log. Compare this with a filesystem created by mkfs.xfs for the target device which has 4 allocation large allocation groups, and a 228927 block log. # truncate -s$((468843089*4096)) test.image # mkfs.xfs test.image meta-data=test.image isize=512 agcount=4, agsize=117210773 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=468843089, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=228927, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 Version-Release number of selected component (if applicable): overcloud-full-13.0-20181029.1.x86_64.tar overcloud-full-14.0-20190816.1.x86_64.tar and likely others. How reproducible: Always Steps to Reproduce: 1. Deploy the an image on hardware with a large disk, or use an image as above.
Could you try increasing the disk size in the nova flavors? This is what we use for the root partition size in ironic. If it helps, we can document it.
Wouldn't that just change the size of the partition that the root image is grown into? In my example I'm simulating a large partition, its not that the filesystem cannot grow into it, but that the filesystem metadata is restricted by the decisions mkfs.xfs made when creating the original root filesystem in the qcow2 image during the build process. Are root partitions > 1TB outside of the design criteria for these images?
Oh, I see. Yes, it may be a downside of the image-based approach, and your best bet right now may be creating your own images. We'll discuss what else can be done. If we could configure partitioning, then you could avoid having such a huge root partition (and allocate /opt, /home, /src or whatever you need instead).
Recommendation for this use-case is to use whole-disk images. There is no plan to change the partition image formatting.