Bug 2044108 - partitions do not end on 4KiB aligned boundary on 512 logical sector drives
Summary: partitions do not end on 4KiB aligned boundary on 512 logical sector drives
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: gdisk
Version: 36
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Terje Røsten
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-23 22:38 UTC by Chris Murphy
Modified: 2022-05-07 04:25 UTC (History)
4 users (show)

Fixed In Version: gdisk-1.0.9-1.fc35 gdisk-1.0.9-1.fc34 gdisk-1.0.9-1.fc36
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-21 21:22:00 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Chris Murphy 2022-01-23 22:38:44 UTC
Description of problem:

When partitioning a drive with a single partition, gdisk defaults to ending that partition on "last usable sector" which often results in the partition not ending on a 4KiB boundary.

As a consequence, 'cryptsetup luksFormat --sector-size 4096' will fail.


Version-Release number of selected component (if applicable):
gdisk-1.0.8-2.fc35.x86_64

How reproducible:
Always if the last sector happens to not be 4096 byte aligned

Steps to Reproduce:
1. partition the drive with 1 partition, accepting all default values (for start and end LBA)
2.
3.

Actual results:

partition 1 end sector is 1465149134


Expected results:

partition 1 end sector should be 1465149127


Additional info:

[root@fedora ~]# blockdev --getsize64 /dev/vdb
750156374016
[root@fedora ~]# blockdev --getbsz /dev/vdb
4096
[root@fedora ~]# blockdev --getpbsz /dev/vdb
4096
[root@fedora ~]# blockdev --getss /dev/vdb
512

[root@fedora ~]# gdisk /dev/vdb
GPT fdisk (gdisk) version 1.0.8

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: not present

Creating new GPT entries in memory.

Command (? for help): n
Partition number (1-128, default 1): 
First sector (34-1465149134, default = 2048) or {+-}size{KMGTP}: 
Last sector (2048-1465149134, default = 1465149134) or {+-}size{KMGTP}: 
Current type is 8300 (Linux filesystem)
Hex code or GUID (L to show codes, Enter = 8300): 
Changed type of partition to 'Linux filesystem'

Command (? for help): p
Disk /dev/vdb: 1465149168 sectors, 698.6 GiB
Sector size (logical/physical): 512/4096 bytes
Disk identifier (GUID): 0277F358-5ED9-4379-8858-DE97F1D80A0C
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 1465149134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048      1465149134   698.6 GiB   8300  Linux filesystem

Command (? for help): w

Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!

[root@fedora ~]# cryptsetup luksFormat --sector-size 4096 /dev/vdb1
WARNING: Device /dev/vdb1 already contains a 'crypto_LUKS' superblock signature.

WARNING!
========
This will overwrite data on /dev/vdb1 irrevocably.

Are you sure? (Type 'yes' in capital letters): YES
Enter passphrase for /dev/vdb1: 
Verify passphrase: 
Device size is not aligned to requested sector size.
[root@fedora ~]# 


The problem is currently less bad than it could be, because luksFormat by default currently uses 512 bytes by default in this case, but it looks like it was intended that 512e drives should get 4096 byte LUKS sector size by default.
https://fedoraproject.org/wiki/Changes/LUKSEncryptionSectorSize

See bug 2044107

Comment 1 Chris Murphy 2022-01-23 22:41:54 UTC
This is more of a feature request than a bug. I can see the logic of ending at last usable, but as most all file systems now use 4096 byte block sizes anyway, if the last 512 byte sectors don't add up to 4096 bytes, they won't be used anyway.

Comment 2 Rod Smith 2022-01-24 01:22:27 UTC
I could add an option to do this, but it seems to me that this is more of a bug in cryptsetup than in GPT fdisk. I could find nothing in the UEFI/GPT specification that addresses where partitions should end, although there is such information relating to partition start points (on p. 118 of the UEFI 2.8 specification).

FWIW, I just did some tests, and Linux fdisk does the same thing as gdisk -- it ends a partition occupying the whole disk at the last usable sector, as defined in the GPT data structures. GNU parted ends it early, to align the end point on a 1024KiB boundary.

I'll give some thought to how such an endpoint-alignment feature might be implemented in GPT fdisk. The design philosophy of GPT fdisk is to give the user total control over the disk data structures, within the confines of what's valid in GPT (or sometimes even when it's not, but that is well defined outside of it, as in hybrid MBRs). Restricting end points in all cases would be a break from this design philosophy, and could create problems -- for instance, it would then become impossible to re-create a partition table that had set end points later than an aligned-endpoint requirement would dictate. Thus, to implement this feature, I would need to give users some way to specify that they want an aligned end-point rather than to use all available space, and perhaps to specify what alignment value is required. (4096 bytes is, as you say, common, but it's not guaranteed to be the best option in all situations, especially in the future.)

Comment 3 Chris Murphy 2022-01-24 05:02:40 UTC
It's a good point that cryptsetup/dm-crypt probably needs to ignore or map out the last 512-byte sectors when it's a fraction of 4096 bytes, and report the resulting (reduced) logical block device accordingly. And for that matter, what's the behavior if the underlying partition ends on a 4096 byte boundary at create time, but subsequently is resized and then isn't? Does cryptsetup fail to open such a LUKS device? Seems not very fail safe. I'll start a thread on the cryptsetup list.

Comment 4 Chris Murphy 2022-01-24 14:54:25 UTC
OK so what cryptsetup does by default is if physical sector size is 4096 bytes and the block device starts and ends on 4096 byte boundaries, you get a 4096 byte LUKS sector size. If there's misalignment in any way, including in my case of 7 dangling 512 byte sectors at the end, it falls back to using 512 byte sectors.

The reason it fails in the above case is because I passed --sector-size 4096 explicitly.

So yeah there isn't a bug here per se. But I still think it's suboptimal. The UEFI spec didn't really consider one way or another the fact it's the backup GPT that results in the misalignment in only the last "sector" at the end of the last partition.

Comment 5 Chris Murphy 2022-01-24 23:07:10 UTC
Started a thread on the cryptsetup/dm-crypt list
https://marc.info/?l=dm-crypt&m=164306225923513&w=2

Comment 6 Karel Zak 2022-01-26 13:27:20 UTC
The problem is the last partition on the device. The standard partitions are aligned (by default) to 1MiB, but for the last partition fdisk/parted uses all usable space. That's a problem mostly on GPT where after the last partition is backup GPT header and size of this "hidden" area does not have to aligned to physical sectors if the logical sector is 512 bytes.

We can improve it easily, GPT header specifies an area where is possible to create partitions (aka. First and Last usable LBA), size of this area we need to align to physical sector size (or optimal I/O or 1MiB), after that the last partition will be aligned although it will be created by the arbitrary 3rd-party tool.

I'll prepare a patch for this issue for libfdisk.

Comment 7 Karel Zak 2022-01-27 12:54:47 UTC
(In reply to Karel Zak from comment #6)
> We can improve it easily, GPT header specifies an area where is possible to
> create partitions (aka. First and Last usable LBA), size of this area we
> need to align to physical sector size

This is the ideal solution. Unfortunately, useless for libfdisk, because it needs backward 
compatibility with already generated sfdisk dumps/script :-(

I have implemented it in another way. If the size of the last partition is not specified then
libfdisk aligns it to 1MiB (or optimal I/O, etc.) boundary.

Simple reproducer:

old version:

 # scsi_debug dev_size_mb=500 sector_size=512 physblk_exp=3

 # echo ",," | sfdisk --quiet --label gpt /dev/sdc
 # echo "$(lsblk --bytes -no SIZE /dev/sdc1) / (4*1024)" | bc -l
 127739.87500000000000000000
        ^^^^

new version:

 # echo ",," | sfdisk --quiet --label gpt /dev/sdc
 # echo "$(lsblk --bytes -no SIZE /dev/sdc1) / (4*1024)" | bc -l
 127488.00000000000000000000

This change will be available in util-linux v2.38.

Comment 8 Rod Smith 2022-01-27 14:47:23 UTC
Note that Linux fdisk (fdisk, sfdisk, cfdisk; in util-linux package) != GPT fdisk (gdisk, sgdisk, cgdisk; in the gdisk package). This bug report is filed against GPT fdisk, but the patch that Karel Zak describes is to Linux fdisk/util-linux.

Comment 9 Chris Murphy 2022-01-27 17:22:18 UTC
Ending the last partition on a $min-io-size boundary seems reasonable. My understanding is that cryptsetup only supports 512-byte and 4096-byte LUKS2 sector size right now, and uses both physical sector size and  partition start and end alignment to that physical sector size to determine which LUKS sector size to use. Not min-io.

I'm not sure there ever really was a per se advantage to 1 MiB alignment.

Comment 10 Rod Smith 2022-01-28 14:45:11 UTC
FYI, 1 MiB alignment is used as a default because many hardware RAID arrays work best with higher alignment values than the 4 KiB alignment that's optimal for Advanced Format disks. Details differ depending on the RAID hardware, and I haven't checked the details recently, but in the past and IIRC, stripe sizes of 32 KiB to 256 KiB were common, with sizes of up to 512 KiB being used in some cases. 1 MiB is not much greater than 512 KiB, so it gives a little wiggle room should stripe sizes increase in the future without wasting all that much space on modern devices. The consequences of not aligning to RAID stripe size are not as great as the consequences of not aligning to 4 KiB boundaries for Advanced Format disks, but they are measurable.

SSDs also have their own optimal alignment issues, many of which are poorly documented. Most are closer to the 4 KiB alignment that's optimal for Advanced Format disks than the tens or hundreds of KiB that's optimal for RAID arrays. I've heard of some weird ones that are optimized on power-of-three boundaries rather than power-of-two boundaries, but I don't know how common they are. Overall, 1 MiB alignment helps with many SSDs without having to dig into the weeds of drive-by-drive specifics.

There are APIs to determine the optimal alignment for a given device, but when I was investigating this issue for GPT fdisk, those APIs returned unreliable results, so I didn't want to rely on them. They may be more reliable now; I haven't checked in quite a while.

Also, and more directly relevant to this bug report, I've implemented an option to align partition end points; however, I haven't yet uploaded this to the GPT fdisk public git repository because I still need to do more testing on it before unleashing a potentially buggy new feature on the world. It's handled differently, in UI terms, in each of the three GPT fdisk programs. In gdisk, the default end point is adjusted, but can be easily overridden just by entering another value (including the maximum one that's still displayed on the screen). In cgdisk, the partition size is adjusted and can also be overridden, but the user would need to do some arithmetic to do so. In sgdisk, the default remains to not align the end point, but it can be aligned by using a new command-line option ("-I", since all the more logical ones were already in use).

Comment 11 Ben Cotton 2022-02-08 20:27:06 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 36 development cycle.
Changing version to 36.

Comment 12 Fedora Update System 2022-04-16 08:32:34 UTC
FEDORA-2022-14b4ccfa1f has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2022-14b4ccfa1f

Comment 13 Fedora Update System 2022-04-16 08:32:36 UTC
FEDORA-2022-f02bc8d566 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2022-f02bc8d566

Comment 14 Fedora Update System 2022-04-16 08:32:38 UTC
FEDORA-2022-02a3900f62 has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-02a3900f62

Comment 15 Fedora Update System 2022-04-16 17:55:20 UTC
FEDORA-2022-02a3900f62 has been pushed to the Fedora 36 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-02a3900f62`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-02a3900f62

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 16 Fedora Update System 2022-04-17 23:07:51 UTC
FEDORA-2022-14b4ccfa1f has been pushed to the Fedora 34 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-14b4ccfa1f`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-14b4ccfa1f

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 17 Fedora Update System 2022-04-17 23:29:18 UTC
FEDORA-2022-f02bc8d566 has been pushed to the Fedora 35 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-f02bc8d566`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-f02bc8d566

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Fedora Update System 2022-04-21 21:22:00 UTC
FEDORA-2022-f02bc8d566 has been pushed to the Fedora 35 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 19 Fedora Update System 2022-05-02 07:30:33 UTC
FEDORA-2022-14b4ccfa1f has been pushed to the Fedora 34 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 20 Fedora Update System 2022-05-07 04:25:29 UTC
FEDORA-2022-02a3900f62 has been pushed to the Fedora 36 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.