Bug 1513820

Summary: lsblk -t reports alignment -1 for drive in a USB enclosure, alignment 512 when drive removed
Product: [Fedora] Fedora Reporter: Chris Murphy <bugzilla>
Component: cryptsetupAssignee: Milan Broz <gmazyland>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 27CC: agk, bmarzins, extras-orphan, gmazyland, heinzm, jonathan, kzak, lvm-team, okozina, sbueno, tom, vtrefny
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-30 23:20:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
lsblk -t
none
lsblk -O
none
lsusb -v -d none

Description Chris Murphy 2017-11-16 03:05:31 UTC
Description of problem:

I have a SATA to USB enclosure that's somehow confusing libblockdev. The result is 'lsblk -t' misreports the ALIGNMENT value for any device mapper volumes (cryptsetup or lvm tools created volumes). Regular GPT partitions always have an ALIGNMENT value of 0.

But device mapper volumes have an ALIGNMENT value of -1 if they are created while the drive is in the enclosure. And 0 if they are created while the drive is directly connected to SATA. So it is definitely the enclosure, not the drive, confusing things.

If I create a bunch of device mapper volumes, while the drive is in the enclosure, giving them -1 ALIGNMENT values; and then I remove the drive and directly connect it to SATA, those same device mapper volumes now have a reported ALIGNMENT value of 512.

Note that I do not have any problem reading or writing to any partition or device mapper volume, regardless of whether those partitions or volumes were created while the drive was (or wasn't) in the enclosure. I can always read data. This is strictly about the reported ALIGNMENT value.

But the fallout of this I noticed is when I created both a dm-crypt volume, and some lvm logical volumes while the drive was in the enclosure; then removed the drive from the enclosure and directly connect to SATA, and then tried to format one of the logical volumes with mkfs.xfs, and XFS refused to do so:

# mkfs.xfs /dev/mapper/vg-timemachine
warning: device is not properly aligned /dev/mapper/vg-timemachine
Use -f to force usage of a misaligned device




Version-Release number of selected component (if applicable):
lvm2-2.02.175-1.fc27.x86_64
xfsprogs-4.12.0-4.fc27.x86_64
libblkid-2.30.2-1.fc27.x86_64
util-linux-2.30.2-1.fc27.x86_64
kernel 4.13.12-300.fc27.x86_64


How reproducible:
Always when the drive is in this particular enclosure.


Steps to Reproduce:
1. Create a logical volume with lvm tools; or a dm-crypt volume with cryptsetup


Actual results:

lsblk -t reports ALIGNMENT -1 for each dm device, while the drive is in the enclosure.

If the drive is then removed from the enclosure and connected to SATA, I can still mount those volumes normally, without error, but lsblk -t now reports the ALIGNMENT value as 512.

And further mkfs.xfs reports "warning: device is not properly aligned /dev/mapper/vg-timemachine" and will not format without -f option


Expected results:

I expect the ALIGNMENT to always be 0. But I have no idea what this enclosure is giving off that libblockdev or device mapper stuff is getting confused about; whether it's a bug in the enclosure implementation, whether there's a work around, or if there's a bug in libblockdev.


Additional info:

I haven't tried this with some other branded enclosures I have, to see if this might be a regression. Those other brand enclosures do not have this problem. But this enclosure is reporting other values from lsblk -t that are suspicious so...

Comment 1 Chris Murphy 2017-11-16 04:48:48 UTC
Created attachment 1353235 [details]
lsblk -t

decoder ring:
At the time this lsblk -t is executed, sda is direct SATA connected, and sdb is in the suspect USB enclosure.

sda2 / brick1 is a dmcrypt (cryptsetup) created volume, created when this drive was in the suspect enclosure. At the time it was in the enclosure it's alignment was -1, but out of the enclosure it's now 512.

sdb, all of the volumes with alignment -1 were created while the drive was in the enclosure.

Comment 2 Chris Murphy 2017-11-16 06:01:35 UTC
Created attachment 1353249 [details]
lsblk -O

Same state as previously posted lsblk -t.

Comment 3 Chris Murphy 2017-11-16 06:02:35 UTC
Created attachment 1353250 [details]
lsusb -v -d

Info on the suspicious USB enclosure.

Comment 4 Karel Zak 2017-11-16 15:49:05 UTC
lsblk only reads and prints data from /sys. Please, try to verify what kernel thinks about the device (to be sure this is not lsblk issue):

 $ cat /sys/block/dm-*/alignment_offset

(or use something better than dm-* for the devices).

BTW, if mkfs.xfs returns the same warning than it's very probably misaligned device (something wrong with your DM mapping or so...)

Comment 5 Chris Murphy 2017-11-16 17:08:32 UTC
# cat /sys/block/dm-4/alignment_offset
-1


mkfs.xfs does not complain about alignment -1 block devices, it does complain about alignment 512 block devices. If I create the dm device with physical drive in the enclosure, dm devices have alignment -1. If I then remove the physical drive from the enclosure to directly connect by SATA, previously created dm devices now have alignment 512.

Comment 6 Chris Murphy 2017-11-16 18:00:17 UTC
Also note from lsblk attachments, device /dev/sdb2 / brickold. This dmcrypt device was created while the physical drive was directly SATA connected, and at that time the alignment was reported as 0. It was also created with older tools and kernels, circa Fedora 25. But now that it's inside the enclosure, that same dm device is reported to have an alignment of -1.

Comment 7 Chris Murphy 2017-11-17 19:48:11 UTC
Run cryptsetup on partition sdb1 that begins at LBA 2048

# cryptsetup --verbose luksFormat /dev/sdb1
# cryptsetup open /dev/sdb1 fourth

The open command results in these device-mapper messages.

[71180.012338] device-mapper: table: 253:0: adding target device sdb1 caused an alignment inconsistency: physical_block_size=4096, logical_block_size=512, alignment_offset=0, start=33553920
[71180.012497] device-mapper: table: 253:0: adding target device sdb1 caused an alignment inconsistency: physical_block_size=4096, logical_block_size=512, alignment_offset=0, start=33553920


# cryptsetup luksDump /dev/sdb1
...
Payload offset:	65535

OK so that's not 4KiB aligned. Cryptsetup is doing the wrong thing here by default near as I can tell. Looks like the wrong assumptions are being made due to the odd OPT-IO size of 33553920 which is not really a good enough reason to, by default, misalign the payload.

Fixing this with option '--align-payload 4096' at luksFormat time.

Comment 8 Milan Broz 2017-11-17 21:04:14 UTC
This seems like garbage-in, garbage out... Why the alignment is -1? This looks like a kernel bug. But cryptsetup should handle it though...

Comment 9 Chris Murphy 2017-11-17 23:17:18 UTC
Found this upstream thread, and it seems it's confusion resulting from an odd OPT-IO. 

http://www.saout.de/pipermail/dm-crypt/2016-January/004934.html

I have those same values as in the thread, and everything after the dmcrypt layer likewise has -1; even with a luks payload offset of 4096, which is definitely aligned, lsblk -t still reports alignment -1.

I've got enough information to file an upstream cryptsetup bug. But not enough information to file a kernel bug.

Comment 10 Milan Broz 2017-11-18 08:16:15 UTC
The lsblk print information about block devices (parsing it from /sys/block), the LUKS alignment is something different - it is offset for data (data payload) and this is visible only in dm-crypt table (and luksDump, just not it is in 512 sectors).

Anyway, I will fix the -1 handling upstream (cryptsetup). But the whole logic was added to cryptsetup exactly because these 4k drives - and it worked, there is several tests covering "proper" 4k devices. Unfortunately now it seems that kernel reports values that are just bizarre and cannot be trusted...

Still, -1 is wrong in kernel IMO. I need it reproduce somehow to check what is causing it.

Comment 11 Milan Broz 2017-11-18 08:17:48 UTC
*note it IS in 512 sector units.
Sorry for typo, ENOCOFFEE :)

Comment 12 Chris Murphy 2017-11-19 22:20:01 UTC
The alignment -1 problem, as well as the cryptsetup luksFormat default payload offset of 65535, applies only to one USB enclosure.

All the other USB enclosures I have with a different make/model and chipset, don't behave this way. And when the drive is removed from the "problem" enclosure, and directly SATA connected, the problem also doesn't happen.

So it seems there is some information this particular enclosure is handing over that's instigating the problem; and also maybe there's a kernel bug making the wrong assumptions about that information.

Comment 13 Chris Murphy 2017-11-19 22:26:20 UTC
Also I get this kernel message twice, whenever I 'cryptsetup luksOpen' this dmcrypt device on this USB drive.

[256079.214629] device-mapper: table: 253:0: adding target device sdb1 caused an alignment inconsistency: physical_block_size=4096, logical_block_size=512, alignment_offset=0, start=2097152


This message happens despite partition 1 (the only partition) starting at LBA 2048, and LUKS payload offset 4096.

Comment 14 Milan Broz 2017-11-21 13:17:26 UTC
Ok, so -1 is valid, it means that alignment is undefined. So cryptsetup must support this flag.

Please, if you can, could you paste me "cryptsetup luksFormat --debug ...." output when this happens? We use ioctl, not sysfs, just to be sure that it is the same problem.

Comment 15 Milan Broz 2017-11-21 14:44:21 UTC
OK, so I found the problem.

The -1 is actually handled correctly. But the device reports sector size 4096, but optimal-io size 33553920 (not multiple of sector size).

Cryptsetup now ignores such a bogus setting, fixed upstream in
https://gitlab.com/cryptsetup/cryptsetup/commit/b80278c04f48c86f1c07dba79b4695f539a524e5

Comment 16 Ben Cotton 2018-11-27 13:54:35 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 17 Ben Cotton 2018-11-30 23:20:30 UTC
Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 18 Tom 2019-01-25 06:44:48 UTC
This may be a problem with USB Attached SCSI (UAS).

See this answer:

https://unix.stackexchange.com/questions/496447/optimal-io-size-is-large-causing-lvm-lv-alignment-inconsistency