Bug 2188785

Summary: F38 installer fails when creating LUKS1 volume on a device with 4K sectors.
Product: [Fedora] Fedora Reporter: Daniel Rychcik <daniel>
Component: python-blivetAssignee: Vojtech Trefny <vtrefny>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 38CC: blivet-maint-list, daniel, dlehman, japokorn, mkolman, rvykydal, vponcova, vtrefny
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: python-blivet-3.8.0-1.fc39 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-06-29 13:45:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Script to reproduce the issue
none
storage.log from failed system
none
anaconda.log from failed system
none
syslog from failed system none

Description Daniel Rychcik 2023-04-22 09:27:29 UTC
When trying to create LUKS1 volume on disks with 4K sectors, Fedora 38 installer fails with: gi.overrides.BlockDev.CryptoError: Invalid extra arguments specified. Only `data_alignment`and `data_device` are valid for LUKS 1.


Reproducible: Always

Steps to Reproduce:
Boot a Kickstart with "--luks-version=luks1" partition created on a device with 4K sectors.

See attached repro.sh script. It creates a minimal netboot Kickstart image and starts a QEMU VM, where the issue is reproduced. You can confirm that the problem is specific to F38 and 4K sectors by tweaking $VER and $SECTOR_SIZE variables at the top. Note that the %post section of the script is completely optional - it only illustrates the end goal (single, encrypted / partition spanning entire drive), but the failure happens at earlier stage.

Actual Results:  
Installer fails with an exception (see "Additional information", "Steps to reproduce" and attached logs)

Expected Results:  
Installer succeeds and creates a LUKS1 volume


Context: I'm using full-disk encryption without separate /boot and, because of GRUB limitations, this needs LUKS1 for the / partition. See e.g. https://cryptsetup-team.pages.debian.net/cryptsetup/encrypted-boot.html and https://wiki.archlinux.org/title/dm-crypt/Encrypting_an_entire_system.

I use following entry in Kickstart:

  part / --fstype=ext4 --onpart=/dev/sda1 --encrypted --luks-version=luks1 --passphrase=some_temp_password

This worked for years, on various machines and across multiple Fedora versions, up to F37. When reinstalling to F38 recently, it failed on one machine, with:

  File "/usr/lib/python3.11/site-packages/blivet/formats/luks.py", line 325, in _create
    blockdev.crypto.luks_format(self.device,
  File "/usr/lib64/python3.11/site-packages/gi/overrides/BlockDev.py", line 1060, in wrapped
    raise transform[1](msg)
  gi.overrides.BlockDev.CryptoError: Invalid extra arguments specified. Only `data_alignment`and `data_device` are valid for LUKS 1.

I narrowed it down to this particular system having a root drive with 4K sectors (all others have 512B). This is probably related to https://fedoraproject.org/wiki/Changes/LUKSEncryptionSectorSize although for some reason it did not surface before Fedora 38.

Looking further at https://github.com/storaged-project/blivet/blob/3.7-release/blivet/formats/luks.py#L308-L332 what I *think* is happening is that, for a 4K drive, the additional 'extra' gets added in #L321 without checking for LUKS version. It is then passed to blockdev.crypto.luks_format(), which rejects it as it only accepts two whitelisted values for LUKS1.

I hacked around it by patching luks.py in the installer image, forcing 'extra=None' in that call. I guess the proper solution would be to have additional check that would add the sector-size-related 'extra' only for LUKS2 volumes.

See attached logs for more details. Note that these are logs from the originally failing machine, including tons of additional configs and constraints. Logs from repro.sh should be more clear.

Comment 1 Daniel Rychcik 2023-04-22 09:28:57 UTC
Created attachment 1959056 [details]
Script to reproduce the issue

Comment 2 Daniel Rychcik 2023-04-22 09:29:50 UTC
Created attachment 1959057 [details]
storage.log from failed system

Comment 3 Daniel Rychcik 2023-04-22 09:30:37 UTC
Created attachment 1959060 [details]
anaconda.log from failed system

Comment 4 Daniel Rychcik 2023-04-22 09:31:12 UTC
Created attachment 1959061 [details]
syslog from failed system

Comment 5 Vojtech Trefny 2023-04-27 12:52:41 UTC
upstream PR: https://github.com/storaged-project/blivet/pull/1124

updates image for Fedora 38: https://vtrefny.fedorapeople.org/img/rhbz2188785.img