Bug 1776408

Summary: Ambiguous error returned preventing user from understanding the root cause of the error
Product: Red Hat Enterprise Linux 8 Reporter: Renaud Métrich <rmetrich>
Component: systemdAssignee: David Tardon <dtardon>
Status: CLOSED ERRATA QA Contact: Frantisek Sumsal <fsumsal>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.1CC: agk, dtardon, fkrska, jbrassow, mbroz, okozina, prajnoha, sbroz, systemd-maint-list
Target Milestone: rcKeywords: EasyFix, Patch, Reproducer
Target Release: 8.2   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: systemd-239-23.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:45:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Renaud Métrich 2019-11-25 16:03:30 UTC
Description of problem:

When attaching a LUKS device using a key file (e.g. /dev/urandom for swap) and the LUKS device is already used, the _activate_loopaes() function returns in error but no message is printed, preventing the caller (e.g. systemd-cryptsetup) to return a proper message.


cryptsetup-2.2.0/lib/setup.c:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
4045 int crypt_activate_by_keyfile_device_offset(struct crypt_device *cd,
 :
4068         r = crypt_keyfile_device_read(cd, keyfile,
4069                                 &passphrase_read, &passphrase_size_read,
4070                                 keyfile_offset, keyfile_size, 0);
4071         if (r < 0)
4072                 goto out;
 :
4074         if (isLOOPAES(cd->type))
4075                 r = _activate_loopaes(cd, name, passphrase_read, passphrase_size_read, flags);
4076         else
4077                 r = _activate_by_passphrase(cd, name, keyslot, passphrase_read, passphrase_size_read     , flags);
 :
4079 out:
4080         crypt_safe_free(passphrase_read);
4081         return r;
4082 }
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Lines 4075 and 4077 above: usually EBUSY will be returned here without explanation.
Line 4068: usually EINVAL will be returned here without explanation.

This leads to systemd-cryptsetup to print a non-sense message:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
Failed to activate with key file '/dev/urandom': Device or resource busy
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

The EBUSY is on the device, not the key file.


Version-Release number of selected component (if applicable):

cryptsetup-2.2.0-2.el8.x86_64


How reproducible:

Always


Steps to Reproduce:
1. Boot a system with unencrypted swap device (/dev/mapper/rhel-swap)
2. Execute systemd-cryptsetup

  # /usr/lib/systemd/systemd-cryptsetup attach swap-enc /dev/mapper/rhel-swap /dev/urandom swap


Actual results:

"
Failed to activate with key file '/dev/urandom': Device or resource busy
"


Expected results:

Some proper error message, e.g. "/dev/mapper/rhel-swap: device is busy"


Additional info:

The issue happens when the admin tries to "convert" a regular swap device into an encrypted swap device after installation occurred.
If the admin didn't deactivate the swap device before setting it for encryption (KCS https://access.redhat.com/solutions/1121603), systemd-cryptsetup@<dev>.service unit will fail with "Failed to activate with key file '/dev/urandom': Device or resource busy".
This was hard to find out what was going on due to the generic message seen there.


Additionally, if an invalid key file is submitted (e.g. "/non-existing"), another generic message is seen:

"
Failed to activate with key file '/non-existing': Invalid argument
"

Comment 1 Ondrej Kozina 2019-11-25 16:56:26 UTC
I think that error message should be fixed on systemd-cryptsetup side. It should interpret -EBUSY return code from libcryptsetup not for a keyfile but data device instead.

For example with cryptsetup cli I get following:

[root@machine /]# mount /dev/sdc /mnt/blabla
[root@machine /]# cryptsetup open --type plain /dev/sdc --key-file /dev/urandom sdc_crypt
Cannot use device /dev/sdc which is in use (already mapped or mounted).

Comment 2 Renaud Métrich 2019-11-25 17:05:20 UTC
Indeed, sorry for the noise, everything is fine on cryptsetup's side.
But systemd's cryptsetup_log_glue() is broken, it logs everything at DEBUG level:

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
  7 void cryptsetup_log_glue(int level, const char *msg, void *usrptr) {
  8         log_debug("%s", msg);
  9 }
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Comment 3 Renaud Métrich 2019-11-25 17:24:38 UTC
Probably the cryptsetup_log_glue() function should be modified to store the message somewhere for processing by the caller (only the caller can decide whether there is actually a failure).

Additionally, when it goes with setting up the swap device with /dev/urandom, it is non-sense to retry the command interactively, as currently seen:

(usually done by generated systemd-crypsetup):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# /usr/lib/systemd/systemd-cryptsetup attach swap-enc /dev/mapper/rhel-swap /dev/urandom swap
Failed to activate with key file '/dev/urandom': Device or resource busy
Please enter passphrase for disk rhel-swap (swap-enc) on swap! 
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

--> That "Please enter passphrase ..." message shouldn't happen when the issue is on the device being busy (not the key file having an issue).

Comment 4 Renaud Métrich 2019-11-26 08:24:57 UTC
We at least need to backport this:

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
commit aa2cc005d77890b07e8c579f25e1333ff8ba8dac
Author: Jan Janssen <medhefgo>
Date:   Mon Jun 25 20:33:31 2018 +0200

    crypt-util: Translate libcryptsetup log level instead of using log_debug()
    
    This makes sure that errors reported by libcryptsetup are shown to the
    user instead of getting swallowed up by log_debug().
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

This commit logs at appropriate level.

Comment 5 Renaud Métrich 2019-11-28 09:10:43 UTC
The fix should consist of 2 parts:
1. backporting aa2cc005d77890b07e8c579f25e1333ff8ba8dac
2. not asking for password if error is "device busy" (there is no point in asking for password in such case)

So it's not so easy fix, but almost :-)

Comment 6 David Tardon 2019-12-12 16:39:58 UTC
PR: https://github.com/systemd-rhel/rhel-8/pull/52

Comment 12 errata-xmlrpc 2020-04-28 16:45:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1794