Bug 2220657 - iSCSI login fails since blivet 3.8.0
Summary: iSCSI login fails since blivet 3.8.0
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: python-blivet
Version: rawhide
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Tomáš Bžatek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
Depends On:
Blocks: F39FinalBlocker
TreeView+ depends on / blocked
 
Reported: 2023-07-05 23:43 UTC by Adam Williamson
Modified: 2023-08-07 06:43 UTC (History)
8 users (show)

Fixed In Version: python-blivet-3.8.1-1.fc39
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-07 06:43:15 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github storaged-project blivet pull 1148 0 None Merged iscsi ibft and auth algs fixes 2023-08-01 22:10:54 UTC

Description Adam Williamson 2023-07-05 23:43:33 UTC
Since blivet 3.8.0 landed in Fedora Rawhide, the iSCSI install test has been failing. This is the log extract:

DEBUG:blivet:discovered iSCSI node: iqn.2016-06.local.domain:support.target1
INFO:anaconda.core.threads:Thread Done: AnaTaskThread-ISCSIDiscoverTask-1 (139932275603136)
DEBUG:dasbus.connection:Publishing an object at /org/fedoraproject/Anaconda/Modules/Storage/Task/3.
INFO:anaconda.core.threads:Running Thread: AnaTaskThread-ISCSILoginTask-1 (139932275603136)
INFO:anaconda.modules.common.task.task:Log into an iSCSI node
WARNING:blivet:iSCSI: could not log into iqn.2016-06.local.domain:support.target1: Failed to call Login method on /org/freedesktop/UDisks2/Manager with ('iqn.2016-06.local.domain:support.target1', 1, '172.16.2.120', 3260, 'default', {'username': <'test'>, 'password': <'weakpassword'>, 'node.startup': <'automatic'>, 'node.session.auth.chap_algs': <'SHA3-256,SHA256,SHA1,MD5'>}) arguments: GDBus.Error:org.freedesktop.UDisks2.Error.Failed: Login failed: initiator reported error (19 - encountered non-retryable iSCSI login failure)

I suspect the cause of this is most likely the change to the algorithms attempted:

https://github.com/storaged-project/blivet/commit/6d219726978b15e904ca7012d68fa8b7a5bd3f26

I found this message from someone who'd tried the same change and found it prevented login working:

https://stackoverflow.com/questions/74808942/linux-iscsi-how-to-change-chap-alg-from-md5

Unfortunately, it's rather hard to test my theory, because it's hard to wedge an updates image into this test (since networking isn't working when the system boots). But the change I'd want to test is to change the list to just be "MD5".

Reproducible: Always

Steps to Reproduce:
1. Set up an iSCSI target using iscsi-initiator-utils - the config we use on the server is:

<target iqn.2016-06.local.domain:support.target1>
    backing-store /dev/vdb
    incominguser test weakpassword
</target>

2. Try and connect to the iSCSI target during install from Fedora-Rawhide-20230630.n.0 or later
Actual Results:  
Connection fails

Expected Results:  
Connection should work

Comment 1 Adam Williamson 2023-07-05 23:45:12 UTC
Nominating as a Final blocker per "The installer must be able to detect (if possible) and install to supported network-attached storage devices" - https://fedoraproject.org/wiki/Fedora_39_Final_Release_Criteria#Network_attached_storage .

Comment 2 Adam Williamson 2023-07-06 00:34:31 UTC
Hum, correction: I'd forgotten how we actually set up the server end of the test. We use 'scsi-target-utils', not iscsi-initiator-utils. 'scsi-target-utils' is really https://github.com/fujita/tgt , which...appears to support MD5 and SHA1, nothing more (using its own 14-year old implementations of both, which, yikes). It is unclear to me how it decides which to use, or what it expects to do when it's given this list of algorithms to try...we could file a ticket upstream, perhaps?

Comment 3 Tomáš Bžatek 2023-07-25 09:42:25 UTC
So the blivet commit was added just as a courtesy to fix possible MD5 CHAP auth issues in the FIPS mode. I'll have a look why the other algorithms are not used, however the preferred algs string was taken from iscsid.conf sample config.

Comment 4 Adam Williamson 2023-07-25 10:03:06 UTC
It's entirely possible blivet is DTRT and this is actually a quirk of / bug in scsi-target-utils and doesn't really need to block release (since 'real world' iSCSI use presumably doesn't really use it), but I just haven't had the time to get back and give it another look yet, unfortunately. When I get time I will try and poke about the scsi-target-utils code a bit, and also see if it'd be possible to redesign the test to use iscsi-initiator-utils or something else...

Comment 5 Tomáš Bžatek 2023-07-28 13:23:08 UTC
I've built a simple testing environment and so far it looks like tgtd is refusing any auth alg other than SHA1 and MD5. Sadly instead of falling back to weaker alg, a failure is reported right away. My testing was done entirely with iscsiadm, outside of blivet and udisks, still getting the same results.

Steps to reproduce:
 1. Setup a tgtd target as described in comment 0
 2. On initiator side:
    # iscsiadm --mode node --portal 192.168.122.1 --targetname iqn.2016-06.local.domain:support.target1 -o update -n node.session.auth.username -v test
    # iscsiadm --mode node --portal 192.168.122.1 --targetname iqn.2016-06.local.domain:support.target1 -o update -n node.session.auth.password -v weakpassword
    # iscsiadm --mode node --portal 192.168.122.1 --targetname iqn.2016-06.local.domain:support.target1 -o update -n node.session.auth.authmethod -v CHAP
    # iscsiadm --mode node --portal 192.168.122.1 --targetname iqn.2016-06.local.domain:support.target1 -o update -n node.session.auth.chap_algs -v SHA3-256,SHA256,SHA1,MD5
 3. Try logging in:
    # iscsiadm --mode node --portal 192.168.122.1 --targetname iqn.2016-06.local.domain:support.target1 --login 
    Logging in to [iface: default, target: iqn.2016-06.local.domain:support.target1, portal: 192.168.122.1,3260]
    iscsiadm: Could not login to [iface: default, target: iqn.2016-06.local.domain:support.target1, portal: 192.168.122.1,3260].
    iscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure)
    iscsiadm: Could not log into all portals
 4. Weaken the auth algs:
    # iscsiadm --mode node --portal 192.168.122.1 --targetname iqn.2016-06.local.domain:support.target1 -o update -n node.session.auth.chap_algs -v SHA1,MD5
    # iscsiadm --mode node --portal 192.168.122.1 --targetname iqn.2016-06.local.domain:support.target1 --login 
    Logging in to [iface: default, target: iqn.2016-06.local.domain:support.target1, portal: 192.168.122.1,3260]
    Login to [iface: default, target: iqn.2016-06.local.domain:support.target1, portal: 192.168.122.1,3260] successful.

The target is Gentoo sys-block/tgt-1.0.86

Testing against the kernel LIO target, everything works with the full 'SHA3-256,SHA256,SHA1,MD5' alg set.

Now since MD5 is disabled in FIPS mode, I think we can live with "SHA1,MD5" defaults in Blivet. It'll be still an improvement from previous state with no algs available in FIPS mode.

Comment 6 Tomáš Bžatek 2023-07-28 13:39:02 UTC
FWIW, my testing environment is based on QEMU + iPXE with a nice EFI-based iSCSI initiator configuration. Testing against both the tgtd and LIO exporting a single LUN with Fedora 38 netinst ISO, I can make it boot with IBFT on both targets but CHAP auth only works against LIO. Haven't found a way to define CHAP auth algs in the iPXE setup, I suppose it's facing the same problem as above.

Comment 7 Tomáš Bžatek 2023-07-28 15:33:43 UTC
Proposed blivet changes: https://github.com/storaged-project/blivet/pull/1148

Comment 8 Adam Williamson 2023-07-28 16:34:09 UTC
it seems a shame to drop the SHA256 support just because tgtd doesn't allow a proper fallback. Is tgtd usage for this in the real world common? I honestly have no idea; as I said I just picked it for the openQA testing as it was there and I could get it to do the job. I'm not stuck on using it for openQA, if there's an alternative.

If we're sufficiently worried about real-world use of tgtd with this bug, then okay, makes sense (and makes my life easier). If not, we can look at fixing tgtd/libiscsi to allow a proper fallback, or switching to something else for the openQA tests?

Comment 9 Adam Williamson 2023-07-28 16:34:47 UTC
btw, thanks a lot for testing this out and confirming the details! I really appreciate when somebody else does that :D

Comment 10 Tomáš Bžatek 2023-07-31 09:58:35 UTC
(In reply to Adam Williamson from comment #8)
> it seems a shame to drop the SHA256 support just because tgtd doesn't allow
> a proper fallback. Is tgtd usage for this in the real world common? I
> honestly have no idea; as I said I just picked it for the openQA testing as
> it was there and I could get it to do the job. I'm not stuck on using it for
> openQA, if there's an alternative.
> 
> If we're sufficiently worried about real-world use of tgtd with this bug,
> then okay, makes sense (and makes my life easier). If not, we can look at
> fixing tgtd/libiscsi to allow a proper fallback, or switching to something
> else for the openQA tests?

Well it's udisks2 in the background doing the heavy lifting through a downstream libiscsi that noone else ships. It's known to be buggy (I'm debugging just another segfault right now), leaking and unsupported. We do plan to improve libopeniscsiusr that is part of the open-iscsi project and port udisks onto. So there's a high chance that the auth fallback might work afterwards. Until then, I suggest to keep the auth alg list minimal for compatibility.

I still think only MD5 alone was used until the point we started making changes. In any case, if you do plan some FIPS testing, you'd find soon enough.

Comment 11 Adam Williamson 2023-07-31 19:09:40 UTC
that makes sense. yes, it was MD5 only before, so SHA1 then MD5 is still an improvement, just SHA256 would be nicer :D but given the info on libiscsi, I agree it makes sense to just go with SHA1->MD5 for now and re-evaluate after the port is done.

Comment 12 Adam Williamson 2023-08-01 22:10:54 UTC
This is merged now, so marking POST.

Comment 13 Tomáš Bžatek 2023-08-03 14:20:26 UTC
Included in python-blivet-3.8.1-1.fc39:

* Thu Aug 03 2023 Vojtech Trefny <vtrefny> - 3.8.1-1
- Ignore new false positives with the latest pylint (vtrefny)
- iscsi: Rename storaged to udisks (tbzatek)
- iscsi: Rework UDisks iscsi module activation (tbzatek)
- iscsi: Make sure to modprobe iscsi_ibft (tbzatek)
- iscsi: Downgrade default CHAP auth algs to SHA1,MD5 (tbzatek)
- iscsi: Save firmware initiator name to /etc/iscsi/initiatorname.iscsi (vtrefny)
- spec: Bump release to 99 to be always ahead of Fedora in nightly (vtrefny)
- tests: Improve iscsi_test.ISCSITestCase (vtrefny)
- Make sure that LUKS.has_key always returns a boolean value (vtrefny)
- Squashed 'translation-canary/' changes from d6a40985..5bb81253 (vtrefny)
- Add btrfs subvolume specification to devicetree.resolve_device (vtrefny)
- Revert "Makefile cleanup" (vtrefny)

A couple more iscsi-related changes are included in that build, feel free to report any discrepancies.


Note You need to log in before you can comment on or make changes to this bug.