Bug 2348878 - Interaction between nbd, mdadm, and SELinux causes I/O errors
Summary: Interaction between nbd, mdadm, and SELinux causes I/O errors
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: nbd
Version: 41
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Pablo Greco
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2025-02-27 20:44 UTC by Felix Howe
Modified: 2025-12-02 02:28 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:


Attachments (Terms of Use)
full dmesg output during an occurrence of the bug (4.10 KB, text/plain)
2025-02-27 20:48 UTC, Felix Howe
no flags Details

Description Felix Howe 2025-02-27 20:44:00 UTC
SELinux somehow interferes with nbd-client devices, causing a protocol error that fails all reads to the nbd device, but only when the nbd device contains an mdadm software RAID member. No SELinux denials are logged and there are no SELinux messages of any kind in dmesg or journalctl during or around the issue, making this problem difficult/subtle to diagnose, but "setenforce 0" prevents the issue from occurring.

Reproducible: Always

Steps to Reproduce:
Run the following sequence of commands on any Fedora host (tested on a fresh install of 41):

dnf install nbd
truncate -s 256M /tmp/testfile
nbd-server localhost:10809 /tmp/testfile
nbd-client localhost
mdadm --create -l 1 -n 2 /dev/md/testarray /dev/nbd0 missing
mdadm --stop /dev/md/testarray
nbd-client -d /dev/nbd0
nbd-client localhost

Actual Results:  
After the mdadm --create command, the array appears to function normally and is readable. However, after disconnecting, all future attempts to use the nbd device fail, even if mdadm is not used - the fact that the device contains an mdadm signature is apparently enough to trigger the problem, even if the array is not used.

After the final command above, the following appears in dmesg:

block nbd0: Send control failed (result -13)
block nbd0: Request send failed, requeueing
block nbd0: Receive control failed (result -32)

(for the full set of dmesg errors, see attachments)


Expected Results:  
Either:

- The mdadm array reassembles automatically after udev is triggered by the block device appearing, and the mdadm array is then readable as usual.

or

- The block device becomes available, and subsequent reads (such as with blkid, or any other utility that issues basic block device read operations) succeed, and the mdadm device can be assembled and started manually.

or

- SELinux logs information about what operation it is denying before causing the nbd device to fail, and also, if SELinux is behaving correctly by preventing the combination of mdadm and nbd, it should be consistent and also prevent the device from being started after mdadm first creates it.

It is important to note that the problem is definitely with the nbd-client component, and not the server. The behaviour described above happens even if the nbd-server is running on an entirely different machine, under a different Linux distribution (when I encountered this bug, I was serving from an Arch Linux live ISO, and I initially assumed the problem was a version/protocol mismatch until I got the test case down to the minimal local case above).

Also, if a Fedora machine is the nbd-server, a different machine running a different distribution (or a different Fedora machine on which SELinux is disabled) is able to use the served device with mdadm without problems.

I have not noticed any other problems with nbd-client when using other block device layers on nbd devices -- e.g. LVM appears to function normally. I have not tried LVM RAID, but  creating an mdadm array on top of an LVM logical volume also triggers this bug.

I have not tested this with nbdkit or qemu-nbd.

Comment 1 Felix Howe 2025-02-27 20:48:33 UTC
Created attachment 2078112 [details]
full dmesg output during an occurrence of the bug

Comment 2 Fedora Admin user for bugzilla script actions 2025-03-02 01:40:35 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 3 Fedora Admin user for bugzilla script actions 2025-03-05 13:56:00 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 4 Ming Lei 2025-09-10 09:10:12 UTC
> [   48.518011] block nbd0: Send control failed (result -13)        <-----
> [   48.518030] block nbd0: Request send failed, requeueing         <-----

There was one requeue bug fixed by:

8337b029f788 ("nbd: fix partial sending")

which is merged to v6.14.

But you don't provide your kernel info...

Comment 5 Adam Williamson 2025-12-02 02:28:39 UTC
This message is a reminder that Fedora Linux 41 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 41 on 2025-12-15.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '41'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 41 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.


Note You need to log in before you can comment on or make changes to this bug.