Bug 1380272 - [RFE] blacklist USB devices in multipath.conf due to firmware updates
Summary: [RFE] blacklist USB devices in multipath.conf due to firmware updates
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.0.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.5.0
: ---
Assignee: Nobody
QA Contact: Avihai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-29 07:31 UTC by Martin Tessun
Modified: 2020-10-07 10:04 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Storage
Target Upstream Version:
ebenahar: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1016535 1 None None None 2021-01-20 06:05:38 UTC
Red Hat Knowledge Base (Solution) 2671581 0 None None None 2016-09-29 07:51:02 UTC
oVirt gerrit 90518 0 'None' ABANDONED multipath: blacklist local devices 2021-01-21 11:33:51 UTC
oVirt gerrit 94693 0 'None' MERGED multipath: Blacklist obsolete and local devices 2021-01-21 11:33:51 UTC
oVirt gerrit 95981 0 'None' ABANDONED multipath: Blacklist obsolete and local devices 2021-01-21 11:33:11 UTC

Internal Links: 1016535

Comment 2 Martin Tessun 2016-10-02 17:03:57 UTC
1. What is the nature and description of the request?  
Add a blacklist to multipath.conf in vdsm so that USB / virtual USB devices are blacklisted.

    2. Why does the customer need this? (List the business requirements here)  
Currently a firmware update is not possible unless multipath is stopped or the USB device is blacklisted manually, as the firmware update processes try to access the device directly, which is no longer possible if it is used by multipath.
      
    3. How would the customer like to achieve this? (List the functional requirements here)  
Add a blacklist to multipath.conf that does blacklist USB devices.
      
    4. For each functional requirement listed, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented.  
      Plug a USB device and check if the device is configured with multipath.

    5. Is there already an existing RFE upstream or in Red Hat Bugzilla?  
Not that I am aware.
      
    6. Does the customer have any specific timeline dependencies and which release would they like to target (i.e. RHEL5, RHEL6)?  
RHV 4.1 best also a backport to RHEV 3.6
      
    7. Is the sales team involved in this request and do they have any additional input?  
No
      
    8. List any affected packages or components.  
multipath / vdsm
      
    9. Would the customer be able to assist in testing this functionality if implemented?  
Yes

Some additional info: Using DELL systems with iDRAC the following multipath.conf worked, but is specific for iDRAC, so probably a more generic approach is needed:

~~~
# VDSM REVISION 1.3
# VDSM PRIVATE

defaults {
    polling_interval            5
    no_path_retry               fail
    user_friendly_names         no
    flush_on_last_del           yes
    fast_io_fail_tmo            5
    dev_loss_tmo                30
    max_fds                     4096
}

blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z]*"
        devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
	devnode "^asm/*"
	devnode "ofsctl"

	device {
		vendor	"DELL"
		product	"IDSDM"
	}
	device {
		vendor "iDRAC"
		product "MAS022"
	}
	device {
                vendor "iDRAC"
                product "SECUPD"
        }
}

# Remove devices entries when overrides section is available.
devices {
    device {
        # These settings overrides built-in devices settings. It does not apply
        # to devices without built-in settings (these use the settings in the
        # "defaults" section), or to devices defined in the "devices" section.
        # Note: This is not available yet on Fedora 21. For more info see
        # https://bugzilla.redhat.com/1253799
        all_devs                yes
        no_path_retry           fail
    }
}

# Enable when this section is available on all supported platforms.
# Options defined here override device specific options embedded into
# multipathd.
#
# overrides {
#      no_path_retry           fail
# }

~~~

Comment 4 Nir Soffer 2017-12-12 17:37:09 UTC
The suggested blaclist looks reasonable, I think we should consider adding this to 
4.2 multipath.conf.

Comment 5 Nir Soffer 2017-12-12 17:44:28 UTC
Ben, do you think the suggested blacklist in comment 2 will block all local devices
on a host?

I think any local scsi devices (sata, sass, etc) will not be blacklisted.

Comment 6 Ben Marzinski 2017-12-12 21:18:28 UTC
(In reply to Nir Soffer from comment #5)
> Ben, do you think the suggested blacklist in comment 2 will block all local
> devices
> on a host?
> 
> I think any local scsi devices (sata, sass, etc) will not be blacklisted.

Yeah, this will still allow local scsi devices. A better answer would be to use something like

blacklist_exceptions {
        property "(SCSI_IDENT_|ID_WWN)"
}

Which uses the "property" blacklist parameter, from Bug 1456955.  That bug was added in response to the RHEV Bug 1016535.  Are you just looking for a method that doesn't require waiting for RHEL-7.5?

Comment 7 Nir Soffer 2017-12-12 23:07:20 UTC
(In reply to Ben Marzinski from comment #6)
> Are you just looking for a
> method that doesn't require waiting for RHEL-7.5?

Thanks Ben,

We have bug 1016535 about blacklisting local scsi devices, and we will wait for
7.5 to resolve it. 

I want to resolve this bug, which is about multipath grabbing USB devices, see
comment 2.

Do you think the blacklist suggested in comment 2 can be harmful in any way to
shared storage using iSCSI or FC?

Comment 8 Ben Marzinski 2018-01-02 19:39:42 UTC
(In reply to Nir Soffer from comment #7)
> (In reply to Ben Marzinski from comment #6)
> > Are you just looking for a
> > method that doesn't require waiting for RHEL-7.5?
> 
> Thanks Ben,
> 
> We have bug 1016535 about blacklisting local scsi devices, and we will wait
> for
> 7.5 to resolve it. 
> 
> I want to resolve this bug, which is about multipath grabbing USB devices,
> see
> comment 2.
> 
> Do you think the blacklist suggested in comment 2 can be harmful in any way
> to
> shared storage using iSCSI or FC?

No. It looks fine to me. I've never used DELL IDSDM, but from what I've read, it doesn't sound like it needs (or wants) multipath to handle failovers.

Comment 10 Yaniv Lavi 2018-02-26 09:43:01 UTC
Any updates on this, can we post a patch to test with RHEL 7.5?

Comment 11 Klaas Demter 2018-03-14 16:00:35 UTC
For iDRAC firmware updates with dell dsu I needed to add another blacklist entry:
blacklist {
[...]
        device {
                vendor "iDRAC"
                product "scrtch"
        }
}

Greetings
Klaas

Comment 12 Klaas Demter 2018-03-28 10:29:07 UTC
https://github.com/dell/DSU/issues/2

Comment 13 Nir Soffer 2018-04-08 15:29:51 UTC
(In reply to Yaniv Lavi from comment #10)
> Any updates on this, can we post a patch to test with RHEL 7.5?

Yes.

Comment 14 Yaniv Kaul 2018-04-24 11:03:05 UTC
This is not a blocker nor an exception, yet it is targeted for 4.2.3?

Comment 15 Nir Soffer 2018-05-03 10:48:21 UTC
(In reply to Ben Marzinski from comment #6)
> (In reply to Nir Soffer from comment #5)
> > Ben, do you think the suggested blacklist in comment 2 will block all local
> > devices
> > on a host?
> > 
> > I think any local scsi devices (sata, sass, etc) will not be blacklisted.
> 
> Yeah, this will still allow local scsi devices. A better answer would be to
> use something like
> 
> blacklist_exceptions {
>         property "(SCSI_IDENT_|ID_WWN)"
> }
> 
> Which uses the "property" blacklist parameter, from Bug 1456955.  That bug
> was added in response to the RHEV Bug 1016535.

Ben, the special behavior of blacklist-exceptions described in bug 1456955:

    The blacklist_exceptions option is special. If an exceptions option is set,
    but no blacklist option, it is assumed that all devices are blacklisted,
    except the ones that have udev evironment variables that match the exception
    option.

is not documented, at least in Fedora 27:

       blacklist        This section defines which devices should be excluded
                        from the multipath topology discovery.

       blacklist_exceptions
                        This section defines which devices should be included
                        in the multipath topology discovery, despite being listed
                        in the blacklist section.

    ...
    
    blacklist section

       The  blacklist  section  is  used  to exclude specific device from
       inclusion in the multipath topology. It is most commonly used to exclude
       local disks or LUNs for the array controller.

       The following keywords are recognized:

       devnode          Regular expression of the device nodes to be excluded.

                        The default is: ^(ram|raw|loop|fd|md|dm-|sr|scd|st|dcssblk) 
                        [0-9] and ^(td|hd|vd)[a-z]

       wwid             The World Wide Identification of a device.

       property         Regular expression of the udev property to be excluded.

       device           Subsection for the device description. This subsection
                        recognizes the vendor and product keywords. For a full
                        description  of  these  keywords  please  see  the
                        devices section description.

    blacklist_exceptions section

       The  blacklist_exceptions  section  is  used  to revert the actions of the
       blacklist section. For example to include specific device in the multipath
       topology. This allows one to selectively include devices which would
       normally be excluded via the blacklist section.

       The following keywords are recognized:

       devnode          Regular expression of the device nodes to be whitelisted.

       wwid             The World Wide Identification of a device.

       property         Regular expression of the udev property to be whitelisted.

       device           Subsection for the device description. This subsection
                        recognizes the vendor and product keywords. For a full
                        description  of  these  keywords  please  see  the devices
                        section description.

       The  property  blacklist  and  whitelist handling is different from the
       usual handling in the sense that the whitelist has to be set, otherwise
       the device will be blacklisted. In these cases the message blacklisted,
       udev property missing will be displayed.


So we cannot use this option before it is documented properly, at least upstream.

Also, from the manual and the description in bug 1456955, it is not clear what
will blacklist_exceptions with `property "(SCSI_IDENT_|ID_WWN)"` will do when
blacklist option *is* specified?

For example - does the following configuration make sense?

    blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"

        device {
            vendor "DELL"
            product "IDSDM"
        }
    }

    blacklist_exceptions {
         property "(SCSI_IDENT_|ID_WWN)"
    }

Or we have to use:

    # WARNING: Do not specify blacklist option. Specifying blacklist option will
    # break blacklisting of local devices using blacklist_exceptions.

    blacklist_exceptions {
        property "(SCSI_IDENT_|ID_WWN)"
    }

Comment 16 Ben Marzinski 2018-05-03 17:54:36 UTC
I agree, this description is not very helpful:
----
The  property  blacklist  and  whitelist handling is different from the usual handling in the sense that the whitelist has to be set, otherwise the device will be blacklisted. In these cases the message blacklisted, udev property missing will be displayed.
----

Perhaps a better descriptions would be the one from RHEL7:
----
The property blacklist and whitelist handling is different  from  the  usual
handling  in the sense that if the whitelist is set, it has to match, other‐
wise the device will be blacklisted.  In  these  cases  the  message  black‐
listed,  udev  property  missing will be displayed. For example settting the
property blacklist_exception  to  (SCSI_IDENT_|ID_WWN)  will  blacklist  all
devices  that  have  no  udev  property  whose  name  regex  matches  either
SCSI_IDENT_  or  ID_WWN.   This  works  to  exclude  most  non-multipathable
devices.
----

This got lost when I ported the patch over to fedora. Sorry. I should really
remove all mention of the blacklist section, because in reality, the code
doesn't treat the property blacklist option any different than any other blacklist option. So either of your above configurations will correctly use the property blacklist_exceptions.

I should point out Red Hat's implementation for this is currently slightly different from upstream's.  Right now, upstream defines the a default property blacklist_exception of

property "(SCSI_IDENT_|ID_WWN)"

This is not changeable by users. Users can only add other properties to be checked for.  This means that in the upstream code, devices that don't have either of those udev properties will not be multipathed by default.  In RHEL 7, the property option simply exists, with no default.  Starting in f27, the default multipath.conf file generated by mpathconf will include

blacklist_exceptions {
        property "(SCSI_IDENT_|ID_WWN)"
}

If new users generate their multipath.conf file with mpathconf, and want to multipath devices that don't have these udev properties, they can remove that
line.  Users who simply upgrade and keep their existing multipath.conf file will continue to multipath these devices.

I will eventually be dropping the patch that removes this built-in default from the redhat code, so we work the same as upstream.  After that, if people want to multipath these devices, they will need to manually edit their multipath.conf file to add another property name, which does exist in the udev database for these devices, to the blacklist_exceptions section of multipath.conf. But that should only be necessary for people who do things like use multipath to get queue_if_no_path behavior on their flaky USB device (which is a thing that some people do).

So, I will send a patch upstream, clarifying the man page, and I will update the the man page in the redhat patch in RHEL-8, rawhide and f28 to let users know that the property blacklist_exception difference only takes effect if a value is set, and to give users instructions on setting up a reasonable default, like it does in the RHEL7 man page.

Does this work for you?

Comment 17 Nir Soffer 2018-05-04 17:24:55 UTC
Yes, having blacklist_exceptions documented upstream should be good enough.

But property "(SCSI_IDENT_|ID_WWN)" is not strict enough for blacklisting local
devices, it is only good enough for blacklisting anything that should not be 
multipathed.

See https://bugzilla.redhat.com/show_bug.cgi?id=1016535#c38

Comment 18 Fred Rolland 2018-05-14 11:34:17 UTC
The proposed solution for this bug is too specific to some hardware.

A better solution should be using udev properties as in:
https://bugzilla.redhat.com/show_bug.cgi?id=1016535

I am moving this one to depend on the above, so it will be verified.

Meanwhile, users can add specific blacklist filter for specific hardware with a drop-in multipath configuration file.
For example:

$ cat /etc/multipath/conf.d/my-filter.conf
blacklist {
    device {
        vendor "DELL"
        product "IDSDM"
    }
}

Comment 19 Yaniv Kaul 2018-06-07 10:12:42 UTC
Is this on track for 4.2.4?

Comment 20 Fred Rolland 2018-06-07 14:11:33 UTC
A better solution is to be able to filter according to transport type as suggested here : https://bugzilla.redhat.com/show_bug.cgi?id=1016535#c42

I suggest to postpone to 4.3

Comment 21 Yaniv Lavi 2018-06-11 05:45:55 UTC
(In reply to Fred Rolland from comment #20)
> A better solution is to be able to filter according to transport type as
> suggested here : https://bugzilla.redhat.com/show_bug.cgi?id=1016535#c42
> 
> I suggest to postpone to 4.3

We need to solve the RHHI use case for 4.2.z at least.
Let's consider options to fix multipath for that use case.

Comment 22 Sandro Bonazzola 2018-07-20 09:17:29 UTC
This bug is targeted to 4.3.0 and is blocking bug #1177771 which is in modified state for 4.2.5.
Either not block bug #1177771 with this bug or ensure both bugs are targeting same version.

Comment 23 Nir Soffer 2018-11-19 13:49:21 UTC
We don't have a generic solution for blacklisting local devices, so we need to
continue with this bug.

Comment 24 Sandro Bonazzola 2019-01-28 09:43:34 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 27 Nir Soffer 2019-07-03 23:55:02 UTC
Ben, do we have any way to blacklist USB devices, at least some of them, but
without blocking anything which is not USB?

Looking in protocol, I don't see anything about USB. Do we have some udev property
that is used only by USB devices?

Comment 28 Ben Marzinski 2019-07-05 17:10:04 UTC
the udev builtin program usb_id should be run for removable usb storage. Unfortunately it doesn't always output a udev environment variable that you can use to blacklist usb devices. But for many usb devices it should add at least one of

ID_USB_INTERFACES
ID_USB_INTERFACE_NUM
ID_USB_DRIVER

to the list of udev properties, that you could use to blacklist by. If that doesn't work, then you could add your own udev property.  If you created a udev rule named something like

/etc/udev/rules/61-usb-devs.rules

with the following rule

KERNEL=="sd*[!0-9]", ENV{ID_SERIAL}=="?*", SUBSYSTEMS=="usb", ENV{USB_DEVICE}="1"

you could then blacklist in multipath based on the property USB_DEVICE. If adding extra udev rules isn't something that will work for you, I could probably modify modify multipaths property blacklisting, so that instead of simply looking of the property name, if you had a equals sign in the string, it would match whatever's after the equals sign with the udev properties value. Then you
could blacklist usb devices using something like

property "ID_BUS=usb"

which I believe usb_id should always add to the list of environment variables.

Comment 29 Nir Soffer 2019-07-08 14:36:27 UTC
(In reply to Ben Marzinski from comment #28)
Thanks Ben. I think using something like:

blacklist {
    property: "(ID_USB_INTERFACES|ID_USB_INTERFACE_NUM|ID_USB_DRIVER)"
}

Should be good enough for now.

Are you sure non-USB devices will never have these properties?

Enhancing property to include the value of the key sounds like a great
idea for future version.

Comment 30 Ben Marzinski 2019-07-08 16:00:32 UTC
Without digging through the code of all the programs that are called by the IMPORT rules in all the udev rules, I can't be totally sure, but as far as I can see from looking in reasonable places, only the id_usb callout and the 60-serial.rules file set them this only for usb devices. Also, I find in pretty had to believe that some callout that wasn't for usb devices would set a property named ID_USB_*

Comment 31 Klaas Demter 2020-01-15 16:11:07 UTC
I tried using
blacklist {
    property: "(ID_USB_INTERFACES|ID_USB_INTERFACE_NUM|ID_USB_DRIVER)"
}

but that does not work.

Also Dell seems to have changed the vendor identification for the SECUPD device. With the latest firmware it's now "Linux".

So my blacklist now looks like this:
blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z]*"
        devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
        devnode "^asm/*"
        devnode "ofsctl"
        device {
                vendor "DELL"
                product "IDSDM"
        }
        device {
                vendor "iDRAC"
                product "MAS022"
        }
        device {
                vendor "Linux"
                product "SECUPD"
        }
        device {
                vendor "iDRAC"
                product "SECUPD"
        }
        device {
                vendor "iDRAC"
                product "scrtch"
        }
}

Greetings
Klaas

Comment 32 Sandro Bonazzola 2020-05-18 14:46:29 UTC
Moved to 4.4.1 not being marked as blocker for 4.4.0 and we are preparing to GA.


Note You need to log in before you can comment on or make changes to this bug.