Bug 1758223 - In some cases a /dev/disk/by-path/fc---lun-0 is created for some devices.
Summary: In some cases a /dev/disk/by-path/fc---lun-0 is created for some devices.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director-images
Version: 13.0 (Queens)
Hardware: x86_64
OS: All
urgent
urgent
Target Milestone: z11
: 13.0 (Queens)
Assignee: Bob Fournier
QA Contact:
URL:
Whiteboard:
Depends On: 1750417
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-03 15:29 UTC by David Hill
Modified: 2023-09-07 20:44 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-29 20:22:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-18224 0 None None None 2022-08-17 05:10:40 UTC
Red Hat Knowledge Base (Solution) 4462251 0 None None None 2019-10-03 15:30:20 UTC

Description David Hill 2019-10-03 15:29:25 UTC
Description of problem:
In some cases a /dev/disk/by-path/fc---lun-0 is created for some devices and I susepect the following:

~~~
#
# FC WWPN-based by-path links
#

ACTION!="add|change", GOTO="fc_wwpn_end"
KERNEL!="sd*", GOTO="fc_wwpn_end"

ENV{DEVTYPE}=="disk", IMPORT{program}="fc_wwpn_id %p"
ENV{DEVTYPE}=="partition", IMPORT{parent}="FC_*"
ENV{FC_TARGET_WWPN}!="$*"; GOTO="fc_wwpn_end"
ENV{FC_INITIATOR_WWPN}!="$*"; GOTO="fc_wwpn_end"
ENV{FC_TARGET_LUN}!="$*"; GOTO="fc_wwpn_end"

ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-path/fc-$env{FC_INITIATOR_WWPN}-$env{FC_TARGET_WWPN}-lun-$env{FC_TARGET_LUN}"
ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-path/fc-$env{FC_INITIATOR_WWPN}-$env{FC_TARGET_WWPN}-lun-$env{FC_TARGET_LUN}-part%n"

LABEL="fc_wwpn_end"
~~~

should be instead:

~~~
#
# FC WWPN-based by-path links
#

ACTION!="add|change", GOTO="fc_wwpn_end"
KERNEL!="sd*", GOTO="fc_wwpn_end"

ENV{DEVTYPE}=="disk", IMPORT{program}="fc_wwpn_id %p"
ENV{DEVTYPE}=="partition", IMPORT{parent}="FC_*"
ENV{FC_TARGET_WWPN}!="$*", GOTO="fc_wwpn_end"
ENV{FC_INITIATOR_WWPN}!="$*", GOTO="fc_wwpn_end"
ENV{FC_TARGET_LUN}!="$*", GOTO="fc_wwpn_end"

ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-path/fc-$env{FC_INITIATOR_WWPN}-$env{FC_TARGET_WWPN}-lun-$env{FC_TARGET_LUN}"
ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-path/fc-$env{FC_INITIATOR_WWPN}-$env{FC_TARGET_WWPN}-lun-$env{FC_TARGET_LUN}-part%n"

LABEL="fc_wwpn_end"
~~~


Version-Release number of selected component (if applicable):
Latest

How reproducible:
This customer environment.

Steps to Reproduce:
1. Install RHEL 7.7
2.
3.

Actual results:
/dev/disk/by-path/fc---lun-0 is created

Expected results:
That file shouldn't be created

Additional info:

Comment 2 David Hill 2019-10-03 20:12:55 UTC
This is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1750417

Comment 5 John Pittman 2019-10-04 20:54:43 UTC
The code/rulefile that is causing the issue was introduced by sg3_utils-1.37-18.el7_7.1.  Unsure if this is possible in terms of openstack needs, but if the customer could temporarily downgrade back to sg3_utils-1.37-18.el7, the bad by-path links should not be created at device attach.

# yum downgrade sg3_utils sg3_utils-libs

Comment 7 Tomáš Bžatek 2019-10-10 12:18:18 UTC
Looks like the new udev rules are uncovering hidden bugs in the OpenStack code. To understand the real issue here, could you please somebody answer the following questions?

 * is that a stray symlink or is it a misplaced one, missing somewhere else?
 * judging by the ouput provided in comment 4, why the "by_path" attribute matter when "name" is present? (assuming the link points to the mentioned block device file)
 * is /dev/disk/by-* structure actually used for block device enumeration?
 * does the OpenStack code interact with udev through libudev or its bindings?

Could you also please provide output of `udevadm info --export-db` at the time of the issue?

Comment 9 Dmitry Tantsur 2019-10-17 09:13:10 UTC
I can only answer the questions from the information collecting side, I'm not sure how it's used by Ceph:

> is /dev/disk/by-* structure actually used for block device enumeration?

Now, we use lsblk for that. /dev/disk/by-* is only used to provide the by_path attribute. I'm not sure how it's consumed.

> does the OpenStack code interact with udev through libudev or its bindings?

We definitely use pyudev to collect additional attributes of block devices.

Relevant code (quite a lot of it): https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/hardware.py#L238-L387

Comment 11 John Pittman 2019-10-17 11:41:24 UTC
Hi Ashish.  I just mean that if you downgrade sg3_utils and the libs, the bad fc symlink will not be created in /dev/disk/by-path.  I don't know anything about the stack environment.  Alternatively, sg3_utils can just be un-installed and the libs left on the system.  I don't believe sg3_utils is installed by default anyways.

Comment 12 David Hill 2019-10-17 22:54:54 UTC
This is a possible workaround for the current IPA initramfs issue:
~~~
 mkdir tmp
 cd tmp
 mv ../ironic-python-agent.initramfs ironic-python-agent.initramfs.gz
 gunzip ironic-python-agent.initramfs.gz
 cat ironic-python-agent.initramfs | cpio -ivd
 rm -rf ironic-python-agent.initramfs
 # fix typos in the initramfs 59-fc-wwpn-id.rules 
 find . -print -depth | cpio -ov > ironic-python-agent.initramfs
 gzip ironic-python-agent.initramfs
 mv ironic-python-agent.initramfs.gz ironic-python-agent.initramfs
~~~

Comment 13 David Hill 2019-10-17 23:19:25 UTC
Something like this:
~~~
 mkdir tmp
 cd tmp
 cp ../ironic-python-agent.initramfs ironic-python-agent.initramfs.gz
 gunzip ironic-python-agent.initramfs.gz
 cat ironic-python-agent.initramfs | cpio -ivd
 rm -rf ironic-python-agent.initramfs
 cat << EOF> usr/lib/udev/rules.d/59-fc-wwpn-id.rules
#
# FC WWPN-based by-path links
#

ACTION!="add|change", GOTO="fc_wwpn_end"
KERNEL!="sd*", GOTO="fc_wwpn_end"

ENV{DEVTYPE}=="disk", IMPORT{program}="fc_wwpn_id %p"
ENV{DEVTYPE}=="partition", IMPORT{parent}="FC_*"
ENV{FC_TARGET_WWPN}!="$*", GOTO="fc_wwpn_end"
ENV{FC_INITIATOR_WWPN}!="$*", GOTO="fc_wwpn_end"
ENV{FC_TARGET_LUN}!="$*", GOTO="fc_wwpn_end"

ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-path/fc-$env{FC_INITIATOR_WWPN}-$env{FC_TARGET_WWPN}-lun-$env{FC_TARGET_LUN}"
ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-path/fc-$env{FC_INITIATOR_WWPN}-$env{FC_TARGET_WWPN}-lun-$env{FC_TARGET_LUN}-part%n"

LABEL="fc_wwpn_end"
EOF
 find . -print -depth | cpio -ov > ironic-python-agent.initramfs
 gzip ironic-python-agent.initramfs
 mv ironic-python-agent.initramfs.gz ../ironic-python-agent.initramfs
~~~

Comment 14 David Hill 2019-10-17 23:20:45 UTC
This one would be better as the variables are not expanded:
~~~
 mkdir tmp
 cd tmp
 cp ../ironic-python-agent.initramfs ironic-python-agent.initramfs.gz
 gunzip ironic-python-agent.initramfs.gz
 cat ironic-python-agent.initramfs | cpio -ivd
 rm -rf ironic-python-agent.initramfs
 cat << 'EOF'> usr/lib/udev/rules.d/59-fc-wwpn-id.rules
#
# FC WWPN-based by-path links
#

ACTION!="add|change", GOTO="fc_wwpn_end"
KERNEL!="sd*", GOTO="fc_wwpn_end"

ENV{DEVTYPE}=="disk", IMPORT{program}="fc_wwpn_id %p"
ENV{DEVTYPE}=="partition", IMPORT{parent}="FC_*"
ENV{FC_TARGET_WWPN}!="$*", GOTO="fc_wwpn_end"
ENV{FC_INITIATOR_WWPN}!="$*", GOTO="fc_wwpn_end"
ENV{FC_TARGET_LUN}!="$*", GOTO="fc_wwpn_end"

ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-path/fc-$env{FC_INITIATOR_WWPN}-$env{FC_TARGET_WWPN}-lun-$env{FC_TARGET_LUN}"
ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-path/fc-$env{FC_INITIATOR_WWPN}-$env{FC_TARGET_WWPN}-lun-$env{FC_TARGET_LUN}-part%n"

LABEL="fc_wwpn_end"
EOF
 find . -print -depth | cpio -ov > ironic-python-agent.initramfs
 gzip ironic-python-agent.initramfs
 mv ironic-python-agent.initramfs.gz ../ironic-python-agent.initramfs
~~~

Comment 18 Brian 2019-10-20 20:52:55 UTC
email from Bob Fournier

It may be a problem with the procedure in Comment 14.  It may be safer to try downgrading sg3_utils and sg3_utils-libs to
1.37-18.el7 as indicated in https://bugzilla.redhat.com/show_bug.cgi?id=1758223#c5.  The downloads can be found here - 
https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=853138

Note when you upgrade the image you'll need to make sure
- the files have root permissions
- that you upload the images after making the change.  

There's a KCS article - https://access.redhat.com/solutions/3548611, that describes how to do this, and also the downstream documentation -
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/partner_integration/overcloud_images#initrd_modifying_the_initial_ramdisks

Comment 20 Dmitry Tantsur 2019-10-21 11:45:52 UTC
Did you try following https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/partner_integration/overcloud_images#initrd_modifying_the_initial_ramdisks precisely (modulo different package)? I'd prefer we don't implement custom procedures if there is an official one.

Comment 21 Dmitry Tantsur 2019-10-21 11:47:04 UTC
Also it's not clear from the previous comments: please make sure you're doing everything as a root user. Files inside initramfs are owned by root.

Comment 29 Bob Fournier 2019-11-22 17:07:21 UTC
Shailesh - its more of a question on the RHEL backport to 7.7 for https://bugzilla.redhat.com/show_bug.cgi?id=1750417.  Once that backport merges we will pick it up in the next OSP release.

Comment 31 Bob Fournier 2019-11-23 22:09:05 UTC
Including storage DFG for Comment 30 as they have the expertise for ceph configuration.  May need to open a separate bug as this appears unrelated to the original issue.

Comment 34 Bob Fournier 2020-01-09 22:02:10 UTC
RHEL 7.7 backport for this fix is here - https://bugzilla.redhat.com/show_bug.cgi?id=1788876.  It will be picked up in the next OSP-13z release - 13z11.

Comment 35 Bob Fournier 2020-07-29 20:22:12 UTC
Fix is available in RHEL 7.7 and later, closing this.


Note You need to log in before you can comment on or make changes to this bug.