Bug 1977434

Summary: Not having a blank space at the end of zfcp.conf results in systemd failing zfcp.conf with exit code 1
Product: Red Hat Enterprise Linux 8 Reporter: tcleveng
Component: s390utilsAssignee: Dan Horák <dhorak>
Status: CLOSED ERRATA QA Contact: Vilém Maršík <vmarsik>
Severity: low Docs Contact:
Priority: unspecified    
Version: 8.4CC: dhorak, mgandhi, rvr, tstaudt
Target Milestone: betaKeywords: Triaged
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: s390utils-2.16.0-1.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-09 20:04:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1796871    

Description tcleveng 2021-06-29 18:01:13 UTC
Description of problem:
This problem is cosmetic - in that the zfcp disks are being brought online as expected . However the existence of the word "failed" is setting off problems for the customer's reporting mechanism. 

Issue is simply that zfcpconf.sh ( s390utils-core) does not return an explicit error code.  So the exit code you get is somewhat random.   This has always been the case. 

Version-Release number of selected component (if applicable):
8.4

How reproducible:
Every time

Steps to Reproduce:
1.Problem is completely recreatable.   You need a zfcp.conf where the last line is a correctly defined disk which is online or can be brought online to bring about a "failure".  If the last line is blank or a disk which is not accessible, then the script  "succeeds":
[root@s8315027 etc]# cat zfcp.conf
0.0.191b	0x50050763070845e3 0x4082402900000000
[root@s8315027 etc]# zfcpconf.sh
[root@s8315027 etc]# echo $?
1

[root@s8315027 etc]# echo >> zfcp.conf
[root@s8315027 etc]# zfcpconf.sh
[root@s8315027 etc]# echo $?
0


Actual results:
This issue occurs on boot - when the zfcp/scsi disks are brought online.   Systemd is reporting a failure : .
systemd-udevd[810]: Process '/sbin/zfcpconf.sh' failed with exit code 1.

Expected results:
The zfcpconf.sh to have exit code 0

What we do not want is an exit code which returns something other than 0 because some of  the zfcp disks are not found - it is normal in z/VM clusters for entries in the configuration file to be unavailable. 

Additional info:

Comment 2 Dan Horák 2021-06-30 08:57:18 UTC
Thanks for digging into it, the same problem has been already tracked in bug 1870635.

Comment 3 Dan Horák 2021-06-30 08:57:39 UTC
*** Bug 1870635 has been marked as a duplicate of this bug. ***

Comment 4 IBM Bug Proxy 2021-06-30 09:05:58 UTC
------- Comment From tstaudt.com 2020-09-10 03:16 EDT-------
for the records ... zfcpconf.sh is an addition to s390utils from Red Hat.

Comment 6 Vilém Maršík 2021-07-02 13:03:39 UTC
We have no zFCP machines in Beaker, but according to the assignee, the "correctly defined disk" should not be necessary for reproducing. Acking.

Comment 7 Vilém Maršík 2021-07-26 19:22:04 UTC
It seems we need an existing ZFCP device to reproduce/verify this. Do you know if there is a way to get one on our Beaker machines?

Comment 8 Dan Horák 2021-07-27 14:19:23 UTC
(In reply to Vilém Maršík from comment #7)
> It seems we need an existing ZFCP device to reproduce/verify this. Do you
> know if there is a way to get one on our Beaker machines?

unfortunately there aren't machines with zFCP in Beaker :-( The change itself (https://fedorapeople.org/cgit/sharkcz/public_git/utils.git/commit/?id=fd41fd536b1182281af587cfccc0d61802af920b) is trivial, so I believe SanityOnly would be good enough.

Comment 9 Vilém Maršík 2021-07-27 15:58:23 UTC
Okay, sanityonly tested, changes should be part of commit 500e34042d976f46effe9c5a206bc6f3e3594ee0 branch rhel-8.5.0.

Comment 12 Vilém Maršík 2021-08-09 23:19:35 UTC
Recent RHEL-8.5.0-20210804.d.3 now contains s390utils-2.16.0-2.el8.s390x, which should be a fixed version. Hope this is sufficient, as I don't know what to more from QA side.

Comment 14 errata-xmlrpc 2021-11-09 20:04:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (s390utils bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4506

Comment 15 IBM Bug Proxy 2023-01-30 12:40:48 UTC
------- Comment From MAIER.com 2023-01-30 07:33 EDT-------
(In reply to comment #4)
> Under some conditions the zfcpconf.sh script fails, I believe it's when
> multipath is used. See bellow for journal content when booted with
> rd.udev.debug
>
> [root@rock-kvmlp-fedora ~]# journalctl | grep zfcpconf
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com systemd-udevd[446]: RUN
> '/sbin/zfcpconf.sh' /usr/lib/udev/rules.d/56-zfcp.rules:1
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com systemd-udevd[459]: RUN
> '/sbin/zfcpconf.sh' /usr/lib/udev/rules.d/56-zfcp.rules:1
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com [475]: starting
> '/sbin/zfcpconf.sh'
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com [476]: starting
> '/sbin/zfcpconf.sh'
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com systemd-udevd[459]:
> '/sbin/zfcpconf.sh'(err) '/sbin/zfcpconf.sh: line 50: echo: write error:
> Resource temporarily unavailable'
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com systemd-udevd[459]:
> '/sbin/zfcpconf.sh'(err) '/sbin/zfcpconf.sh: line 52:
> /sys/bus/ccw/drivers/zfcp/0.0.3506/0x500507680b21ea4b/unit_add: No such file
> or directory'
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com systemd-udevd[459]:
> '/sbin/zfcpconf.sh'(err) '/sbin/zfcpconf.sh: line 50: echo: write error:
> Resource temporarily unavailable'
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com systemd-udevd[459]:
> '/sbin/zfcpconf.sh'(err) '/sbin/zfcpconf.sh: line 52:
> /sys/bus/ccw/drivers/zfcp/0.0.3506/0x500507680b21ea4a/unit_add: No such file
> or directory'
> srp 20 07:48:37 rock-kvmlp-fedora.z14.bos.redhat.com systemd-udevd[459]:
> Process '/sbin/zfcpconf.sh' failed with exit code 1.
> srp 20 07:48:39 rock-kvmlp-fedora.z14.bos.redhat.com systemd-udevd[446]:
> Process '/sbin/zfcpconf.sh' succeeded.
> [root@rock-kvmlp-fedora ~]# cat /proc/cmdline
> root=/dev/mapper/rhel_rock--kvmlp--fedora-root crashkernel=auto
> rd.lvm.lv=rhel_rock-kvmlp-fedora/root
> rd.zfcp=0.0.3506,0x500507680b21ea4b,0x0002000000000000
> rd.zfcp=0.0.3506,0x500507680b21ea4a,0x0002000000000000
> rd.lvm.lv=rhel_rock-kvmlp-fedora/swap cio_ignore=all,!condev
> rd.znet=qeth,0.0.1100,0.0.1101,0.0.1102,layer2=1,portno=0 rd.udev.debug
>
> There are 2 instances started, one succeeds, the other fails.

FWIW, it's due to the (fuzzy) udev trigger.
See also LTC bug 183179 / Red Hat bug 1790496:
>> The subsystems coldplug causes the following 2 events triggering
>> /lib/udev/rules.d/56-zfcp.rules:
>> KERNEL[1125.183178] add      /bus/ccw/drivers/zfcp (drivers)
>> KERNEL[1125.184255] add      /module/zfcp (module)
>>
>> (This has zfcpconf.sh needlessly running twice and somewhat in parallel but
>>  it does not hurt other than testing locking of cio&zfcp sysfs attributes.)
>>
>> $ rpm -qf /lib/udev/rules.d/56-zfcp.rules
>> s390utils-base-1.23.0-45.el7.s390x
>>
>> $ cat /lib/udev/rules.d/56-zfcp.rules
>> KERNEL=="zfcp", RUN+="/sbin/zfcpconf.sh"

> The result
> (discovered and enabled disks) isn't affected by this behaviour.

yep