RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 533494 - [LTC 6.0 FEAT] 201085: unmask and wait for devices
Summary: [LTC 6.0 FEAT] 201085: unmask and wait for devices
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: dracut
Version: 6.0
Hardware: s390x
OS: All
high
high
Target Milestone: beta
: 6.0
Assignee: Harald Hoyer
QA Contact: Release Test Team
URL:
Whiteboard:
Depends On: 463544
Blocks: 554559 561339 582286
TreeView+ depends on / blocked
 
Reported: 2009-11-06 21:26 UTC by Denise Dumas
Modified: 2010-07-02 19:00 UTC (History)
14 users (show)

Fixed In Version: dracut-004-18.el6
Doc Type: Enhancement
Doc Text:
Clone Of: 463544
: 561339 (view as bug list)
Environment:
Last Closed: 2010-07-02 19:00:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
script for unblocking DASD and ZFCP devices (1.13 KB, text/plain)
2009-12-01 10:37 UTC, Dan Horák
no flags Details
script for unblocking DASD and ZFCP devices v2 (1.82 KB, text/plain)
2009-12-08 09:26 UTC, Dan Horák
no flags Details
script for unblocking DASD and ZFCP and ZNET devices v3 (3.08 KB, text/plain)
2009-12-11 15:27 UTC, Dan Horák
no flags Details
script for unblocking DASD and ZFCP and ZNET devices v4 (3.18 KB, text/plain)
2009-12-14 12:09 UTC, Dan Horák
no flags Details
script for unblocking DASD and ZFCP and ZNET devices v5 (4.36 KB, text/plain)
2009-12-21 16:45 UTC, Dan Horák
no flags Details
script for unblocking DASD and ZFCP and ZNET devices v6 (4.68 KB, text/plain)
2010-01-07 09:55 UTC, Dan Horák
no flags Details
script for unblocking DASD and ZFCP and ZNET devices v7 (4.83 KB, text/plain)
2010-01-08 07:56 UTC, Dan Horák
no flags Details
script for unblocking DASD and ZFCP and ZNET devices v8 (5.13 KB, application/octet-stream)
2010-01-08 11:17 UTC, Dan Horák
no flags Details
init.log (105.60 KB, text/plain)
2010-03-24 13:56 UTC, Jan Stodola
no flags Details
rdudevinfo (226.69 KB, text/plain)
2010-03-25 21:40 UTC, Jan Stodola
no flags Details
rdudevdebug rdinitdebug (3.41 MB, application/octet-stream)
2010-03-26 08:34 UTC, Jan Stodola
no flags Details
rdudevdebug rdinitdebug (1.57 MB, application/octet-stream)
2010-04-27 17:03 UTC, Jan Stodola
no flags Details
use ccw-init and ccw rules from s390utils in dracut (3.24 KB, patch)
2010-04-28 08:45 UTC, Dan Horák
no flags Details | Diff

Comment 1 Denise Dumas 2009-11-06 21:26:58 UTC
This BZ is to address the following portion of the original BZ (dracut not mkinitrd) 


> - Patch mkinitrd (for root devices only) / initscripts (for all other devices)
> to:
>    Unmask and wait for appearance of devices needed which are enumerated in
> already existing config files such as modprobe.conf (options dasd_mod
> dasd=...), zfcp.conf (1st column) and ifcfg-* (SUBCHANNELS=...) where device
> numbers are listed (doing so automatically handles all cases of device
> configuration during installation, post-installation or manual editing of
> config files)

This needs to be filed as an mkinitrd bug if it hasn't already.

Comment 2 Steffen Maier 2009-11-07 15:23:45 UTC
My suggestion would be to introduce three new scripts znet_cio_free, dasd_cio_free, and zfcp_cio_free, that can be called by both dracut (for root-fs device dependencies) and sysV-init/upstart (for non-rootfs devices).
All three scripts parse their corresponding config files from /etc and dynamically free each device bus ID by means of /proc/cio_ignore. After the scripts have done their job, the already existing configuration mechanisms just work as they have been doing ever since.

For dracut, the scripts should be called probably before returning from the respective module parse function of the dracut command line, since the removal of devices from the cio blacklist has to happen before udev or anybody else tries to do something with those device bus IDs.
For sysV-init/upstart, the scripts should be called at an appropriate place and before udev is started or udev is triggered respectively.

znet_cio_free should parse /etc/sysconfig/network-scripts/ifcfg-* similar to /lib/udev/ccw_init.

dasd_cio_free should parse $(modprobe --showconfig | grep "options[[:space:]]\+dasd_mod") [see anaconda's loader/linuxrc.s390:parse_dasd] and /etc/dasd.conf [see dracut's modules.d/95dasd/dasdconf.sh].

zfcp_cio_free should parse /etc/zfcp.conf [see s390utils-base's /sbin/zfcpconf.sh].

Since both sysV-init and dracut use the same config files, they can transparently call the three new scripts. Also the user is free to manually modify the config files without having to rerun mkinitrd or dracut-gencmdline. Additionally, system-config tools may simply write config files without modifications except for the case where they dynamically configure s390 devices including setting online. For the latter case, they need support to dynamically free devices from the cio blacklist before trying to configure those devices.

Comment 6 Dan Horák 2009-12-01 10:37:13 UTC
Created attachment 375017 [details]
script for unblocking DASD and ZFCP devices

with added dasd_cio_free and zfcp_cio_free symlinks it should work for DASD and ZFCP devices

Comment 7 Steffen Maier 2009-12-04 23:10:00 UTC
Comment on attachment 375017 [details]
script for unblocking DASD and ZFCP devices

>#!/bin/sh
>
># unblock devices listed in various config files
>#
># it uses dasd and zfcp config file
># config file syntax:
># deviceno   options
># or
># deviceno   WWPN   FCPLUN
>#
>
>DASDCONFIG=/etc/dasd.conf
>ZFCPCONFIG=/etc/zfcp.conf
>BLACKLIST=/proc/cio_ignore
>PATH=/bin:/usr/bin:/sbin:/usr/sbin
>
>CMD=`basename $0`
>if [ $CMD = "dasd_cio_free" ]; then
>    CONFIG=$DASDCONFIG
>    MODE=dasd
>elif [ $CMD = "zfcp_cio_free" ]; then
>    CONFIG=$ZFCPCONFIG
>    MODE=zfcp
>else
>    echo "Unknown alias '$CMD'"
>    exit 1
>fi

Switch case construct instead of if-elif chain?

>
>if [ ! -f $BLACKLIST ]; then

Could we give the user a hint why we exit here?

>    exit 2
>fi
>
>if [ $MODE = "dasd" -o $MODE = "zfcp" ]; then
>    # process the config file
>    if [ -f "$CONFIG" ]; then
>        while read line; do
>	    case $line in
>		\#*) ;;
>		*)
>		    [ -z "$line" ] && continue
>		    set $line
>		    DEVICE=$1
>		    echo "freeing device $DEVICE"
>		    echo "free $DEVICE" > $BLACKLIST 2> /dev/null

Writing to $BLACKLIST can return with an errno. I don't know how we should show such error cases. They might be syntax errors from users in the config files and reporting would help the user.

>		    ;;
>	    esac
>	done < $CONFIG
>    fi
>fi
>
>if [ $MODE = "dasd" ]; then
>    # process the device list defined as option for the dasd module
>    DEVICES=`modprobe --showconfig | grep "options[[:space:]]\+dasd_mod" | cut -d "=" -f 2`

echo "freeing devices $DEVICES"

>    echo "free $DEVICES" > $BLACKLIST 2> /dev/null

[1] Chapter 3, Setting up the DASD device driver, Module parameters, page 34

Valid example entry in modprobe.conf:
options   dasd_mod  eer_pages=5 dasd=nopav,100(ro:diag),105-10f,0.0.beed-0.0.beef(erplog),nofcx,0.0.1234(erplog:failfast),0.1.5678

The above script code using grep and cut is not specific enough in parsing:
5 dasd

The idea to make use of /proc/cio_ignore's capability to eat sequences of device ranges and just take them from modprobe.conf seems nice. In order to get something that is digestible by
[1] Chapter 44, cio_ignore, page 513,
we could use the following, if we really wanted to do everything with sed:
DEVICES=$(modprobe --showconfig | grep "options[[:space:]]\+dasd_mod" | sed -e 's/.*[[:space:]]dasd=\([^[:space:]]*\).*/\1/' -e 's/([^)]*)//g' -e 's/nopav\|nofcx\|autodetect\|probeonly//g' -e 's/,,/,/g' -e 's/^,//' -e 's/,$//')

>fi

[1] Device Drivers, Features, and Commands (kernel 2.6.31) - SC33-8411-03, September 2009
http://www.ibm.com/developerworks/linux/linux390/documentation_dev.html
http://download.boulder.ibm.com/ibmdl/pub/software/dw/linux390/docu/lk31dd03.pdf

Comment 8 Dan Horák 2009-12-08 09:26:15 UTC
Created attachment 376864 [details]
script for unblocking DASD and ZFCP devices v2

updated script uploaded, thanks for the comments, Steffen

changes:
- use case instead of if-elif chain
- print some message what's going on when run with --verbose
- check error code after writing to /proc/cio_ignore
- use the sed command from Steffen to parse the dasd module options

Comment 9 Steffen Maier 2009-12-08 12:31:31 UTC
Attachment 376864 [details] looks good. Thanks a lot for the update, Dan.
Assuming this has been or will be function tested successfully, let's integrate it into sysV-init/upstart and dracut.
Please note, that this is so far only sufficient to get access to non-network disk drives. We still need support for the znet_cio_free part for network to function, most notably for ssh login (and potentially for network (root-)fs).

Comment 10 Dan Horák 2009-12-08 13:20:54 UTC
I am aware of the fact that this script solves only 2/3 of the problem. The znet_cio_free script is still in development, but will be ready in few days. And I tend to making it also a part of the "device_cio_free" script, though the data it should process are quite different.

Comment 11 Dan Horák 2009-12-11 15:27:12 UTC
Created attachment 377744 [details]
script for unblocking DASD and ZFCP and ZNET devices v3

new version that adds
- support for freeing networking channels
- verbose output enabled with command line switch

Comment 12 Steffen Maier 2009-12-11 20:13:27 UTC
Comment on attachment 377744 [details]
script for unblocking DASD and ZFCP and ZNET devices v3

Thanks! Basically it looks good, I only have a few minor comments below.

>if [ $MODE = "znet" ]; then
>    # process the config file
>    if [ -f "$CONFIG" ]; then
>        while read line; do
>	    case $line in
>		\#*) ;;
>		*)
>		    [ -z "$line" ] && continue
>		    # grep 2 or 3 channels from beginning of each line
>		    DEVICES=$(echo $line | egrep -i -o "^([0-9]\.[0-9]\.[a-f0-9]+,){1,2}([0-9]\.[0-9]\.[a-f0-9]+)")

I'm not sure about matching case-insensitively. In fact, the drivers only accept lower case letters for xdigits. Unfortunately, it's a trap users fall in, when copying&pasting devnos from z/VM, which writes out upper case letters in xdigits. But then again, we could transform whatever the user specifies to lower case. Unfortunately, this is not the only place where such a more flexible acceptance scheme would have to be implemented to work all the way through. I guess what I'm trying to say is, that we should probably accept either lower case only or mixed case and transform it to lower case everywhere to be consistent.

>		    if [ $DEVICES ]; then
>			[ $VERBOSE ] && echo "Freeing device(s) $DEVICES"
>			echo "free $DEVICES" > $BLACKLIST 2> /dev/null || echo "Error: can't free device(s) $DEVICES"
>		    fi
>		    ;;
>	    esac
>	done < $CONFIG
>    fi
>    # process channels from network interface configurations
>    for line in $(egrep -i -h "^[[:space:]]*SUBCHANNELS=([0-9]\.[0-9]\.[a-f0-9]+,){1,2}([0-9]\.[0-9]\.[a-f0-9]+)([[:space:]]+#|[[:space:]]*$)" /etc/sysconfig/network-scripts/ifcfg-* 2> /dev/null)

Here, I would have the same note as above regarding case-insensitive matching.

It would be nice, if the regex also matched if the user quoted (single or double) the right hand side of the assignment to the SUBCHANNELS variable. Anaconda does not write out quotes here, but the user might add them when manually editing an ifcfg file.

>    do
>	eval $line

Is this always safe, no matter what the content of $line is, or do we need to escape the dollar sign for variable expansion?:
eval \$line
(I had to do the latter in function ask() of anaconda's linuxrc.s390 but I'm not sure this even works here as well.)

>        [ $VERBOSE ] && echo "Freeing device(s) $SUBCHANNELS"
>	echo "free $SUBCHANNELS" > $BLACKLIST 2> /dev/null  || echo "Error: can't free device(s) $SUBCHANNELS"
>    done
>fi

Comment 13 Dan Horák 2009-12-14 12:08:35 UTC
(In reply to comment #12)
> (From update of attachment 377744 [details])
> Thanks! Basically it looks good, I only have a few minor comments below.
> 
> >if [ $MODE = "znet" ]; then
> >    # process the config file
> >    if [ -f "$CONFIG" ]; then
> >        while read line; do
> >	    case $line in
> >		\#*) ;;
> >		*)
> >		    [ -z "$line" ] && continue
> >		    # grep 2 or 3 channels from beginning of each line
> >		    DEVICES=$(echo $line | egrep -i -o "^([0-9]\.[0-9]\.[a-f0-9]+,){1,2}([0-9]\.[0-9]\.[a-f0-9]+)")
> 
> I'm not sure about matching case-insensitively. In fact, the drivers only
> accept lower case letters for xdigits. Unfortunately, it's a trap users fall
> in, when copying&pasting devnos from z/VM, which writes out upper case letters
> in xdigits. But then again, we could transform whatever the user specifies to
> lower case. Unfortunately, this is not the only place where such a more
> flexible acceptance scheme would have to be implemented to work all the way
> through. I guess what I'm trying to say is, that we should probably accept
> either lower case only or mixed case and transform it to lower case everywhere
> to be consistent.

new version has the input lowercased before going to search the regex, "-i" option is removed

> >		    if [ $DEVICES ]; then
> >			[ $VERBOSE ] && echo "Freeing device(s) $DEVICES"
> >			echo "free $DEVICES" > $BLACKLIST 2> /dev/null || echo "Error: can't free device(s) $DEVICES"
> >		    fi
> >		    ;;
> >	    esac
> >	done < $CONFIG
> >    fi
> >    # process channels from network interface configurations
> >    for line in $(egrep -i -h "^[[:space:]]*SUBCHANNELS=([0-9]\.[0-9]\.[a-f0-9]+,){1,2}([0-9]\.[0-9]\.[a-f0-9]+)([[:space:]]+#|[[:space:]]*$)" /etc/sysconfig/network-scripts/ifcfg-* 2> /dev/null)
> 
> Here, I would have the same note as above regarding case-insensitive matching.

I keep the "-i" option, because the lowercasing is done after the SUBCHANNEL variable is read and before going to the ?cio_ignore"
 
> It would be nice, if the regex also matched if the user quoted (single or
> double) the right hand side of the assignment to the SUBCHANNELS variable.
> Anaconda does not write out quotes here, but the user might add them when
> manually editing an ifcfg file.

added, the regex will catch also strings with mismatched or missing quotes, but an error will be issued later
 
> >    do
> >	eval $line
> 
> Is this always safe, no matter what the content of $line is, or do we need to
> escape the dollar sign for variable expansion?:
> eval \$line
> (I had to do the latter in function ask() of anaconda's linuxrc.s390 but I'm
> not sure this even works here as well.)

the "line" variable contains only the string "SUBCHANNELS=...." grepped from the config files, so I think it's correct to use "eval $line"
 
> >        [ $VERBOSE ] && echo "Freeing device(s) $SUBCHANNELS"
> >	echo "free $SUBCHANNELS" > $BLACKLIST 2> /dev/null  || echo "Error: can't free device(s) $SUBCHANNELS"
> >    done
> >fi

Comment 14 Dan Horák 2009-12-14 12:09:32 UTC
Created attachment 378191 [details]
script for unblocking DASD and ZFCP and ZNET devices v4

Comment 15 Steffen Maier 2009-12-15 22:20:21 UTC
(In reply to comment #14)
> Created an attachment (id=378191) [details]
> script for unblocking DASD and ZFCP and ZNET devices v4  

Thank you for the update. Regarding the unmasking the script is very good.
However, I was reminded of the part "waiting" for the appearance of devices that have just been unmasked.

Since udev and its rules rely on devices being sensed completely, *_cio_free must wait for the appearance of devices after unmasking them. Since we do not really know if device bus IDs will actually be sensed after unmasking them (they may just not be configured in the system at all), this probably has to be done with some timeout mechanism and polling for the device appearance on the ccw bus.
See also https://bugzilla.redhat.com/show_bug.cgi?id=463544#c12.

Comment 16 Dan Horák 2009-12-16 17:55:15 UTC
Steffen, thanks for reminding the "wait" part, but I am not sure where it should belong and also how the whole device initialization process actually looks like. Can you, please, give me some pointers or write it here?

- lets start with all devices ignored
- issue "free <device>" to /proc/cio_ignore
- a bit corresponding to a device is switched in the kernel structure and a rescan is started by css_schedule_reprobe()

And my questions are
- when does an udev event apear that leads to the (dasd,ccw,zfcp)conf.sh scripts
- how can I check device is fully initialized

Comment 17 Peter Oberparleiter 2009-12-17 10:04:38 UTC
Dan,

here's the flow of kernel functions following a blacklist free operation.
(* denotes deferred work).

# echo free > /proc/cio_ignore
cio_ignore_write
blacklist_parse_proc_parameters
css_schedule_reprobe
css_schedule_eval_all_unreg

        (work item is scheduled)

*css_slow_path_func
slow_eval_unknown_fn
css_evaluate_new_subchannel
css_probe_device
css_register_subchannel
css_sch_device_register
device_register

        After this, there is an object in sysfs for the
        subchannel (the future parent of the device we're
        looking for), but no kobject event will be
        generated as it is suppressed until later.

css_probe
io_subchannel_probe
css_schedule_eval

        (work item is scheduled)

*css_slow_path_func
slow_eval_known_fn
css_evaluate_known_subchannel
io_subchannel_sch_event
sch_create_and_recog_new_device
io_subchannel_recog
ccw_device_recognition
ccw_device_sense_id_start

        (I/O takes place)

*ccw_device_sense_id_done
ccw_device_recog_done
io_subchannel_recog_done
ccw_device_sched_todo

        (work item is scheduled)

*ccw_device_todo
io_subchannel_register

        Here, the KOBJ_ADD event for the subchannel is
        generated.

ccw_device_register
device_add

        After this, there is an object in sysfs for the
        device and a KOBJ_ADD event for the device is
        generated.

ccw_device_probe

        At this point, device type specific setup
        continues, e.g. DASD will add its own sysfs
        attributes and try to set the device online when
        a dasd= module parameter has been specified.
        Once the probe function finishes (assuming no
        module parameters were specified), the device is
        fully initialized.

I'm assuming that the *conf.sh scripts are run when the second UEVENT is processed. As you can see, there is a theoretical race condition between the second UEVENT and the completion of ccw_device_probe, but this condition also exists for the *conf.sh scripts so I guess that can be disregarded.

The biggest problem here is that you cannot wait for "cio_ignore processing has finished, but there are no new devices". That's where timeout processing comes into play. In a worst case scenario, both I/O and deferred kernel work can delay processing for an almost arbitrary amount of time, so there's no perfect timeout period. It might be a good idea to make this value user configurable somehow. Good starting values could be 10 to 15 seconds.

Here's my suggestion for the resulting wait logic:
        echo free <device id> > /proc/cio_ignore
        while /sys/bus/ccw/devices/<device id>/online does not exist and
              retry count hasn't been exhausted; do
                decrease retry count
                # maybe inform user about the reason for the delay when
                # we're waiting for more than 3 seconds
                wait for ADD UEVENT for any device or 1 second
        done

If you're issuing multiple blacklist free requests, it might be a good idea to do the waiting once for all device IDs at the end, or otherwise booting could be delayed for 1 second per freed device.

Comment 18 Peter Oberparleiter 2009-12-17 10:15:02 UTC
As a side note: we're currently working on a new kernel feature that will offer a replacement for this user space wait logic in the form of a single command: echo > /proc/cio_settle. When this command returns, processing for all previous blacklist operations is guaranteed to have finished.

Comment 19 Dan Horák 2009-12-17 10:47:10 UTC
(In reply to comment #18)
> As a side note: we're currently working on a new kernel feature that will offer
> a replacement for this user space wait logic in the form of a single command:
> echo > /proc/cio_settle. When this command returns, processing for all previous
> blacklist operations is guaranteed to have finished.  

Peter, first thanks a lot for the detailed description, I think it will be useful not only for me.

The new feature won't be available in the RHEL-6 kernels or is a backport planned?

Comment 20 Peter Oberparleiter 2009-12-17 11:43:55 UTC
(In reply to comment #19)
> The new feature won't be available in the RHEL-6 kernels or is a backport
> planned?  

This feature is not planned for RHEL-6 - the kernel feature submission deadline for RHEL-6 GA has long passed and even if there was an exception, we're not yet in a state where we can submit the code for inclusion. There is a chance that we'll submit it for one of the RHEL6 updates though.

Comment 21 Dan Horák 2009-12-17 13:17:29 UTC
Ok, it makes sense.

And before I start to reinvent the wheel and because you are the author of the  do you think it's possible to use(In reply to comment #20)
> (In reply to comment #19)
> > The new feature won't be available in the RHEL-6 kernels or is a backport
> > planned?  
> 
> This feature is not planned for RHEL-6 - the kernel feature submission deadline
> for RHEL-6 GA has long passed and even if there was an exception, we're not yet
> in a state where we can submit the code for inclusion. There is a chance that
> we'll submit it for one of the RHEL6 updates though.  

Ok, it makes sense.

And before I start to reinvent the wheel and because you are the author of some tools in s390-tools, do you think it's possible to use some tool directly or as a library of functions? Mainly I am interested in converting of ids to the canonical x.y.zzzz form and iterating thru device ranges.

My plan is to collect the device ids from the whole run of the script and do the availability checks at the end of the script.

Comment 22 Peter Oberparleiter 2009-12-21 15:58:48 UTC
(In reply to comment #21)
> And before I start to reinvent the wheel and because you are the author of some
> tools in s390-tools, do you think it's possible to use some tool directly or as
> a library of functions? Mainly I am interested in converting of ids to the
> canonical x.y.zzzz form and iterating thru device ranges.
> 
> My plan is to collect the device ids from the whole run of the script and do
> the availability checks at the end of the script.  

Unfortunately there are currently no tools for converting and iterating device IDs. Perhaps you could have a look at function "check_id()" in the lscss tool for a pointer on how to parse the device ID.

What I do recommend though, is using the cio_ignore tool from the s390-tools package: it will accept all sorts of identifier formats and convert them internally to the format as required by the /proc/cio_ignore interface, e.g.

  cio_ignore --remove 12,78,0.0.0100,8000,7000-7010

will work.

Also when the kernel item I mentioned will be available, we will also patch the cio_ignore tool to use that "settle" interface whenever a change to the blacklist was made.

Comment 23 Dan Horák 2009-12-21 16:45:21 UTC
Created attachment 379649 [details]
 script for unblocking DASD and ZFCP and ZNET devices v5

Hopefully a complete implementation with the waiting part added. Waiting/check times are 0, 1, 3, 6, 10 and 15 seconds and can be configured thru the "for" cycle in the wait_for_device function.

Comment 25 Harald Hoyer 2010-01-06 09:43:14 UTC
hmm, this script seems to need bash:

$ dash -n ~/Desktop/device_cio_free.sh 
/home/harald/Desktop/device_cio_free.sh: 176: Syntax error: redirection unexpected

and maybe we can replace "egrep" with "grep -E" to save binaries to pull in.

Comment 26 Dan Horák 2010-01-06 10:43:56 UTC
(In reply to comment #25)
> hmm, this script seems to need bash:
> 
> $ dash -n ~/Desktop/device_cio_free.sh 
> /home/harald/Desktop/device_cio_free.sh: 176: Syntax error: redirection
> unexpected

it looks like the whole "waiting" section is dash-incompatible ...

> and maybe we can replace "egrep" with "grep -E" to save binaries to pull in.  

that's doable

Comment 27 Dan Horák 2010-01-07 09:55:26 UTC
Created attachment 382186 [details]
script for unblocking DASD and ZFCP and ZNET devices v6

this is de-bashified version with a fix for device ranges that could be treated wrongly in the previous version

Comment 28 Peter Oberparleiter 2010-01-07 15:22:34 UTC
(In reply to comment #27)
> Created an attachment (id=382186) [details]
> script for unblocking DASD and ZFCP and ZNET devices v6
> 
> this is de-bashified version with a fix for device ranges that could be treated
> wrongly in the previous version  

Looks good!

There appears to be a small error though:

    if [ echo "free $DEV" > $BLACKLIST 2> /dev/null ]; then

I'm not a dash expert, but shouldn't it look more like this?:

    if ! echo "free $DEV" > $BLACKLIST 2> /dev/null; then

Comment 29 Dan Horák 2010-01-07 16:57:15 UTC
(In reply to comment #28)
> (In reply to comment #27)
> > Created an attachment (id=382186) [details] [details]
> > script for unblocking DASD and ZFCP and ZNET devices v6
> > 
> > this is de-bashified version with a fix for device ranges that could be treated
> > wrongly in the previous version  
> 
> Looks good!
> 
> There appears to be a small error though:
> 
>     if [ echo "free $DEV" > $BLACKLIST 2> /dev/null ]; then
> 
> I'm not a dash expert, but shouldn't it look more like this?:
> 
>     if ! echo "free $DEV" > $BLACKLIST 2> /dev/null; then  

Thanks for catching this.

And I have a question again - when does the /sys/bus/ccw/devices/<device id> directory appear for an existing device? Is it (almost) immediately? Because there an issue when the scripts unblocks non-existing devices and then it wants to wait on them and that can take a serious amount of time (nr of such devices * 15 sec timeout). So is there something that can be checked before the script starts waiting for the /sys/bus/ccw/devices/<device id>/online file?

Comment 31 Dan Horák 2010-01-08 07:56:26 UTC
Created attachment 382410 [details]
script for unblocking DASD and ZFCP and ZNET devices v7

this version is present in s390utils-1.8.2-7

changes:
- switch to grep -E
- implemented waiting for individual devices in the ranges
- little fixes

Comment 32 Peter Oberparleiter 2010-01-08 08:52:08 UTC
(In reply to comment #29)
> And I have a question again - when does the /sys/bus/ccw/devices/<device id>
> directory appear for an existing device? Is it (almost) immediately? Because
> there an issue when the scripts unblocks non-existing devices and then it wants
> to wait on them and that can take a serious amount of time (nr of such devices
> * 15 sec timeout). So is there something that can be checked before the script
> starts waiting for the /sys/bus/ccw/devices/<device id>/online file?  

The /sys directory for a device may take a potentially arbitrary amount of time to appear. In most cases and on modern hardware, it should be there immediately (<<1s). We've seen cases though, where congestion on a path to the device led to delays of more than 15s, though such long times are very rare.

The 100% solution would be to add a kernel interface through which userspace can wait until all pending kernel work related to CCW device recognition is done. That's what we're working on at the moment, but we're not there yet, so for this release we'll have to work around that problem.

I agree that the total waiting time for multiple non-existent devices might be unacceptably high. Is there a way to consolidate the timeout logic for all devices? If a device isn't present 15 seconds after the script started, there's no use in waiting for it any longer.

Also would it be possible to add a kernel(?) parameter which is parsed by the script through which users can influence the timeout value? In case a customer runs into the mentioned congestion problem, such a parameter could be used to work around the problem of their particular setup.

Comment 33 Dan Horák 2010-01-08 09:16:26 UTC
Peter, thanks for the answer and inspiration. I will change the waiting to accumulate the waiting time from individual devices and after a predefined time (say 60 seconds after sending all "free device" requests) is reached no additional time will spent for the waits.

Comment 34 Dan Horák 2010-01-08 11:17:13 UTC
Created attachment 382441 [details]
script for unblocking DASD and ZFCP and ZNET devices v8

this version is present in s390utils-1.8.2-8

changes:
- reworked the waiting logic to use a global timeout
- little fixes

Comment 35 Steffen Maier 2010-01-11 17:57:06 UTC
> if [ $MODE = "dasd" ]; then
>     # process the device list defined as option for the dasd module
>     DEVICES=$(modprobe --showconfig | grep "options[[:space:]]\+dasd_mod" | \
> 	sed -e 's/.*[[:space:]]dasd=\([^[:space:]]*\).*/\1/' -e 's/([^)]*)//g' \
> 	-e 's/nopav\|nofcx\|autodetect\|probeonly//g' -e 's/,,/,/g' -e 's/^,//' -e 's/,$//')
> 
>     free_device $DEVICES

This is the only place, where the user may specify a list of device bus IDs or ranges thereof, that correspond to different devices (the tuple with znet represents one ccwgroup and is therefore not considered multiple devices). The cio_ignore backend called by free_devices parses the comma-separated list of ranges and tries to free each list item during parsing. If the user now specified a list with many valid ranges and at least one invalid range, the kernel returns EINVAL for the entire list but might have successfully freed some valid ranges from the whole list already. Hence device_cio_free won't wait for the valid ranges because nothing is appended to ALL_DEVICES.

Therefore, I suggest to split the list here and call free_device for each range separately:

    for DEVRANGE in $(echo $DEVICES | tr ',' ' ')
    do
        free_device $DEVRANGE
    done

I did a quick test with the following, where the 2nd to last range is intentionally invalid:
options dasd_mod dasd=nopav,A00,0.0.0A02,400-402,0.1.0500-0.0.0503,0.0.fff0-0.0.fFfF

Other than that, the script seems very solid now. Thank you, Dan!

Comment 36 Dan Horák 2010-01-13 13:57:43 UTC
Steffen, thanks for your note, updated script is available in s390utils-1.8.2-9.el6

Comment 39 Steffen Maier 2010-02-01 16:59:26 UTC
Harald, I was just curious about the integration of device_cio_free into all s390-specific dracut modules and found dracut commit e1603bf7e404c13b4dd7312901e6153f45b8352a. It introduces the following calls:
modules.d/95dasd_mod/parse-dasd-mod.sh: dasd_cio_free
modules.d/95zfcp/parse-zfcp.sh: zfcp_cio_free
modules.d/95znet/parse-ccw.sh: znet_cio_free

They all seem good, thanks. However, I was wondering, if we needed an additional dasd_cio_free at the end of modules.d/95dasd/parse-dasd.sh. I'm not sure the one in parse-dasd-mod.sh is effective, if the user only specified rd_DASD but no rd_DASD_MOD.

Comment 40 Harald Hoyer 2010-02-03 09:17:42 UTC
(In reply to comment #39)
> Harald, I was just curious about the integration of device_cio_free into all
> s390-specific dracut modules and found dracut commit
> e1603bf7e404c13b4dd7312901e6153f45b8352a. It introduces the following calls:
> modules.d/95dasd_mod/parse-dasd-mod.sh: dasd_cio_free
> modules.d/95zfcp/parse-zfcp.sh: zfcp_cio_free
> modules.d/95znet/parse-ccw.sh: znet_cio_free
> 
> They all seem good, thanks. However, I was wondering, if we needed an
> additional dasd_cio_free at the end of modules.d/95dasd/parse-dasd.sh. I'm not
> sure the one in parse-dasd-mod.sh is effective, if the user only specified
> rd_DASD but no rd_DASD_MOD.    

The one in parse-dasd-mod.sh is always effective, although we might want to make sure parse-dasd.sh is executed before parse-dasd-mod.sh.

Comment 41 Jan Stodola 2010-02-03 10:20:42 UTC
After installation with "cio_ignore=all,!0.0.0009" parameter, system doesn't
have network interface working. I did following installation
(RHEL6.0-20100201.4, anaconda-13.21.8-1, s390utils-1.8.2-9, dracut-004-1):

1. make sure you are using network interface in layer2 mode (bug 561017)
2. edit config file, modify for example "LAYER2=1" to "oAYER2=1" - you need to
go interactive and restart zSeries initrd (see bug 558881 comment 6 for more
details)
3. start installation with "cio_ignore=all,!0.0.0009" in parameter file
4. install with default values
5. restart when the installation is done
6. log in and run "ip a" - no eth0 available:

Ýroot@rtt6 etc¨# ip a
ip a  
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00  
    inet 127.0.0.1/8 scope host lo  
    inet6 ::1/128 scope host   
       valid_lft forever preferred_lft forever  

I will append logs from the system.
Moving back to ASSIGNED.

Comment 42 Harald Hoyer 2010-02-03 10:25:57 UTC
(In reply to comment #41)
> After installation with "cio_ignore=all,!0.0.0009" parameter, system doesn't
> have network interface working. I did following installation
> (RHEL6.0-20100201.4, anaconda-13.21.8-1, s390utils-1.8.2-9, dracut-004-1):
> 
> 1. make sure you are using network interface in layer2 mode (bug 561017)
> 2. edit config file, modify for example "LAYER2=1" to "oAYER2=1" - you need to
> go interactive and restart zSeries initrd (see bug 558881 comment 6 for more
> details)
> 3. start installation with "cio_ignore=all,!0.0.0009" in parameter file
> 4. install with default values
> 5. restart when the installation is done
> 6. log in and run "ip a" - no eth0 available:
> 
> Ýroot@rtt6 etc¨# ip a
> ip a  
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00  
>     inet 127.0.0.1/8 scope host lo  
>     inet6 ::1/128 scope host   
>        valid_lft forever preferred_lft forever  
> 
> I will append logs from the system.
> Moving back to ASSIGNED.    

This is _not_ a dracut bug, if eth0 is not available at all...

Comment 43 Harald Hoyer 2010-02-03 10:27:23 UTC
dracut does only configure network interfaces, if you access the root device over network.

Comment 44 Jan Stodola 2010-02-03 10:56:01 UTC
I'm sorry, I will file a new bug.
Moving back to ON_QA

Comment 45 Steffen Maier 2010-02-03 12:20:47 UTC
(In reply to comment #41)
> 1. make sure you are using network interface in layer2 mode (bug 561017)
> 2. edit config file, modify for example "LAYER2=1" to "oAYER2=1" - you need to
> go interactive and restart zSeries initrd (see bug 558881 comment 6 for more
> details)

I'm not sure what this step does. If this means edit the CMS conf file (or parm
file if not using a conf file), wouldn't this step then in fact disable layer2,
i.e. enable layer3 mode and cause bug 561017?

> I will append logs from the system.

The content of /proc/cio_ignore after booting the installed system would be
interesting and I suppose that it'll show the equivalent of
"cio_ignore=all,!0.0.0900".

This is because we have integrated device_cio_free into dracut to handle rootfs
devices. But I'm not sure the other necessary integration of *_cio_free into
sysV-init/initscripts/upstart has been completed (see comment 1 and comment 2).
If not, then all the udev triggered rules to configure s390 devices after the
rootfs has been mounted won't work since the devices are ignored and cannot
send udev events to trigger the rules for configuration (in this case
/lib/udev/ccw_init for network).

If so, maybe a clone for initscripts or the like would help to clearly track this dependency.

Comment 46 Jan Stodola 2010-02-03 13:18:53 UTC
(In reply to comment #45)
> (In reply to comment #41)
> > 1. make sure you are using network interface in layer2 mode (bug 561017)
> > 2. edit config file, modify for example "LAYER2=1" to "oAYER2=1" - you need to
> > go interactive and restart zSeries initrd (see bug 558881 comment 6 for more
> > details)
> 
> I'm not sure what this step does. If this means edit the CMS conf file (or parm
> file if not using a conf file), wouldn't this step then in fact disable layer2,
> i.e. enable layer3 mode and cause bug 561017?

zSeries initrd will ask only for LAYER parameter, because there is no LAYER= option in config file, and the rest of parameters are taken from the config file - you don't need to enter all the parameters by hand and you are able to restart configuration questions at the end of zSeries initrd. Bug 561017 was tested with the proper config file.

> The content of /proc/cio_ignore after booting the installed system would be
> interesting and I suppose that it'll show the equivalent of
> "cio_ignore=all,!0.0.0900".

Ýroot@rtt6 proc¨# cat cio_ignore
cat cio_ignore  
0.0.0000-0.0.0008  
0.0.000a-0.0.3025  
0.0.3027-0.0.3125  
0.0.3127-0.0.ffff  
0.1.0000-0.1.ffff  
0.2.0000-0.2.ffff  
0.3.0000-0.3.ffff  

I will do a clone of this bug for initscripts, since I didn't find bug for that.

Comment 47 Jan Stodola 2010-02-03 13:29:13 UTC
clone of this bug for initscript part: bug 561339

Comment 48 Jan Stodola 2010-03-24 13:56:10 UTC
Created attachment 402311 [details]
init.log

While testing network part of this bug, I found that dracut doesn't bring network interface up. Not sure if it is dracut issue or issue with my setup.

steps I did:
1. install rhel without cio_ignore parameter, install on dasd disks
2. install dracut-network
3. run dracut to rebuild initramfs with network support:
dracut --force /boot/initramfs-2.6.32-19.el6.s390x.img `uname -r`
4. after reboot, modify /etc/zipl.conf and add lines:

[nfs2]
        image=/boot/vmlinuz-2.6.32-19.el6.s390x
        ramdisk=/boot/initramfs-2.6.32-19.el6.s390x.img
        parameters="root=10.16.105.196:/nfs/nfs_root cio_ignore=all,!0.0.0009 rd_CCW=qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR ip=10.16.105.197:10.16.105.196:10.16.111.254:255.255.248.0:rtt6.s390.bos.redhat.com:eth0:none rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rdshell rdinitdebug"

5. run zipl
6. reboot and select "nfs2" to boot
7. dracut fails to boot, eth0 is not up:

dracut:/# ip a
1: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN  
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

cat /proc/cio_ignore
0.0.0000-0.0.0008 
0.0.000a-0.0.3025 
0.0.3027-0.0.3125 
0.0.3127-0.0.3225 
0.0.3227-0.0.3325 
0.0.3327-0.0.3425 
0.0.3427-0.0.3525 
0.0.3527-0.0.3625 
0.0.3627-0.0.3725 
0.0.3727-0.0.ffff 
0.1.0000-0.1.ffff 
0.2.0000-0.2.ffff 
0.3.0000-0.3.ffff 

init.log is in attachment.

The same configuration without cio_ignore=all,!0.0.0009 bring the network interface up.

DASD and zFCP part of this bug is working fine for me.

Comment 49 Harald Hoyer 2010-03-24 14:23:44 UTC
cio_ignore=all,!0.0.0009

shouldn't this be: cio_ignore=all,!0.0.0009,!0.0.0a00,!0.0.0a01,!0.0.0a02

for rd_CCW=qeth,0.0.0a00,0.0.0a01,0.0.0a02 to work??

Comment 50 Dan Horák 2010-03-24 14:54:58 UTC
(In reply to comment #49)
> cio_ignore=all,!0.0.0009
> 
> shouldn't this be: cio_ignore=all,!0.0.0009,!0.0.0a00,!0.0.0a01,!0.0.0a02

I think it should be only cio_ignore=all,!0.0.0009 and the rest is freed automatically with *_cio_free
 
> for rd_CCW=qeth,0.0.0a00,0.0.0a01,0.0.0a02 to work??    

If I remember the dracut code correctly this string is copied into /etc/ccw.conf and so there is probably a problem with parsing it in the znet_cio_free script.

Comment 51 Jan Stodola 2010-03-24 14:57:24 UTC
with "cio_ignore=all,!0.0.0009,!0.0.0a00,!0.0.0a01,!0.0.0a02", the network ifc
eth0 still doesn't bring up, although /proc/cio_ignore looks fine to me:

cat /proc/cio_ignore
0.0.0000-0.0.0008 
0.0.000a-0.0.09ff 
0.0.0a03-0.0.3025 
0.0.3027-0.0.3125 
0.0.3127-0.0.3225 
0.0.3227-0.0.3325 
0.0.3327-0.0.3425 
0.0.3427-0.0.3525 
0.0.3527-0.0.3625 
0.0.3627-0.0.3725 
0.0.3727-0.0.ffff 
0.1.0000-0.1.ffff 
0.2.0000-0.2.ffff 
0.3.0000-0.3.ffff 

dracut:/# ip a
1: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN  
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 

Then I tried to run "udevadm trigger" and "ip a":

dracut:/# udevadm trigger
dracut:/# qeth: register layer 2 discipline 
qdio: 0.0.0a02 OSA on SC 2 using AI:1 QEBSM:0 PCI:1 TDD:1 SIGA:RW AO  
qeth 0.0.0a00: MAC address 02:00:00:00:00:07 successfully registered on device
e
th0 
qeth 0.0.0a00: Device is a Guest LAN QDIO card (level: V543) 
with link type GuestLAN QDIO (portname: FOOBAR) 

ip a
1: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN  
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 
2: eth0: <BROADCAST,MULTICAST> mtu 1492 qdisc noop state DOWN qlen 1000 
    link/ether 02:00:00:00:00:07 brd ff:ff:ff:ff:ff:ff 


Anyway, from line:
rd_CCW=qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR
there should be all necessary information for dracut to free network device,
doesn't it? For DASD disks it is not needed to add any additional devices to
cio_ignore= list, this boot item works for DASD:

[linux2]
        image=/boot/vmlinuz-2.6.32-19.el6.s390x
        ramdisk=/boot/initramfs-2.6.32-19.el6.s390x.img
        parameters="root=/dev/disk/by-path/ccw-0.0.3126-part1
rd_DASD=0.0.3126,use_diag=0,readonly=0,erplog=0,failfast=0 rd_NO_LUKS rd_NO_LVM
rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us
cio_ignore=all,!0.0.0009"

Comment 52 Steffen Maier 2010-03-24 19:19:21 UTC
(In reply to comment #48)

+ à -e /cmdline/30parse-ccw.sh ¨ 
+ . /cmdline/30parse-ccw.sh 
+ getargs rd_CCW= 
+ local o line found 
+ Ã -z root=10.16.105.196:/nfs/nfs_root cio_ignore=all,!0.0.0009 rd_CCW=qeth,0.0
.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR ip=10.16.105.197:10.16.105.196:
10.16.111.254:255.255.248.0:rtt6.s390.bos.redhat.com:eth0:none rd_NO_LUKS rd_NO_
LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rds
hell rdinitdebug BOOT_IMAGE=4  ¨ 

+ à rd_CCW=qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR = rd_CCW= ¨ 
+ à rd_CCW = rd_CCW ¨ 
+ echo -n qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR 
+ found=1 

+ à -n 1 ¨ 
+ return 0 

$(getargs 'rd_CCW=') returns successfully with hopefully at least one ccw_arg.

+ echo qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR 

modules.d/95znet/parse-ccw.sh correctly appends one line to /etc/ccw.conf
and calls znet_cio_free.

+ znet_cio_free 
ALL_DEVICES='' 

PROBLEM NO. 1:
Yet, znet_cio_free does not find anything in its internal ALL_DEVICES variable
which leads to unwanted output (c.f. bug 570763) but that's not the problem here.

Jan, could you please provide output of the following commands at the rdshell?:

ls -lF /etc/ccw.conf
cat /etc/ccw.conf
ls -laF /sys/bus/ccwgroup/devices/
sh -x znet_cio_free

Dan, I suspect this:

# grep 2 or 3 channels from beginning of each line
DEVICES=$(echo $line | grep -E -i -o "^([0-9]\.[0-9]\.[a-f0-9]+,){1,2}([0-9]\.[0-9]\.[a-f0-9]+)")

I think we should not match from the beginning of the line since it begins
with the driver name, "qeth," in the case here.

(In reply to comment #51)
> dracut:/# udevadm trigger
> dracut:/# qeth: register layer 2 discipline 
> qdio: 0.0.0a02 OSA on SC 2 using AI:1 QEBSM:0 PCI:1 TDD:1 SIGA:RW AO  
> qeth 0.0.0a00: MAC address 02:00:00:00:00:07 successfully registered on device
> e
> th0 
> qeth 0.0.0a00: Device is a Guest LAN QDIO card (level: V543) 
> with link type GuestLAN QDIO (portname: FOOBAR) 

PROBLEM NO. 2:
Even when working around problem 1, udev does not bring up the ccwgroup
correctly and it almost seems as if it needs to be triggered twice.
Wasn't there a similar issues requiring multiple udevadm trigger for network
in anaconda?

> Anyway, from line:
> rd_CCW=qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR
> there should be all necessary information for dracut to free network device,
> doesn't it?

Yes, rd_DASD, rd_DASD_MOD, rd_ZFCP, and rd_CCW (which should btw have been named rd_ZNET along with /etc/znet.conf in dracut) contain everything necessary for *_cio_fee, namely the device bus IDs.

Comment 53 Jan Stodola 2010-03-24 21:51:31 UTC
(In reply to comment #52)
> Jan, could you please provide output of the following commands at the rdshell?:
> 
> ls -lF /etc/ccw.conf
> cat /etc/ccw.conf
> ls -laF /sys/bus/ccwgroup/devices/
> sh -x znet_cio_free

dracut:/# ls -lF /etc/ccw.conf
-rw-r--r-- 1 root root 57 Mar 24 21:32 /etc/ccw.conf 

dracut:/# cat /etc/ccw.conf
qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR 

dracut:/# ls -laF /sys/bus/ccwgroup/devices/
ls: cannot access /sys/bus/ccwgroup/devices/: No such file or directory

ls -laF /sys/bus/
total 0 
drwxr-xr-x  7 root root 0 Mar 24 21:32 ./ 
drwxr-xr-x 13 root root 0 Mar 24 21:32 ../ 
drwxr-xr-x  4 root root 0 Mar 24 21:32 ccw/ 
drwxr-xr-x  4 root root 0 Mar 24 21:32 css/ 
drwxr-xr-x  4 root root 0 Mar 24 21:32 iucv/ 
drwxr-xr-x  4 root root 0 Mar 24 21:32 platform/
drwxr-xr-x  4 root root 0 Mar 24 21:32 virtio/ 

dracut:/# sh -x znet_cio_free
sh: Can't open znet_cio_free 

dracut:/sbin# sh -x /sbin/znet_cio_free
+ DASDCONFIG=/etc/dasd.conf 
+ ZFCPCONFIG=/etc/zfcp.conf 
+ ZNETCONFIG=/etc/ccw.conf 
+ BLACKLIST=/proc/cio_ignore 
+ VERBOSE= 
+ PATH=/bin:/usr/bin:/sbin:/usr/sbin 
+ ALL_DEVICES= 
+ WAITING_TIMEOUT=60 
+ WAITING_TOTAL=0 
+ basename /sbin/znet_cio_free 
+ CMD=znet_cio_free 
+ CONFIG=/etc/ccw.conf 
+ MODE=znet 
+ Ý 0 -gt 0 ¨ 
+ Ý ! -f /proc/cio_ignore ¨ 
+ Ý znet = dasd -o znet = zfcp ¨ 
+ Ý znet = dasd ¨ 
+ Ý znet = znet ¨ 
+ Ý -f /etc/ccw.conf ¨ 
+ read line 
+ Ý -z qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR ¨ 
+ echo qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR 
+ grep -E -i -o ^(Ý0-9¨\.Ý0-9¨\.Ýa-f0-9¨+,){1,2}(Ý0-9¨\.Ý0-9¨\.Ýa-f0-9¨+) 
+ DEVICES= 
+ free_device 
+ local DEV 
+ Ý -z  ¨ 
+ return 
+ read line 
+ grep -E -i -h ^ÝÝ:space:¨¨*SUBCHANNELS=Ý'"¨?(Ý0-9¨\.Ý0-9¨\.Ýa-f0-9¨+,){1,2}(Ý0
-9¨\.Ý0-9¨\.Ýa-f0-9¨+)Ý'"¨?(ÝÝ:space:¨¨+#|ÝÝ:space:¨¨*$) /etc/sysconfig/network-
scripts/ifcfg-* 
+ OLD_IFS=   
 
+ IFS=, 
+ set 
ALL_DEVICES='' 
BLACKLIST='/proc/cio_ignore' 
BOOT_IMAGE='4' 
CMD='znet_cio_free' 
CONFIG='/etc/ccw.conf' 
DASDCONFIG='/etc/dasd.conf' 
DEVICES='' 
HOME='/' 
IFS=',' 
KEYTABLE='us' 
LANG='en_US.UTF-8' 
MODE='znet' 
OLDPWD='/' 
OLD_IFS='   
' 
OPTIND='1' 
PATH='/bin:/usr/bin:/sbin:/usr/sbin' 
PPID='663' 
PS1='dracut:${PWD}# ' 
PS2='> ' 
PS4='+ ' 
PWD='/sbin' 
SYSFONT='latarcyrheb-sun16' 
TERM='linux' 
VERBOSE='' 
WAITING_TIMEOUT='60' 
WAITING_TOTAL='0' 
ZFCPCONFIG='/etc/zfcp.conf' 
ZNETCONFIG='/etc/ccw.conf' 
line='' 
rd_CCW='qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR' 

Module qeth is not loaded at this moment, is it expected behavior?

cat /proc/modules
dasd_eckd_mod 87759 24 - Live 0x000003e000c04000 
dasd_mod 101128 9 dasd_eckd_mod, Live 0x000003e000b8f000 
dm_mod 104990 0 - Live 0x000003e000b3c000 
nfs 468197 0 - Live 0x000003e000a78000 
lockd 103088 1 nfs, Live 0x000003e000999000 
fscache 65381 1 nfs, Live 0x000003e00094f000 
nfs_acl 3845 1 nfs, Live 0x000003e000926000 
auth_rpcgss 57161 1 nfs, Live 0x000003e000908000 
sunrpc 309132 5 nfs,lockd,nfs_acl,auth_rpcgss, Live 0x000003e000888000

Comment 54 Steffen Maier 2010-03-25 12:08:08 UTC
(In reply to comment #53)
> dracut:/# ls -laF /sys/bus/ccwgroup/devices/
> ls: cannot access /sys/bus/ccwgroup/devices/: No such file or directory

OK, this directory does not exist, because no network device driver using ccwgroup has been loaded which is in turn due to problem no. 2 of comment 52.

> dracut:/sbin# sh -x /sbin/znet_cio_free

> + Ý -f /etc/ccw.conf ¨ 
> + read line 
> + Ý -z qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR ¨ 
> + echo qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR 
> + grep -E -i -o ^(Ý0-9¨\.Ý0-9¨\.Ýa-f0-9¨+,){1,2}(Ý0-9¨\.Ý0-9¨\.Ýa-f0-9¨+) 
> + DEVICES= 

Dan, I think this confirms my suspicion that we should not match for beginning of line in above grep expression.

> ALL_DEVICES='' 

> Module qeth is not loaded at this moment, is it expected behavior?

Assuming the VM guest has qeth devices configured and they are no longer ignored by cio_ignore, then udev should have loaded qeth automatically by means of modalias. Additionally, dracut's modules.d/95znet/55-ccw.rules should have triggered modules.d/95znet/ccw_init to establish a ccwgroup and set it online, in order to provide eth0.

My assumption can be confirmed by looking if the following sysfs symlinks exist and point to ../../../../bus/ccw/drivers/qeth:
/sys/bus/ccw/devices/0.0.0a00/driver
/sys/bus/ccw/devices/0.0.0a01/driver
/sys/bus/ccw/devices/0.0.0a02/driver
And to confirm that modalias is there:
cat /sys/bus/ccw/devices/0.0.0a00/modalias

Jan, maybe Harald can figure out what's wrong with regard to problem no. 2 by looking at a udev debug trace (full z/VM console log with rdudevdebug and maybe rdudevinfo boot option).
The anaconda udevadm trigger issue with network I mentioned above was here (maybe this also helps Harald; I haven't understood the cause nor the conclusion):
https://www.redhat.com/archives/anaconda-devel-list/2009-July/msg00387.html
referring to bug 514501 comment 4.

Comment 55 Dan Horák 2010-03-25 13:08:26 UTC
(In reply to comment #54)
> > + Ý -f /etc/ccw.conf ¨ 
> > + read line 
> > + Ý -z qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR ¨ 
> > + echo qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR 
> > + grep -E -i -o ^(Ý0-9¨\.Ý0-9¨\.Ýa-f0-9¨+,){1,2}(Ý0-9¨\.Ý0-9¨\.Ýa-f0-9¨+) 
> > + DEVICES= 
> 
> Dan, I think this confirms my suspicion that we should not match for beginning
> of line in above grep expression.

Right and it's also documented in dracut's man page. It's already fixed in my development version.

Comment 56 Jan Stodola 2010-03-25 15:11:51 UTC
> > Module qeth is not loaded at this moment, is it expected behavior?
> 
> Assuming the VM guest has qeth devices configured and they are no longer
> ignored by cio_ignore, then udev should have loaded qeth automatically by means
> of modalias. Additionally, dracut's modules.d/95znet/55-ccw.rules should have
> triggered modules.d/95znet/ccw_init to establish a ccwgroup and set it online,
> in order to provide eth0.
> 
> My assumption can be confirmed by looking if the following sysfs symlinks exist
> and point to ../../../../bus/ccw/drivers/qeth:
> /sys/bus/ccw/devices/0.0.0a00/driver
> /sys/bus/ccw/devices/0.0.0a01/driver
> /sys/bus/ccw/devices/0.0.0a02/driver
> And to confirm that modalias is there:
> cat /sys/bus/ccw/devices/0.0.0a00/modalias

These directories do not exist:
dracut:/# ls /sys/bus/ccw/devices
0.0.0009  0.0.3126  0.0.3326  0.0.3526 0.0.3726
0.0.3026  0.0.3226  0.0.3426  0.0.3626

Comment 57 Steffen Maier 2010-03-25 15:30:28 UTC
(In reply to comment #56)
> These directories do not exist:
> dracut:/# ls /sys/bus/ccw/devices
> 0.0.0009  0.0.3126  0.0.3326  0.0.3526 0.0.3726
> 0.0.3026  0.0.3226  0.0.3426  0.0.3626    

Even if you circumvent problem no. 1 by using cio_ignore=all,!0.0.0009,!0.0.0a00-0.0.0a02 at IPL time as you did in comment 51?

Comment 58 Jan Stodola 2010-03-25 15:48:09 UTC
with cio_ignore=all,!0.0.0009,!0.0.0a00,!0.0.0a01,!0.0.0a02

dracut:/# ls -l /sys/bus/ccw/devices/0.0.0a00/driver
lrwxrwxrwx 1 root root 0 Mar 25 15:53 /sys/bus/ccw/devices/0.0.0a00/driver -> ..
/../../../bus/ccw/drivers/qeth 
dracut:/# ls -l /sys/bus/ccw/devices/0.0.0a01/driver
lrwxrwxrwx 1 root root 0 Mar 25 15:54 /sys/bus/ccw/devices/0.0.0a01/driver -> ..
/../../../bus/ccw/drivers/qeth 
dracut:/# ls -l /sys/bus/ccw/devices/0.0.0a02/driver
lrwxrwxrwx 1 root root 0 Mar 25 15:54 /sys/bus/ccw/devices/0.0.0a02/driver -> ..
/../../../bus/ccw/drivers/qeth 
dracut:/# cat /sys/bus/ccw/devices/0.0.0a00/modalias
ccw:t1731m01dt1732dm01

Comment 59 Steffen Maier 2010-03-25 18:41:44 UTC
Phew, I was already worried something really weird was going on.
rdudevdebug and rdudevinfo should help us shed light onto problem no. 2.

Comment 60 Jan Stodola 2010-03-25 21:40:11 UTC
Created attachment 402692 [details]
rdudevinfo

Messages from boot with rdudevinfo parameter.

Comment 61 Steffen Maier 2010-03-25 23:41:29 UTC
Could you please replace attachment 402692 [details] with a console log having cio_ignore=all,!0.0.0009,!0.0.0a00,!0.0.0a01,!0.0.0a02
AND rdudevdebug plus rdinitdebug at the beginning of the kernel command line
(and possibly compressed with gzip since it might get large)?

We have to circumvent problem no. 1, otherwise dracut will never see any qeth devices.
rdudevinfo does not give enough info to track all the output.



Independent of that, I see some unrelated oddities in the console log
(which might be worth new bug(s) against dracut):

Kernel command line: root=10.16.105.196:/nfs/nfs_root cio_ignore=all,!0.0.0009 r
d_CCW=qeth,0.0.0a00,0.0.0a01,0.0.0a02,layer2=1,portname=FOOBAR ip=10.16.105.197:
10.16.105.196:10.16.111.254:255.255.248.0:rtt6.s390.bos.redhat.com:eth0:none rd_
NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 K
EYTABLE=us rdshell rdinitdebug rdudevinfo BOOT_IMAGE=4 

Even though the kernel command line does not specify any rd_DASD*, six DASDs seem to be freed automatically from cio_ignore, the dasd device driver gets loaded automatically, and the DASDs set online successfully. This is definitely not intended. The only thing I can imagine is that /etc/dasd.conf (or even /etc/modprobe.conf or /etc/modprobe.d/** with some "options dasd_mod dasd=...") was copied from your installed system into the initramfs even though you correctly did not specify hostonly mode on generating initramfs and modules.d/95dasd/install does only copy the file in hostonly mode.

Also, I'm missing lots of dracut shell trace e.g. "getarg rd_CCW" and all the other dracut args of the above cited kernel command line. The only getarg visible in rdinitdebug output is "getarg rdshell" at the end. But then again, dracut seems to do something with rd_NO_LUKS, rd_NO_LVM, and rd_NO_MD.
Ahhh, maybe rdinitdebug is position dependent and should be moved to the beginning of the kernel command line together with rdudevdebug???

Comment 62 Jan Stodola 2010-03-26 08:34:50 UTC
Created attachment 402768 [details]
rdudevdebug rdinitdebug

console log with kernel parameters rdudevdebug rdinitdebug cio_ignore=all,!0.0.0009,!0.0.0a00,!0.0.0a01,!0.0.0a02

Comment 63 Steffen Maier 2010-03-26 11:16:30 UTC
The issue reported by Jan in comment 48 actually consists of two sub-problems:

Problem 1 is a bug in s390utils' dasd_cio_free and therefore not related to dracut. However, we have been discussing *_cio_free in this bug here, so problem 1 actually belongs here.

Problem 2 is the same as in bug 561926 comment 26, i.e. related to s390utils' ccw_init (assuming bug 539491 has been fixed) and therefore not related to dracut. Since I don't have a recent compose, I cannot look into it to see which of those dependent bugs have already been fixed and incorporated. I hesitate to make this bug here dependent on 561926 and 539491 since that would weave unrelated stuff together.

### analysis of debug log:

udevd^?185^?: 'ACTION=add' added
udevd^?185^?: 'DEVPATH=/devices/css0/0.0.0000/0.0.0a00' added
udevd^?185^?: 'MODALIAS=ccw:t1731m01dt1732dm01' added
...
udevd^?185^?: 'ACTION=add' added
udevd^?185^?: 'DEVPATH=/devices/css0/0.0.0001/0.0.0a01' added
udevd^?185^?: 'MODALIAS=ccw:t1731m01dt1732dm01' added
...
udevd^?185^?: 'ACTION=add' added
udevd^?185^?: 'DEVPATH=/devices/css0/0.0.0002/0.0.0a02' added
udevd^?185^?: 'MODALIAS=ccw:t1731m01dt1732dm01' added

### Udevadm trigger correctly generates add events for the qeth subchannels.

udevd-work^?223^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 78: cannot create /sys/bus/ccwgroup/drivers/qeth/group: Directory nonexistent'

udevd-work^?224^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 78: cannot create /sys/bus/ccwgroup/drivers/qeth/group: Directory nonexistent'

udevd-work^?226^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 78: cannot ceate /sys/bus/ccwgroup/drivers/qeth/group: Directory nonexistent'

udevd-work^?223^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 87: cannot create /sys/bus/ccwgroup/drivers/qeth/0.0.0a00/layer2: Directory nonexistent' 
udevd-work^?223^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 93: cannot create /sys/bus/ccwgroup/drivers/qeth/0.0.0a00/portname: Directory nonexistent'
udevd-work^?223^?: 'ccw_init' returned with exitcode 1
...
udevd-work^?223^?: '/sbin/modprobe -b ccw:t1731m01dt1732dm01' started
...
udevd-work^?223^?: '/sbin/modprobe -b ccw:t1731m01dt1732dm01' returned with exitcode 0

qeth: loading core functions

udevd-work^?224^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 87: cannot create /sys/bus/ccwgroup/drivers/qeth/0.0.0a00/layer2: Directory nonexistent' 
udevd-work^?224^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 93: cannot create /sys/bus/ccwgroup/drivers/qeth/0.0.0a00/portname: Directory nonexistent'
udevd-work^?224^?: 'ccw_init' returned with exitcode 1
...
udevd-work^?224^?: '/sbin/modprobe -b ccw:t1731m01dt1732dm01' started
...
udevd-work^?224^?: '/sbin/modprobe -b ccw:t1731m01dt1732dm01' returned with exitcode 0

udevd-work^?226^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 87: cannot create /sys/bus/ccwgroup/drivers/qeth/0.0.0a00/layer2: Directory nonexistent'
udevd-work^?226^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 93: cannot create /sys/bus/ccwgroup/drivers/qeth/0.0.0a00/portname: Directory nonexistent'
udevd-work^?226^?: 'ccw_init' returned with exitcode 1
...
udevd-work^?226^?: '/sbin/modprobe -b ccw:t1731m01dt1732dm01' started
...
udevd-work^?226^?: '/sbin/modprobe -b ccw:t1731m01dt1732dm01' returned with exitcode 0

### For all three subchannels 0a00, 0a01, and 0a02 separate udevd workers run in parallel. Unfortunately ccw_init gets triggered before the qeth driver has been loaded successfully. That's simply the wrong order and therefore it doesn't work.

Comment 64 Dan Horák 2010-03-29 16:01:09 UTC
The /etc/ccw.conf parsing is fixed s390utils-1.8.2-15.el6

Comment 65 Steffen Maier 2010-03-29 16:07:48 UTC
Hi Dan, the following code fragment of device_cio_free seems to exit early if one of the free for one device type failed, e.g. due to some non-existent device. Suppose, one DASD does not exist or took too long to free and we ran into the timeout, then no ZFCP or network devices would be freed any more.

I think it should still call free for all different device types. It could maybe remember if an error level unequal to zero was returned by one of the three calls and then finally return with an appropriate error level unequal to zero.

if [ $MODE = "all" ]; then
    # shortcut for calling all 3 scripts
    $DIR/dasd_cio_free $ARGS || exit $?
    $DIR/zfcp_cio_free $ARGS || exit $?
    $DIR/znet_cio_free $ARGS || exit $?
    exit 0
fi

Comment 66 Dan Horák 2010-03-29 16:57:17 UTC
(In reply to comment #65)
> Hi Dan, the following code fragment of device_cio_free seems to exit early if
> one of the free for one device type failed, e.g. due to some non-existent
> device. Suppose, one DASD does not exist or took too long to free and we ran
> into the timeout, then no ZFCP or network devices would be freed any more.

Hi Steffen, device_cio_free never exits with a non-zero when there is a problem with a device, it can reach a timeout for some devices, but still returns a zero. It must check also the devices (with a success) that are listed after the timed out ones.

Did you get the new version from the public repo for the s390(x) Fedora utils (http://fedorapeople.org/gitweb?p=sharkcz/public_git/utils.git;a=summary)?

Comment 67 Denise Dumas 2010-03-29 19:00:49 UTC
Dan, should this bz be assigned to s390-utils, or stay with dracut?

Comment 68 Steffen Maier 2010-03-30 06:09:30 UTC
(In reply to comment #67)
> Dan, should this bz be assigned to s390-utils, or stay with dracut?    

IMHO, this bug here treats both dracut (originally) and s390utils (started later on). Similar to bug 561339, the dracut part should strictly speaking be called "[LTC 6.0 FEAT] 201085: unmask and wait for devices, dracut part" and we still need it to track the dependency on that package. For the s390utils part this bug here already has the corresponding description.

Maybe we could clone this bug for s390utils and then rename the title in this original dracut bug?

Comment 69 Steffen Maier 2010-03-30 06:15:32 UTC
(In reply to comment #66)
> Hi Steffen, device_cio_free never exits with a non-zero when there is a problem
> with a device, it can reach a timeout for some devices, but still returns a
> zero. It must check also the devices (with a success) that are listed after the
> timed out ones.

Ah, OK, I missed that. Thanks for pointing out. Everything's fine then.

> Did you get the new version from the public repo for the s390(x) Fedora utils
> (http://fedorapeople.org/gitweb?p=sharkcz/public_git/utils.git;a=summary)?    

Yes, already cloned. Thanks a lot for having shared this on fedora-s390x!

Comment 70 Steffen Maier 2010-04-01 12:33:40 UTC
device_cio_free has implicit dependencies on various external tools such as:
echo, sleep, modprobe, grep, printf, seq.

To make things worse, device_cio_free is used in 3 different environments:
1) in anaconda [bug 558881, bug 533492, bug 576015]: dependencies are in anaconda's rpm, upd-instroot, and mk-images (for linuxrc.s390).
2) in dracut [bug 533494]: dependencies are coded in dracut module's install script and probably dracut's rpm in the first place as well.
3) in initscripts/upstart [bug 561339]: dependencies would need to be fetched on install of s390utils-base by rpm requires.

Since dependencies seem only partly explicitly specified, there are cases where some of the external tools are not available and device_cio_free therefore fails,
e.g. bug 558881 comment 30.

Can we make all those dependencies explicit for each environment and/or reduce the number of external tools by using dash builtins where possible?

Comment 71 Phil Knirsch 2010-04-13 12:48:56 UTC
Wouldn't the least error prone way be to simply have s390utils-base require all the necessary components? As each of the above mentioned components will need to require s390utils-base for s390x anyway that would ensure that existence of the necessary packages, right?

The problem with image generation is distinct from those requirements though. Those are and will always be script related issues, be it anaconda image generation or dracut. So basically only 3) can be really fixed via dependencies, 1) and 2) will always need sync ups with possible changes in device_cio_free requirements i guess.

Or am i missing something here?

Just my $0.02

Thanks & regards, Phil

Comment 72 Steffen Maier 2010-04-13 17:45:32 UTC
(In reply to comment #71)
> The problem with image generation is distinct from those requirements though.
> Those are and will always be script related issues, be it anaconda image
> generation or dracut. So basically only 3) can be really fixed via
> dependencies, 1) and 2) will always need sync ups with possible changes in
> device_cio_free requirements i guess.

Exactly, 3) is necessary but not sufficient for 1) and 2).
I just did the overview in comment 70 so we won't forget any place where dependencies occur.

Comment 73 Harald Hoyer 2010-04-20 14:09:15 UTC
(In reply to comment #70)
> Since dependencies seem only partly explicitly specified, there are cases where
> some of the external tools are not available and device_cio_free therefore
> fails,
> e.g. bug 558881 comment 30.

seq will be installed in initramfs since dracut-004-12.el6

Comment 76 Jan Stodola 2010-04-27 17:03:22 UTC
Created attachment 409528 [details]
rdudevdebug rdinitdebug

I'm still unable to boot with rootfs on network device (iSCSI disk) when booting with cio_ignore=all,!0.0.0009 parameter. (It works without the cio_ignore parameter).

Attaching boot log with "cio_ignore=all,!0.0009,!0.0.0a00,!0.0.0a01,!0.0.0a02 rdshell rdudevdebug rdinitdebug" parameter.

Tested on build RHEL6.0-20100422.12 with dracut-004-17.el6, s390utils-1.8.2-18.el6

Moving back to ASSIGNED.

Comment 77 Steffen Maier 2010-04-28 00:28:56 UTC
Did anybody address problem 2 of comment 63?
It doesn't look like because I still see those lines:

udevd-work^?176^?: '/lib/udev/ccw_init' (stderr) '/lib/udev/ccw_init: 73: cannot create /sys/bus/ccwgroup/drivers/qeth/group: Directory nonexistent' 
...
udevd-work^?176^?: 'ccw_init' returned with exitcode 1 

Before that is not fixed, it does not make sense to retest.
Jan, you could work around the too late module loading by booting with rdbreak=cmdline or rdbreak=pre-udev and issue "modprobe qeth" in the debug shell and then exit the shell to continue dracut initramfs execution.

===

Is Dan's fix of comment 55 in the compose?
If so, then you don't need the workaround with manually adapting the default cio_ignore anymore, Jan. cio_ignore=all,!0.0009 should just work.

===

FYI: I doubt "ifname=eth0:02:00:00:00:00:05" will work, since OSA layer2 will come up with a random MAC address each time so you cannot match for MAC address. That's why linuxrc pins the first random MAC address in the ifcfg file as MACADDR so it'll continue using the same MAC on each reboot. However, that doesn't help with dracut here.

Comment 78 Dan Horák 2010-04-28 08:45:31 UTC
Created attachment 409723 [details]
use ccw-init and ccw rules from s390utils in dracut

I've committed and built s390utils-1.8.2-19.el6 with the merged ccw_init script that should work in both initramfs and normal startup.

re comment 63:
there is a delay loop before trying to group the channel in the ccw_init script, it should help if it's only a timing issue

re comment 55:
the fix is present in s390utils for some time

Comment 82 Jan Stodola 2010-06-17 11:52:09 UTC
Tested many times during RHEL6 Snapshot testing and no issue found with recent trees:

 * root fs on zFCP LUN tested on build RHEL6.0-20100531.1 (dracut-004-20.el6, s390utils-1.8.2-24.el6)
 * root fs on DASD tested on build RHEL6.0-20100615.0 (dracut-004-20.1.el6, s390utils-1.8.2-24.el6)
 * root fs on iSCSI (=network part of this bug) tested on build RHEL6.0-20100615.0 (dracut-004-20.1.el6, s390utils-1.8.2-24.el6)

Moving bug to VERIFIED.

Comment 83 Steffen Maier 2010-06-24 15:07:40 UTC
Dan, while looking at bug 607481, I figured that /etc/sysconfig/network-scripts/network-functions ignores ifcfg files with certain name pattersn defined in /etc/init.d/functions with $__sed_discard_ignored_files.
If we wanted to copy the semantics of SUBCHANNELS handling, /sbin/device_cio_free could use the same file ignore pattern, in order to not free devnos from cio_ignore which /lib/udev/ccw_init would never activate.

Comment 84 Dan Horák 2010-06-28 14:06:22 UTC
(In reply to comment #83)
> Dan, while looking at bug 607481, I figured that
> /etc/sysconfig/network-scripts/network-functions ignores ifcfg files with
> certain name pattersn defined in /etc/init.d/functions with
> $__sed_discard_ignored_files.
> If we wanted to copy the semantics of SUBCHANNELS handling,
> /sbin/device_cio_free could use the same file ignore pattern, in order to not
> free devnos from cio_ignore which /lib/udev/ccw_init would never activate.    

good point, Steffen

fixed in commit 5dec421fb and the updated device_cio_free will be included in s390utils-1.8.2-26.el6

Comment 85 releng-rhel@redhat.com 2010-07-02 19:00:46 UTC
Red Hat Enterprise Linux Beta 2 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.