Bug 463544

Summary: [LTC 6.0 FEAT] 201085:cio_ignore entry in generic.prm for LPARs
Product: Red Hat Enterprise Linux 6 Reporter: IBM Bug Proxy <bugproxy>
Component: anacondaAssignee: David Cantrell <dcantrell>
Status: CLOSED CURRENTRELEASE QA Contact: Release Test Team <release-test-team-automation>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: bhinson, borgan, ejratl, gmuelas, hannsj_uhl, jjarvis, jstodola, maier, pknirsch, snagar
Target Milestone: betaKeywords: FutureFeature, Reopened
Target Release: 6.0   
Hardware: s390x   
OS: All   
Whiteboard:
Fixed In Version: anaconda-13.21.0-1 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 533490 533492 533494 533495 (view as bug list) Environment:
Last Closed: 2010-07-02 20:39:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 356741, 475675, 533490, 533492, 533494, 533495, 554559, 561339, 626825    

Description IBM Bug Proxy 2008-09-23 21:01:09 UTC
=Comment: #0=================================================
Emily J. Ratliff <emilyr.com> - 2008-09-16 18:02 EDT
1. Feature Overview:
Feature Id:	[201085]
a. Name of Feature:	cio_ignore entry in generic.prm for LPARs
b. Feature Description
Provide a generic.prm template with an cio_ignore entry for installation from LPAR.  In addition
have install process have the cio_ignore setting become persistent.  For details see:
BZ/RIT34008/RIT118985.

Additional Comments:	This feature is for validation, it is expected to get included into RHEL5.3 
LTC 37754 - RHBZ253075: 201085: cio_ignore entry in generic.prm for LPARs

2. Feature Details:
Sponsor:	zSeries
Architectures:
s390x

Arch Specificity: Purely Arch Specific Code
Affects Installer: Yes
Delivery Mechanism: Request Red Hat development assistance
Category:	zSeries
Request Type:	Installer - Enhancement from Distributor
d. Upstream Acceptance:	No Code Required
Sponsor Priority	1
f. Severity: High
IBM Confidential:	no
Code Contribution:	no
g. Component Version Target:	

3. Business Case
Allow customers to Install Linux on an LPAR configured to have potential access to a huge number of
devices.

4. Primary contact at Red Hat: 
John Jarvis
jjarvis

5. Primary contacts at Partner:
Project Management Contact:
Hans-Georg Markgraf, mgrf.com, Boeblingen 49-7031-16-3978

Technical contact(s):
Gonzalo Muelas Serrano, gmuelas.com

IBM Manager:
Thomas Schwarz, t.schwarz.com

Comment 1 Chris Lumens 2008-09-23 22:58:32 UTC
This feature has already been implemented for 5.3 and was forward ported to Fedora ensuring it will be in RHEL6.

*** This bug has been marked as a duplicate of bug 253075 ***

Comment 4 David Cantrell 2009-10-06 20:56:32 UTC
The generic.prm file in anaconda-12.32-1 will have cio_ignore=all,!0.0.0009 added.

Comment 5 John Jarvis 2009-10-06 21:18:55 UTC
IBM is signed up to test and provide feedback.

Comment 6 David Cantrell 2009-10-07 23:13:07 UTC
*** Bug 499173 has been marked as a duplicate of this bug. ***

Comment 7 Steffen Maier 2009-10-14 12:41:53 UTC
Addition to comment 1: As Brad already mentioned in the dup bug https://bugzilla.redhat.com/show_bug.cgi?id=499173#c1, this was not implemented for RHEL 5.3. In fact, it was backed out late [https://bugzilla.redhat.com/show_bug.cgi?id=253075#c68]. It was also not forward ported from RHEL 5.3 but more or less freshly implemented for Fedora.

The commit 
http://git.fedorahosted.org/git/anaconda.git?p=anaconda.git;a=commitdiff;h=e75a5ed90eadd294f4db9e8c80e73ee2f021c6fc;hp=0ba6cb516da384e30efdd0dbd70a6c12336fd4f5
mentioned by David in comment 4 only completes what is necessary for a working installation with that cio_ignore default parameter in generic.prm.

What is still missing are:
- Making the information about dynamically freed devices during installation persistent in the installed system, so the users will also benefit from faster boot with every system boot after installation.
- Depending on how this is solved, all system config / management tools influencing s390 devices on the ccw bus would have to be adapted.

We have ideas on how to solve this with minimal effort and transparently for system config tools. We plan to provide a document describing the approach soon.

Comment 8 Siddharth Nagar 2009-10-23 15:36:14 UTC
If you haven't done so already, please open new bugzillas to address missing functionality that is referenced in comment #7.

Comment 9 Gonzalo Muelas Serrano 2009-10-29 13:42:57 UTC
Here is what I presented last Fr. on eSDT meeting:
A Linux on System z system will in most cases see a lot more devices than it
should be using (up to a current theoretical maximum of 262144 devices, in
practice, 1000-4000 devices are common) Using the cio_ignore kernel parameter
and /proc interface, this feature should mask the ccw devices which are not
used and unmask them if they are needed. Note: When unmasking should be
waited/verified that dev. is there (via sysfs) with a timeout (in case device
does not exist).
More details about cio_ignore under:
− Latest version of Device Drivers, Features, and Commands at:
http://www.ibm.com/developerworks/linux/linux390/development_documentation.html

This should be done during the installation and HW setup tools (like
system-config-network) and persistent for each boot.

What is needed for this feature:
- Patch anaconda to:
   By default mask all the devices except the console (0009) in generic.prm
(default parmfile for installation)and propagate this kernel parameter to the
zipl.conf (for boot)
   Unmask and wait for appearance of devices needed (interactive and
kickstart)[linuxrc.s390 already has support; backport for zfcp in anaconda]
- Patch mkinitrd (for root devices only) / initscripts (for all other devices)
to:
   Unmask and wait for appearance of devices needed which are enumerated in
already existing config files such as modprobe.conf (options dasd_mod
dasd=...), zfcp.conf (1st column) and ifcfg-* (SUBCHANNELS=...) where device
numbers are listed (doing so automatically handles all cases of device
configuration during installation, post-installation or manual editing of
config files)
- Patch device configuration tools (at this moment only system-config-network)
to:
   Unmask and wait for appearance of devices as needed during user-interactive
configuration

Denise will schedule a Tech. call when the developer who might be able to work
on this feature have some time, since right now they are busy with RHEL 6... so
probably will be around Jan. 2010.

Please sync. with Denise/John if you think it is possible to have a call before Jan. 2010 to get this feature request accepted in RHEL 6.0.

Thank you!

Comment 10 David Cantrell 2009-10-29 20:02:35 UTC
These issues need to be filed as separate bugs in Bugzilla because not all of these are anaconda issues (and some are multiple anaconda issues).

(In reply to comment #9)
> Here is what I presented last Fr. on eSDT meeting:
> A Linux on System z system will in most cases see a lot more devices than it
> should be using (up to a current theoretical maximum of 262144 devices, in
> practice, 1000-4000 devices are common) Using the cio_ignore kernel parameter
> and /proc interface, this feature should mask the ccw devices which are not
> used and unmask them if they are needed. Note: When unmasking should be
> waited/verified that dev. is there (via sysfs) with a timeout (in case device
> does not exist).
> More details about cio_ignore under:
> − Latest version of Device Drivers, Features, and Commands at:
> http://www.ibm.com/developerworks/linux/linux390/development_documentation.html
> 
> This should be done during the installation and HW setup tools (like
> system-config-network) and persistent for each boot.
> 
> What is needed for this feature:
> - Patch anaconda to:
>    By default mask all the devices except the console (0009) in generic.prm
> (default parmfile for installation)and propagate this kernel parameter to the
> zipl.conf (for boot)

This bug is about the first part of this requirement.  Carrying the parameter over to zipl.conf is a separate bug and should be filed as such.

>    Unmask and wait for appearance of devices needed (interactive and
> kickstart)[linuxrc.s390 already has support; backport for zfcp in anaconda]

As stated, linuxrc.s390 does this.  If zfcp support is missing from this facility, file another bug and detail just that issue.

> - Patch mkinitrd (for root devices only) / initscripts (for all other devices)
> to:
>    Unmask and wait for appearance of devices needed which are enumerated in
> already existing config files such as modprobe.conf (options dasd_mod
> dasd=...), zfcp.conf (1st column) and ifcfg-* (SUBCHANNELS=...) where device
> numbers are listed (doing so automatically handles all cases of device
> configuration during installation, post-installation or manual editing of
> config files)

This needs to be filed as an mkinitrd bug if it hasn't already.

> - Patch device configuration tools (at this moment only system-config-network)
> to:
>    Unmask and wait for appearance of devices as needed during user-interactive
> configuration

This needs to be filed against the appropriate configuration tool, not anaconda.

Comment 11 Steffen Maier 2009-11-07 15:24:50 UTC
(In reply to comment #10)
> These issues need to be filed as separate bugs in Bugzilla because not all of
> these are anaconda issues (and some are multiple anaconda issues).
> 
> (In reply to comment #9)

> > - Patch mkinitrd (for root devices only) / initscripts (for all other devices)
> > to:
> >    Unmask and wait for appearance of devices needed which are enumerated in
> > already existing config files such as modprobe.conf (options dasd_mod
> > dasd=...), zfcp.conf (1st column) and ifcfg-* (SUBCHANNELS=...) where device
> > numbers are listed (doing so automatically handles all cases of device
> > configuration during installation, post-installation or manual editing of
> > config files)
> 
> This needs to be filed as an mkinitrd bug if it hasn't already.

Since dracut really only handles devices for the root-fs but there can be other devices, cio_ignore for the latter needs to be handled by something like initscripts.
Please see bug 533494 for the general idea and code to be shared by dracut and initscripts.

Comment 15 Steffen Maier 2009-12-15 22:22:22 UTC
(In reply to comment #10)
> (In reply to comment #9)
> >    Unmask and wait for appearance of devices needed (interactive and
> > kickstart)[linuxrc.s390 already has support; backport for zfcp in anaconda]
> 
> As stated, linuxrc.s390 does this.  If zfcp support is missing from this
> facility, file another bug and detail just that issue.

I'm sorry, I only realized this now, but linuxrc.s390 and anaconda's zfcp support only have support to unmask but NOT to wait for the appearance of devices. Back when the unmasking was implemented in those places, we were not aware that writing to /proc/cio_ignore was asynchronous and would not block. See also https://bugzilla.redhat.com/show_bug.cgi?id=533492#c6

This means, that the waiting has to be done in user space by linuxrc.s390 and anaconda in this case here. During function testing of linuxrc.s390 and anaconda we never hit the case where the unmasking took longer than the 1 second sleep and the call of "udevadm settle" (which is NOT sufficient here since udev events are only triggered after the asynchronous unmasking has finished). Our co-worker Mark Ver hit the case during testing and I'm thankful he reminded me of the asynchronicity issue. With linuxrc.s390 messages like the following appear on the console for network devices (and similar ones for DASDs):

Device 0.0.1900 not present, trying to clear from blacklist and resense...
Device 0.0.1900 does not exist
Device 0.0.1901 not present, trying to clear from blacklist and resense...
Device 0.0.1901 does not exist
Device 0.0.1902 not present, trying to clear from blacklist and resense...
Device 0.0.1902 does not exist
0) redo this parameter, 1) continue, 2) restart dialog, 3) halt, 4) shell

With the rescue shell at this point he could confirm that the devices were unmasked successfully. Just the device check in linuxrc.s390 was to fast because it did not wait after the unmasking.

Since we do not really know if device bus IDs will actually be sensed after unmasking them (they may just not be configured in the system at all), waiting probably has to be done with some timeout mechanism and polling for the device appearance on the ccw bus.

> > - Patch mkinitrd (for root devices only) / initscripts (for all other devices)
> > to:
> >    Unmask and wait for appearance of devices

While the user may select continue with linuxrc.s390, there is no user-interaction on system boot, when *_cio_free (see bug 533494) becomes active. Since udev and its rules rely on devices being sensed completely, *_cio_free must wait for the appearance of devices after unmasking them. Since we do not really know if device bus IDs will actually be sensed after unmasking them (they may just not be configured in the system at all), waiting probably has to be done with some timeout mechanism and polling for the device appearance on the ccw bus.
See https://bugzilla.redhat.com/show_bug.cgi?id=533494#c15

> > - Patch device configuration tools (at this moment only system-config-network)
> > to:
> >    Unmask and wait for appearance of devices as needed during user-interactive
> > configuration

Tools such as system-config-network may just call *_cio_free (see bug 533494) which should handle (unmasking and) waiting(!) for the devices transparently.

Comment 17 releng-rhel@redhat.com 2010-01-14 17:19:56 UTC
Fixed in 'anaconda-13.21.0-1'. 'anaconda-13.21.0-1.el6' included in compose 'RHEL6.0-20100114.1'.
Moving to ON_QA.

Comment 18 Jan Stodola 2010-03-25 11:03:54 UTC
cio_ignore is included in generic.prm:
root=/dev/ram0 ro ip=off ramdisk_size=40000 cio_ignore=all,!0.0.0009

cio_ignore=all,!0.0.0009 is included in /etc/zipl.conf after installation:

[defaultboot]
default=linux
target=/boot/
[linux]
        image=/boot/vmlinuz-2.6.32-19.el6.s390x
        ramdisk=/boot/initramfs-2.6.32-19.el6.s390x.img
        parameters="root=/dev/mapper/vg_rtt7-lv_root rd_ZFCP=0.0.a000,0x50050763050b073d,0x4020400100000000 rd_LVM_LV=vg_rtt7/lv_root rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us cio_ignore=all,!0.0.0009 rhgb quiet"

Tested on build RHEL6.0-20100311.5 with anaconda-13.21.20-1.el6

Moving to VERIFIED.

Comment 19 releng-rhel@redhat.com 2010-07-02 20:39:47 UTC
Red Hat Enterprise Linux Beta 2 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.