Bug 533492
Summary: | [LTC 6.0 FEAT] 201085:zFCP portion of original BZ | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Denise Dumas <ddumas> | ||||
Component: | anaconda | Assignee: | David Cantrell <dcantrell> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Release Test Team <release-test-team-automation> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6.0 | CC: | bhinson, borgan, brueckner, bugproxy, dhorak, diehl, ejratl, gmuelas, jjarvis, jkachuck, jstodola, maier, pknirsch, rlerch, rpacheco, snagar, syeghiay, tao | ||||
Target Milestone: | rc | Keywords: | FutureFeature, Reopened | ||||
Target Release: | 6.0 | ||||||
Hardware: | s390x | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
Fixed In Version: | anaconda-13.21.47-1 | Doc Type: | Enhancement | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 463544 | Environment: | |||||
Last Closed: | 2010-07-06 19:08:41 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 463544, 576015, 589278 | ||||||
Bug Blocks: | 555224, 563347, 582286 | ||||||
Attachments: |
|
Support for cio_ignore with zfcp in anaconda is upstream (and in rhel6-branch) since commit f2de4e76d7f8b8e7f21de371e42427096909a361. (In reply to comment #2) > Support for cio_ignore with zfcp in anaconda is upstream (and in rhel6-branch) > since commit f2de4e76d7f8b8e7f21de371e42427096909a361. Hmm, so I guess this can be closed then, David ? Looks like it. This fix has been present since anaconda-12.16-1. Moving to MODIFIED. I imagine we should at least get IBM to verify the functionality they want is there (even though the patch came from Steffen at IBM...). Fixed in 'anaconda-12.16-1'. 'anaconda-12.38.5-1.el6' included in compose 'RHEL6.0-20091118.1'. Moving to ON_QA. I'm sorry, I only realized this now, but anaconda's zfcp support only has support to unmask but NOT to wait for the appearance of devices that have just been unmasked. Back when the unmasking was implemented in those places, we were not aware that writing to /proc/cio_ignore was asynchronous and would not block. Not waiting for the device appearance might lead to strange error situations and zFCP disks not becoming available. See also https://bugzilla.redhat.com/show_bug.cgi?id=463544#c15. Hello, as mentioned by Steffen in comment 9, anaconda doesn't wait for the appearance of zfcp devices. zfcp device is not available in anaconda gui when zfcp device is specified in CMS config file. When adding zfcp device in gui, the first attempt fails, the second attempt to add the device is successful. Moving back to ASSIGNED. The storage/zfcp.py code needed updating to block on the cio_ignore free operation. I've created a patch that has the code use /sbin/zfcp_cio_free from the s390utils package rather than writing our own thing. I also changed the linuxrc.s390 script to just write out /etc/zfcp.conf rather than writing /tmp/fcpconfig. The zfcp_cio_free script can just read /etc/zfcp.conf and go from there. In order to fix this, I need the /sbin/zfcp_cio_free command to support specifying zFCP devices at the command line. I filed bug #576015 requesting this and set that bug to block this one. As for whether or not this bug is beta 1 blocker status, I personally don't think so. The problem here is for people booting where zFCP devices are blacklisted. They can boot up and have those devices excluded from the blacklist to complete an install of beta 1. We can release note that if we want to. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: * The cio ignore implementation has not been completed for zFCP devices. In order to avoid installation problems on those devices the images/generic.prm file needs have the following entry instead: root=/dev/ram0 ro ip=off ramdisk_size=40000 added to the Beta1 Release notes as a known issue *** Bug 587364 has been marked as a duplicate of this bug. *** *** Bug 589278 has been marked as a duplicate of this bug. *** Event posted on 05-11-2010 01:54pm EDT by Glen Johnson ------- Comment From MAIER.com 2010-05-11 13:48 EDT------- (In reply to comment #29) > When DASD and ZFCP both are attached to the system and if only ZFCP lun is > selected for installation then I see the error sometimes. However the error is > not consisten and I am able to proceed with the install other times with ZFCP > lun. This indeterministic part is indeed a duplicate of RH bug 533492. > Also if I press back button and go to devices screen again then sometimes ZFCP > lun is not listed. If I try to attach same lun again, it says the device is > already attached. However, this is not. Instead, this is what was already pointed out in https://bugzilla.redhat.com/show_bug.cgi?id=587364#c15. David, could you reopen and use this bug here as the "separate problem" as you named it in https://bugzilla.redhat.com/show_bug.cgi?id=587364#c16 ? This event sent from IssueTracker by jkachuck issue 840543 Deleted Technical Notes Contents. Old Contents: * The cio ignore implementation has not been completed for zFCP devices. In order to avoid installation problems on those devices the images/generic.prm file needs have the following entry instead: root=/dev/ram0 ro ip=off ramdisk_size=40000 Performed number of installations in graphical and text mode, with zFCP LUN(s) defined in CMS config file and also added manually via "Add Advanced Target" button. Tested with 1 - 7 zFCP LUNs. Also tested in rescue mode. anaconda-13.21.48-1.el6, build RHEL6.0-20100527.2 Moving to VERIFIED. Event posted on 06-08-2010 12:51am EDT by Glen Johnson File uploaded: zfcpAddException This event sent from IssueTracker by jkachuck issue 840543 it_file 745773 Event posted on 06-08-2010 07:31am EDT by Glen Johnson ------- Comment From MAIER.com 2010-06-08 07:25 EDT------- (In reply to comment #45) > Created an attachment (id=54385) [details] > zfcp Add Execption > > This problem is seen in RHEL6.0 Snap5 when tried to add zfcp lun This is already known as https://bugzilla.redhat.com/show_bug.cgi?id=595290 and fixed by http://git.fedorahosted.org/git/?p=anaconda.git;a=commit;h=cbd823f0bf1d74d9c09281cae9b6a4dac9c96eed in anaconda-13.21.46-1. Therefore, it should be fixed in snap6 (RHEL6.0-20100527.2) which contains anaconda-13.21.48-1. However, with that fix, you're going to run into what's already known as https://bugzilla.redhat.com/show_bug.cgi?id=597223 for which currently no fix exists yet. anaconda 13.21.48 exception report Traceback (most recent call first): File "/usr/lib/anaconda/iw/filter_gui.py", line 69, in __contains__ return item["name"] in iter(self) File "/usr/lib/anaconda/iw/filter_gui.py", line 438, in <lambda> mpaths = filter(lambda d: d not in self._cachedMPaths, new_mpaths) File "/usr/lib/anaconda/iw/filter_gui.py", line 438, in _add_advanced_clicked mpaths = filter(lambda d: d not in self._cachedMPaths, new_mpaths) TypeError: list indices must be integers, not str That said, the attached exception is not at all related to this bug (LTC bug 62837 / RIT 840543 / RH bug 589278) here. This event sent from IssueTracker by jkachuck issue 840543 Event posted on 06-15-2010 06:33am EDT by Glen Johnson ------- Comment From holger.dengler.com 2010-06-15 06:27 EDT------- Installation failed after activating a ZFCP/SCSI device manually. A DASD device and a SCSI device are selected as install targets. More details see logs. This event sent from IssueTracker by jkachuck issue 840543 Event posted on 06-15-2010 06:33am EDT by Glen Johnson File uploaded: logs.tgz This event sent from IssueTracker by jkachuck issue 840543 it_file 769603 Event posted on 06-16-2010 06:36am EDT by Glen Johnson ------- Comment From htengshe.com 2010-06-16 06:30 EDT------- Hi, With further testing on RHEL6.0 prebeta I see 2 problems. 1) In basic devices screen if I add ZFCp LUN and then proceed till partitioning screen, I see SCSI LUN. But if I press back button and go to devices screen again then ZFCP lun is not listed. If I try to attach same lun again, it says the device is already attached. I need to check if the problem is persistent. 2) In second case I had some DASDs and I added ZFCP lun. It was listed in "Other SAN Devices" tab. However when I selected only ZFCP lun for installation and presses next, it again threw the error "No usable disk found". Then ZFCP lun vanished from the "other SAN devices" tab and it was not present in any other tab as well. At this point if I try to add the same lun, it says "the lun is already attached to the system" However the lun is not listed on the screen. Uploading images for the same. This event sent from IssueTracker by jkachuck issue 840543 Event posted on 06-16-2010 06:36am EDT by Glen Johnson File uploaded: ZFCPError.JPG This event sent from IssueTracker by jkachuck issue 840543 it_file 773743 Event posted on 06-16-2010 06:36am EDT by Glen Johnson File uploaded: ZFCPError1.JPG This event sent from IssueTracker by jkachuck issue 840543 it_file 773753 Event posted on 06-16-2010 06:37am EDT by Glen Johnson File uploaded: ZFCPError2.JPG This event sent from IssueTracker by jkachuck issue 840543 it_file 773763 Open separate Bugzillas to report these bugs. Event posted on 06-17-2010 04:23am EDT by Glen Johnson ------- Comment From MAIER.com 2010-06-17 04:20 EDT------- I'm pretty sure, this is all caused by the kernel race between scsi delete and fcp unit_remove and therefore the same bug. The last comments were just a verification attempt that failed. This event sent from IssueTracker by jkachuck issue 840543 Event posted on 06-30-2010 06:13am EDT by Glen Johnson ------- Comment From brueckner.ibm.com 2010-06-30 06:05 EDT------- The patch has been tested and fixes the problem. The patch will be sent upstream for 2.6.36. With best regards, Hendrik This event sent from IssueTracker by jkachuck issue 840543 Event posted on 06-30-2010 06:13am EDT by Glen Johnson File uploaded: linux-2.6.32-s390-zfcp-unit-remove.patch This event sent from IssueTracker by jkachuck issue 840543 it_file 818133 Event posted on 06-30-2010 06:13am EDT by Glen Johnson <cde:attachment> Comment on attachment: linux-2.6.32-s390-zfcp-unit-remove.patch ------- Comment on attachment From brueckner.ibm.com 2010-06-30 06:01 EDT------- Description: zfcp: Remove SCSI device during unit_remove Symptom: When issuing the commands to delete a SCSI device and then to remove the zfcp unit from a script, the zfcp unit remove can fail. Problem: The unit_remove will fail when the reference count of the unit is not zero. When the SCSI device exists, it holds a reference to the unit. The upstream commit d9a9cdfb078d755e648d53ec25b7370f84ee5729 changed the deletion of a SCSI to be run asynchronously from a workqueue. With this change, the actual removal of the SCSI device can run after the unit_remove in zfcp and the unit_remove will fail. Solution: Get a reference to the SCSI device from the unit_remove function and remove the SCSI device from this function. If the SCSI device has already been deleted earlier, unit_remove cannot get the reference and does nothing. If the removal of the SCSI device is running on two threads, this is protected by the scan_mutex and the second one will exit early. </cde:attachment> This event sent from IssueTracker by jkachuck issue 840543 I'm puzzled why I'm seeing a patch being posted to a BZ that is in VERIFIED state when I requested on June 16 that separate bugzillas be opened to track bug reports. Why is this? John, the reason for no further new bugzillas you can find in comment 37 from MAIER.com 2010-06-17 04:20 EDT------- : > I'm pretty sure, this is all caused by the kernel race between > scsi delete and fcp unit_remove and therefore the same bug. > The last comments were just a verification attempt that failed. But, you're right, this bug had better been reopened. OK, Hendrik clarified this in comment 42 which is invisible for me; forget my comment 43. Red Hat Enterprise Linux Beta 2 is now available and should resolve the problem described in this bug report. This report is therefore being closed with a resolution of CURRENTRELEASE. You may reopen this bug report if the solution does not work for you. Event posted on 07-06-2010 04:33am EDT by Glen Johnson ------- Comment From htengshe.com 2010-07-06 03:54 EDT------- The problem still exists in RHEl6.0 beta2. I had added DASD and ZFCP LUN in parmfile using following parameters DASD=exxx FCP_1="0.0.3xxx 0x500507630303c562 0x4014402200000000" During installation I selected only ZFCP lun, It gave error "No usable disks have been found". ------- Comment From htengshe.com 2010-07-06 03:56 EDT------- Reopening Bug This event sent from IssueTracker by jkachuck issue 840543 Event posted on 07-06-2010 04:33am EDT by Glen Johnson File uploaded: ZFCPInstallError.JPG This event sent from IssueTracker by jkachuck issue 840543 it_file 832833 Event posted on 07-06-2010 04:34am EDT by Glen Johnson <cde:attachment> Comment on attachment: ZFCPInstallError.JPG ------- Comment (attachment only) From htengshe.com 2010-07-06 03:55 EDT------- </cde:attachment> This event sent from IssueTracker by jkachuck issue 840543 Please attach the /tmp/anaconda* files so we can see what is going on. Created attachment 429855 [details]
linux-2.6.32-s390-zfcp-unit-remove.patch
Hello, This issue should most likely be able to be closed again. This will be worked through BZ 589278. Thank You Joe Kachuck Thanks, Joe. I've set this bug to depend on bug #589278 and I have moved it back to CLOSED CURRENTRELEASE which was the state at the time of comment #45. |
This BZ is to address the following portion of the original BZ: > Unmask and wait for appearance of devices needed (interactive and > kickstart)[linuxrc.s390 already has support; backport for zfcp in anaconda] As stated, linuxrc.s390 does this. If zfcp support is missing from this facility, file another bug and detail just that issue.