Bug 1463080
Summary: | Issue while installing RHEL7.1 as Boot From SAN on a 3PAR VV presented to IBM LPAR partition by NPIV (vfc) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Venkatesan <venkatesan.arumugam> | ||||
Component: | anaconda | Assignee: | Anaconda Maintenance Team <anaconda-maint-list> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Release Test Team <release-test-team-automation> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 7.1 | CC: | coughlan, dhorak, dinesh.surpur, dlehman, hannsj_uhl, jkachuck, mike.pechulis, mknutson, msnitzer, phinchman, secondary-arch-list, srinivas.lingampalli, venkatesan.arumugam | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | ppc64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-11-30 16:47:53 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1438583, 1445812 | ||||||
Attachments: |
|
Description
Venkatesan
2017-06-20 05:57:53 UTC
Switching to the installer as a component. If I read it correctly, then the issue is in the storage configuration (PReP partition made too large) during the installation. IBM, I would also recommend trying with recent RHELs, either 7.3 or 7.4 beta. Hi Dan, Yes, during storage configuration we are seeing this issue. We are testing this configuration on specific customer requests so it would be good if we get a workaround in RHEL 7.1 Thanks, A. Venkatesan. This looks like a result of incorrect alignment information being provided to the OS. There is a code workaround in python-blivet in RHEL-7.3. If the customer needs to stay on 7.1 I think the only option is to fix the firmware to report correct alignment/topology. Hi David, Can you please clarify what are you referring as fix the firmware? Are you suggesting to fix 3PAR firmware by providing correct storage configuration to OS Hi, As per suggestion from previous discussions, we planned to try RHEL 7.3. Unfortunately RHEL 7.3 ppc64 iso image (RHEL-7.3-20161019.0-Server-ppc64-dvd1.iso) that we have is corrupted so couldn't move forward. Is it possible from your side to provide us a RHEL 7.3 GA ppc64 image to continue our testing? We got the ppc images from following link http://blofly.us.rdlabs.hpecorp.net/PPC/RedHat/EL7/Update3/GA/Server/ Error Message: ------------------- Starting package installation process Error populating transaction, retrying (1/10) Error populating transaction, retrying (2/10) Error populating transaction, retrying (3/10) Error populating transaction, retrying (4/10) Error populating transaction, retrying (5/10) Error populating transaction, retrying (6/10) Error populating transaction, retrying (7/10) Error populating transaction, retrying (8/10) Error populating transaction, retrying (9/10) Error populating transaction, retrying (10/10) ================================================================================ ================================================================================ Error The following error occurred while installing. This is a fatal error and installation will be aborted. Error populating transaction after 10 retries: failure: Packages/perl-HTTP- Tiny-0.033-3.el7.noarch.rpm from anaconda: [Errno 256] No more mirrors to try. Press enter to exit. Thanks & Regards, A.Venkatesan (In reply to Srini from comment #5) > Hi David, > Can you please clarify what are you referring as fix the firmware? Are you > suggesting to fix 3PAR firmware by providing correct storage configuration > to OS Yes. David, I need more details on why u think it is an storage issue. Here is the sizes that the storage is providing. This is from RHEL 7.3 system data but should be same for RHEL 7.1 as well. B0 Inquiry Page. [root@dl380g9-57 ~]# sg_inq -e -p 0xb0 /dev/sdb VPD INQUIRY: Block limits page (SBC) Maximum compare and write length: 1 blocks Optimal transfer length granularity: 32 blocks Maximum transfer length: 32768 blocks Optimal transfer length: 32768 blocks Maximum prefetch, xdread, xdwrite transfer length: 0 blocks Maximum unmap LBA count: 65536 Maximum unmap block descriptor count: 10 Optimal unmap granularity: 32 Unmap granularity alignment valid: 1 Unmap granularity alignment: 0 # cat /sys/block/sdg/queue/max_sectors_kb 16384 # cat /sys/block/sdg/queue/optimal_io_size 16777216 # cat /sys/block/sdg/queue/max_segment_size 65536 # cat /sys/block/sdg/queue/minimum_io_size 16384 # cat /sys/block/sdg/queue/logical_block_size 512 [root@dl380g9-57 queue]# smartctl -a /dev/sdg smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-513.el7.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: 3PARdata Product: VV Revision: 3310 User Capacity: 53,687,091,200 bytes [53.6 GB] Logical block size: 512 bytes Logical Unit id: 0x60002ac000000000000002e10007e025 Serial number: 4UW0001036 Device type: disk Transport protocol: Fibre channel (FCP-2) Local Time is: Thu Jun 22 09:57:48 2017 PDT SMART support is: Available - device has SMART capability. SMART support is: Disabled Temperature Warning: Disabled or Not Supported [root@dl380g9-57 queue]# blockdev --getioopt /dev/sdg 16777216 Anaconda needs to take the alignment from the storage which is 16 MB. What is the default value it is using in RHEL 7.1. Also, what was fix made in RHEL 7.3 python-blivet module and any bug Red Hat reference. Also, different arrays provided different alignments why do you think it is an issue with 3PAR only? On a normal Proliant X86_64 machine with 3par no issues were seen for RHEL 7.1 install (FYI) So David I have marked needinfo from your side to confirm what the problem on the array side you suspect while using IBM PowerPC. here is default PE size used by the installer. [root@localhost ~]# vgdisplay --- Volume group --- VG Name rhel System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 3 VG Access read/write VG Status resizable MAX LV 0 Cur LV 2 Open LV 2 Max PV 0 Cur PV 1 Act PV 1 VG Size 49.26 GiB PE Size 4.00 MiB Total PE 12611 Alloc PE / Size 12601 / 49.22 GiB Free PE / Size 10 / 40.00 MiB VG UUID b99SBn-Jwgy-PKnI-fsh3-kJN1-EpzO-zLAptf Received response from IBM bug#1250822 ===================================== From: John Henderson [johnhend.com] Sent: Tuesday, June 27, 2017 4:22 AM To: Lingampalli, Srinivas <srinivas.lingampalli> Subject: Fwd: Re: Fwd: RE: RE: BFS issues for RHEL on Power systems with 3PAR Srinivas, Here is the Power Linux team's response to your questions.... -jch- Sent from my iPhone using IBM Verse ________________________________________ On Jun 26, 2017, 5:45:22 PM, Brian King wrote: To: johnhend.com Cc: Date: Jun 26, 2017, 5:45:22 PM Subject: Re: Fwd: RE: RE: BFS issues for RHEL on Power systems with 3PAR The link below is a link to an IBM internal bugzilla which does not have external access. However, this was mirrored into Red Hat's bugzilla and the related Red Hat bugzilla is this: https://bugzilla.redhat.com/show_bug.cgi?id=1250822 Below is the analysis done for the issue we discovered on IBM storage which was fixed in a firmware update: Here is what we get when reading the block limits VPD page on V7K with 7.2.X code: [root@amp ~]# sg_inq --page=0xB0 /dev/sdm VPD INQUIRY: Block limits page (SBC) Maximum compare and write length: 128 blocks Optimal transfer length granularity: 0 blocks Maximum transfer length: 0 blocks Optimal transfer length: 0 blocks Maximum prefetch, xdread, xdwrite transfer length: 0 blocks Maximum unmap LBA count: 0 Maximum unmap block descriptor count: 0 Optimal unmap granularity: 0 Unmap granularity alignment valid: 0 Unmap granularity alignment: 0 Here is what we get when reading the block limits VPD page on V7K with 7.5.X code: [root@amp ~]# sg_inq --page=0xB0 /dev/sdt VPD INQUIRY: Block limits page (SBC) Maximum compare and write length: 128 blocks Optimal transfer length granularity: 64 blocks Maximum transfer length: 2097152 blocks Optimal transfer length: 131072 blocks Maximum prefetch, xdread, xdwrite transfer length: 0 blocks Maximum unmap LBA count: 0 Maximum unmap block descriptor count: 0 Optimal unmap granularity: 0 Unmap granularity alignment valid: 0 Unmap granularity alignment: 0 Two key fields changed here: First, the Optimal transfer length granularity changed from 0 to 64 blocks, which is 32k. T his means Linux will try to send 32k aligned / 32k multiple lenght ops to the device. This should be OK. 512e drives set this to 4k, the software RAID stack sets this to its chunk size, which is 512k by default, so Linux should handle this fine. Secondly, the Optimal transfer length changed from 0 to 131072 blocks, which is 64MB. Most devices don't set this. The software RAID stack sets it to be the chunk size * number of RAID disks. This is the change that is causing grief, as has already been stated. It looks like OS is trying to create partitions that are aligned to this boundary and are a multiple of this boundary, but PreP partitions can only be as large as 10MB... Thanks, Brian Brian King Power Linux I/O Focal Point Linux Technology Center If the storage vendor intentionally advertises an optimal transfer length of 16 MiB then obviously there is nothing to fix there. However, the workaround for this issue in blivet was included in RHEL-7.3. We cannot update the RHEL-7.1 installation media retroactively. From comment # 11 from David indicates no issues from the 3PAR Array Side. The IBM message also seems to indicate that once their firmware was fixed in 7.5.x the installations of RHEL 7.1 starts failing. The OS can use any value below "Optimal transfer length" to perform I/O and this is what the spec is mentioning. "An OPTIMAL TRANSFER LENGTH field set to a non-zero value indicates the optimal transfer size in logical blocks for a single command shown in table 227. If a device server receives one of these commands with a transfer size greater than this value, then the device server may incur delays in processing the command. An OPTIMAL TRANSFER LENGTH field set to 0000_0000h indicates that the device server does not report an optimal transfer" What is the fix is blivet module to determine the alignment offset and how does it differ from 7.1 calculation ? If a bug number can be provided that will help to understand. The change in RHEL-7.3 was the result of bug 1262137. The upstream pull request is here: https://github.com/rhinstaller/blivet/pull/444 Hello HPE, This fix is already in RHEL 7.3. It will not be released in RHEL 7.1. I am closing this issue as current release. Thank You Joe KAchuck |