Bug 1463080

Summary: Issue while installing RHEL7.1 as Boot From SAN on a 3PAR VV presented to IBM LPAR partition by NPIV (vfc)
Product: Red Hat Enterprise Linux 7 Reporter: Venkatesan <venkatesan.arumugam>
Component: anacondaAssignee: Anaconda Maintenance Team <anaconda-maint-list>
Status: CLOSED CURRENTRELEASE QA Contact: Release Test Team <release-test-team-automation>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.1CC: coughlan, dhorak, dinesh.surpur, dlehman, hannsj_uhl, jkachuck, mike.pechulis, mknutson, msnitzer, phinchman, secondary-arch-list, srinivas.lingampalli, venkatesan.arumugam
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-30 16:47:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1438583, 1445812    
Attachments:
Description Flags
Error Message snapshot none

Description Venkatesan 2017-06-20 05:57:53 UTC
Created attachment 1289399 [details]
Error Message snapshot

We are seeing issues with Boot from SAN (BFS) for RHEL 7.1 when 3PAR LUNs are presented to LPAR partition profile using virtual fibre channel (VFC).RHEL 7.1 installation with boot from SAN with direct 3PAR presentation to VIOS using VSCSI goes through fine without any issues. Following issue is seen only with presentation of LUN for BFS using VFC. LUN Size used is 50GB.

Error message : (partition is too large for PPC PReP Boot formatting (allowable size is 4096 KiBto 10 MiB)). 

Two ways you can present the 3PAR LUN to IBM LPAR partition profile. I will try to explain both the scenario in Steps to reproduce Section.
 
  - vSCSI presentation of 3PAR array through VIOS where we are able to install the OS.
  - vFC presentation of 3PAR array through VIOS where we are seeing this issue,


Bug details on similar issue from IBM site:
------------------------------------------

  We could find following from IBM sites on the issue: https://www.ibm.com/developerworks/community/forums/html/topic?id=e0a702e8-3325-4bb5-a251-79bff8821d28

Excerpts from the doc :
<<
    “The issue is that certain levels of IBM storage subsystem firmware return inappropriate information to the RHEL installer regarding the characteristics of the LUN.  The issue is documented in Bug 128493 in the IBM Linux Technology Center.  In this particular case, the problem was resolved by upgrading the V7000 firmware level to V7.6.0.0.”
>>  IBM Bug 128493


Workaround Tried:
-----------------
  We also tried creating a PPC PreP Boot Partition of Linux Type 0x41 of size around 8MB on the boot drive & tried to install but it still fails with above said error message.

Reference  :  https://www.ibm.com/support/knowledgecenter/POWER5/iphbi_p5/iphbibook.pdf
  “The disk partition on the virtual disk must be formatted as type PReP Boot (type 0x41) and marked as a device that starts. You can format a disk partition as type PReP Boot by using the Linux fdisk command with the –t option. You can specify that the disk partition starts by using the fdisk command with the –a option.”  

Configuration details:
----------------------

System Model: IBM,8286-41A
Processor Type: PowerPC_POWER8
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 1
Processor Clock Speed: 3026 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
Memory Size: 6144 MB
Good Memory Size: 6144 MB
Platform Firmware level: SV810_146
Firmware Version: IBM,SV810_146
NX Crypto Acceleration: Capable and Enabled
VIOS version: 2.2.3.4
Host OS on LPAR Partition: RHEL 7.1 ppc64 & RHEL 7.2 ppc64 (Issue is there on both the update)
HBA: IBM lpe12002 (emulex)
Array : HP 3PAR 8440 (FW version: 3.2.2) 
Attach Protocol: FC

Steps to Reproduce:
-------------------

Case-1: Through vSCSI presentation of 3PAR array to LPAR partition profile.
------
1. On IBM Power server, install VIOS 2.2.3.4. Connected the Power server to the Hardware Management Console  (HMC).
2. Access the Power server from HMC. 
3. Create a LPAR partition profile with two virtual SCSI adapter (Note the adapter ID). Similarly create virtual SCSI adapters on the VIOS and map it by adapter ID.
4. Create a VV from 3PAR 8440 array. Create a LUN and map it to the VIOS. 
5. Use one vSCSI adapter as SCSI CD-ROM and load the image to the CD-ROM and map the LUN to another vSCSI adapter.
6. Set the boot options on the System Management Services (SMS) to SCSI CD-ROM and boot from it.
7. Base Environment selection menu appears. Select the 3PAR LUN on Installation Destination section and proceed with the installation.

Case-2: Through vFC presentation of 3PAR array to LPAR partition profile.
------
1. On IBM Power server, install VIOS 2.2.3.4. Connected the Power server to the Hardware Management Console  (HMC).
2. Access the Power server from HMC. 
3. Create a LPAR partition profile with one virtual SCSI adapter and one virtual Fibre Channel adapter (Note the adapter ID). Similarly create virtual SCSI adapter and virtual FC adapter on the VIOS and map it by adapter ID.
4. Create a VV from 3PAR 8440 array. Create a LUN and map it to LPAR Partition profile by virtual WWPN number created by NPIV. 
5. Use one vSCSI adapter as SCSI CD-ROM and load the image to the CD-ROM and map the LUN to virtual fibre channel adapter.
6. Set the boot options on the System Management Services (SMS) to SCSI CD-ROM and boot from it.
7. Base Environment selection menu appears. Select the 3PAR LUN on Installation Destination section, selection of 3PAR LUN will fail with error (partition is too large for PPC PReP Boot formatting (allowable size is 4096 KiBto 10 MiB)).

Steps explained here are in brief. Please let me know,if you face any problem while reproduce. Looking forward for your assistance in resolving this issue.

Additional info: RHEL 7.1 image used here is customized one to support IBM Power servers (RHEL-7.1-20150219.1-Server-ppc64-dvd1.iso).

Thanks & Regards,
A.Venkatesan

Comment 2 Dan Horák 2017-06-20 07:17:10 UTC
Switching to the installer as a component. If I read it correctly, then the issue is in the storage configuration (PReP partition made too large) during the installation.

IBM, I would also recommend trying with recent RHELs, either 7.3 or 7.4 beta.

Comment 3 Venkatesan 2017-06-20 08:25:16 UTC
Hi Dan,

   Yes, during storage configuration we are seeing this issue. We are testing this configuration on specific customer requests so it would be good if we get a workaround in RHEL 7.1

Thanks,
A. Venkatesan.

Comment 4 David Lehman 2017-06-20 17:08:54 UTC
This looks like a result of incorrect alignment information being provided to the OS. There is a code workaround in python-blivet in RHEL-7.3. If the customer needs to stay on 7.1 I think the only option is to fix the firmware to report correct alignment/topology.

Comment 5 Srini 2017-06-21 09:30:42 UTC
Hi David,
Can you please clarify what are you referring as fix the firmware? Are you suggesting to fix 3PAR firmware by providing correct storage configuration to OS

Comment 6 Venkatesan 2017-06-21 17:10:57 UTC
Hi,

 As per suggestion from previous discussions, we planned to try RHEL 7.3. Unfortunately RHEL 7.3 ppc64 iso image (RHEL-7.3-20161019.0-Server-ppc64-dvd1.iso) that we have is corrupted so couldn't move forward.

 Is it possible from your side to provide us a RHEL 7.3 GA ppc64 image to continue our testing? We got the ppc images from following link http://blofly.us.rdlabs.hpecorp.net/PPC/RedHat/EL7/Update3/GA/Server/


Error Message:
------------------- 
Starting package installation process
Error populating transaction, retrying (1/10)
Error populating transaction, retrying (2/10)
Error populating transaction, retrying (3/10)
Error populating transaction, retrying (4/10)
Error populating transaction, retrying (5/10)
Error populating transaction, retrying (6/10)
Error populating transaction, retrying (7/10)
Error populating transaction, retrying (8/10)
Error populating transaction, retrying (9/10)
Error populating transaction, retrying (10/10)
================================================================================
================================================================================
Error

The following error occurred while installing.  This is a fatal error and
installation will be aborted.

  Error populating transaction after 10 retries: failure: Packages/perl-HTTP-
Tiny-0.033-3.el7.noarch.rpm from anaconda: [Errno 256] No more mirrors to try.
Press enter to exit.

Thanks & Regards,
A.Venkatesan

Comment 7 David Lehman 2017-06-21 17:36:43 UTC
(In reply to Srini from comment #5)
> Hi David,
> Can you please clarify what are you referring as fix the firmware? Are you
> suggesting to fix 3PAR firmware by providing correct storage configuration
> to OS

Yes.

Comment 8 Dinesh Surpur 2017-06-22 17:05:36 UTC
David, I need more details on why u think it is an storage issue. Here is the sizes that the storage is providing. 

This is from RHEL 7.3 system data but should be same for RHEL 7.1 as well.

B0 Inquiry Page.

[root@dl380g9-57 ~]# sg_inq -e -p 0xb0 /dev/sdb
VPD INQUIRY: Block limits page (SBC)
  Maximum compare and write length: 1 blocks
  Optimal transfer length granularity: 32 blocks
  Maximum transfer length: 32768 blocks
  Optimal transfer length: 32768 blocks
  Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
  Maximum unmap LBA count: 65536
  Maximum unmap block descriptor count: 10
  Optimal unmap granularity: 32
  Unmap granularity alignment valid: 1
  Unmap granularity alignment: 0


# cat /sys/block/sdg/queue/max_sectors_kb
16384

# cat /sys/block/sdg/queue/optimal_io_size
16777216

# cat /sys/block/sdg/queue/max_segment_size
65536

# cat /sys/block/sdg/queue/minimum_io_size
16384

# cat /sys/block/sdg/queue/logical_block_size
512


[root@dl380g9-57 queue]# smartctl -a /dev/sdg
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-513.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               3PARdata
Product:              VV
Revision:             3310
User Capacity:        53,687,091,200 bytes [53.6 GB]
Logical block size:   512 bytes
Logical Unit id:      0x60002ac000000000000002e10007e025
Serial number:        4UW0001036
Device type:          disk
Transport protocol:   Fibre channel (FCP-2)
Local Time is:        Thu Jun 22 09:57:48 2017 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Disabled
Temperature Warning:  Disabled or Not Supported


[root@dl380g9-57 queue]# blockdev --getioopt /dev/sdg
16777216


Anaconda needs to take the alignment from the storage which is 16 MB. What is the default value it is using in RHEL 7.1. Also, what was fix made in RHEL 7.3 python-blivet  module and any bug Red Hat reference. 

Also, different arrays provided different alignments why do you think it is an issue with 3PAR only?

Comment 9 Dinesh Surpur 2017-06-22 22:32:47 UTC
On a normal Proliant X86_64 machine with 3par no issues were seen for RHEL 7.1 install (FYI)  So David I have marked needinfo from your side to confirm what the problem on the array side you suspect while using IBM PowerPC.

here is default PE size used by the installer. 

[root@localhost ~]# vgdisplay
  --- Volume group ---
  VG Name               rhel
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               49.26 GiB
  PE Size               4.00 MiB
  Total PE              12611
  Alloc PE / Size       12601 / 49.22 GiB
  Free  PE / Size       10 / 40.00 MiB
  VG UUID               b99SBn-Jwgy-PKnI-fsh3-kJN1-EpzO-zLAptf

Comment 10 Srini 2017-06-27 06:01:39 UTC
Received response from IBM bug#1250822 
=====================================

From: John Henderson [johnhend.com] 
Sent: Tuesday, June 27, 2017 4:22 AM
To: Lingampalli, Srinivas <srinivas.lingampalli>
Subject: Fwd: Re: Fwd: RE: RE: BFS issues for RHEL on Power systems with 3PAR

Srinivas,


Here is the Power Linux team's response to your questions....


-jch-



Sent from my iPhone using IBM Verse 
________________________________________
On Jun 26, 2017, 5:45:22 PM, Brian King wrote: 


To: johnhend.com 
Cc: 
Date: Jun 26, 2017, 5:45:22 PM 
Subject: Re: Fwd: RE: RE: BFS issues for RHEL on Power systems with 3PAR 
The link below is a link to an IBM internal bugzilla which does not have external access. However, this was mirrored into Red Hat's bugzilla and the related Red Hat bugzilla is this: 
  
https://bugzilla.redhat.com/show_bug.cgi?id=1250822 
  
  
Below is the analysis done for the issue we discovered on IBM storage which was fixed in a firmware update: 
  
Here is what we get when reading the block limits VPD page on V7K with 7.2.X code:

[root@amp ~]# sg_inq --page=0xB0 /dev/sdm
VPD INQUIRY: Block limits page (SBC)
  Maximum compare and write length: 128 blocks
  Optimal transfer length granularity: 0 blocks
  Maximum transfer length: 0 blocks
  Optimal transfer length: 0 blocks
  Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
  Maximum unmap LBA count: 0
  Maximum unmap block descriptor count: 0
  Optimal unmap granularity: 0
  Unmap granularity alignment valid: 0
  Unmap granularity alignment: 0

Here is what we get when reading the block limits VPD page on V7K with 7.5.X code:

[root@amp ~]# sg_inq --page=0xB0 /dev/sdt
VPD INQUIRY: Block limits page (SBC)
  Maximum compare and write length: 128 blocks
  Optimal transfer length granularity: 64 blocks
  Maximum transfer length: 2097152 blocks
  Optimal transfer length: 131072 blocks
  Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
  Maximum unmap LBA count: 0
  Maximum unmap block descriptor count: 0
  Optimal unmap granularity: 0
  Unmap granularity alignment valid: 0
  Unmap granularity alignment: 0

Two key fields changed here:

First, the Optimal transfer length granularity changed from 0 to 64 blocks, which is 32k. T
his means Linux will try to send 32k aligned / 32k multiple lenght ops to the device. 
This should be OK. 512e drives set this to 4k, the software RAID stack sets this to its chunk size, 
which is 512k by default, so Linux should handle this fine.

Secondly, the Optimal transfer length changed from 0 to 131072 blocks, which is 64MB. Most devices 
don't set this. The software RAID stack sets it to be the chunk size * number of RAID disks. 
This is the change that is causing grief, as has already been stated. It looks like OS is trying 
to create partitions that are aligned to this boundary and are a multiple of this boundary, but 
PreP partitions can only be as large as 10MB...
Thanks, 
  
Brian 

Brian King 
Power Linux I/O Focal Point 
Linux Technology Center

Comment 11 David Lehman 2017-06-27 14:58:57 UTC
If the storage vendor intentionally advertises an optimal transfer length of 16 MiB then obviously there is nothing to fix there. However, the workaround for this issue in blivet was included in RHEL-7.3. We cannot update the RHEL-7.1 installation media retroactively.

Comment 12 Dinesh Surpur 2017-06-27 22:12:54 UTC
From comment # 11 from David indicates no issues from the 3PAR Array Side. The IBM message also seems to indicate that once their firmware was fixed in 7.5.x the installations of RHEL 7.1 starts failing. 

The OS can use any value below "Optimal transfer length" to perform I/O and this is what the spec is mentioning.

"An OPTIMAL TRANSFER LENGTH field set to a non-zero value indicates the optimal transfer size in logical blocks for a single command shown in table 227. If a device server receives one of these commands with a transfer size greater than this value, then the device server may incur delays in processing the command. An OPTIMAL TRANSFER LENGTH field set to 0000_0000h indicates that the device server does not report an optimal transfer"

What is the fix is blivet module to determine the alignment offset and how does it differ from 7.1 calculation ? If a bug number can be provided that will help to understand.

Comment 13 David Lehman 2017-06-28 18:13:24 UTC
The change in RHEL-7.3 was the result of bug 1262137. The upstream pull request is here: https://github.com/rhinstaller/blivet/pull/444

Comment 16 Joseph Kachuck 2017-11-30 16:47:53 UTC
Hello HPE,
This fix is already in RHEL 7.3. It will not be released in RHEL 7.1.
I am closing this issue as current release.

Thank You
Joe KAchuck