Bug 142263

Summary: Only 16 EMC powerpath LUNs usable with LVM1
Product: Red Hat Enterprise Linux 3 Reporter: Thomas Uebermeier <uthomas>
Component: lvmAssignee: Doug Ledford <dledford>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: agk, coughlan, hgarcia, jrfuller, kanderso, peterm, petrides, poelstra, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2005-663 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-28 14:34:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 156320    
Attachments:
Description Flags
lvmdiskscan -d
none
/proc/partitions none

Description Thomas Uebermeier 2004-12-08 17:19:29 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20040922

Description of problem:
vgextend uses /dev/emcpowera to /dev/emcpowerp, but stops accepting
devices beyond this (/dev/emcpowerq ...)

Version-Release number of selected component (if applicable):
lvm-1.0.8-8

How reproducible:
Always

Steps to Reproduce:
1. setup lvm, create one or more volumegroup, etc.
2. setup emcpowerpath, create more than 16 LUNs
3. add /dev/emcpowera to /dev/emcpowerp to these VGs
4. vgextend vgsgdb2data01 /dev/emcpowerq /dev/emcpowerr

Actual Results:  # vgextend vgsgdb2data01 /dev/emcpowerq /dev/emcpowerr
vgextend -- INFO: maximum logical volume size is 2 Terabyte
vgextend -- ERROR: no physical volumes usable to extend volume group
"vgsgdb2data01"

Expected Results:  /dev/emcpowerq and /dev/emcpowerr added to VG
vgsgdb2data01

Additional info:

The error message seems to be the default error, when vgextend fails,
therefore misleading.

Also there seems to be several other problems involving LVM1 + EMC
PowerPath,  eg. LVM cannot distinguish between the "raw" SCSI devices
(sda, sdb,...) and the bound meta devices (emcpowera,...) happily
taking the first to come, making the redundancy of multi pathing useless.

Comment 3 Heinz Mauelshagen 2004-12-09 10:25:09 UTC
Thomas,
I need to add a filter to the device discovery code in LVM1
in order to avoid access to the 'raw' devices in case PowerPath
ones give access.
The other issue about limited number of devices (16) needs tweeking
in the same discovery code.
Hopefully will get around to this before Christmas ;)
Still no PowerPath copy here to be able to test myself (EMC claims
they are pushing delivery).
I'll push any fixes to you to test on the Bladecenter configuration first.

Comment 4 Thomas Uebermeier 2004-12-09 10:41:56 UTC
Heinz, 
I just received some more informations from the customer, from which 
I would say, to put the "max 16 devices" issue to a hold, the devices 
do *not* show up in /proc/partitions, sorry for the confusion... 
The emcpower*/sd* device thing is still left (and much more nasty), 
shall I create a new bug for that, so we can track it separately ? 

Comment 5 Heinz Mauelshagen 2004-12-09 11:34:53 UTC
Thomas,

in lieu of the PowerPath /proc/partitions bug, yes please create
a new bug in order to track the /dev/sd* filter issue seperately
from this one.

Comment 6 Heinz Mauelshagen 2004-12-09 11:41:55 UTC
Background on the 16 max limit issue:

----- Forwarded message from "goggin, edward" <egoggin> -----

Assuming this is a 2.4 kernel ...

the 16 device limit likely applies to emcpower whole device names which
show up in /proc/partitions.  Only 16 emcpower whole device names show
up in this file due to the code in disk_name() in
fs/partitions/check.c which both (1) enforces a very simplistic naming
policy on all devices which are not (scsi,ide, and a few privileged others)
and (2) assumes a driver can only manage a single major number.

This single major number requirement is not met by the PowerPath driver
since like the scsi class driver, it manages 16 major numbers.  Given
enough LUNs in your SAN, you will see the device name sequence
(emcpowera, ... , emcpowerp) repeated multiple times in /proc/partitions,
each time with a different major number, from 247-232 inclusive.
Sixteen show up for each major number since, like scsi, PowerPath's dev_t
uses 4 minor bits for indicating partition so there are but 4 bits left
in each 8-bit minor to specify whole device instance.

This is all done better in the 2.6 kernel since each driver is allowed
to determine the name of the devices the driver manages and record the
name in the gendisk entry (which is per minor not per major) for each
device.

SuSE has addressed this problem in its 2.4 based SLES 8 distribution
by introducing a "driver name" callout in the per-major gendisk structure
and calling this callout in disk_name().

We have been aware of this issue for over 2 1/2 years and have made Red Hat
aware of the issue for that same length of time.  My suspicion has been that
due to Red Hat's goal to provide backward compatibility for kernel modules
in its 2.4 based enterprise distributions, it was more difficult to fix
this issue.

Comment 7 Thomas Uebermeier 2004-12-09 11:55:39 UTC
Created attachment 108192 [details]
/proc/partitions

ok, that makes sense now. Isn't quite nice (the device naming from EMC admin
tools (you can get emcpowerq and above) compared to /proc/partitions))
though...
Is there any chance of ever getting this fixed within RHEL3 ?

Comment 8 Thomas Uebermeier 2004-12-09 14:04:07 UTC
Followup Bug for device filtering: 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=142386 

Comment 22 Marty Wesley 2005-05-26 06:49:11 UTC
PM ACK for U6

Comment 27 Doug Ledford 2005-07-07 22:57:07 UTC

*** This bug has been marked as a duplicate of 79086 ***

Comment 28 Ernie Petrides 2005-07-29 02:24:11 UTC
A fix for this problem has just been committed to the RHEL3 U6
patch pool this evening (in kernel version 2.4.21-34.EL).

Propagating acks from bug 79086.


Comment 30 Red Hat Bugzilla 2005-09-28 14:34:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-663.html