Bug 264001 - RFE - Document the ability to remove/add LUNs dynamically
Summary: RFE - Document the ability to remove/add LUNs dynamically
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: Online_Storage_Reconfiguration_Guide
Version: 5.2
Hardware: All
OS: All
high
high
Target Milestone: rc
: 5.4
Assignee: Don Domingo
QA Contact:
URL:
Whiteboard:
Depends On: 238421
Blocks: 432577 461680 483784
TreeView+ depends on / blocked
 
Reported: 2007-08-29 16:17 UTC by David Mair
Modified: 2018-10-19 19:47 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-08 02:30:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Online Storage Reconfiguration Guide (latest PDF build) (54.91 KB, application/pdf)
2008-02-11 23:23 UTC, Don Domingo
no flags Details

Description David Mair 2007-08-29 16:17:36 UTC
Description of problem:
- RHEL4 is based on a 2.6.9+ kernel. This kernel, by default, does not delete
devices when connectivity is lost. The sdev simply goes into a offline state.

- Later upstream kernels, even as late as 2.6.18 (which is RHEL5) and later,
do actively teardown the devices when connectivity is lost (part of a
fc transport side effect).

- Many bugs were found in the "removal" code paths, with several bugs being
  data structures that were erroneously freed, then reallocated by other
  code paths. This happened on target and sdev level structures.

  - The patches to correct this made it into 2.6.19 and 2.6.20, and there's
    still a couple of residual "sdev resurrection" patches that were posted
    in June 2007, that have yet to make usptream.

- The discovery of many of these paths were so late in the RHEL5 schedule
  (and SLES10 actually), that both distros put non-upstream patches in the
  fc transport to avoid the deletion of devices.

So,  although the linux kernel says it supports hot removal, and the interfaces
exist, as policy, I would not support "lun removal".

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Customers require the ability to de-allocate storage on the fly without
rebooting. After discussion with Red Hat engineering and Emulex engineering, it
appears the kernel in RHEL 5 has bugs in the remove path which prevent this from
working reliably. This feature request is to apply whatever patches necessary
from upstream and to test thoroughly so that customers can expect a way to
execute a scan that would remove any storage that has been deallocated.

Comment 4 Tom Coughlan 2007-11-02 22:32:05 UTC
The upstream kernel struggled with this problem, as noted in comment 0. This
leads me to think that the changes may be too disruptive to backport to RHEL 5.
On the other hand, I believe there will be a strong demand for this, because
storage arrays these days are dynamically creating and deleting LUNs (e.g.
snapshots) as a matter or normal operation. 

Chip, please take a look and see what specific problems exist in 5.1, and what
specific fixes we might backport. Also consider this as part of the larger
effort to do discover and manage on-line storage config changes.   

Comment 34 Don Domingo 2008-02-11 23:23:15 UTC
Created attachment 294606 [details]
Online Storage Reconfiguration Guide (latest PDF build)

as requested. mind the placeholders.

please have all reviewers email me at ddomingo for revisions,
suggestions and other concerns regarding this document. i'd be happy to
integrate as much content as can be supplied. thanks!

Comment 37 Don Domingo 2008-04-09 00:53:16 UTC
Tom,
can you sign off on the document for release with RHEL5.2 (to be made available
online only)? when you do, i will remove all placeholders. the latest build can
be found here:

https://engineering.redhat.com/docbot/en-US/Red_Hat_Enterprise_Linux/0.0/html/Online_Storage_Reconfiguration_Guide/

thanks!

Comment 46 Andrius Benokraitis 2008-10-20 13:23:46 UTC
Putting on the 5.4 list to possibly move this doc from tech preview to fully supported...

Comment 51 Rob Evers 2009-06-16 13:15:56 UTC
The link in comment 37 is no longer available:  Use the publically available link:

http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.2/html/Online_Storage_Reconfiguration_Guide/index.html

Comment 59 Don Domingo 2009-07-16 02:10:34 UTC
Andrius, Rob,

the updated OSRG is public now on:

html
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/html/Online_Storage_Reconfiguration_Guide/index.html

pdf
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/pdf/Online_Storage_Reconfiguration_Guide.pdf

Both links can be found on:

http://www.redhat.com/docs/manuals/enterprise/

Note: i just discovered the link to the PDF on that index page was incorrect. i've pushed the correction earlier today and it should be up within the next 24 hours or so.

Comment 62 Andrius Benokraitis 2009-07-16 03:11:10 UTC
I believe since this has been built and synced to redhat.com (and doesn't require being built by rel-eng or be reviewed by QE), I think this can be closed manually. Will double-check with TomC.

Comment 64 Andrius Benokraitis 2009-07-17 13:23:08 UTC
SCSI rescan script stuff that went into *5.4* needs to be included.

Comment 66 Rob Evers 2009-07-29 17:19:22 UTC
Hi Don,

My suggestion on documenting 'rescan-scsi-bus.sh' and its limitations follow.

Rob

I would add a new chapter to the 'Online Storage Configuration Guide'
after chapter 7. 'Removing Devices'

I would call the new chapter 'Automated LUN Addition and Removal'
I would refer to this new chapter at the end of chapters 5 and 7.

The new chapter 'Automated LUN Addition and Removal'
should read as:

The script rescan-scsi-bus.sh is available as part of the sg3_utils package.
(May want to add a section here on how a customer gets the sg3_utils
package.  Is it something like:  yum install sg3_utils?).

The script can be used to automatically update host lun configuration
following LUN addition and removal.

Limitations of rescan-scsi-bus.sh

(from bz507379 comment 31)

LUN0 should be the first LUN mapped for rescan-scsi-bus.sh to work correctly.  If LUN0 is not the first LUN mapped, the first LUN mapped will not get detected, nor will other LUNs that should be scanned.  Using the --nooptscan option does not work around this.

Due to a bug in the rescan-scsi-bus.sh script, the functionality to recognize a
change in the size of a lun executes when the --remove option is used.

rescan-scsi-bus.sh needs to be run twice when LUNs are mapped for the first time for all luns to be recognized.

Comment 67 Don Domingo 2009-09-08 02:30:38 UTC
i've revised the document as per specification (in source). since the only remaining issue in the doc was replacing "LUN" with "logical units" where applicable, i'll be pushing it to public later today. 

closing this bug as CLOSED -> CURRENTRELEASE.


Note You need to log in before you can comment on or make changes to this bug.