Red Hat Bugzilla – Bug 264001
RFE - Document the ability to remove/add LUNs dynamically
Last modified: 2010-10-22 14:13:04 EDT
Description of problem:
- RHEL4 is based on a 2.6.9+ kernel. This kernel, by default, does not delete
devices when connectivity is lost. The sdev simply goes into a offline state.
- Later upstream kernels, even as late as 2.6.18 (which is RHEL5) and later,
do actively teardown the devices when connectivity is lost (part of a
fc transport side effect).
- Many bugs were found in the "removal" code paths, with several bugs being
data structures that were erroneously freed, then reallocated by other
code paths. This happened on target and sdev level structures.
- The patches to correct this made it into 2.6.19 and 2.6.20, and there's
still a couple of residual "sdev resurrection" patches that were posted
in June 2007, that have yet to make usptream.
- The discovery of many of these paths were so late in the RHEL5 schedule
(and SLES10 actually), that both distros put non-upstream patches in the
fc transport to avoid the deletion of devices.
So, although the linux kernel says it supports hot removal, and the interfaces
exist, as policy, I would not support "lun removal".
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Customers require the ability to de-allocate storage on the fly without
rebooting. After discussion with Red Hat engineering and Emulex engineering, it
appears the kernel in RHEL 5 has bugs in the remove path which prevent this from
working reliably. This feature request is to apply whatever patches necessary
from upstream and to test thoroughly so that customers can expect a way to
execute a scan that would remove any storage that has been deallocated.
The upstream kernel struggled with this problem, as noted in comment 0. This
leads me to think that the changes may be too disruptive to backport to RHEL 5.
On the other hand, I believe there will be a strong demand for this, because
storage arrays these days are dynamically creating and deleting LUNs (e.g.
snapshots) as a matter or normal operation.
Chip, please take a look and see what specific problems exist in 5.1, and what
specific fixes we might backport. Also consider this as part of the larger
effort to do discover and manage on-line storage config changes.
Created attachment 294606 [details]
Online Storage Reconfiguration Guide (latest PDF build)
as requested. mind the placeholders.
please have all reviewers email me at email@example.com for revisions,
suggestions and other concerns regarding this document. i'd be happy to
integrate as much content as can be supplied. thanks!
can you sign off on the document for release with RHEL5.2 (to be made available
online only)? when you do, i will remove all placeholders. the latest build can
be found here:
Putting on the 5.4 list to possibly move this doc from tech preview to fully supported...
The link in comment 37 is no longer available: Use the publically available link:
the updated OSRG is public now on:
Both links can be found on:
Note: i just discovered the link to the PDF on that index page was incorrect. i've pushed the correction earlier today and it should be up within the next 24 hours or so.
I believe since this has been built and synced to redhat.com (and doesn't require being built by rel-eng or be reviewed by QE), I think this can be closed manually. Will double-check with TomC.
SCSI rescan script stuff that went into *5.4* needs to be included.
My suggestion on documenting 'rescan-scsi-bus.sh' and its limitations follow.
I would add a new chapter to the 'Online Storage Configuration Guide'
after chapter 7. 'Removing Devices'
I would call the new chapter 'Automated LUN Addition and Removal'
I would refer to this new chapter at the end of chapters 5 and 7.
The new chapter 'Automated LUN Addition and Removal'
should read as:
The script rescan-scsi-bus.sh is available as part of the sg3_utils package.
(May want to add a section here on how a customer gets the sg3_utils
package. Is it something like: yum install sg3_utils?).
The script can be used to automatically update host lun configuration
following LUN addition and removal.
Limitations of rescan-scsi-bus.sh
(from bz507379 comment 31)
LUN0 should be the first LUN mapped for rescan-scsi-bus.sh to work correctly. If LUN0 is not the first LUN mapped, the first LUN mapped will not get detected, nor will other LUNs that should be scanned. Using the --nooptscan option does not work around this.
Due to a bug in the rescan-scsi-bus.sh script, the functionality to recognize a
change in the size of a lun executes when the --remove option is used.
rescan-scsi-bus.sh needs to be run twice when LUNs are mapped for the first time for all luns to be recognized.
i've revised the document as per specification (in source). since the only remaining issue in the doc was replacing "LUN" with "logical units" where applicable, i'll be pushing it to public later today.
closing this bug as CLOSED -> CURRENTRELEASE.