Bug 111342

Summary: get_parition_list can loose disks
Product: Red Hat Enterprise Linux 2.1 Reporter: Neil Horman <nhorman>
Component: kernelAssignee: Jason Baron <jbaron>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: knoel, riel, tao
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-04-22 01:03:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 107565    

Description Neil Horman 2003-12-02 13:22:28 UTC
Description of problem:
 It looks like several specific sd entries are always deleted
when other sd entries whose lun number is within 16 apart
from the lun numbers of these specific sd entries, using
"sdcb: 2 0 0 38" for example:

If removing any sd entries from "sdbm: 2 0 0 23" thru
"sdca: 2 0 0 37", then "sdcb: 2 0 0 38" would also be
removed along the way.

So the same applies to the following:

sdbl: 2 0 0 22
sdav: 2 0 0 6
sdp: 3 0 0 14
sdaf: 3 0 0 30


This is due a bug in linux kernel in get_partition_list() in
drivers/block/genhd.c. This function uses a variable nr_real to loop
through the disks. When a disk is removed using remove-single-device,
all entries(nsect) for the disks are made zero BUT it is not removed
from the list of disks. However, nr_real is decremented by calling
sd_detach().

As a result, whenever a disk is removed, last disk in the list
gets ignored while reporting disks in /proc/partitions - unless ofcourse
the disk being removed is the last disk in that list.

Since VM uses /proc/partitions to get the disk info, it also fails
to report the disk.


Here is a look at the source:
************************************************************
       The following function is causing the problem.

get_partition_list(char *page, char **start, off_t offset, int count)
{
       struct gendisk *gp;
       ...
       ...
       for (gp = gendisk_head; gp; gp = gp->next) {
               for (n = 0; n < (gp->nr_real << gp->minor_shift); n++) {
                       if (gp->part[n].nr_sects == 0)
                               continue;

               }
               ....
               ....
}

For every major number(16 disks), there will be one gendisk
structure.  Look at the second loop in the function.
That second loop is supposed to loop on all devices.
But this is not happening due to the followng.

Lets assume that gendisk structure contains 16 disks (i.e. gp->nr_real
= 16).

Suppose 1st disk is removed,(while detaching it decremnets
gp->nr_real value), then then the loop is executed for 0-14
only.  So, the last disk is not printed(i.e it is also
removed).

The fist disk is anyway removed from list because it sets
gp->part[n].nr_sects to zero, while detaching.

Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
issue tracker ticket #29927 contains a script to show loss of scsi devices
  
Actual results:
scsi devices are not reported in /proc/partitions

Expected results:
All scsi devices should be reported in /proc/partitions.


Additional info:

Comment 1 Jason Baron 2003-12-03 00:19:27 UTC
ok, this is some helpful analysis...seems to me that this is bug
upstream too...ie all linux kernels have this issue, is that your
finding? Also, i looked at the code a bit and it seems that typical
users of the gendisk interface, such as ide, pin the nr_real to its
max value. So here we could have the drivers/scsi/sd.c, simply pin
nr_real to 16. If you agree this, its an easy patch which i can do,
and we can verify its correctness w/your test script.

Comment 2 Jason Baron 2003-12-03 00:40:07 UTC
so something like the following would probably work, although i
havne't tested it.

--- linux/drivers/scsi/sd.c.bak 2003-12-02 19:40:13.000000000 -0500
+++ linux/drivers/scsi/sd.c     2003-12-02 19:41:35.000000000 -0500
@@ -1207,7 +1207,7 @@ static int sd_init()
                        goto cleanup_gendisks_part;
                memset(sd_gendisks[i].part, 0, (SCSI_DISKS_PER_MAJOR
<< 4) * sizeof(struct hd_struct));
                sd_gendisks[i].sizes = sd_sizes + (i *
SCSI_DISKS_PER_MAJOR << 4);
-               sd_gendisks[i].nr_real = 0;
+               sd_gendisks[i].nr_real = SCSI_DISKS_PER_MAJOR;
                sd_gendisks[i].next = sd_gendisks + i + 1;
                sd_gendisks[i].real_devices =
                    (void *) (rscsi_disks + i * SCSI_DISKS_PER_MAJOR);
@@ -1326,7 +1326,6 @@ static int sd_attach(Scsi_Device * SDp)
        rscsi_disks[i].device = SDp;
        rscsi_disks[i].has_part_table = 0;
        sd_template.nr_dev++;
-       SD_GENDISK(i).nr_real++;
         devnum = i % SCSI_DISKS_PER_MAJOR;
         SD_GENDISK(i).de_arr[devnum] = SDp->de;
         if (SDp->removable)
@@ -1447,7 +1446,6 @@ static void sd_detach(Scsi_Device * SDp)
                        SDp->attached--;
                        sd_template.dev_noticed--;
                        sd_template.nr_dev--;
-                       SD_GENDISK(i).nr_real--;
                        return;
                }
        return;


Comment 3 Neil Horman 2003-12-11 15:49:50 UTC
I think that seems like a reasonable an non-invasive fix to me.

Comment 4 Jason Baron 2003-12-11 15:56:46 UTC
Neil, can you test this, or is it easier if i spin a kernel for you to
test?

Comment 5 Neil Horman 2003-12-11 16:43:29 UTC
I can test it, I just have to get some clarification on part of the
test script from the customer, and set up a machine for this.  I'll
post results as soon as I have them.
  

Comment 6 Neil Horman 2003-12-15 15:19:27 UTC
Looks like the patch is a winner.  Under a patched kernel, disks which
previously dissappeared unintentionally when I issue a scsi
remove-single-device now remain in /proc/partitions.  Thanks!


Comment 7 Jason Baron 2004-01-30 17:20:44 UTC
This fix has been committed to the U4 branch. changing to modified.
Test kernel is at:

 http://people.redhat.com/~jbaron/.private/u4/2.4.9-e.37.2.test/

I'd like to push this patch upstream as it looks like the issue is
there as well. is there a simple test script that i can use to
reproduce the problem?


Comment 8 John Flanagan 2004-04-22 01:03:02 UTC
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-105.html