Description of problem: Status: 16.10.2003 patch still to be re-send to linux- scsi mailing list, requires review with regard to latest 2.4 release candidate 23.10.2003 adapted patch to latest kernel (2.4.23-pre7), patch re-send to Marcello Tosatti (Linux 2.4 maintainer) and linux-scsi mailing list 25.10.2003 original patch was defective, fixed problems, opened LTC 5135, re-send fixed version of patch to Marcello Tosatti and linux-scsi mailing list 11.11.2003 no response on resubmitted patch ---------------------------------------------------------------------- ---------------- [prev in list] [next in list] [prev in thread] [next in thread] List: linux-scsi Subject: [PATCH] 2.4.23-pre8: fixing deferred SCSI adapter registration (resubmitted) From: "Martin Peschke3" <MPESCHKE () de ! ibm ! com> Date: 2003-10-25 13:27:13 [Download message RAW] Hi, I am resubmitting this fix as 2 issues of the original patch haved surfaced (scsi_host_template needs to be added anyway regardless of the fact that no scsi_host might have been added in the first instance; accounting of detected scsi_hosts needs to be done in scsi_register () / scsi_unregister() in order to avoid miscalculation). The patch has also missed the 2.4.23-pre7 cutoff, fortunately. Please consider applying for 2.4.23-pre8. Cheers, Martin Peschke >>>> please find patch in mailing list ---------------------- Forwarded by Martin Peschke3/Germany/IBM on 25/10/2003 15:20 --------------------------- Martin Peschke3 23/10/2003 19:08 To: marcelo.tosatti cc: linux-scsi.org From: Martin Peschke3/Germany/IBM@IBMDE Subject: [PATCH] 2.4.23-pre: fixing deferred SCSI adapter registration Hi, The following patch coutesy of Stefan Bader fixes the 2.4 SCSI adapter registration, which fatally goes wrong when done at a later time than an HBA driver's registration. This has hampered the addition of SCSI adapters which become available at any later point in time. As the then-maintainer of the SCSI mid-layer confirmed, the deferred registration of SCSI adapters should be possible, (http://marc.theaimsgroup.com/?l=linux-scsi&m=96283633428905&w=2) though it later turned out to be flawed by some misplaced chunks of code, which need to be considered as a bug. The problem with the current 2.4 code is that a couple of setup things are only done when an HBA driver is registered while these steps ought to be exercised each time an adapter is registered, independent of time and circumstance of the emergence of a new adapter. This includes the creation of an error recovery thread and a procfs-entry per adapter. As far as I know, dynamic adapter addition has been mainly a domain of the zSeries HBA driver (zfcp), but it is also a valid requirement for any other technology capable of detecting and adding adapters on-the-fly, like Hotplug PCI. The original patch was posted some time ago last year and received positive feedback. (http://marc.theaimsgroup.com/?l=linux-scsi&m=101559609329580&w=2) It got probably lost because SCSI maintainance in 2.4 is rather unsettled. It was also used as base for similar functionality in 2.5 (http://marc.theaimsgroup.com/?l=linux-scsi&m=103117214824710&w=2) The patch has been proved reliable in SuSE kernels since last year. I am convinced it should go into the mainline kernel. I have adopted the patch to 2.4.23-pre7, regression test included. Please apply. best regards, Martin Peschke ---- original patch removed ---- -
I have been asked too add more information about the problem and patch: The latest posting of the patch to linux-scsi can be found here http://marc.theaimsgroup.com/?l=linux-scsi&m=106708846520490&w=2 Scenario: another SCSI adapter becomes available while other SCSI adapters have already been in use for some time, i.e. SCSI detection was finished earlier (unlikely for systems with PCI cards, but an absolutetly valid case for zSeries!), user wants to use that new adapter and adds entries to his zfcp map and triggers detection of devices attached via the additional SCSI adapter, Result without patch: First, SCSI stack mid-layer allows to add the new attachment ... and crashes later. Result with patch: no crash, scsi_mod works smoothly with adapters added on-the-fly Problem: setup of adapters was partly done outside the routine which sets up a SCSI adapter, hence the mid-layer failed to setup error handling threads and proc-files for "hotplugged" adapters Fix: move setup of per-adapter proc-entries, error handling threads to proper routine and also adapt accounting of present scsi adapters
To IBM. I believe this is a request for hot plug adapter support. We do not have this in our 2.4.21 kernel. This should be moved to RHEL 4 ?
I asked the SCSI developer to answer.
No. Sorry, I didn't explain the scenario clear enough. The distinct feature of real "hotplug", as it is usually understood by Linux developers, is that all consequences of devices coming or going are handled _automagically_, namely by the interplay of kernel hotplug events, hotplug scripts and so on. This is not what I try to shoulder here. The referred patch would just allow a Linux administrator to _manually_ enlarge (not shrink!!) their SCSI configuration. In principle, the Linux SCSI stack provides means to do that, e.g. echo 0x1234 1:0xabc 5:0x5789 > /proc/scsi/zfcp/add_map echo scsi add-single-device 2 0 1 5 > /proc/scsi/scsi (which detects LUN 5 at target ID 1 at bus 0 at adapter 2) Unfortunately, this method does only work in some cases (not by design, but by mistake, I suppose), i.e. if a SCSI device is to be added to an _already known_ SCSI adapter. Given the above example, adapter #2 would be fine, while accessing adapter #3 (a new adapter) would cause inconsistencies in the SCSI mid layer, which make the SCSI subsystem crash sooner or later. The cause is a flaw in the SCSI adapter setup routines, as explained in my previous comment. The fix is straight forward, and it should also be a candidate for Doug's bk tree which he has setup to mend the stuck vanilla 2.4 SCSI maintenance, in my eyes. Let me give you an analogue example. Our DASD driver provides an r/w proc-interface (/proc/dasd/devices), by which the Linux system administrator can instruct the DASD driver to look for another DASD addressed by a specified device number: echo add device 8275 > /proc/dasd/devices This makes the DASD driver detect the device 8275, for instance. There is nothing automagical about it. It simply allows to enlarge the existing device configuration _manually_ ...and without the need for reboot (important!!). Usually this has to be done after that device was added to a Linux guests configuration, i.e. if the z/VM administrator acknowledges the need for growing disk space of Linux guests by dedicating more disks. For availability reasons Linux systems must not be rebooted each time. That is why, there is a real business need to be able to do that "on the fly". The disputed SCSI patch does a very, very similar thing. Nothing more. The term 'hotplug' would untruly suggest much more functionality, wherefore I have been striving to avoid that word in connection with the referred patch.
*** Bug 114941 has been marked as a duplicate of this bug. ***
Martin, I've integrated this patch. In the process, I noticed that the coding style didn't follow the established style in the existing files. Do me a favor, and when modifying an existing file with an existing coding style, please try to use that style in your modifications. There was an additional issue regarding the patch. I'm surprised that it ever worked for more traditional SCSI HBAs since you didn't bother to create the proc_dir for the HBA module until *after* the call to hostt->detect, which for normal scsi drivers is when they call scsi_register and therefore build_proc_dir_entry but there won't be a proc dir to put the entry into yet! Anyway, some reordering later, a lot of hand patching later, and I've got it in the tree. I'm doing a compile/boot test now, and when that's done I'll push this to the bk tree along with the other changes.
OK, I've got the patch working on x86 with both aic79xx and qla2200 drivers, so I think it's ready to be tested. The changes have been pushed to the public BK tree already.
Good points. We certainly overlooked that procfs thing. I tested it only on zSeries (which does forgoes /proc/scsi entries as created by the mid layer). Can't say for sure whether the guy who created the first version of this patch tried some other lldd. Why don't we create the lldd proc entry unconditionally prior to calling detect? Wouldn't it be easier to be able to create a per-hba procfs entries in detect() in any case?
Unfortunately, that doesn't work. A number of scsi host drivers don't set the tpnt->name entry until their detect routine is called, so attempts to create the proc dir prior to detect fails. In the end, I added a new global variable, scsi_in_detection, and whenever we call the hostt->detect() routine of a driver, we first set this to 1, that way calls to scsi_register() with this variable set will skip certain setup activities, then clear it after detect is complete and in the loops just below the call to hostt->detect() in scsi_register_host() we complete those setup activities, including printing out the info() of the host adapter (that oopses if we haven't completed the detect on some drivers) and creating the /proc directories for the specific host adapters. If you download the bk tree you can see what I'm talking about. The next thing I was thinking of doing was to take all the detection and host scanning code out of scsi_register_host() and put it into scsi_setup_host(), loop through the adapters found in scsi_register_host() and call scsi_setup_host() for each one that needs it. Then EXPORT_SYMBOL() scsi_setup_host() and that way when your driver gets an out of band host adapter addition, it could call scsi_register() to create the host, then call scsi_setup_host(shpnt) in order to trigger a typical scan like you would get at module insert time. That should be fairly easy to implement and would be a nice addition since by doing that, you would then be able to modify the proc code in the dasd driver to allow a user to specify just the adapter to add, not the device, and have the adapter and all its devices show up then. Not quite the same as the 2.6 hotplug stuff, but reasonable for 2.4. If you're interested in something like that, I'm pretty sure I could have it ready for you to test in a couple hours. Let me know.
I copied the bk tree onto my disk and this is why I was wondering about the use of scsi_in_detection. Okay, it's probably better to work around lldd issues with tpnt->name. I agree that all this per-lldd work should be pulled out of scsi_register_host. This is basically similar to the cleanup that has been done in 2.6. We tried to keep our original patch as small as possible. As to the scsi_setup_host() idea, I don't think a call from a lldd should be necessary. Scanning should be part of scsi_register(). It would look odd to to call scsi_register() followed by scsi_setup_host() for dynamically added adapters, while scsi_register() is sufficient for adapters accessible at module load time. I guess there is a minor misunderstanding as to the dasd driver you have mentioned. Yes, zfcp requires every LUN to be configured separately, which won't change for other reasons. This can be done by cat-ing some configuration comprising a set of LUNs (attached to various - if so - new adapters) into some zfcp specific proc entry. All the new adapters configured this way will be scsi_register()-ed upon close() of that proc entry. Therewith, all new adapters would be scanned, if scsi_register() provided this service. All LUNs attached to new adapters would be scanned. So far so good. Only new LUNs attached to adapters added earlier, e.g. upon module load, wouldn't be found, since old adapters would not be rescanned. The add-single-device venture would be needed for those LUNs, and only for those. Which doesn't look consistent either. Then a call to something like scsi_setup_host() for all dynamically added LUNs attached to old adapters would be needed to have all LUNs scanned without further user intervention. To make a long story short, scsi_setup_host() seems to be the best way to trigger scans of dynamically added adapters, if it was also capable of rescanning old adapters for new LUNs without causing pain for already accessed LUNs. What do you think, Doug?
scsi_setup_host() can be called from scsi_register() if scsi_in_detection is 0. I will make it so. That way a call to scsi_register() after module load does "The Right Thing" without any additional calls needed. As to rescanning of existing adapters, we have a patch in the tree to do exactly that. The proc code in the zfcp driver could be simplified to something like: for(ptr=new_host_list; ptr; ptr=ptr->next) { zfcp_setup_new_host(ptr); /* assuming this calls scsi_register */ /* scsi_register would scan host */ } scsi_scan_new_devices(); /* scan for new devices on hosts */ return;
OK, I pushed a set of changes to the bk tree today. These changes implement what we were talking about. At module load time, scsi_register_host() is called (poor name, it should be scsi_register_driver or something like that, but too late now). When we are in this function, all calls to scsi_register() will not complete the final host setup, the same as it has always been. Instead, after all the host adapters are registered, then we loop through them to set them up, then we loop through them again to scan them. Now, scanning can *not* be done during scsi_register regardless of whether or not we are in the detect routine. Drivers don't have the host adapter ready for scanning at the point we call scsi_register. So, when we are not in the detect routine, scsi_register will call scsi_setup_host, but not call scsi_scan_host. It is up to the low level driver to call scsi_scan_host(host_ptr); after it has completed all setup and is ready for scanning. This implements a fairly complete dynamic host adapter addition patch that the zfcp driver can now be modified to take advantage of. In addition, since the scsi_scan_new_devices() export was already there, you have the ability to dynamically add new hosts and to rescan existing hosts any time you want from within the driver now.
I pushed two more changesets to the tree this weekend that also implement the ability to remove host adapters. You can now do echo "scsi remove-hba <number>" > /proc/scsi/scsi and the subsystem will attempt to find a host with host_no matching <number> and remove it. You can also call the function that does this directly from your driver, eg. scsi_remove_hba(struct Scsi_Host *host); The locking for this can't be done on a module basis, it has to be done on a per device basis, so if you have 4 disks on a host, and 3 of them aren't in use, the function will free up the 3 free disks, fail on the 1 busy disk, refuse to remove the host itself, and return. The return code is the number of failed device removals. So, return code 0 means that the host struct has been nuked and you shouldn't try to reference it again (in fact, the driver's revoke() and release() routines, if set in the host template, will get called during the remove hba process). Return code > 0 is the count of devices still active on the host. Anyway, it's there, but it's also a big change for a RHEL update, so even though I sent the patch to our internal list for review, unless someone says "OK" to such a big change, it may not go into the next update.
Just to confirm my understanding of your changes: What I could do now, after some new LUNs and potentially adapters have emerged, i.e. have been configured via zfcp's proc interface, is either for_all_adapters(curr, first_adapter) { if (curr->scsi_host == NULL) { curr->scsi_host = scsi_register(my_scsi_template); /* ... setup various members of curr->scsi_host ... */ } if (curr->new_luns_added) { scsi_scan_host(curr->scsi_host); curr->new_luns_added = 0; } } or for_all_adapters(curr, first_adapter) { if (curr->scsi_host == NULL) { curr->scsi_host = scsi_register(my_scsi_template); /* ... setup various members of curr->scsi_host ... */ } scsi_scan_host(curr->scsi_host); } or for_all_adapters(curr, first_adapter) { if (curr->scsi_host == NULL) { curr->scsi_host = scsi_register(my_scsi_template); /* ... setup various members of curr->scsi_host ... */ } } scsi_scan_new_devices(); So, no need for me to call scsi_setup_host() at any time, right? Neither scsi_scan_host() nor scsi_scan_new_devices() cause harm for already known devices, do they? As for the hba removal, this won't occur with zfcp on 2.4. The zfcp driver in 2.4, unlike its 2.6 incarnation, is not prepared for removing stuff safely on the fly. Unless you see other potential exploiters of this feature, dynamic hba removal should not consume too much of your time, I think. Anyway, the way it works as described above sounds reasonable. I have not yet looked at the actual code. But the crucial bit must be the identification of devices still in use, i.e. with ongoing I/O.
I'm reverting the status back to "assigned" until I commit Doug's changes to the RHEL3 U3 patch pool (at which time I'll change it to "modified" with the associated kernel version).
Doug's implementation of dynamic adapter addition has just been committed to the RHEL3 U3 patch pool (kernel version 2.4.21-15.7.EL).
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-433.html
I think that these changes cause a regression. I get panic when sending "scsi add-single-device" to /proc/scsi/scsi in 2.4.21-20.ELsmp kernel when new disk is on an Adaptec aic79xx controler. This does not happen for example when adding new disk on a Qlugic qla2300 controler. This did not happen with 2.4.21-15.ELsmp kernel either. I'm attaching hy oops and ksymoops dumps together with sysreport.
Created attachment 103820 [details] OOPS dump
Created attachment 103821 [details] OOPS dump processed with ksymoops
Created attachment 103822 [details] sysreport output for the system in question
Peter, thanks for the report, but it has to be a new bug.
Thanks for opening a new bug report, Peter. The new bug is #132547.