Description of problem: This problem occurs on x86_64 systems when there are multiple kernel modules attempting to handle ioctls in compatibility mode, CONFIG_COMPAT. One module registers itself via the register_ioctl32_conversion()call, registering at the same time a handler to handle ioctls in that mode. An application daemon then issues an ioctl call, which sleeps in the kernel module to handle asynchronous events. While this ioctl call is outstanding, if another module tries to register to handle ioctls via register_ioctl32_conversion() at its load time, that call will hang, causing the module loading to hang. How reproducible: Very Steps to Reproduce: 1. Load a module which registers itself to handle ioctls via register_ioctl32_conversion(), registering at the same time a handler. 2. Issue an ioctl to this module, which ioctl remains outstanding. 3. Attempt to load another module that tries in its module_init function to register itself to handle ioctls via register_ioctl32_conversion(). Actual results: Loading the second module causes a hang. Expected results: The second module should be allowed to load and register itself to handle ioctls. Additional info: The problem appears to be with using the semaphore âioctl32_semâ, used globally in fs/compat.c. When an ioctl is handled in compatibility mode, it goes through compat_sys_ioctl(). If a module has registered a handler to handle the ioctls, compat_sys_ioctl() acquires the read-variant of the semaphore and then calls the moduleâs handler function. The module then might sleep holding the semaphore. This is all valid because you can have many read-variant semaphores outstanding. The problem arises when a new module comes in and attempts to register itself via register_ioctl32_conversion(). That call attempts to acquire the write-variant of the same semaphore. This will cause a hang though because a write-variant semaphore cannot be acquired as long as there are any read-variants of the same semaphore outstanding!!!
hmmm. have you seen this issue with the stock Red Hat kernels? Or are you playing with new modules outside our tree?
I have seen this w/ the stock 2.6.9-22.ELsmp kernel (RHEL4-U2) on x86_64 systems. In the fs/compat.c file, in compat_sys_ioctl() a module's t->handler(...) function is called holding the read-variant of ioctl32_sem ( after down_read(&ioctl32_sem);). If the module in its ioctl handler sleeps waiting to handle some asynchronous events, then it sleeps holding the read-variant of the semaphore. The register_ioctl32_conversion() function, called by another module, tries to get the write-variant of the same lock (down_write(&ioctl32_sem);). This causes a hang though because the write-variant of this semaphore cannot be acquired while there are read-variants acquired for the same semaphore. This whole implementation scheme means that as long as a module has registered a handler to handle ioctls in 32-64 bit compatibility mode, and there are ioctl calls outstanding to this module, NO OTHER module can register itself to handle ioctls in this mode. Maybe re-implementing the safeguard mechanism w/ ref counts would fix this?
hmmm, but the guy in the ioctl holding the read semaphore should complete at some point, and then the register would complete. what am i missing? is the guy holding the read semaphore sleeping indefinitely?
i looked a bit at the upstream kernel, and there they have completely removed this sempahore referencing issues that it caused. although i'm not sure what exeactly. unfortunately this is going to be difficutl to due in rhel4 since the register_ioctl32_conversion interface is exported.
The read semaphores (could be more than one) are held by a daemon process, which has outstanding ioctl requests to the ioctl module. This is used to handle asynchronous event processing down in the kernel module. The thread holding the read semaphore can sleep, wake up to handle an event, and sleep again. Yes, the upstream kernel has reworked the compatibility support, they do not use global register/unregister functions. It also now uses a ref count scheme rather than holding a semaphore across the ioctl call. That's why I suggested Red Hat might want to look into changing the locking mechanism here in RHEL4 from semaphores to ref counts.
the upstream method is simply to have a 'compat_ioctl' field in the file_operations. Thus, nothing is dynmaically register or unregistered. We just follow pointers. backporting this breaks our kernel binary interface however, so this isn't going to be so simple...
Can you please provide a test case. thanks.
I dug up from when the case was reported to us, the two configurations that can be used as test cases. They both involve the Emulex HBAnyware/apps kit (and of course the corresponding LPFC 8.0.16.x driver) being present, and installing either the Veritas SF4.1 application or PowerPath 4.5.0-b87. 1. Test case installing Veritas SF4.1 application. This was first reported by an OEM when installing the application Veritas SF 4.1 on RHEL4-U2, on a system that had the LPFC 8.0.17.17 driver and corresponding HBAnyware (elxlinuxapps-2.1a25-8.0.16.17-1-1.tar) packages already installed. When going through the initial installer (from version 4.1), the installation hangs when installing the VRTSvxvmplatform rpm. The actual hanging part is when the system tries to do an "insmod vxdmp". When you try to install again, it will experience the same problem when trying to insmod the vxfs module. When the insmod is hanging, trying to communicate with the fibre cards (via lputil) hangs in a similar fashion - i.e. the modprobe cannot be killed, nor can the modprobe process be connected to via strace. Loading only the lpfc driver and stopping the lpfcdfc and the Emulex application kit fixed the problem. It looks like there is some conflict going on between the application kit and lpfcdfc driver and Veritas SF. Some more details on the test case: - The hang occurs when the HBAnyware rmserver daemons (/etc/init.d/ElxRMSrv and /etc/init.d/ElxDiscSrv) are started. - Starting both daemons and trying to install Veritas fails - it hangs when trying to insmod vxdmp. - Stopping ElxDiscSrv (via /etc/init.d/ElxDiscSrv stop) doesn't change the state. - However, stopping the rmservers (via /etc/init.d/ElxRMSrv stop) permits the insmod to complete, and the installation continues normally. Keep in mind that upon HBAnyware installation, by default, only the rmserver daemon will be started. The discovery daemon (elxdiscoveryd) will start up on the first invocation of the HBAnyware application. When the system is rebooted, the rmserver daemon will start automatically. The discovery daemon will start up on the first invocation of HBAnyware. 2. Test case installing/uninstall Powerpath The PowerPath installation hangs when modprobing the emcp module. Uninstalling PowerPath causes a similar problem, only it's trying to unload the emcph driver when it hangs. It does appear that PowerPath seems to be OK after rebooting. The drivers are loaded and the devices are all created fine. In addition, the hba drivers are all the most recent available on the vendor/oem site. The installation was attempted when the system had a singlepath to storage (as well as with no external storage). Here is the info about the two systems that we have encountered this problem: PowerPath: 4.5.0-b87 OS: RHEL 4.0-U2-x86_64 - 2.6.9-22.Elsmp Sys 1: Opteron IDE EMLX-1050 N Sys 2: Opteron IDE EMLX-10000 N The systems do not have a problem booting. The problem exists at RPM installation time. Running "rpm -i EMCpower.LINUX-4.5.0-087.rhel.x86_64.rpm" never returns, so the RPM installation never fully completes installation. If you connect to another session, you see that the RPM installation script is running and trying to run the command "modprobe -r emcp". This process is "unstoppable" - it doesn't respond to signals. If you strace it, the strace gets hung. However, if you reboot the system, the system comes online without any problems and the PowerPath drivers are loaded normally. It was concluded that this problem manifests itself only during the installation phase of the second module (i.e. emcp) when the HBAnyware rmserver daemon is running.
Another customer report: On x86_64 machines, certain modules attempt to handle ioctls in compatibility mode by using register_ioctl32_conversion(). In this case, it is the lpfcdfc module and the llt module. The problem arises when you have a module handling an ioctl in compatibility mode. The ioctl is called via the compat_sys_ioctl() while holding the ioctl32_sem semaphore(example lpfc_ioctl32_handler for Red Hat provided lpfcdfc module). This prevents any other module which tries to register itself to handle ioctls using register_ioctl32_conversion(). The loading of this module stops while the other first module handling the ioctl is running. In the customers case, when calling the ioctl handler for the Red Hat module lpfcdfc, compat_sys_ioctl holds the ioctl32_sem while handling the ioctl. asmlinkage long compat_sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg) { .. down_read(&ioctl32_sem); <--- Semaphore grabbed .. if (t) { if (t->handler) { lock_kernel(); error = t->handler(fd, cmd, arg, filp); <-- Ioctl handled unlock_kernel(); up_read(&ioctl32_sem); <-- Semaphore released. } else { up_read(&ioctl32_sem); ... kernel: Call Trace:<ffffffff8015b240>{find_get_page+65} <ffffffff8015bd8f>{filemap_nopage+384} kernel: <ffffffffa09cce47>{:lpfcdfc:lpfc_sleep+94} <ffffffff8013465d>{default_wake_function+0} kernel: <ffffffffa09c9222>{:lpfcdfc:lpfc_ioctl_hba_set_event+492} kernel: <ffffffffa09ca399>{:lpfcdfc:lpfc_process_ioctl_util+942} kernel: <ffffffffa09c5357>{:lpfcdfc:lpfcdiag_ioctl+129} <ffffffff8018cc75>{sys_ioctl+853} kernel: <ffffffffa09c545b>{:lpfcdfc:lpfc_ioctl32_handler+208} kernel: <ffffffff801a0f28>{compat_sys_ioctl+235} <ffffffff801265b3>{sysenter_do_call+27} While this is happening, Veritas provided module llt cannot be loaded. The insmod (loading LLT driver) is hung at register_ioctl32_conversion() Aug 11 18:38:06 rdg2950-25 kernel: insmod D 0000002a9558b010 0 21325 21260 (NOTLB) Aug 11 18:38:06 rdg2950-25 kernel: 00000100c1dc3eb8 0000000000000002 0000010001043a20 000000000000001d Aug 11 18:38:06 rdg2950-25 kernel: 000001010b44c030 000000000000038a 000002d069cbe00a 0000000000000246 Aug 11 18:38:06 rdg2950-25 kernel: 000001010b44c030 0000000000010b1b Aug 11 18:38:06 rdg2950-25 kernel: Call Trace:<ffffffff80162a3e>{cache_alloc_refill+390} <ffffffff8030fe58>{__down_write+134} Aug 11 18:38:06 rdg2950-25 kernel: <ffffffff801a0caf>{register_ioctl32_conversion+151} Aug 11 18:38:06 rdg2950-25 kernel: <ffffffffa0a7603c>{:llt:llt_mod_init+60} <ffffffff80150892>{sys_init_module+278} Aug 11 18:38:06 rdg2950-25 kernel: <ffffffff8011026a>{system_call+126}
It does not seem likely that we will re-implement the way ioctl semaphores are done in RHEL 4. That seems too risky. Laurie, is it possible for Emulex to modify lpfcdfc, so that it does not take the semaphore and hold it indefinitely? Are there work-arounds for the issue? Comment 10 seems to say that installation of Veritas will succeed if HBAnyware is stopped first. Once that is done, are you able to get both running satisfactorily?
Tom, Recall that Emulex logged this bug over 2 years ago now and the bug is a problem in the RHEL4 x86_64 kernel in how it handles ioctls in 32-64 compatibility (CONFIG_COMPAT) mode. SLES9, and other RHEL4 architectures do not have this problem. Later kernels (which were adopted by RHEL5) modified the implementation to remove the bug. Furthermore, this is not a problem w/just the LFPCDFC/ioctl driver (this is the module that registers in compatibility mode). LPFCDFC does not directly take any semaphores, it simply uses the kernel API for compatibiltiy mode. As long as ANY module has registered a handler to handle ioctls in 32-64 bit compatibility mode, and there are ioctl calls outstanding to this module, NO OTHER module can register itself to handle ioctls in this mode. The two examples we provided of this are Veritas and Powerpath. If there are any utilities that use these third party modules it's entirely possible they'll hit this problem. The lpfc driver is also in maintenance mode, we are doing customer critical fixes and new hardware support only at this point. It doesn't makes sense for us to work around this issue when the root issue is elsewhere. The workaround we've been suggesting in the absence of a RH fix is to stop all hbanyware threads from running, before any modules that require ioctl handling in 32-64 compatibility mode are installed. A possible alternative to RH fixing the core issue and Emulex putting a workaround in our driver is for RH to create a knowledge based article on this issue so that customers don't keep hitting it and providing a workaround similar to what is noted above. Kind Regards Laurie
Updating PM score.
Created attachment 316622 [details] re-implement (un)registering 32-bit compat ioctls using refcounts x86_64
ok, i believe we have a kernel that fixes this issue. Can you please test the kernel at: http://people.redhat.com/~jbaron/bz185585/ Also any feedback on the patch (from comment #18) is much appreciated. thanks.
posted for 4.8 inclusion, 12/16
Committed in 78.30.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
~~ Attention Partners! ~~ RHEL 4.8 Partner Alpha has been released on partners.redhat.com. There should be a fix present in the Beta, which addresses this URGENT priority bug. If you haven't had a chance yet to test this bug, please do so at your earliest convenience, to ensure that only the highest possible quality bits are shipped in the upcoming public Beta drop. If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. Thanks, more information about Beta testing to come. - Red Hat QE Partner Management
~~ Attention Partners! ~~ RHEL 4.8 Partner Alpha has been released on partners.redhat.com. There should be a fix present in the Beta, which addresses this bug. If you have already completed testing your other URGENT priority bugs, and you still haven't had a chance yet to test this bug, please do so at your earliest convenience, to ensure that only the highest possible quality bits are shipped in the upcoming public Beta drop. If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. Thanks, more information about Beta testing to come. - Red Hat QE Partner Management
~~ Attention Partners! ~~ RHEL 4.8Beta has been released on partners.redhat.com. There should be a fix present, which addresses this bug. Please test and report back results on this OtherQA Partner bug at your earliest convenience. If you encounter any issues, please set the bug back to the ASSIGNED state and describe any issues you encountered. If you have found a NEW bug, clone this bug and describe the issues you've encountered. Further questions can be directed to your Red Hat Partner Manager. If you have VERIFIED the bug fix. Please select your PartnerID from the Verified field above. Please leave a comment with your test results details. Include which arches tested, package version and any applicable logs. - Red Hat QE Partner Management
~~ Attention Partners! Snap 1 Released ~~ RHEL 4.8 Snapshot 1 has been released on partners.redhat.com. There should be a fix present, which addresses this bug. NOTE: there is only a short time left to test, please test and report back results on this OtherQA Partner bug at your earliest convenience. If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have found a NEW bug, clone this bug and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. If you have VERIFIED the bug fix. Please select your PartnerID from the Verified field above. Please leave a comment with your test results details. Include which arches tested, package version and any applicable logs. - Red Hat QE Partner Management
~~ Attention Partners! Snap 2 *Kernel Only* Released ~~ RHEL 4.8 Snapshot 2 *kernel* has been released on partners.redhat.com. There should be a fix present, which addresses this bug. NOTE: there is only a short amount of time left to test, please test and report back results on this OtherQA Partner bug at your earliest convenience. If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have found a NEW bug, clone this bug and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. If you have VERIFIED the bug fix. Please select your PartnerID from the Verified field above. Please leave a comment with your test results details. Include which arches tested, package version and any applicable logs.
Marking as verified. Test case with insmod/rmmod of a module which uses the register_ioctl32_conversion() call while the 32-bit lpfc ioctls are blocking passes on 2.6.9-85.ELsmp. The same test fails on 2.6.9-22.ELsmp - that is the Emulex daemons must be stopped before another module which uses the ioctl32 support can insmod or rmmod. -85.ELsmp no longer requires the deamons to be stopped. Package: kernel-smp-2.6.9-85.EL Architecture: x86_64 (this bug only affects x86_64)
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1024.html