Description of problem: scsi hotplug add/remove panics in sysfs_hash_and_remove() due to a NULL dentry pointer dereference when tearing down the sysfs entry for the sg device node. We have a reliable reproducer script and a verified patch from upstream, both (will be) attached to this BZ. Version-Release number of selected component (if applicable): All RHEL4 kernels. RHEL5 is not affected. How reproducible: The attached reproducer script can reliably reproduce this in seconds. Steps to Reproduce: 1. boot up any RHEL4 kernel on a system with at least one unused scsi device. Note: cannot use qemu virtual scsi drives due to a different bug, but a scsi_debug device is OK and so is a real scsi device. 2. run the reproducer script. 3. splat Actual results: RIP: 0010:[<ffffffff801b5203>] <ffffffff801b5203>{sysfs_hash_and_remove+14} ... Call Trace:<ffffffff8024e2ba>{class_device_del+156} <ffffffff8024e33e>{class_device_unregister+9} <ffffffffa0009f3e>{:scsi_mod:scsi_remove_device+78} <ffffffffa0009fd3>{:scsi_mod:sdev_store_delete+16} <ffffffff8024c6a7>{dev_attr_store+29} <ffffffff801b554f>{sysfs_write_file+194} <ffffffff8017af0e>{vfs_write+207} <ffffffff8017aff6>{sys_write+69} <ffffffff8011026a>{system_call+126} Expected results: reproducer.sh runs forever
Created attachment 367759 [details] reproducer script
Created attachment 367763 [details] fix BZ533299 crash in sysfs_hash_and_remove when scsi device is removed Patch based on three upstream commits, back-ported to RHEL4 : 32aeef605aa01e1fee45e052eceffb00e72ba2b0 b365b3daf2a9e2a8b002ea9fef877af1c71513fd 9d9307dabb3de8140fb3801bf6eb01f231dbd83d
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Committed in 91.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Reproduced this issue on kernel-2.6.9-89.EL and got the same error with the reproduce script. New kernel-2.6.9-95.EL have fixed this bug. Change bug into verify status
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0263.html