I added a PERC3/QC to a system that was install kernel 2.4.2-0.1.28. The boot controller is AIC7899. kudzu didn't see the card in few cases. I had to "insmod megaraid" manually. Then, when I tried to "rmmod megaraid", I get a "segmentation fault". The system is boot with the smp kernel. Here is the Oops message right after "rmmod megaraid". Oops: 0000 CPU: 0 EIP: 0010:[<c0173813>] EFLAGS: 00010202 eax: 00000000 ebx: f5bd0084 ecx: 00000004 edx: 5a5a5a5a esi: f5bd0000 edi: 5a5a5a8a ebp: c2b2e000 esp: f60e7ea4 ds: 0018 es: 0018 ss: 0018 Process rmmod (pid: 9549, stackpage=f60e7000) Stack: f89a9b6c f5bd0084 f89ac000 f89a6f0c f89a9b6c 5a5a5a5a f6f7167c 00000003 f6f71734 00000000 c0173891 0000fe0a 00000000 00000000 01000000 f5bd0000 00000003 f60e7f60 00000000 f8802cad f5bd0000 f71c7e84 00000003 f68acc64 Call Trace: [<f89a9b6c>] [<f89ac000>] [<f89a6f0c>] [<f89a9b6c>] [<c0173891>] [<f8802cad>] [<c0117d36>] [<c0117b90>] [<c02bc050>] [<c0245800>] [<f89a5000>] [<f88032c6>] [<f89aa320>] [<f89a8f8c>] [<f89aa320>] [<c011fe7e>] [<f89a5000>] [<c011efaf>] [<f89a5000>] [<c01099eb>] Code: 8b 52 30 89 cb 85 d2 0f 84 b0 00 00 00 8b 07 50 8b 4c 24 04 Segmentation fault Here is what "lsmod" shows before "rmmod megaraid" Module Size Used by megaraid 23632 0 After "rmmod megaraid", "lsmod" shows Module Size Used by megaraid 0 0 (deleted) I did this on two systems (PE2450 and PE4400) with PERC3/QC.
Here's the decoded oops. It fails similarly with 2.4.2-0.1.32smp. ksymoops 2.4.0 on i686 2.4.2-0.1.28smp. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.2-0.1.28smp/ (default) -m /boot/System.map-2.4.2-0.1.28smp (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Error (expand_objects): cannot stat(/lib/aic7xxx.o) for aic7xxx Error (expand_objects): cannot stat(/lib/sd_mod.o) for sd_mod Error (expand_objects): cannot stat(/lib/scsi_mod.o) for scsi_mod Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(shmem_file_setup) not found in System.map. Ignoring ksyms_base entry Warning (compare_maps): mismatch on symbol partition_name , ksyms_base says c01df490, System.map says c0176cc0. Ignoring ksyms_base entry Warning (compare_maps): mismatch on symbol mega_hbas , megaraid says f88d56e0, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/megaraid.o says f88d5440. Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/megaraid.o entry Warning (compare_maps): mismatch on symbol usb_devfs_handle , usbcore says f8897be0, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/usb/usbcore.o says f8897640. Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/usb/usbcore.o entry Warning (compare_maps): mismatch on symbol sd , sd_mod says f881ee84, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/sd_mod.o says f881ed20. Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/sd_mod.o entry Warning (compare_maps): mismatch on symbol proc_scsi , scsi_mod says f881ac7c, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/scsi_mod.o says f8819540. Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/scsi_mod.o entry Warning (compare_maps): mismatch on symbol scsi_logging_level , scsi_mod says f881ac78, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/scsi_mod.o says f881953c. Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/scsi_mod.o entry Mar 21 12:19:21 localhost kernel: cpu: 0, clocks: 1329190, slice: 443063 Mar 21 12:19:21 localhost kernel: cpu: 1, clocks: 1329190, slice: 443063 Mar 21 12:19:25 localhost kernel: Receiver lock-up bug exists -- enabling work-around. Mar 22 12:56:48 localhost kernel: cpu: 0, clocks: 1328945, slice: 442981 Mar 22 12:56:48 localhost kernel: cpu: 1, clocks: 1328945, slice: 442981 Mar 22 12:56:52 localhost kernel: Receiver lock-up bug exists -- enabling work-around. Mar 22 12:59:38 localhost kernel: Unable to handle kernel paging request at virtual address 5a5a5a8a Mar 22 12:59:38 localhostkernel: c0173813 Mar 22 12:59:38 localhost kernel: Oops: 0000 Mar 22 12:59:38 localhost kernel: CPU: 1 Mar 22 12:59:38 localhost kernel: EIP: 0010:[remove_proc_entry+67/272] Mar 22 12:59:38 localhost kernel: EIP: 0010:[<c0173813>] Using defaults from ksymoops -t elf32-i386 -a i386 Mar 22 12:59:38 localhost kernel: EFLAGS: 00010202 Mar 22 12:59:38 localhost kernel: eax: 00000000 ebx: f6950084 ecx: 00000004 edx: 5a5a5a5a Mar 22 12:59:38 localhost kernel: esi: f6950000 edi: 5a5a5a8a ebp: c2b2c800 esp: f6967ea4 Mar 22 12:59:38 localhost kernel: ds: 0018 es: 0018 ss: 0018 Mar 22 12:59:38 localhost kernel: Process rmmod (pid: 1068, stackpage=f6967000) Mar 22 12:59:38 localhost kernel: Stack: f88d4b6c f6950084 f6967ee0 f88d1f0c f88d4b6c 5a5a5a5a f6dc0304 00000003 Mar 22 12:59:38 localhost kernel: f6dc03bc 00000000 c0173891 0000fe0a 00000000 00000000 01000000 f6950000 Mar 22 12:59:38 localhost kernel: 00000003 f6967f60 00000000 f8802cad f6950000 f772b194 00000003 f7231d24 Mar 22 12:59:38 localhost kernel: Call Trace: [<f88d4b6c>] [<f88d1f0c>] [<f88d4b6c>] [remove_proc_entry+193/272] [eepro100:__insmod_eepro100_O/lib/modules/2.4.2-0.1.28smp/kernel/driv+- 770899/96] [do_page_fault+422/1392] [do_page_fault+0/1392] Mar 22 12:59:38 localhost kernel: Call Trace: [<f88d4b6c>] [<f88d1f0c>] [<f88d4b6c>] [<c0173891>] [<f8802cad>] [<c0117d36>] [<c0117b90>] Mar 22 12:59:38 localhost kernel: [<c02bc720>] [<c0245800>] [<f88d0000>] [<f88032c6>] [<f88d5320>] [<f88d3f8c>] [<f88d5320>] [<c011fe7e>] Mar 22 12:59:38 localhost kernel: [<f88d0000>] [sys_delete_module+511/1232] [<f88d0000>] [system_call+51/56] Mar 22 12:59:38 localhost kernel: [<f88d0000>] [<c011efaf>] [<f88d0000>] [<c01099eb>] Mar 22 12:59:38 localhost kernel: Code: 8b 52 30 89 cb 85 d2 0f 84 b0 00 00 00 8b 07 50 8b 4c 24 04 >>EIP; c0173813 <remove_proc_entry+43/110> <===== Trace; f88d4b6c <[megaraid].rodata.start+a6c/10bf> Trace; f88d1f0c <[megaraid]megaraid_release+ac/170> Trace; f88d4b6c <[megaraid].rodata.start+a6c/10bf> Trace; f88d4b6c <[megaraid].rodata.start+a6c/10bf> Trace; f88d1f0c <[megaraid]megaraid_release+ac/170> Trace; f88d4b6c <[megaraid].rodata.start+a6c/10bf> Trace; c0173891 <remove_proc_entry+c1/110> Trace; f8802cad <[scsi_mod]scsi_unregister_host+45d/680> Trace; c0117d36 <do_page_fault+1a6/570> Trace; c0117b90 <do_page_fault+0/570> Trace; c02bc720 <__kstrtab_tcp_write_wakeup+0/1f> Trace; c0245800 <__generic_copy_to_user+30/40> Trace; f88d0000 <[autofs].data.end+777d/77dd> Trace; f88032c6 <[scsi_mod]scsi_unregister_module+26/30> Trace; f88d5320 <[megaraid]driver_template+0/7f> Trace; f88d3f8c <[megaraid]exit_this_scsi_driver+c/10> Trace; f88d5320 <[megaraid]driver_template+0/7f> Trace; c011fe7e <free_module+1e/160> Trace; f88d0000 <[autofs].data.end+777d/77dd> Trace; f88d0000 <[autofs].data.end+777d/77dd> Trace; c011efaf <sys_delete_module+1ff/4d0> Trace; f88d0000 <[autofs].data.end+777d/77dd> Trace; c01099eb <system_call+33/38> Code; c0173813 <remove_proc_entry+43/110> 00000000 <_EIP>: Code; c0173813 <remove_proc_entry+43/110> <===== 0: 8b 52 30 mov 0x30(%edx),%edx <===== Code; c0173816 <remove_proc_entry+46/110> 3: 89 cb mov %ecx,%ebx Code; c0173818 <remove_proc_entry+48/110> 5: 85 d2 test %edx,%edx Code; c017381a <remove_proc_entry+4a/110> 7: 0f 84 b0 00 00 00 je bd <_EIP+0xbd> c01738d0 <remove_proc_entry+100/110> Code; c0173820 <remove_proc_entry+50/110> d: 8b 07 mov (%edi),%eax Code; c0173822 <remove_proc_entry+52/110> f: 50 push %eax Code; c0173823 <remove_proc_entry+53/110> 10: 8b 4c 24 04 mov 0x4(%esp,1),%ecx Mar 22 13:05:44 localhost kernel: cpu: 0, clocks: 1328945, slice: 442981 Mar 22 13:05:44 localhost kernel: cpu: 1, clocks: 1328945, slice: 442981 Mar 22 13:05:48 localhost kernel: Receiver lock-up bug exists -- enabling work-around. 8 warnings and 3 errors issued. Results may not be reliable.
Nice catch. The slab poisoning caught an obvious (and I think easily fixed) Megaraid bug Move the scsi_unregister (pSHost); after the proc entry free up. megaCfg is in hostdata so was freed then referenced.
fixed for the next build
After the "Segmentation fault", the system hangs when "reboot" or "init 6" is issued. A powere reset is required.