Bug 32737 - "rmmod megaraid" OOPS with Segmentation fault
Summary: "rmmod megaraid" OOPS with Segmentation fault
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kudzu
Version: 7.1
Hardware: i386
OS: Linux
high
high
Target Milestone: ---
Assignee: Alan Cox
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-03-22 19:57 UTC by Tesfamariam Michael
Modified: 2005-10-31 22:00 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2001-03-22 20:13:31 UTC
Embargoed:


Attachments (Terms of Use)

Description Tesfamariam Michael 2001-03-22 19:57:21 UTC
I added a PERC3/QC to a system that was install kernel 2.4.2-0.1.28. The 
boot controller is AIC7899. kudzu didn't see the card in few cases. I had 
to "insmod megaraid" manually. Then, when I tried to "rmmod megaraid", I 
get a "segmentation fault". The system is boot with the smp kernel.
Here is the Oops message right after "rmmod megaraid".
Oops: 0000
CPU:    0
EIP:    0010:[<c0173813>]
EFLAGS: 00010202
eax: 00000000   ebx: f5bd0084   ecx: 00000004   edx: 5a5a5a5a
esi: f5bd0000   edi: 5a5a5a8a   ebp: c2b2e000   esp: f60e7ea4
ds: 0018   es: 0018   ss: 0018
Process rmmod (pid: 9549, stackpage=f60e7000)
Stack: f89a9b6c f5bd0084 f89ac000 f89a6f0c f89a9b6c 5a5a5a5a f6f7167c 
00000003
       f6f71734 00000000 c0173891 0000fe0a 00000000 00000000 01000000 
f5bd0000
       00000003 f60e7f60 00000000 f8802cad f5bd0000 f71c7e84 00000003 
f68acc64
Call Trace: [<f89a9b6c>] [<f89ac000>] [<f89a6f0c>] [<f89a9b6c>] 
[<c0173891>] [<f8802cad>] [<c0117d36>]
       [<c0117b90>] [<c02bc050>] [<c0245800>] [<f89a5000>] [<f88032c6>] 
[<f89aa320>] [<f89a8f8c>] [<f89aa320>]
       [<c011fe7e>] [<f89a5000>] [<c011efaf>] [<f89a5000>] [<c01099eb>]

Code: 8b 52 30 89 cb 85 d2 0f 84 b0 00 00 00 8b 07 50 8b 4c 24 04
Segmentation fault


Here is what "lsmod" shows before "rmmod megaraid"
Module                  Size  Used by
megaraid               23632   0 

After "rmmod megaraid", "lsmod" shows 
Module                  Size  Used by
megaraid                   0   0  (deleted)


I did this on two systems (PE2450 and PE4400) with PERC3/QC.

Comment 1 Matt Domsch 2001-03-22 20:01:13 UTC
Here's the decoded oops.  It fails similarly with 2.4.2-0.1.32smp.

ksymoops 2.4.0 on i686 2.4.2-0.1.28smp.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.2-0.1.28smp/ (default)
     -m /boot/System.map-2.4.2-0.1.28smp (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Error (expand_objects): cannot stat(/lib/aic7xxx.o) for aic7xxx
Error (expand_objects): cannot stat(/lib/sd_mod.o) for sd_mod
Error (expand_objects): cannot stat(/lib/scsi_mod.o) for scsi_mod
Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(shmem_file_setup) 
not found in System.map.  Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol partition_name  , ksyms_base says 
c01df490, System.map says c0176cc0.  Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol mega_hbas  , megaraid says 
f88d56e0, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/megaraid.o says 
f88d5440.  Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/megaraid.o 
entry
Warning (compare_maps): mismatch on symbol usb_devfs_handle  , usbcore says 
f8897be0, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/usb/usbcore.o says 
f8897640.  Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/usb/usbcore.o 
entry
Warning (compare_maps): mismatch on symbol sd  , sd_mod says 
f881ee84, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/sd_mod.o says 
f881ed20.  Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/sd_mod.o 
entry
Warning (compare_maps): mismatch on symbol proc_scsi  , scsi_mod says 
f881ac7c, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/scsi_mod.o says 
f8819540.  Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/scsi_mod.o 
entry
Warning (compare_maps): mismatch on symbol scsi_logging_level  , scsi_mod says 
f881ac78, /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/scsi_mod.o says 
f881953c.  Ignoring /lib/modules/2.4.2-0.1.28smp/kernel/drivers/scsi/scsi_mod.o 
entry
Mar 21 12:19:21 localhost kernel: cpu: 0, clocks: 1329190, slice: 443063
Mar 21 12:19:21 localhost kernel: cpu: 1, clocks: 1329190, slice: 443063
Mar 21 12:19:25 localhost kernel:   Receiver lock-up bug exists -- enabling 
work-around.
Mar 22 12:56:48 localhost kernel: cpu: 0, clocks: 1328945, slice: 442981
Mar 22 12:56:48 localhost kernel: cpu: 1, clocks: 1328945, slice: 442981
Mar 22 12:56:52 localhost kernel:   Receiver lock-up bug exists -- enabling 
work-around.
Mar 22 12:59:38 localhost kernel: Unable to handle kernel paging request at 
virtual address 5a5a5a8a
Mar 22 12:59:38 localhostkernel: c0173813
Mar 22 12:59:38 localhost kernel: Oops: 0000
Mar 22 12:59:38 localhost kernel: CPU:    1
Mar 22 12:59:38 localhost kernel: EIP:    0010:[remove_proc_entry+67/272]
Mar 22 12:59:38 localhost kernel: EIP:    0010:[<c0173813>]
Using defaults from ksymoops -t elf32-i386 -a i386
Mar 22 12:59:38 localhost kernel: EFLAGS: 00010202
Mar 22 12:59:38 localhost kernel: eax: 00000000   ebx: f6950084   ecx: 
00000004   edx: 5a5a5a5a
Mar 22 12:59:38 localhost kernel: esi: f6950000   edi: 5a5a5a8a   ebp: 
c2b2c800   esp: f6967ea4
Mar 22 12:59:38 localhost kernel: ds: 0018   es: 0018   ss: 0018
Mar 22 12:59:38 localhost kernel: Process rmmod (pid: 1068, stackpage=f6967000)
Mar 22 12:59:38 localhost kernel: Stack: f88d4b6c f6950084 f6967ee0 f88d1f0c 
f88d4b6c 5a5a5a5a f6dc0304 00000003 
Mar 22 12:59:38 localhost kernel:        f6dc03bc 00000000 c0173891 0000fe0a 
00000000 00000000 01000000 f6950000 
Mar 22 12:59:38 localhost kernel:        00000003 f6967f60 00000000 f8802cad 
f6950000 f772b194 00000003 f7231d24 
Mar 22 12:59:38 localhost kernel: Call Trace: [<f88d4b6c>] [<f88d1f0c>] 
[<f88d4b6c>] [remove_proc_entry+193/272] 
[eepro100:__insmod_eepro100_O/lib/modules/2.4.2-0.1.28smp/kernel/driv+-
770899/96] [do_page_fault+422/1392] [do_page_fault+0/1392] 
Mar 22 12:59:38 localhost kernel: Call Trace: [<f88d4b6c>] [<f88d1f0c>] 
[<f88d4b6c>] [<c0173891>] [<f8802cad>] [<c0117d36>] [<c0117b90>] 
Mar 22 12:59:38 localhost kernel:        [<c02bc720>] [<c0245800>] [<f88d0000>] 
[<f88032c6>] [<f88d5320>] [<f88d3f8c>] [<f88d5320>] [<c011fe7e>] 
Mar 22 12:59:38 localhost kernel:        [<f88d0000>] 
[sys_delete_module+511/1232] [<f88d0000>] [system_call+51/56] 
Mar 22 12:59:38 localhost kernel:        [<f88d0000>] [<c011efaf>] [<f88d0000>] 
[<c01099eb>] 
Mar 22 12:59:38 localhost kernel: Code: 8b 52 30 89 cb 85 d2 0f 84 b0 00 00 00 
8b 07 50 8b 4c 24 04 

>>EIP; c0173813 <remove_proc_entry+43/110>   <=====
Trace; f88d4b6c <[megaraid].rodata.start+a6c/10bf>
Trace; f88d1f0c <[megaraid]megaraid_release+ac/170>
Trace; f88d4b6c <[megaraid].rodata.start+a6c/10bf>
Trace; f88d4b6c <[megaraid].rodata.start+a6c/10bf>
Trace; f88d1f0c <[megaraid]megaraid_release+ac/170>
Trace; f88d4b6c <[megaraid].rodata.start+a6c/10bf>
Trace; c0173891 <remove_proc_entry+c1/110>
Trace; f8802cad <[scsi_mod]scsi_unregister_host+45d/680>
Trace; c0117d36 <do_page_fault+1a6/570>
Trace; c0117b90 <do_page_fault+0/570>
Trace; c02bc720 <__kstrtab_tcp_write_wakeup+0/1f>
Trace; c0245800 <__generic_copy_to_user+30/40>
Trace; f88d0000 <[autofs].data.end+777d/77dd>
Trace; f88032c6 <[scsi_mod]scsi_unregister_module+26/30>
Trace; f88d5320 <[megaraid]driver_template+0/7f>
Trace; f88d3f8c <[megaraid]exit_this_scsi_driver+c/10>
Trace; f88d5320 <[megaraid]driver_template+0/7f>
Trace; c011fe7e <free_module+1e/160>
Trace; f88d0000 <[autofs].data.end+777d/77dd>
Trace; f88d0000 <[autofs].data.end+777d/77dd>
Trace; c011efaf <sys_delete_module+1ff/4d0>
Trace; f88d0000 <[autofs].data.end+777d/77dd>
Trace; c01099eb <system_call+33/38>
Code;  c0173813 <remove_proc_entry+43/110>
00000000 <_EIP>:
Code;  c0173813 <remove_proc_entry+43/110>   <=====
   0:   8b 52 30                  mov    0x30(%edx),%edx   <=====
Code;  c0173816 <remove_proc_entry+46/110>
   3:   89 cb                     mov    %ecx,%ebx
Code;  c0173818 <remove_proc_entry+48/110>
   5:   85 d2                     test   %edx,%edx
Code;  c017381a <remove_proc_entry+4a/110>
   7:   0f 84 b0 00 00 00         je     bd <_EIP+0xbd> c01738d0 
<remove_proc_entry+100/110>
Code;  c0173820 <remove_proc_entry+50/110>
   d:   8b 07                     mov    (%edi),%eax
Code;  c0173822 <remove_proc_entry+52/110>
   f:   50                        push   %eax
Code;  c0173823 <remove_proc_entry+53/110>
  10:   8b 4c 24 04               mov    0x4(%esp,1),%ecx

Mar 22 13:05:44 localhost kernel: cpu: 0, clocks: 1328945, slice: 442981
Mar 22 13:05:44 localhost kernel: cpu: 1, clocks: 1328945, slice: 442981
Mar 22 13:05:48 localhost kernel:   Receiver lock-up bug exists -- enabling 
work-around.

8 warnings and 3 errors issued.  Results may not be reliable.


Comment 2 Alan Cox 2001-03-22 20:08:03 UTC
Nice catch. The slab poisoning caught an obvious (and I think easily fixed)
Megaraid bug

Move the  scsi_unregister (pSHost); after the proc entry free up. megaCfg is in
hostdata so was freed then referenced.


	

Comment 3 Arjan van de Ven 2001-03-22 20:11:20 UTC
fixed for the next build

Comment 4 Tesfamariam Michael 2001-03-22 20:13:27 UTC
After the "Segmentation fault", the system hangs when "reboot" or "init 6" is 
issued. A powere reset is required.


Note You need to log in before you can comment on or make changes to this bug.