Bug 172784

Summary:	kernel BUG at mm/prio_tree.c:528 (invalid operand: 0000 [#1])
Product:	[Fedora] Fedora	Reporter:	Jez Tucker <jez>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED WONTFIX	QA Contact:	Brian Brock <bbrock>
Severity:	high	Docs Contact:
Priority:	medium
Version:	3	CC:	wtogami
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2005-11-09 23:53:03 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jez Tucker 2005-11-09 18:19:03 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

Description of problem:
Machine hard locks and kernel dumps.
Have to power cycle.

Occurs when running Maya 7.0.
Suggest bug exists, but Maya makes it present itself.

Version-Release number of selected component (if applicable):
kernel-2.6.9-1.667smp

How reproducible:
Always

Steps to Reproduce:
1. Render a scene using Maya.
2. Await kernel dump.
  

Actual Results:  RENDERNODE (base config):

Nov  8 17:13:25 blade2-1 kernel: ------------[ cut here ]------------
Nov  8 17:13:25 blade2-1 kernel: kernel BUG at mm/prio_tree.c:528!
Nov  8 17:13:25 blade2-1 kernel: invalid operand: 0000 [#1]
Nov  8 17:13:25 blade2-1 kernel: SMP
Nov  8 17:13:25 blade2-1 kernel: Modules linked in: joydev md5 ipv6 parport_pc lp parport autofs4 nfs lockd sunrpc dm_mod button battery ac ohci_hcd cfi_probe gen_probe scb2_flash mtdc
ore chipreg map_funcs tg3 ext3 jbd
Nov  8 17:13:25 blade2-1 kernel: CPU:    0
Nov  8 17:13:25 blade2-1 kernel: EIP:    0060:[<0213f145>]    Not tainted VLI
Nov  8 17:13:25 blade2-1 kernel: EFLAGS: 00010206   (2.6.9-1.667smp)
Nov  8 17:13:25 blade2-1 kernel: EIP is at vma_prio_tree_add+0x36/0x95
Nov  8 17:13:25 blade2-1 kernel: eax: 00000016   ebx: f79406a4   ecx: 00000000   edx: 00000c27
Nov  8 17:13:25 blade2-1 kernel: esi: f796164c   edi: f795dba8   ebp: f7940eb4   esp: e7f42f3c
Nov  8 17:13:25 blade2-1 kernel: ds: 007b   es: 007b   ss: 0068
Nov  8 17:13:25 blade2-1 kernel: Process maya.bin (pid: 16876, threadinfo=e7f42000 task=0e58a030)
Nov  8 17:13:25 blade2-1 kernel: Stack: f79406a4 057f6980 02147dd6 f79406a4 00100077 00000000 68b53680 0214892b
Nov  8 17:13:25 blade2-1 kernel:        f7940eb4 f7940ea8 00000c28 00000001 00000000 f795daf8 057f6980 00c28000
Nov  8 17:13:25 blade2-1 kernel:        eb1ad000 c963a754 f7940eb4 f7940ea8 057f6980 057f69b0 00000002 68b53680
Nov  8 17:13:25 blade2-1 kernel: Call Trace:
Nov  8 17:13:25 blade2-1 kernel:  [<02147dd6>] vma_link+0x9c/0xbc
Nov  8 17:13:25 blade2-1 kernel:  [<0214892b>] do_mmap_pgoff+0x523/0x673
Nov  8 17:13:25 blade2-1 kernel:  [<0210b55c>] sys_mmap2+0x80/0xb4
Nov  8 17:13:25 blade2-1 kernel: Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

WORKSTATION (base config + gfx):

Nov  9 15:12:33 shakeysanchez kernel: ------------[ cut here ]------------
Nov  9 15:12:33 shakeysanchez kernel: kernel BUG at mm/prio_tree.c:528!
Nov  9 15:12:33 shakeysanchez kernel: invalid operand: 0000 [#1]
Nov  9 15:12:33 shakeysanchez kernel: SMP
Nov  9 15:12:33 shakeysanchez kernel: Modules linked in: nvidia(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core nfs lockd sunrpc dm_mod uhci_hcd ehci_hcd hw_random snd_intel8x0 snd_ac97
_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc gameport snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore e1000 floppy ext3 jbd
Nov  9 15:12:33 shakeysanchez kernel: CPU:    1
Nov  9 15:12:33 shakeysanchez kernel: EIP:    0060:[<0213f145>]    Tainted: P   VLI
Nov  9 15:12:33 shakeysanchez kernel: EFLAGS: 00210216   (2.6.9-1.667smp)
Nov  9 15:12:33 shakeysanchez kernel: EIP is at vma_prio_tree_add+0x36/0x95
Nov  9 15:12:33 shakeysanchez kernel: eax: 00000008   ebx: 31e9ad2c   ecx: 00000000   edx: 00000012
Nov  9 15:12:33 shakeysanchez kernel: esi: 815ab6fc   edi: 8134ed64   ebp: 52164a3c   esp: 2dd22f3c
Nov  9 15:12:33 shakeysanchez kernel: ds: 007b   es: 007b   ss: 0068
Nov  9 15:12:33 shakeysanchez kernel: Process maya.bin (pid: 7947, threadinfo=2dd22000 task=6f5ae620)
Nov  9 15:12:33 shakeysanchez kernel: Stack: 31e9ad2c 71110980 02147dd6 31e9ad2c 00100077 00000000 6e2a7080 0214892b
Nov  9 15:12:33 shakeysanchez kernel:        52164a3c 52164a30 00000013 00000001 00000000 8134ecb4 71110980 00013000
Nov  9 15:12:33 shakeysanchez kernel:        ed4db000 461ae334 52164a3c 52164a30 71110980 711109b0 00000002 6e2a7080
Nov  9 15:12:33 shakeysanchez kernel: Call Trace:
Nov  9 15:12:33 shakeysanchez kernel:  [<02147dd6>] vma_link+0x9c/0xbc
Nov  9 15:12:33 shakeysanchez kernel:  [<0214892b>] do_mmap_pgoff+0x523/0x673
Nov  9 15:12:33 shakeysanchez kernel:  [<0210b55c>] sys_mmap2+0x80/0xb4
Nov  9 15:12:33 shakeysanchez kernel: Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00


Expected Results:  No kernel dump.

Additional info:

Maya is 'qualified' to use this kernel.
Catch 22 - this might be fixed in a later kernel release.  (Couldn't locate any posting that suggested a fix had been done).
If we upgrade, we are using an 'unsupported' kernel.

Using Maya7_0-7.0-374 and AWCommon-9.5-1 rpms.

Have tried various Intel Xeon III and AMD Athlon MP hardware with 2-4 GB RAM.
Each machine has the same base kickstart.  Tried also with workstation config (base + gfx) as opposed to rendernode.

Comment 1 Dave Jones 2005-11-09 20:01:40 UTC

if you can't upgrade, what exactly do you expect to happen with this bug ?

Even if it wasn't tainted by the nvidia module, there's nothing we can do if
you're stuck on an old kernel.

Comment 2 Jez Tucker 2005-11-09 23:50:33 UTC

Some thoughts;

- has it been fixed in a later release, but I can't find out when or in which 
kernel?

- it is a bug, and may exist in later kernels.

- if it could be fixed, we would most likely use the unqualifed kernel.  Alias 
would have to requalify the fix and it would be in their interest to do so.