Bug 710921 - ftrace: ftrace_mod_code not disabling interrupts while writing NOPs during boot
Summary: ftrace: ftrace_mod_code not disabling interrupts while writing NOPs during boot
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-06-05 20:05 UTC by Richard W.M. Jones
Modified: 2011-06-16 00:03 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-16 00:03:14 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
revert removing of interrupts disabling (612 bytes, patch)
2011-06-07 13:06 UTC, Steven Rostedt
no flags Details | Diff

Description Richard W.M. Jones 2011-06-05 20:05:02 UTC
Description of problem:

During testing of libguestfs, I get this kernel error:

[    0.109586] Initializing cgroup subsys cpuacct
[    0.111765] Initializing cgroup subsys memory
[    0.116440] Initializing cgroup subsys devices
[    0.117262] Initializing cgroup subsys freezer
[    0.117623] Initializing cgroup subsys net_cls
[    0.118299] Initializing cgroup subsys blkio
[    0.120101] Initializing cgroup subsys perf_event
[    0.128609] mce: CPU supports 10 MCE banks
[    0.136986] SMP alternatives: switching to UP code
[    0.217839] Freeing SMP alternatives: 12k freed
[    0.222323] ftrace: allocating 23539 entries in 47 pages
[    0.265968] BUG: unable to handle kernel paging request at fffffffe
[    0.265968] IP: [<c043cd93>] printk_tick+0x7/0x3b
[    0.265968] *pde = 017fc067 *pte = 00000000 
[    0.265968] Oops: 0000 [#1] SMP 
[    0.265968] Modules linked in:
[    0.265968] 
[    0.265968] Pid: 0, comm: swapper Not tainted 3.0-0.rc1.git0.2.fc16.i686 #1 Bochs Bochs
[    0.265968] EIP: 0060:[<c043cd93>] EFLAGS: 00000046 CPU: 0
[    0.265968] EIP is at printk_tick+0x7/0x3b
[    0.265968] EAX: 00000046 EBX: 00000000 ECX: 00000002 EDX: fffffed4
[    0.265968] ESI: 00000000 EDI: c0ac5fe0 EBP: de411f58 ESP: de411f54
[    0.265968]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[    0.265968] Process swapper (pid: 0, ti=de410000 task=c0ac5fe0 task.ti=c0a30000)
[    0.265968] Stack:
[    0.265968]  00000046 de411f6c c04486a2 c0ac8f00 c0a35bd0 c0ac8780 de411f74 c0462d3f
[    0.265968]  de411f94 c0462daf c0465d04 de411f94 00000000 c0ac8780 c0a35bd0 c0ac8780
[    0.265968]  de411f9c c0404a89 de411fd4 c0491b8a c0ac5fe0 00000046 de411fc8 c0468c77
[    0.265968] Call Trace:
[    0.265968]  [<c04486a2>] update_process_times+0x37/0x5d
[    0.265968]  [<c0462d3f>] tick_periodic+0x6b/0x77
[    0.265968]  [<c0462daf>] tick_handle_periodic+0x1e/0x6c
[    0.265968]  [<c0465d04>] ? trace_hardirqs_off+0xb/0xd
[    0.265968]  [<c0404a89>] timer_interrupt+0x19/0x20
[    0.265968]  [<c0491b8a>] handle_irq_event_percpu+0x4a/0x1a4
[    0.265968]  [<c0468c77>] ? lock_release+0x163/0x184
[    0.265968]  [<c0491d15>] handle_irq_event+0x31/0x48
[    0.265968]  [<c04937e3>] ? unmask_irq+0x23/0x23
[    0.265968]  [<c0493838>] handle_level_irq+0x55/0x73
[    0.265968]  <IRQ> 
[    0.265968]  [<c04042d0>] ? do_IRQ+0x3c/0x92
[    0.265968]  [<c0465ca1>] ? trace_hardirqs_off_caller+0x2e/0x86
[    0.265968]  [<c043cd91>] ? printk_tick+0x5/0x3b
[    0.265968]  [<c08257f5>] ? common_interrupt+0x35/0x3c
[    0.265968]  [<c043cd91>] ? printk_tick+0x5/0x3b
[    0.265968]  [<c043cd90>] ? printk_tick+0x4/0x3b
[    0.265968]  [<c05f850a>] ? __copy_to_user_ll+0x4d/0x51
[    0.265968]  [<c04be01f>] ? __probe_kernel_write+0x1f/0x3c
[    0.265968]  [<c043cd90>] ? printk_tick+0x4/0x3b
[    0.265968]  [<c041c67c>] ? ftrace_mod_code+0x18/0x28
[    0.265968]  [<c041c713>] ? ftrace_modify_code+0x87/0xc9
[    0.265968]  [<c041c83f>] ? ftrace_make_nop+0x2e/0x30
[    0.265968]  [<c049bcc9>] ? ftrace_process_locs+0x141/0x1ec
[    0.265968]  [<c043cd90>] ? printk_tick+0x4/0x3b
[    0.265968]  [<c0b45ab0>] ? ftrace_init+0x126/0x13e
[    0.265968]  [<c0b2d790>] ? start_kernel+0x366/0x371
[    0.265968]  [<c0b2d1b3>] ? loglevel+0x18/0x18
[    0.265968]  [<c0b2d0a9>] ? i386_start_kernel+0xa9/0xaf
[    0.265968] Code: be d8 77 d4 c0 00 75 ab 83 cb ff 83 c4 0c 89 d8 5b 5e 5f 5d c3 55 89 e5 e8 87 8a 3e 00 a1 c0 77 d4 c0 5d c3 55 89 e5 50 3e 77 8a <3e> 00 64 a1 fc e6 bb c0 85 c0 74 26 c7 04 24 00 00 00 00 b9 01 
[    0.265968] EIP: [<c043cd93>] printk_tick+0x7/0x3b SS:ESP 0068:de411f54
[    0.265968] CR2: 00000000fffffffe
[    0.265968] ---[ end trace a7919e7f17c0a725 ]---
[    0.265968] Kernel panic - not syncing: Fatal exception in interrupt
[    0.265968] Pid: 0, comm: swapper Tainted: G      D     3.0-0.rc1.git0.2.fc16.i686 #1
[    0.265968] Call Trace:
[    0.265968]  [<c0816356>] ? printk+0x2d/0x2f
[    0.265968]  [<c081622a>] panic+0x5c/0x15b
[    0.265968]  [<c0820ac3>] oops_end+0x9b/0xa8
[    0.265968]  [<c0815c1d>] no_context+0x128/0x130
[    0.265968]  [<c0815d44>] __bad_area_nosemaphore+0x11f/0x127
[    0.265968]  [<c04075b4>] ? paravirt_read_tsc+0x8/0x10
[    0.265968]  [<c0421b08>] ? pgd_val+0x8/0x10
[    0.265968]  [<c0815844>] ? pte_offset_kernel+0x12/0x2a
[    0.265968]  [<c0815d63>] bad_area_nosemaphore+0x17/0x19
[    0.265968]  [<c082247b>] do_page_fault+0x17a/0x365
[    0.265968]  [<c045a94b>] ? sched_clock_cpu+0x134/0x144
[    0.265968]  [<c0465ca1>] ? trace_hardirqs_off_caller+0x2e/0x86
[    0.265968]  [<c0822301>] ? spurious_fault+0xa9/0xa9
[    0.265968]  [<c08200c4>] error_code+0x6c/0x74
[    0.265968]  [<c043cd93>] ? printk_tick+0x7/0x3b
[    0.265968]  [<c04486a2>] update_process_times+0x37/0x5d
[    0.265968]  [<c0462d3f>] tick_periodic+0x6b/0x77
[    0.265968]  [<c0462daf>] tick_handle_periodic+0x1e/0x6c
[    0.265968]  [<c0465d04>] ? trace_hardirqs_off+0xb/0xd
[    0.265968]  [<c0404a89>] timer_interrupt+0x19/0x20
[    0.265968]  [<c0491b8a>] handle_irq_event_percpu+0x4a/0x1a4
[    0.265968]  [<c0468c77>] ? lock_release+0x163/0x184
[    0.265968]  [<c0491d15>] handle_irq_event+0x31/0x48
[    0.265968]  [<c04937e3>] ? unmask_irq+0x23/0x23
[    0.265968]  [<c0493838>] handle_level_irq+0x55/0x73
[    0.265968]  <IRQ>  [<c04042d0>] ? do_IRQ+0x3c/0x92
[    0.265968]  [<c0465ca1>] ? trace_hardirqs_off_caller+0x2e/0x86
[    0.265968]  [<c043cd91>] ? printk_tick+0x5/0x3b
[    0.265968]  [<c08257f5>] ? common_interrupt+0x35/0x3c
[    0.265968]  [<c043cd91>] ? printk_tick+0x5/0x3b
[    0.265968]  [<c043cd90>] ? printk_tick+0x4/0x3b
[    0.265968]  [<c05f850a>] ? __copy_to_user_ll+0x4d/0x51
[    0.265968]  [<c04be01f>] ? __probe_kernel_write+0x1f/0x3c
[    0.265968]  [<c043cd90>] ? printk_tick+0x4/0x3b
[    0.265968]  [<c041c67c>] ? ftrace_mod_code+0x18/0x28
[    0.265968]  [<c041c713>] ? ftrace_modify_code+0x87/0xc9
[    0.265968]  [<c041c83f>] ? ftrace_make_nop+0x2e/0x30
[    0.265968]  [<c049bcc9>] ? ftrace_process_locs+0x141/0x1ec
[    0.265968]  [<c043cd90>] ? printk_tick+0x4/0x3b
[    0.265968]  [<c0b45ab0>] ? ftrace_init+0x126/0x13e
[    0.265968]  [<c0b2d790>] ? start_kernel+0x366/0x371
[    0.265968]  [<c0b2d1b3>] ? loglevel+0x18/0x18
[    0.265968]  [<c0b2d0a9>] ? i386_start_kernel+0xa9/0xaf
[    0.265968] Rebooting in 1 seconds..libguestfs: error: unexpected end of file when reading from daemon

Version-Release number of selected component (if applicable):

Linux version 3.0-0.rc1.git0.2.fc16.i686 (mockbuild.fedoraproject.org) (gcc version 4.6.0 20110530 (Red Hat 4.6.0-9) (GCC) ) #1 SMP Fri Jun 3 12:51:45 UTC 2011

How reproducible:

Not always.

Steps to Reproduce:
1. ?? Compile and test libguestfs on Rawhide.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Richard W.M. Jones 2011-06-05 21:52:46 UTC
Same problem seems to happen on x86-64 too.

http://koji.fedoraproject.org/koji/getfile?taskID=3112179&name=build.log

[    0.243459] ftrace: allocating 24624 entries in 97 pages
[    0.307962] BUG: unable to handle kernel paging request at ffffffffffffff8d
[    0.307962] IP: [<ffffffff8100b264>] timer_interrupt+0x4/0x24
[    0.307962] PGD 1a05067 PUD 1a06067 PMD 0 
[    0.307962] Oops: 0000 [#1] SMP 
[    0.307962] CPU 0 
[    0.307962] Modules linked in:
[    0.307962] 
[    0.307962] Pid: 0, comm: swapper Not tainted 3.0-0.rc1.git0.2.fc16.x86_64 #1 Bochs Bochs
[    0.307962] RIP: 0010:[<ffffffff8100b264>]  [<ffffffff8100b264>] timer_interrupt+0x4/0x24
[    0.307962] RSP: 0000:ffff88001f003e98  EFLAGS: 00000046
[    0.307962] RAX: 00000000001d5f4c RBX: ffffffff81a0e900 RCX: 00000000125b2249
[    0.307962] RDX: 00000000026b026a RSI: 0000000000000000 RDI: 0000000000000000
[    0.307962] RBP: ffff88001f003e98 R08: 0000000000000002 R09: ffffffff82703e60
[    0.307962] R10: ffffffff82703e60 R11: ffffffff82703e60 R12: ffff88001ec08548
[    0.307962] R13: ffff88001ec08490 R14: 0000000000000000 R15: 0000000000000000
[    0.307962] FS:  0000000000000000(0000) GS:ffff88001f000000(0000) knlGS:0000000000000000
[    0.307962] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    0.307962] CR2: ffffffffffffff8d CR3: 0000000001a03000 CR4: 00000000000006f0
[    0.307962] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.307962] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
[    0.307962] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a0b020)
[    0.307962] Stack:
[    0.307962]  ffff88001f003ef8 ffffffff810ba8b7 ffffffff81a0e900 ffff88001ec08548
[    0.307962]  ffff88001ec08548 0000000081a0e900 ffffffff81a0b020 ffff88001ec08490
[    0.307962]  ffff88001ec08548 ffffffff81a0e900 ffffffff81a01d38 ffffffff814a09b2
[    0.307962] Call Trace:
[    0.307962]  <IRQ> 
[    0.307962]  [<ffffffff810ba8b7>] handle_irq_event_percpu+0x56/0x1c7
[    0.307962]  [<ffffffff814a09b2>] ? csum_partial_copy_from_user+0xf/0xed
[    0.307962]  [<ffffffff810baa6f>] handle_irq_event+0x47/0x68
[    0.307962]  [<ffffffff810bcad9>] handle_level_irq+0x6a/0x8e
[    0.307962]  [<ffffffff8100ab2c>] handle_irq+0x8b/0x91
[    0.307962]  [<ffffffff814cc93d>] do_IRQ+0x4d/0xa5
[    0.307962]  [<ffffffff8100b264>] ? timer_interrupt+0x4/0x24
[    0.307962]  [<ffffffff814c45d3>] common_interrupt+0x13/0x13
[    0.307962]  <EOI> 
[    0.307962]  [<ffffffff8125c8bb>] ? copy_user_generic_unrolled+0x9b/0xb0
[    0.307962]  [<ffffffff810ee710>] ? __probe_kernel_write+0x36/0x56
[    0.307962]  [<ffffffff8102537e>] ftrace_mod_code+0x1c/0x2d
[    0.307962]  [<ffffffff8102540b>] do_ftrace_mod_code+0x7c/0xaa
[    0.307962]  [<ffffffff8100b264>] ? timer_interrupt+0x4/0x24
[    0.307962]  [<ffffffff810254f2>] ftrace_modify_code+0x56/0x70
[    0.307962]  [<ffffffff81229ec4>] ? fips_enable+0x4/0x4c
[    0.307962]  [<ffffffff8102549c>] ? ftrace_mod_jmp+0x63/0x63
[    0.307962]  [<ffffffff810255fb>] ftrace_make_nop+0x37/0x39
[    0.307962]  [<ffffffff810c6887>] ftrace_process_locs+0x190/0x25d
[    0.307962]  [<ffffffff81d5faf3>] ftrace_init+0x19f/0x1c3
[    0.307962]  [<ffffffff81d49823>] ? alternative_instructions+0x136/0x13b
[    0.307962]  [<ffffffff81d41ba4>] start_kernel+0x3e3/0x3f3
[    0.307962]  [<ffffffff81d412c4>] x86_64_start_reservations+0xaf/0xb3
[    0.307962]  [<ffffffff81d41140>] ? early_idt_handlers+0x140/0x140
[    0.307962]  [<ffffffff81d413ca>] x86_64_start_kernel+0x102/0x111
[    0.307962] Code: e8 42 f9 4b 00 48 89 fb e8 76 b0 0f 00 31 d2 66 8b 0b 31 c0 85 d2 75 0a 31 c0 66 81 f9 0f 0b 0f 94 c0 5a 5b 5d c3 90 55 48 89 e5 
[    0.307962]  66 66 4b 00 48 8b 05 30 35 e3 00 65 ff 04 25 c8 29 1d 00 48 
[    0.307962] RIP  [<ffffffff8100b264>] timer_interrupt+0x4/0x24
[    0.307962]  RSP <ffff88001f003e98>
[    0.307962] CR2: ffffffffffffff8d
[    0.307962] ---[ end trace a7919e7f17c0a725 ]---
[    0.307962] Kernel panic - not syncing: Fatal exception in interrupt
[    0.307962] Pid: 0, comm: swapper Tainted: G      D     3.0-0.rc1.git0.2.fc16.x86_64 #1
[    0.307962] Call Trace:
[    0.307962]  <IRQ>  [<ffffffff814b9bf5>] panic+0x91/0x1a1
[    0.307962]  [<ffffffff8105a373>] ? kmsg_dump+0x89/0x132
[    0.307962]  [<ffffffff814c5599>] oops_end+0xb4/0xc5
[    0.307962]  [<ffffffff814b94b3>] no_context+0x203/0x212
[    0.307962]  [<ffffffff8100eecf>] ? native_sched_clock+0x39/0x3b
[    0.307962]  [<ffffffff8100eeda>] ? sched_clock+0x9/0xd
[    0.307962]  [<ffffffff814b965b>] __bad_area_nosemaphore+0x199/0x1bc
[    0.307962]  [<ffffffff814b8d83>] ? pmd_offset+0x19/0x3f
[    0.307962]  [<ffffffff814b9691>] bad_area_nosemaphore+0x13/0x15
[    0.307962]  [<ffffffff814c766a>] do_page_fault+0x1df/0x3d5
[    0.307962]  [<ffffffff8100eeda>] ? sched_clock+0x9/0xd
[    0.307962]  [<ffffffff81088e5a>] ? mark_lock+0x2d/0x220
[    0.307962]  [<ffffffff81086ac0>] ? trace_hardirqs_off_caller+0x33/0x90
[    0.307962]  [<ffffffff8125dabd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[    0.307962]  [<ffffffff814c4915>] page_fault+0x25/0x30
[    0.307962]  [<ffffffff8100b264>] ? timer_interrupt+0x4/0x24
[    0.307962]  [<ffffffff810ba8b7>] handle_irq_event_percpu+0x56/0x1c7
[    0.307962]  [<ffffffff814a09b2>] ? csum_partial_copy_from_user+0xf/0xed
[    0.307962]  [<ffffffff810baa6f>] handle_irq_event+0x47/0x68
[    0.307962]  [<ffffffff810bcad9>] handle_level_irq+0x6a/0x8e
[    0.307962]  [<ffffffff8100ab2c>] handle_irq+0x8b/0x91
[    0.307962]  [<ffffffff814cc93d>] do_IRQ+0x4d/0xa5
[    0.307962]  [<ffffffff8100b264>] ? timer_interrupt+0x4/0x24
[    0.307962]  [<ffffffff814c45d3>] common_interrupt+0x13/0x13
[    0.307962]  <EOI>  [<ffffffff8125c8bb>] ? copy_user_generic_unrolled+0x9b/0xb0
[    0.307962]  [<ffffffff810ee710>] ? __probe_kernel_write+0x36/0x56
[    0.307962]  [<ffffffff8102537e>] ftrace_mod_code+0x1c/0x2d
[    0.307962]  [<ffffffff8102540b>] do_ftrace_mod_code+0x7c/0xaa
[    0.307962]  [<ffffffff8100b264>] ? timer_interrupt+0x4/0x24
[    0.307962]  [<ffffffff810254f2>] ftrace_modify_code+0x56/0x70
[    0.307962]  [<ffffffff81229ec4>] ? fips_enable+0x4/0x4c
[    0.307962]  [<ffffffff8102549c>] ? ftrace_mod_jmp+0x63/0x63
[    0.307962]  [<ffffffff810255fb>] ftrace_make_nop+0x37/0x39
[    0.307962]  [<ffffffff810c6887>] ftrace_process_locs+0x190/0x25d
[    0.307962]  [<ffffffff81d5faf3>] ftrace_init+0x19f/0x1c3
[    0.307962]  [<ffffffff81d49823>] ? alternative_instructions+0x136/0x13b
[    0.307962]  [<ffffffff81d41ba4>] start_kernel+0x3e3/0x3f3
[    0.307962]  [<ffffffff81d412c4>] x86_64_start_reservations+0xaf/0xb3
[    0.307962]  [<ffffffff81d41140>] ? early_idt_handlers+0x140/0x140
[    0.307962]  [<ffffffff81d413ca>] x86_64_start_kernel+0x102/0x111
[    0.307962] Rebooting in 1 seconds..libguestfs: error: unexpected end of file when reading from daemon

Comment 2 Steven Rostedt 2011-06-06 13:03:22 UTC
Which kernel version is this exactly, and which version of the compiler was used to build it?

Comment 3 Richard W.M. Jones 2011-06-06 13:47:27 UTC
Hi Steven, this is:

kernel-3.0-0.rc1.git0.2.fc16 (tried both i686 and x86-64)

from Fedora Rawhide.  This kernel was compiled using:

gcc-4.6.0-9.fc16

I'm currently installing a Fedora Rawhide VM to see if I
can reproduce this issue outside Koji.

Comment 4 Steven Rostedt 2011-06-07 01:57:41 UTC
This is a virtual machine correct? It may have issues with the "ideal_nop" that was chosen. Looking at the text dump I see:

55 89 e5 50 3e 77 8a <3e> 00 64 a1 fc

This tells a lot, as looking at the objdump of printk_tick, this should have been:

  55                push %ebp
  89 e5             mov %esp, %ebp
  50                push %eax
  3e 77 8a 3e 00    nop

But that "3e 77 8a 3e 00" does not look like a nop to me. The starting 3e may be, but the rest looks funny.

On boot up, that line is a call to mcount (e8 <offset>), and the ftrace code converts it into a 5 byte nop. But different x86 hardware may not handle all 5 byte nops, so we have a function called earlier in arch/x86/kernel/alternative.c: arch_init_ideal_nops().

But that doesn't look like any of the nops that we may pick. I wonder if something funny happened in the way it was converted.

Can you give me an objdump of that code from this kernel vmlinux?

objdump -dr --start-addr 0xc043cd8c vmlinux | head

Comment 5 Richard W.M. Jones 2011-06-07 07:21:09 UTC
I am able to reproduce this on Rawhide, so I'm now sure
that this is a real bug, not an artifact of Koji or libguestfs.

I booted the kernel (3.0-0.rc1.git0.2.fc16.x86_64) in a qemu
guest (qemu-system-x86-0.14.0-8.fc16.x86_64).

I did this in a loop, 100 times in a row.  It failed 16 / 100
times.

In 14 of the failure cases, the stack trace contained the
string "ftrace_make_nop".

In the other 2 cases, the boot hung with the last line being:

[    0.202464] ftrace: allocating 24624 entries in 97 pages

Comment 6 Richard W.M. Jones 2011-06-07 07:40:12 UTC
(In reply to comment #4)
> Can you give me an objdump of that code from this kernel vmlinux?
> 
> objdump -dr --start-addr 0xc043cd8c vmlinux | head

I don't have the 32 bit kernel handy (but it's available
from Koji: http://koji.fedoraproject.org/koji/buildinfo?buildID=246189)

However printk_tick also failed in the same way in my
test of the 64 bit kernel.  This is the disassembly of that
function in the kernel image:

ffffffff81059762:       55                      push   %rbp
ffffffff81059763:       48 89 e5                mov    %rsp,%rbp
ffffffff81059766:       e8 15 14 47 00          callq  0xffffffff814cab80
ffffffff8105976b:       65 8b 04 25 50 e6 00    mov    %gs:0xe650,%eax
ffffffff81059772:       00 
ffffffff81059773:       85 c0                   test   %eax,%eax
ffffffff81059775:       74 24                   je     0xffffffff8105979b
ffffffff81059777:       31 c9                   xor    %ecx,%ecx
ffffffff81059779:       ba 01 00 00 00          mov    $0x1,%edx
ffffffff8105977e:       be 01 00 00 00          mov    $0x1,%esi
ffffffff81059783:       48 c7 c7 60 a2 a1 81    mov    $0xffffffff81a1a260,%rdi
ffffffff8105978a:       65 c7 04 25 50 e6 00    movl   $0x0,%gs:0xe650
ffffffff81059791:       00 00 00 00 00 
ffffffff81059796:       e8 47 b8 fe ff          callq  0xffffffff81044fe2
ffffffff8105979b:       5d                      pop    %rbp
ffffffff8105979c:       c3                      retq   

Here is the code in memory from the stack trace:

[    0.245971] Code: 41 5c 41 5d 41 5e 41 5f 5d c3 55 48 89 e5 e8 26 14 47 00 8b
 05 60 d0 e4 00 5d c3 55 48 89 e5 66 66 66 66 00 65 8b 04 25 50 e6 00 
[    0.245971]  85 c0 74 24 31 c9 ba 01 00 00 00 be 01 00 00 00 48 c7 c7 60 
[    0.245971] RIP  [<ffffffff81059772>] printk_tick+0x10/0x3b
[    0.245971]  RSP <ffff88001f003e10>
[    0.245971] CR2: ffff88005024b2d8
[    0.245971] ---[ end trace a7919e7f17c0a725 ]---

Comment 7 Richard W.M. Jones 2011-06-07 07:48:57 UTC
Looking at the nops from the kernel, it strikes me
that "66 66 66 66 00" should be "66 66 66 66 90".

Comment 8 Richard W.M. Jones 2011-06-07 10:00:21 UTC
What seems to happen is that the copy_user_generic_unrolled
function (which is copying in the NOP) gets interrupted.  The
interrupt code calls some function that has a partial NOP,
causing mayhem.

Here is another example.  Disassembly of rcu_enter_nohz
in vmlinux:

ffffffff810bfb25 <.text+0xbfb25>:
ffffffff810bfb25:       55                      push   %rbp
ffffffff810bfb26:       48 89 e5                mov    %rsp,%rbp
ffffffff810bfb29:       41 54                   push   %r12
ffffffff810bfb2b:       53                      push   %rbx
ffffffff810bfb2c:       e8 4f b0 40 00          callq  0xffffffff814cab80
ffffffff810bfb31:       49 c7 c4 e0 ee 1c 00    mov    $0x1ceee0,%r12
ffffffff810bfb38:       e8 71 e7 ff ff          callq  0xffffffff810be2ae

At the time of the panic, the code in memory looks like:

[    0.243972] Code: 10 ef 1c 00 48 63 ff 48 03 04 fd d0 29 b6 81 48 8b 50 08 48
 ff ca 48 89 50 10 c6 40 18 01 5d c3 55 48 89 e5 41 54 53 66 4f b0 40 
[    0.243972]  49 c7 c4 e0 ee 1c 00 e8 71 e7 ff ff 48 89 c3 e8 d8 6f fc ff 

Note that only the first byte of the instruction has been
overwritten by the NOP.

   Original: e8 4f b0 40 00
  In memory: 66 4f b0 40 [00]*
        NOP: 66 66 66 66 90

*Mystery: why is the trailing 00 omitted from the Code print out?

Comment 9 Steven Rostedt 2011-06-07 12:58:50 UTC
I believe I know the answer. It is assumed that nothing else will run during this time, and things that do will not be affected by the change. But I see now that interrupts are enabled, and active. If an interrupt happens when we copy the nop into place, but the copy is not atomic, we can copy just part of the instruction. If that gets executed, it will crash.

The fix should be to have ftrace disable interrupts before doing the modifications.

I'll attach a patch soon. Would you be able to test it?

Comment 10 Richard W.M. Jones 2011-06-07 13:03:22 UTC
(In reply to comment #9)
> I'll attach a patch soon. Would you be able to test it?

Actually I was just compiling a kernel that used cli/sti around
the relevant code.  However I'd prefer to test your patch
instead.  So please send me your patch / attach it here and
I will definitely be able to test it for you.

Comment 11 Steven Rostedt 2011-06-07 13:05:35 UTC
This is a mainline bug caused by my patch:

commit 8ab2b7efd3e2ccf2c2dda3206b8171ecdbd0af40
Author: Steven Rostedt <srostedt>
Date:   Thu Apr 21 22:41:35 2011 -0400

    ftrace: Remove unnecessary disabling of irqs


The comment was wrong, and the code was removed because the reason stated by the comment no longer is required. But interrupts must still be disabled for other reasons. And is the bug you have.

Revert this commit and it should work.

Comment 12 Steven Rostedt 2011-06-07 13:06:47 UTC
Created attachment 503477 [details]
revert removing of interrupts disabling

Comment 13 Richard W.M. Jones 2011-06-07 16:28:57 UTC
Steven, I patched my Rawhide kernel with the patch in comment 12.
I then reran the boot test 100 times (as in comment 5).  The boot
was successful every time.

*** Therefore this bug appears to be FIXED ***

Please can we get this patch into the upstream kernel, soon as poss?
It's breaking libguestfs builds, and likely many other users of
Rawhide too.

Comment 14 Steven Rostedt 2011-06-07 16:38:33 UTC
I have it queued, and I'm just finishing my standard tests on it.

Comment 15 Chuck Ebbert 2011-06-16 00:03:14 UTC
This was fixed in kernel-3.0-0.rc2.git0.2.fc16


Note You need to log in before you can comment on or make changes to this bug.