Bug 702408

Summary: kernel BUG at fs/buffer.c:3228!
Product: [Fedora] Fedora Reporter: Satish Balay <balay>
Component: kernelAssignee: Eric Sandeen <esandeen>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 15CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-11 15:21:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
new stacktrace from /var/log/messages none

Description Satish Balay 2011-05-05 15:24:22 UTC
Description of problem:

OS Crash with the kernel stack on the screen - with error:

kernel BUG at fs/buffer.c:3228!

Version-Release number of selected component (if applicable):

kernel-2.6.38.5-22.fc14.x86_64

How reproducible:

Occured once

Steps to Reproduce:
1. It happened suddenly - with no apparent trigger - after an uptime of 2days [and a few suspend resume cycles]. This is on a Thinkpad T61 laptop

This is a F14 machine with the F15 kernel grabbed from koji - and rebuilt. i.e

wget http://kojipkgs.fedoraproject.org/packages/kernel/2.6.38.5/22.fc15/src/kernel-2.6.38.5-22.fc15.src.rpm
rpmbuild --rebuild --with baseonly --without debuginfo --target=`uname -m` kernel-2.6.38.5-22.fc15.src.rpm


Additional info:

May  5 09:53:48 asterix kernel: [113524.816882] ------------[ cut here ]------------
May  5 09:53:48 asterix kernel: [113524.816955] kernel BUG at fs/buffer.c:3228!
May  5 09:53:48 asterix kernel: [113524.817012] invalid opcode: 0000 [#1] SMP
May  5 09:53:48 asterix kernel: [113524.817015] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
May  5 09:53:48 asterix kernel: [113524.817015] CPU 1
May  5 09:53:48 asterix kernel: [113524.817015] Modules linked in: nls_utf8 vfat fat usb_storage uas tcp_lp fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc rfcomm sco bnep l2cap sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm uinput snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm arc4 btusb bluetooth snd_timer thinkpad_acpi iwl3945 iwlcore snd mac80211 cfg80211 iTCO_wdt iTCO_vendor_support wmi i2c_i801 microcode e1000e rfkill soundcore snd_page_alloc yenta_socket i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
May  5 09:53:48 asterix kernel: [113524.817015]
May  5 09:53:48 asterix kernel: [113524.817015] Pid: 30, comm: khugepaged Not tainted 2.6.38.5-22.fc14.x86_64 #1 LENOVO 8897CTO/8897CTO
May  5 09:53:48 asterix kernel: [113524.817015] RIP: 0010:[<ffffffff8114ef99>]  [<ffffffff8114ef99>] free_buffer_head+0x16/0x33
May  5 09:53:48 asterix kernel: [113524.817015] RSP: 0018:ffff88012fecf7e0  EFLAGS: 00010207
May  5 09:53:48 asterix kernel: [113524.817015] RAX: ffff880131abbc10 RBX: ffff880131abbbc8 RCX: 0020000000000009
May  5 09:53:48 asterix kernel: [113524.817015] RDX: 0000000000000000 RSI: ffff880131abbbc8 RDI: ffff880131abbbc8
May  5 09:53:48 asterix kernel: [113524.817015] RBP: ffff88012fecf7e0 R08: ffffea0001bb4410 R09: ffff8800830dfcd8
May  5 09:53:48 asterix kernel: [113524.817015] R10: ffff88012fecf8b0 R11: ffff880098a71d70 R12: ffff880095847858
May  5 09:53:48 asterix kernel: [113524.817015] R13: ffff88012ee76b9c R14: ffff880131abbbc8 R15: 0000000000000000
May  5 09:53:48 asterix kernel: [113524.817015] FS:  0000000000000000(0000) GS:ffff8800bf500000(0000) knlGS:0000000000000000
May  5 09:53:48 asterix kernel: [113524.817015] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  5 09:53:48 asterix kernel: [113524.817015] CR2: 0000000003baf001 CR3: 0000000001a03000 CR4: 00000000000006e0
May  5 09:53:48 asterix kernel: [113524.817015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May  5 09:53:48 asterix kernel: [113524.817015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May  5 09:53:48 asterix kernel: [113524.817015] Process khugepaged (pid: 30, threadinfo ffff88012fece000, task ffff880132449720)
May  5 09:53:48 asterix kernel: [113524.817015] Stack:
May  5 09:53:48 asterix kernel: [113524.817015]  ffff88012fecf820 ffffffff8114f25e ffff88012fecf800 ffff880100000001
May  5 09:53:48 asterix kernel: [113524.817015]  ffff88012fecf820 ffff880131abbbc8 ffff880131abbbc8 ffffea0001bb43e8
May  5 09:53:48 asterix kernel: [113524.817015]  ffff88012fecf870 ffffffff811d5bd5 ffff88012fecf928 0000000000000000
May  5 09:53:48 asterix kernel: [113524.817015] Call Trace:
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff8114f25e>] try_to_free_buffers+0x93/0xa5
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff811d5bd5>] jbd2_journal_try_to_free_buffers+0xdb/0xee
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff811a20fa>] ext4_releasepage+0x67/0x7a
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810df897>] try_to_release_page+0x34/0x3d
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810edafd>] shrink_page_list+0x305/0x478
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810ee096>] shrink_inactive_list+0x239/0x38b
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810ee7f7>] shrink_zone+0x362/0x464
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810eec35>] do_try_to_free_pages+0xdd/0x2dd
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810ef077>] try_to_free_pages+0xa0/0xe3
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810e71aa>] __alloc_pages_nodemask+0x4c8/0x75a
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810655d2>] ? try_to_del_timer_sync+0x77/0x85
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff81085756>] ? arch_local_irq_save+0x18/0x1e
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff81113a03>] alloc_pages_vma+0xec/0xf1
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff811213e1>] khugepaged+0x561/0xeeb
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff8107366f>] ? autoremove_wake_function+0x0/0x3d
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff81120e80>] ? khugepaged+0x0/0xeeb
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff810731ce>] kthread+0x82/0x8a
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff8100ba64>] kernel_thread_helper+0x4/0x10
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff8107314c>] ? kthread+0x0/0x8a
May  5 09:53:48 asterix kernel: [113524.817015]  [<ffffffff8100ba60>] ? kernel_thread_helper+0x0/0x10
May  5 09:53:48 asterix kernel: [113524.817015] Code: 76 4b c1 00 0f 9f c0 89 05 5d 4b c1 00 5a 5b 41 5c 41 5d c9 c3 55 48 89 e5 0f 1f 44 00 00 48 8d 47 48 48 39 47 48 48 89 fe 74 02 <0f> 0b 48 8b 3d 3e 4b c1 00 e8 21 ca fc ff 65 ff 0c 25 40 0c 01
May  5 09:53:48 asterix kernel: [113524.817015] RIP  [<ffffffff8114ef99>] free_buffer_head+0x16/0x33
May  5 09:53:48 asterix kernel: [113524.817015]  RSP <ffff88012fecf7e0>
May  5 09:53:48 asterix kernel: [113524.847416] ---[ end trace fe9aff3113b26bee ]---
May  5 09:54:10 asterix abrt: Kerneloops: Reported 1 kernel oopses to Abrt
May  5 09:54:10 asterix abrtd: Directory 'kerneloops-1304607250-1256-1' creation detected
May  5 09:54:10 asterix abrtd: New crash /var/spool/abrt/kerneloops-1304607250-1256-1, processing
May  5 09:54:10 asterix abrtd: RunApp('/var/spool/abrt/kerneloops-1304607250-1256-1','test x"`cat component`" = x"xorg-x11-server-Xorg" && cp /var/log/Xorg.0.log .')

Comment 1 Satish Balay 2011-05-05 16:05:19 UTC
One additional note: This machine had been using earlier 2.6.38 kernels for a little over a month - without this issue.

kernel-2.6.38.5-22.fc14.x86_64
kernel-2.6.38.4-20.fc14.x86_64
kernel-2.6.38.3-15.rc1.fc14.x86_64
kernel-2.6.38.2-8.fc14.x86_64

Comment 2 Satish Balay 2011-06-01 13:31:16 UTC
I think the above is related to hibernate. Sometimes the machine goes into hibernate when idle on battery. [Its set to suspend when idle on battery - but that never happens - perhaps a gnome2 bug].

I had 2 other crashes related to this unintentional hibernate. The stack trace looked different for these. And one of them wasn't captured in/var/log/messages [so I don't have that].

Now I've upgraded to F15 - and looks like the 'unintentional hibernate' issue is fixed. [i.e the laptop goes into suspend when idle on battery - as intended]. So hoping these crashes don't occur anymore.

I'm attaching the trace here to be complete - and will report again if I see any crashes in F15.

Comment 3 Satish Balay 2011-06-01 13:32:23 UTC
Created attachment 502265 [details]
new stacktrace from /var/log/messages

Comment 4 Dave Jones 2011-07-11 22:33:47 UTC
        BUG_ON(!list_empty(&bh->b_assoc_buffers));

Eric, any ideas ?

Comment 5 Dave Jones 2012-04-11 15:21:00 UTC
This is probably the i915 corruption bug that recently got fixed in 2.6.43.1