Bug 591466

Summary:

[abrt] WARNING: at fs/buffer.c:1159 mark_buffer_dirty+0x82/0xa0()

Product:

Red Hat Enterprise Linux 6

Reporter:

Stefan Assmann <sassmann>

Component:

kernel

Assignee:

Edward Shishkin <edward>

Status:

CLOSED ERRATA

QA Contact:

Eryu Guan <eguan>

Severity:

medium

Docs Contact:

Priority:

low

Version:

6.0

CC:

babu.moger, case-diagnostics, edward, eguan, esandeen, kzhang, pmcdonou, rlary, rwheeler, tao

Target Milestone:

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

abrt_hash:16221087

Fixed In Version:

kernel-2.6.32-112.el6

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2011-05-23 20:21:31 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

645454

Attachments:

Description	Flags
File: backtrace	none
sosreport	none
patch against rc-1 (2.6.32-71.el6)	none
File: backtrace	none

Description Stefan Assmann 2010-05-12 10:29:34 UTC

abrt 1.1.0 detected a crash.

architecture: x86_64
cmdline: not_applicable
comment: umounted and powered down external usb hdd approx 1 minute before this happened
component: kernel
executable: kernel
kernel: 2.6.32-25.el6.x86_64
package: kernel
reason: ------------[ cut here ]------------
release: Red Hat Enterprise Linux Workstation release 6.0 Beta (Santiago)

backtrace
-----
------------[ cut here ]------------
WARNING: at fs/buffer.c:1159 mark_buffer_dirty+0x82/0xa0() (Not tainted)
Hardware name: 2241B48
Modules linked in: tun(U) fuse(U) ipt_MASQUERADE(U) iptable_nat(U) nf_nat(U) rfcomm(U) sco(U) bridge(U) stp(U) llc(U) bnep(U) l2cap(U) autofs4(U) sunrpc(U) cpufreq_ondemand(U) acpi_cpufreq(U) freq_table(U) xt_physdev(U) ip6t_REJECT(U) nf_conntrack_ipv6(U) ip6table_filter(U) ip6_tables(U) ipv6(U) ext3(U) jbd(U) dm_mirror(U) dm_region_hash(U) dm_log(U) kvm_intel(U) kvm(U) uinput(U) snd_hda_codec_conexant(U) arc4(U) ecb(U) snd_hda_intel(U) snd_hda_codec(U) snd_hwdep(U) iwlagn(U) snd_seq(U) snd_seq_device(U) iwlcore(U) snd_pcm(U) mac80211(U) uvcvideo(U) btusb(U) ppdev(U) snd_timer(U) thinkpad_acpi(U) videodev(U) cfg80211(U) bluetooth(U) parport_pc(U) snd(U) hwmon(U) v4l2_compat_ioctl32(U) ricoh_mmc(U) parport(U) iTCO_wdt(U) i2c_i801(U) e1000e(U) soundcore(U) rfkill(U) sg(U) iTCO_vendor_support(U) wmi(U) snd_page_alloc(U) ext4(U) mbcache(U) jbd2(U) cryptd(U) aes_x86_64(U) aes_generic(U) xts(U) gf128mul(U) dm_crypt(U) ums_cypress(U) usb_storage(U) sr_mod(U) cdrom(U) sd_mod(U) crc_t10dif(
U) sdhci_pci(U) sdhci(U) firewire_ohci(U) mmc_core(U) firewire_core(U) crc_itu_t(U) yenta_socket(U) rsrc_nonstatic(U) ahci(U) i915(U) drm_kms_helper(U) drm(U) i2c_algo_bit(U) i2c_core(U) video(U) output(U) dm_mod(U) [last unloaded: microcode]
Pid: 7798, comm: umount Not tainted 2.6.32-25.el6.x86_64 #1
Call Trace:
[<ffffffff810672a3>] warn_slowpath_common+0x83/0xc0
[<ffffffff810672f4>] warn_slowpath_null+0x14/0x20
[<ffffffff8118ff32>] mark_buffer_dirty+0x82/0xa0
[<ffffffffa0497fb5>] ext3_put_super+0x1a5/0x280 [ext3]
[<ffffffff811642f6>] generic_shutdown_super+0x56/0xd0
[<ffffffff811643a1>] kill_block_super+0x31/0x50
[<ffffffff811653ea>] deactivate_super+0x6a/0x80
[<ffffffff8118057f>] mntput_no_expire+0xaf/0x100
[<ffffffff81180973>] sys_umount+0x63/0x3b0
[<ffffffff81013172>] system_call_fastpath+0x16/0x1b

Comment 1 Stefan Assmann 2010-05-12 10:29:39 UTC

Created attachment 413391 [details]
File: backtrace

Comment 2 Stefan Assmann 2010-05-12 10:37:12 UTC

things I found in /var/log/messages regarding the usb hdd:
May 12 10:38:10 t500 kernel: EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2 offset 0

lots of these:
May 12 12:19:00 t500 kernel: sd 5:0:0:0: [sdc] Add. Sense: ATA pass through information available
May 12 12:19:00 t500 kernel: sd 5:0:0:0: [sdc] Sense Key : Recovered Error [current] [descriptor]
May 12 12:19:00 t500 kernel: Descriptor sense data with sense descriptors (in hex):
May 12 12:19:00 t500 kernel:        72 01 00 1d 00 00 00 0e 09 0c 00 00 00 00 00 00·
May 12 12:19:00 t500 kernel:        00 4f 00 c2 e0 50·
May 12 12:19:00 t500 kernel: sd 5:0:0:0: [sdc] Add. Sense: ATA pass through information available
May 12 12:19:00 t500 kernel: sd 5:0:0:0: [sdc] Sense Key : Recovered Error [current] [descriptor]
May 12 12:19:00 t500 kernel: Descriptor sense data with sense descriptors (in hex):
May 12 12:19:00 t500 kernel:        72 01 00 1d 00 00 00 0e 09 0c 00 00 00 00 00 00·
May 12 12:19:00 t500 kernel:        00 4f 00 c2 e0 50·
May 12 12:19:00 t500 kernel: sd 5:0:0:0: [sdc] Add. Sense: ATA pass through information available
May 12 12:19:00 t500 kernel: sd 5:0:0:0: [sdc] Sense Key : Recovered Error [current] [descriptor]
May 12 12:19:00 t500 kernel: Descriptor sense data with sense descriptors (in hex):
May 12 12:19:00 t500 kernel:        72 01 00 1d 00 00 00 0e 09 0c 00 00 00 00 00 00·
May 12 12:19:00 t500 kernel:        00 4f 00 c2 e0 50·
May 12 12:19:00 t500 kernel: sd 5:0:0:0: [sdc] Add. Sense: ATA pass through information available

kernel warning happend at:
May 12 12:26:39 t500 kernel: WARNING: at fs/buffer.c:1159 mark_buffer_dirty+0x82/0xa0() (Not tainted)

Comment 4 RHEL Program Management 2010-05-12 12:00:10 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 5 Eric Sandeen 2010-05-12 15:59:11 UTC

The warning is from:

void mark_buffer_dirty(struct buffer_head *bh)
{
        WARN_ON_ONCE(!buffer_uptodate(bh));

The buffer in question is that containing the superblock;

ext3_put_super():
                mark_buffer_dirty(sbi->s_sbh);

All of your scsi errors were on sdc, but you mentioned an ext3 error on sdb.  The warning was generated from the unmount task, but you said you had completed (?) unmount 1 minute prior to the trace?  Which device got unmounted, sdb or sdc - I'm guessing sdb?

> May 12 10:38:10 t500 kernel: EXT3-fs error (device sdb1): ext3_find_entry:
reading directory #2 offset 0

Anything else related to sdb in the logs?

Comment 6 Stefan Assmann 2010-05-14 06:51:21 UTC

Hi Eric,

sorry I didn't see that there's sdb and sdc. I'm using only 1 external drive so what I'm guessing that might have happened is: I've unmounted (and unplugged) sdb1 which caused the first "EXT3-fs error (device sdb1): ext3_find_entry:". Didn't really notice that at that time. Replugged the device later and it appeared as sdc. Later I unmounted that again and approx 1 minute after that I got the abrt notification.

No other messages regarding sd* in the log except lots of these
sd 4:0:0:0: [sdb] Sense Key : Recovered Error [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
        72 01 00 1d 00 00 00 0e 09 0c 00 00 00 00 00 00 
        00 4f 00 c2 e0 50
messages after I attach the device.

Comment 7 Eric Sandeen 2010-05-14 15:41:42 UTC

Are you certain that you got the error -after- you unmounted?  It doesn't make any sense to get an ext3_find_entry message on an unmounted filesystem; nothing could get to that point in the code...

Is there any chance there was user error here? :)

Comment 8 RHEL Program Management 2010-07-15 14:39:21 UTC

This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 9 Eric Sandeen 2010-07-15 16:53:28 UTC

Ok, this looks similar to Bug 614206 - WARNING: at fs/buffer.c:1159 mark_buffer_dirty+0x82/0xa0() after device removal

it seems that after an IO error -> ext3 error -> superblock write we sometimes have a not-uptodate superblock buffer ... I'll have to look into this one.

It's not super-critical because it's only on error paths...

Comment 10 Eric Sandeen 2010-07-15 16:54:33 UTC

*** Bug 614206 has been marked as a duplicate of this bug. ***

Comment 12 RHEL Program Management 2011-01-07 04:27:48 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 13 Suzanne Logcher 2011-01-07 16:17:07 UTC

This request was erroneously denied for the current release of Red Hat
Enterprise Linux.  The error has been fixed and this request has been
re-proposed for the current release.

Comment 14 Edward Shishkin 2011-01-20 16:29:03 UTC

There was an upstream commit for this issue:
dff6825e9fde93891e60751e01480337a991235e

I've sent a backport of this commit as well as a backport of dependent commit
4cf46b67eb6de94532c1bea11d2479d085229d0e to rhkernel-list

Edward.

Comment 15 RHEL Program Management 2011-01-26 19:31:02 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 16 Eric Sandeen 2011-01-26 20:17:37 UTC

*** Bug 640419 has been marked as a duplicate of this bug. ***

Comment 17 Aristeu Rozanski 2011-02-03 16:46:34 UTC

Patch(es) available on kernel-2.6.32-112.el6

Comment 24 Eric Sandeen 2011-03-04 17:28:45 UTC

*** Bug 682209 has been marked as a duplicate of this bug. ***

Comment 25 Pat McDonough 2011-03-10 17:57:40 UTC

Package: kernel
Architecture: x86_64
OS Release: Red Hat Enterprise Linux Server release 6.0 (Santiago)


How to reproduce
-----
Not sure aout this exact sequence, but something like the following:

1. Plug-in a USB storage device with an Ext2 partition (this device also has an Ext4 partition)
2. Hibernate the machine
3. Wake the machine up
4. Greetings from ABRT

Comment
-----
I've got some other funky behavior going on as well, so have the following /proc/cmdline (note the iommu)igfx parm):

[pmcdonou@patredhat /]$ cat /proc/cmdline 
ro root=/dev/mapper/vg_thinkpad--01-lv_root rd_LVM_LV=vg_thinkpad-01/lv_root rd_LVM_LV=vg_thinkpad-01/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us crashkernel=128M rhgb quiet intel_iommu=igfx_off

Comment 26 Eric Sandeen 2011-03-16 17:15:29 UTC

Bug for ext2 seems to be bug #679930

Comment 27 Ben 2011-03-16 20:05:47 UTC

Comment #26: Yes, at least in IBM's opinion.

Comment 28 IBM Bug Proxy 2011-03-26 14:45:41 UTC

------- Comment From djwong.com 2011-02-03 12:41 EDT-------
There's some confusion going on in LTC64882 as to whether or not this bug (LTC65071) is a duplicate.

To clarify, 65071 is a fix for the badness warning when ext3 produces it.  64882 is a fix for the badness warning when ext2 produces it.  These two bugs are not the same, they are not duplicates of each other, and I hope this is true on the RH end too, though I don't know how to confirm that.  In the past, the two LTC bugs were marked as duplicates of each other, but this should no longer be the case.

Comment 29 IBM Bug Proxy 2011-03-26 14:45:51 UTC

Created attachment 487779 [details]
sosreport

Comment 30 IBM Bug Proxy 2011-03-26 14:45:57 UTC

Created attachment 487780 [details]
patch against rc-1 (2.6.32-71.el6)

Comment 31 Eric Sandeen 2011-03-28 17:02:39 UTC

RH Bug 679930 for ext2 is still valid & open, it will be fixed.

The patch attached in comment #30 for ext3 is already integrated into RHEL6, see comment #17.

Comment 32 IBM Bug Proxy 2011-03-29 10:41:41 UTC

------- Comment From afox.com 2011-03-29 06:39 EDT-------
fix verified with kernel-2.6.32-118.el6.s390x

Comment 33 Red Hat Case Diagnostics 2011-04-01 07:06:28 UTC

Created attachment 489309 [details]
File: backtrace

Comment 35 Eryu Guan 2011-04-06 12:19:41 UTC

Reproduced on -71 kernel by umounting a USB disk which was already removed physically after some kinds of buffered IO. (though it's not so easy to hit this issue)

dd if=/dev/zero of=/mnt/testfile bs=1M count=100
# unplug USB disk
umount /mnt

usb 1-4: USB disconnect, address 10
------------[ cut here ]------------
WARNING: at fs/buffer.c:1159 mark_buffer_dirty+0x6a/0x80() (Not tainted)
Hardware name: 26681BC
Modules linked in: ext3 jbd usb_storage xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat bridge stp llc autofs4 sunrpc cpufreq_ondemand acpi_cpufreq ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev thinkpad_acpi hwmon rfkill parport_pc parport ipw2200 libipw lib80211 sg i2c_i801 iTCO_wdt iTCO_vendor_support snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc tg3 ext4 mbcache jbd2 video output yenta_socket rsrc_nonstatic sd_mod crc_t10dif sr_mod cdrom pata_acpi ata_generic ahci ata_piix radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mod [last unloaded: microcode]
Pid: 3089, comm: umount Not tainted 2.6.32-71.el6.i686 #1
Call Trace:
 [<c04501c1>] ? warn_slowpath_common+0x81/0xc0
 [<c054719a>] ? mark_buffer_dirty+0x6a/0x80
 [<c054719a>] ? mark_buffer_dirty+0x6a/0x80
 [<c045021b>] ? warn_slowpath_null+0x1b/0x20
 [<c054719a>] ? mark_buffer_dirty+0x6a/0x80
 [<f98b48fd>] ? ext3_put_super+0x15d/0x230 [ext3]
 [<c053469d>] ? invalidate_inodes+0xbd/0x120
 [<c05647c0>] ? vfs_quota_off+0x0/0x10
 [<c051fee5>] ? generic_shutdown_super+0x45/0xc0
 [<c05150b6>] ? free_percpu+0x66/0x100
 [<c051ff82>] ? kill_block_super+0x22/0x40
 [<c0520d0b>] ? deactivate_super+0x5b/0x90
 [<c0537d8a>] ? sys_umount+0x6a/0x350
 [<c0538087>] ? sys_oldumount+0x17/0x20
 [<c04099fb>] ? sysenter_do_call+0x12/0x28
---[ end trace 2f8381ad354ffc4c ]---

On -128 kernel, I tried 30 times or so, no such message found. Set it to VERIFIED.

Comment 36 errata-xmlrpc 2011-05-23 20:21:31 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html