Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 680105

Summary:	[ext4/xfstests] kernel BUG at fs/jbd2/transaction.c:1027!
Product:	Red Hat Enterprise Linux 6	Reporter:	Boris Ranto <branto>
Component:	kernel	Assignee:	Lukáš Czerner <lczerner>
Status:	CLOSED ERRATA	QA Contact:	Eryu Guan <eguan>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	6.2	CC:	eguan, esandeen, lczerner, rwheeler, syeghiay
Target Milestone:	rc	Keywords:	Regression
Target Release:	---
Hardware:	All
OS:	Unspecified
Whiteboard:
Fixed In Version:	kernel-2.6.32-130.el6	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-05-19 12:43:47 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Boris Ranto 2011-02-24 11:17:29 UTC

Description of problem:
When running xfstests in beaker I'm seeing kernel panic on ppc64 platform with ext4 filesystem. I didn't manage to reproduce the problem manually outside the beaker but at least beaker was able to provide the calltrace (in additional info). This is most probably a regression.

Version-Release number of selected component (if applicable):
2.6.32-117.el6.ppc64

How reproducible:
Not sure, in beaker fairly regular.

Steps to Reproduce:
1. Clone job J:56079 in beaker
2. Watch the results for ext4
  
Actual results:
Kernel panic due to 'kernel BUG at fs/jbd2/transaction.c:1027!'

Expected results:
No panic.

Additional info:
Last test that beaker noticed was test no. 233 so the problem should arise from one of the tests 234-248 (most probably 234).
Related beaker jobs/recipes:
https://beaker.engineering.redhat.com/recipes/109908
https://beaker.engineering.redhat.com/recipes/112445

The calltraces are the same (but for different machines):
------------[ cut here ]------------ 
kernel BUG at fs/jbd2/transaction.c:1027! 
Oops: Exception in kernel mode, sig: 5 [#1] 
SMP NR_CPUS=1024 NUMA pSeries 
Modules linked in: ext3 jbd ext2 sunrpc ipv6 dm_mirror dm_region_hash dm_log ibmveth sg ext4 jbd2 mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt dm_mod [last unloaded: scsi_wait_scan] 
NIP: d000000002260d00 LR: d0000000023d6dbc CTR: d000000002260c70 
REGS: c0000000a973f3f0 TRAP: 0700   Not tainted  (2.6.32-117.el6.ppc64) 
MSR: 8000000000029032 <EE,ME,CE,IR,DR>  CR: 24008482  XER: 20000000 
TASK = c0000000a9b84da0[19290] 'setquota' THREAD: c0000000a973c000 CPU: 3 
GPR00: 0000000000000001 c0000000a973f670 d00000000227de30 c0000000ad5960a0  
GPR04: c000000013f80ae0 0000000000000000 c000000013f80ae0 0000000000000001  
GPR08: c0000000afff1f00 c0000000a81df080 0000000000000000 0000000000000000  
GPR12: d0000000023ecb80 c000000000fa2c80 0000000000000000 0000000000000000  
GPR16: d0000000023efef0 c000000013f80ae0 c0000000ad89f4d0 0000000000000008  
GPR20: 0000000000000018 c0000000a973f820 c00000004aee1180 c0000000ad89f418  
GPR24: 0000000000000018 0000000000000004 0000000000000000 c000000074a70ac0  
GPR28: c0000000ad5960a0 c0000000a8163b00 d000000002406688 c000000013f80ae0  
NIP [d000000002260d00] .jbd2_journal_dirty_metadata+0x90/0x1c0 [jbd2] 
LR [d0000000023d6dbc] .__ext4_handle_dirty_metadata+0xac/0x170 [ext4] 
Call Trace: 
[c0000000a973f670] [c0000000a973f710] 0xc0000000a973f710 (unreliable) 
[c0000000a973f710] [d0000000023d6dbc] .__ext4_handle_dirty_metadata+0xac/0x170 [ext4] 
[c0000000a973f7b0] [d0000000023c556c] .ext4_quota_write+0x18c/0x300 [ext4] 
[c0000000a973f8c0] [c00000000023280c] .v2_write_file_info+0x13c/0x1a0 
[c0000000a973f990] [c00000000022d4bc] .dquot_commit+0x22c/0x250 
[c0000000a973fa30] [d0000000023ca8dc] .ext4_write_dquot+0x6c/0xc0 [ext4] 
[c0000000a973fac0] [c00000000022ff60] .dqput+0x100/0x390 
[c0000000a973fb90] [c000000000231390] .vfs_set_dqblk+0x240/0x430 
[c0000000a973fc40] [c0000000002358d0] .do_quotactl+0x450/0x6a0 
[c0000000a973fd70] [c000000000235dbc] .SyS_quotactl+0x29c/0x4d0 
[c0000000a973fe30] [c000000000008564] syscall_exit+0x0/0x40 
Instruction dump: 
796a57e3 40c200c8 801b0010 2f800000 409e002c 38000001 901b0010 e97c000a  
380bffff 7c005b78 54000ffe 7c0007b4 <0b000000> 396bffff 917c0008 e81b0028  
Kernel panic - not syncing: Fatal exception 
Call Trace: 
[c0000000a973efd0] [c000000000012e04] .show_stack+0x74/0x1c0 (unreliable) 
[c0000000a973f080] [c0000000005a335c] .panic+0x80/0x1b4 
[c0000000a973f110] [c00000000002fbcc] .die+0x21c/0x2a0 
[c0000000a973f1c0] [c000000000030000] ._exception+0x110/0x220 
[c0000000a973f380] [c000000000004b9c] program_check_common+0x11c/0x180 
--- Exception: 700 at .jbd2_journal_dirty_metadata+0x90/0x1c0 [jbd2] 
    LR = .__ext4_handle_dirty_metadata+0xac/0x170 [ext4] 
[c0000000a973f670] [c0000000a973f710] 0xc0000000a973f710 (unreliable) 
[c0000000a973f710] [d0000000023d6dbc] .__ext4_handle_dirty_metadata+0xac/0x170 [ext4] 
[c0000000a973f7b0] [d0000000023c556c] .ext4_quota_write+0x18c/0x300 [ext4] 
[c0000000a973f8c0] [c00000000023280c] .v2_write_file_info+0x13c/0x1a0 
[c0000000a973f990] [c00000000022d4bc] .dquot_commit+0x22c/0x250 
[c0000000a973fa30] [d0000000023ca8dc] .ext4_write_dquot+0x6c/0xc0 [ext4] 
[c0000000a973fac0] [c00000000022ff60] .dqput+0x100/0x390 
[c0000000a973fb90] [c000000000231390] .vfs_set_dqblk+0x240/0x430 
[c0000000a973fc40] [c0000000002358d0] .do_quotactl+0x450/0x6a0 
[c0000000a973fd70] [c000000000235dbc] .SyS_quotactl+0x29c/0x4d0 
[c0000000a973fe30] [c000000000008564] syscall_exit+0x0/0x40

Comment 3 Boris Ranto 2011-02-24 11:47:38 UTC

I've finally managed to reproduce the problem manually running 'while true;do ./check 234;done' for about an hour. Therefore test no. 234 causes the panic.

Comment 4 Eric Sandeen 2011-02-24 15:12:32 UTC

1020         if (jh->b_modified == 0) {
1021                 /*
1022                  * This buffer's got modified and becoming part
1023                  * of the transaction. This needs to be done
1024                  * once a transaction -bzzz
1025                  */
1026                 jh->b_modified = 1;
1027                 J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
1028                 handle->h_buffer_credits--;
1029         }

test 234 does quota work...

# FS QA Test No. 234
#
# Stress setquota and setinfo handling.


and:

        /* Number of remaining buffers we are allowed to dirty: */
        int                     h_buffer_credits;

sounds like perhaps we under-reserved for the quota metadata...

Comment 5 Eryu Guan 2011-03-18 06:25:36 UTC

*** Bug 688817 has been marked as a duplicate of this bug. ***

Comment 6 Eryu Guan 2011-03-18 06:29:06 UTC

I saw this on x86_64 and i386 too. Please see bug 688817. Change platform to ALL

Comment 9 RHEL Program Management 2011-03-29 09:59:40 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 12 Aristeu Rozanski 2011-04-07 13:52:51 UTC

Patch(es) available on kernel-2.6.32-130.el6

Comment 15 Eryu Guan 2011-04-11 07:29:43 UTC

Ran xfstests 234 in loop for more than 1 hour on -130 kernel, no issue found. 
Tested on x86_64 i386 and s390x.

Set it to VERIFIED.

Comment 16 errata-xmlrpc 2011-05-19 12:43:47 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html