Bug 2220985

Summary: BUG_ON "kernel bug at fs/inode.c:518" with nonzero inode->i_data.i_pages in clear_inode()
Product: Red Hat Enterprise Linux 8 Reporter: Frank Sorenson <fsorenso>
Component: kernelAssignee: fs-maint Bot <fs-maint>
kernel sub component: VFS QA Contact: Kun Wang <kunwan>
Status: CLOSED INSUFFICIENT_DATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: aquini, dhowells, ikent, llong, mszeredi, swhiteho, xzhou
Version: 8.4   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-18 17:28:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Frank Sorenson 2023-07-06 20:11:48 UTC
Description of problem:

System crashed with BUG due to nonzero pages counter, despite having an empty page tree:
[8004333.840863] kernel BUG at fs/inode.c:518!

PID: 1        TASK: ffff973100c19ec0  CPU: 66   COMMAND: "systemd"
    [exception RIP: clear_inode+0x81]
 #7 [ffffbcd5c0037e20] evict at ffffffff93f36e6b
 #8 [ffffbcd5c0037e40] __dentry_kill at ffffffff93f32975
 #9 [ffffbcd5c0037e60] dentry_kill at ffffffff93f335cd
#10 [ffffbcd5c0037e88] dput at ffffffff93f337e9
#11 [ffffbcd5c0037ea0] __fput at ffffffff93f1a3af
#12 [ffffbcd5c0037ee8] task_work_run at ffffffff93d01eaa
#13 [ffffbcd5c0037f20] exit_to_usermode_loop at ffffffff93c03bbb
#14 [ffffbcd5c0037f38] do_syscall_64 at ffffffff93c04348
#15 [ffffbcd5c0037f50] entry_SYSCALL_64_after_hwframe at ffffffff946000ad

void clear_inode(struct inode *inode)
...
        xa_lock_irq(&inode->i_data.i_pages);
        BUG_ON(inode->i_data.nrpages);    <<<< location of crash

called from evict():
         if (op->evict_inode) {
                op->evict_inode(inode);
        } else {
                truncate_inode_pages_final(&inode->i_data);
                clear_inode(inode);

nrpages is indeed nonzero, however the page tree is empty:

crash> inode.i_data ffff9730011f30b0 -ox
struct inode {
  [ffff9730011f3228] struct address_space i_data;
}
crash> address_space.nrpages ffff9730011f3228
  nrpages = 32,


crash> address_space.i_pages ffff9730011f3228
  i_pages = {
    xa_lock = {
        rlock = {
          raw_lock = {
              val = {
                counter = 1        <<<--------- locked on our codepath
...
    xa_flags = 33, ---> XA_FLAGS_LOCK_IRQ|XA_FLAGS_ACCOUNT
    xa_head = 0x0,     <<<----------- empty tree !


crash> inode.i_fop,i_sb,i_state ffff9730011f30b0
  i_fop = 0xffffffff94a3c740 <pipefifo_fops>,
  i_sb = 0xffff975eaf23c800,
  i_state = 0x27,  -------------> I_DIRTY_SYNC|I_DIRTY_DATASYNC|I_DIRTY_PAGES|I_FREEING

crash> super_block.s_id 0xffff975eaf23c800
  s_id = "pipefs",



Version-Release number of selected component (if applicable):

kernel-4.18.0-305.57.1.el8_4


How reproducible:

Unknown; crash experienced once thus far


Steps to Reproduce:

unknown


Actual results:

kernel crash


Expected results:

no crash


Additional info:

Comment 5 Eric Sandeen 2023-07-14 15:15:52 UTC
Reassigning this open bug to fs-maint.redhat.com as the old fs-maint mailing list has been deprecated.

Comment 6 Frank Sorenson 2023-07-18 17:28:57 UTC
closing this for now...  if they or someone else hits this in the future, we'll know it's not the first, and perhaps there will be more information

Comment 7 Ian Kent 2023-07-30 00:57:23 UTC
(In reply to Frank Sorenson from comment #6)
> closing this for now...  if they or someone else hits this in the future,
> we'll know it's not the first, and perhaps there will be more information

If we see this again we will need to include memory management folks on the
cc.

It looks like this could be related to the relatively new page folios
implementation.