Bug 2220985 - BUG_ON "kernel bug at fs/inode.c:518" with nonzero inode->i_data.i_pages in clear_inode()
Summary: BUG_ON "kernel bug at fs/inode.c:518" with nonzero inode->i_data.i_pages in c...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: fs-maint Bot
QA Contact: Kun Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-06 20:11 UTC by Frank Sorenson
Modified: 2023-07-30 01:00 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-07-18 17:28:57 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-161711 0 None None None 2023-07-06 20:12:45 UTC

Description Frank Sorenson 2023-07-06 20:11:48 UTC
Description of problem:

System crashed with BUG due to nonzero pages counter, despite having an empty page tree:
[8004333.840863] kernel BUG at fs/inode.c:518!

PID: 1        TASK: ffff973100c19ec0  CPU: 66   COMMAND: "systemd"
    [exception RIP: clear_inode+0x81]
 #7 [ffffbcd5c0037e20] evict at ffffffff93f36e6b
 #8 [ffffbcd5c0037e40] __dentry_kill at ffffffff93f32975
 #9 [ffffbcd5c0037e60] dentry_kill at ffffffff93f335cd
#10 [ffffbcd5c0037e88] dput at ffffffff93f337e9
#11 [ffffbcd5c0037ea0] __fput at ffffffff93f1a3af
#12 [ffffbcd5c0037ee8] task_work_run at ffffffff93d01eaa
#13 [ffffbcd5c0037f20] exit_to_usermode_loop at ffffffff93c03bbb
#14 [ffffbcd5c0037f38] do_syscall_64 at ffffffff93c04348
#15 [ffffbcd5c0037f50] entry_SYSCALL_64_after_hwframe at ffffffff946000ad

void clear_inode(struct inode *inode)
...
        xa_lock_irq(&inode->i_data.i_pages);
        BUG_ON(inode->i_data.nrpages);    <<<< location of crash

called from evict():
         if (op->evict_inode) {
                op->evict_inode(inode);
        } else {
                truncate_inode_pages_final(&inode->i_data);
                clear_inode(inode);

nrpages is indeed nonzero, however the page tree is empty:

crash> inode.i_data ffff9730011f30b0 -ox
struct inode {
  [ffff9730011f3228] struct address_space i_data;
}
crash> address_space.nrpages ffff9730011f3228
  nrpages = 32,


crash> address_space.i_pages ffff9730011f3228
  i_pages = {
    xa_lock = {
        rlock = {
          raw_lock = {
              val = {
                counter = 1        <<<--------- locked on our codepath
...
    xa_flags = 33, ---> XA_FLAGS_LOCK_IRQ|XA_FLAGS_ACCOUNT
    xa_head = 0x0,     <<<----------- empty tree !


crash> inode.i_fop,i_sb,i_state ffff9730011f30b0
  i_fop = 0xffffffff94a3c740 <pipefifo_fops>,
  i_sb = 0xffff975eaf23c800,
  i_state = 0x27,  -------------> I_DIRTY_SYNC|I_DIRTY_DATASYNC|I_DIRTY_PAGES|I_FREEING

crash> super_block.s_id 0xffff975eaf23c800
  s_id = "pipefs",



Version-Release number of selected component (if applicable):

kernel-4.18.0-305.57.1.el8_4


How reproducible:

Unknown; crash experienced once thus far


Steps to Reproduce:

unknown


Actual results:

kernel crash


Expected results:

no crash


Additional info:

Comment 5 Eric Sandeen 2023-07-14 15:15:52 UTC
Reassigning this open bug to fs-maint.redhat.com as the old fs-maint mailing list has been deprecated.

Comment 6 Frank Sorenson 2023-07-18 17:28:57 UTC
closing this for now...  if they or someone else hits this in the future, we'll know it's not the first, and perhaps there will be more information

Comment 7 Ian Kent 2023-07-30 00:57:23 UTC
(In reply to Frank Sorenson from comment #6)
> closing this for now...  if they or someone else hits this in the future,
> we'll know it's not the first, and perhaps there will be more information

If we see this again we will need to include memory management folks on the
cc.

It looks like this could be related to the relatively new page folios
implementation.


Note You need to log in before you can comment on or make changes to this bug.