Bug 667707 - [xfs] Stress testing with swap usage resulted in unresponsive processes
Summary: [xfs] Stress testing with swap usage resulted in unresponsive processes
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: x86_64
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Dave Chinner
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 640580
TreeView+ depends on / blocked
 
Reported: 2011-01-06 15:18 UTC by Boris Ranto
Modified: 2017-04-04 20:44 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-04 20:44:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Few calltraces from the machine (12.73 KB, text/plain)
2011-01-07 16:05 UTC, Boris Ranto
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 666477 0 medium CLOSED test XFS/FUSE heavy writeback workloads (also swapping too) 2021-02-22 00:41:40 UTC

Internal Links: 666477

Description Boris Ranto 2011-01-06 15:18:24 UTC
Description of problem:
Based on testing suggested in bug 666477 for xfs filesystem with memory filled by memhog and several (10) dd's, the dd's and memhog get stuck after a while (within an hour) becoming unresponsive.

Version-Release number of selected component (if applicable):
kernel-2.6.18-238.el5

How reproducible:
I've tried only two times but I managed to reproduce the issue both times.

Steps to Reproduce:
1. Create ~2 TB sparse file
dd if=/dev/zero of=xfs.img count=0 bs=4096 seek=500M
2. Start running memhog in while loop so that memory can get completely filled with it
while true;do memhog 11g; done
# Machine I used had 8 GB ram and 10 GB swap
3. Make xfs filesystem on the sparse file and mount the file
mkfs.xfs xfs.img && mount xfs.img /xfs -o loop && cf /xfs
4. Run several (in my case 10) dd's in background
for i in $(seq 1 10); do dd if=/dev/zero of=test$i bs=4096 count=2M &; done
5. After a while (approximately in the middle of the writes) kill the dd's (I'm not sure this step is needed but both times it reproduced after I did this)
killall -9 dd
6. Run the dd's again (with same command) so that files get overwritten (possibly repeatedly until reproduction)
7. Run ll -sh /xfs to see what the dd's are doing while the dd's are running

Actual results:
After a while (within an hour), memhog gets stuck (no new points in its output) and dd's gets stuck (filesize of the test$i files does not update anymore). Both dd and memhog processes are unkillable (not event with -9 option).

Expected results:
dd and memhog will finish successfully.

Additional info:
Machine: hp-bl685cg6-01.rhts.eng.bos.redhat.com

Comment 1 Qian Cai 2011-01-07 03:19:41 UTC
Boris, Can you please generate a sysrq-t output when this is happening?

Comment 2 Boris Ranto 2011-01-07 12:05:17 UTC
I've managed to reproduce it the same way on different machine therefore I suppose it is not machine specific.
Problem is that after another while the machine becomes completely unresponsive (let's assume you had unused terminal opened, the terminal accepts input but that is all it can do, when you try to run any command, nothing happens and command is unkillable (with ctrl-c)). Therefore it is quite difficult to generate sysrq-t (I can't enter any command and I have no idea how to send the key combination remotely).
I'll try to catch the window between memhog + dd freeze and complete freeze next time I'll reproduce the problem. If I catch it I'll send the output of sysrq-t here.

Comment 3 Boris Ranto 2011-01-07 16:05:05 UTC
Created attachment 472253 [details]
Few calltraces from the machine

I didn't manage to get into the window to do echo t >/proc/sysrq-trigger but I can provide at least call traces that pop up because of hung_task_timeout.

Comment 7 Ric Wheeler 2011-01-10 21:22:45 UTC
Eric, can you summarize where we are with this? thanks!

Comment 8 Eric Sandeen 2011-01-10 21:41:28 UTC
I don't see this as a blocker; I doubt that it is a regression (can that be tested?), and it is a bit of an odd use case.  How often will we badly stress a sparse loopback file containing xfs?

BTW, what filesystem hosts the sparse file?

Dave, do the backtraces speak to you at all?

Comment 9 Qian Cai 2011-01-11 00:34:19 UTC
It would be better to use a real XFS filesystem partition other than loopback.

It might also possible to generate sysrq-t and sysrq-m via conserve like this.

# echo 1 >/proc/sys/kernel/sysrq

Then,from the conserv serial console,
ctrl-e, c, l, 0, t

ctrl-e, c, l, 0, m

Comment 10 Boris Ranto 2011-01-11 13:06:43 UTC
The sparse file is hosted on ext3 filesystem as installed by default on rhel5.

I reproduced it again and it seems that the step when dd's are killed is not important for reproduction (although it might shorten the time necessary for reproduction). I also tried the conserver console commands but they didn't generate anything:
[root@nec-em19 ~]# cat /proc/sys/kernel/sysrq 
1
[root@nec-em19 ~]# [halt sent]
[halt sent]
t
-bash: t: command not found
[root@nec-em19 ~]# [halt sent]
m

Comment 11 Qian Cai 2011-01-11 13:42:01 UTC
> reproduction). I also tried the conserver console commands but they didn't
> generate anything:
> [root@nec-em19 ~]# cat /proc/sys/kernel/sysrq 
> 1
> [root@nec-em19 ~]# [halt sent]
> [halt sent]
> t
> -bash: t: command not found
> [root@nec-em19 ~]# [halt sent]
> m
It is unlucky to reproduce on one of those boxes did not support conserv sysrq.

Comment 12 Dave Chinner 2011-01-12 05:31:44 UTC
Looks familiar. Probably a different manifestation of the problem the below upstream commit (which is in RHEL6) fixes. In this case, it is writeback holding the ilock, waiting for metadata buffer IO completion, which can't occur because all the IO completion queues are blocked in the ilock held by writeback.

$ gl 77d7a0c2eeb285c9069e15396703d0cb9690ac50 -n 1
commit 77d7a0c2eeb285c9069e15396703d0cb9690ac50
Author: Dave Chinner <david>
Date:   Wed Feb 17 05:36:29 2010 +0000

    xfs: Non-blocking inode locking in IO completion
    
    The introduction of barriers to loop devices has created a new IO
    order completion dependency that XFS does not handle. The loop
    device implements barriers using fsync and so turns a log IO in the
    XFS filesystem on the loop device into a data IO in the backing
    filesystem. That is, the completion of log IOs in the loop
    filesystem are now dependent on completion of data IO in the backing
    filesystem.
    
    This can cause deadlocks when a flush daemon issues a log force with
    an inode locked because the IO completion of IO on the inode is
    blocked by the inode lock. This in turn prevents further data IO
    completion from occuring on all XFS filesystems on that CPU (due to
    the shared nature of the completion queues). This then prevents the
    log IO from completing because the log is waiting for data IO
    completion as well.
    
    The fix for this new completion order dependency issue is to make
    the IO completion inode locking non-blocking. If the inode lock
    can't be grabbed, simply requeue the IO completion back to the work
    queue so that it can be processed later. This prevents the
    completion queue from being blocked and allows data IO completion on
    other inodes to proceed, hence avoiding completion order dependent
    deadlocks.
    
    Signed-off-by: Dave Chinner <david>
    Reviewed-by: Christoph Hellwig <hch>
    Signed-off-by: Alex Elder <aelder>

Comment 16 RHEL Program Management 2011-06-20 22:26:42 UTC
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 18 RHEL Program Management 2012-01-09 14:18:44 UTC
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.8 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 21 Chris Williams 2017-04-04 20:44:33 UTC
Red Hat Enterprise Linux 5 shipped it's last minor release, 5.11, on September 14th, 2014. On March 31st, 2017 RHEL 5 exits Production Phase 3 and enters Extended Life Phase. For RHEL releases in the Extended Life Phase, Red Hat  will provide limited ongoing technical support. No bug fixes, security fixes, hardware enablement or root-cause analysis will be available during this phase, and support will be provided on existing installations only.  If the customer purchases the Extended Life-cycle Support (ELS), certain critical-impact security fixes and selected urgent priority bug fixes for the last minor release will be provided.  The specific support and services provided during each phase are described in detail at http://redhat.com/rhel/lifecycle

This BZ does not appear to meet ELS criteria so is being closed WONTFIX. If this BZ is critical for your environment and you have an Extended Life-cycle Support Add-on entitlement, please open a case in the Red Hat Customer Portal, https://access.redhat.com ,provide a thorough business justification and ask that the BZ be re-opened for consideration of an errata. Please note, only certain critical-impact security fixes and selected urgent priority bug fixes for the last minor release can be considered.


Note You need to log in before you can comment on or make changes to this bug.