RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1417257 - xfs hang task during shutdown once docker thinp is full
Summary: xfs hang task during shutdown once docker thinp is full
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.3
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: smahajan@redhat.com
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-27 17:19 UTC by Qian Cai
Modified: 2019-03-06 01:07 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-03 15:13:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Qian Cai 2017-01-27 17:19:21 UTC
Description of problem:
caiqian> [ 1554.834403] XFS (dm-4): xfs_log_force: error -5 returned.
<caiqian> [ 1560.557408] INFO: task dockerd-current:8562 blocked for more than 120 seconds.
<caiqian> [ 1560.558532] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<caiqian> [ 1560.559145] dockerd-current D ffff880228b53c90     0  8562      1 0x00000004
<caiqian> [ 1560.559732]  ffff88023029bd90 0000000000000082 ffff880033738000 ffff88023029bfd8
<caiqian> [ 1560.560365]  ffff88023029bfd8 ffff88023029bfd8 ffff880033738000 ffff880228b53c80
<caiqian> [ 1560.560979]  ffff88022c312d10 ffff880228b53cc0 ffff880228b53ce8 ffff880228b53c90
<caiqian> [ 1560.561597] Call Trace:
<caiqian> [ 1560.561798]  [<ffffffff8168b879>] schedule+0x29/0x70
<caiqian> [ 1560.562225]  [<ffffffffa02debb1>] xfs_ail_push_all_sync+0xc1/0x110 [xfs]
<caiqian> [ 1560.562754]  [<ffffffff810b1720>] ? wake_up_atomic_t+0x30/0x30
<caiqian> [ 1560.563223]  [<ffffffffa02c7b61>] xfs_unmountfs+0x71/0x1c0 [xfs]
<caiqian> [ 1560.563709]  [<ffffffffa02c866b>] ? xfs_mru_cache_destroy+0x6b/0x90 [xfs]
<caiqian> [ 1560.564249]  [<ffffffffa02cacc2>] xfs_fs_put_super+0x32/0x90 [xfs]
<caiqian> [ 1560.564736]  [<ffffffff812007f2>] generic_shutdown_super+0x72/0xf0
<caiqian> [ 1560.565218]  [<ffffffff81200c37>] kill_block_super+0x27/0x70
<caiqian> [ 1560.565664]  [<ffffffff81200f79>] deactivate_locked_super+0x49/0x60
<caiqian> [ 1560.566153]  [<ffffffff81201576>] deactivate_super+0x46/0x60
<caiqian> [ 1560.566600]  [<ffffffff8121e9b5>] mntput_no_expire+0xc5/0x120
<caiqian> [ 1560.567056]  [<ffffffff8121faf0>] SyS_umount+0xa0/0x3b0
<caiqian> [ 1560.567471]  [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b
<caiqian> [ 1584.914502] XFS (dm-4): xfs_log_force: error -5 returned.
<caiqian> [ 1614.994477] XFS (dm-4): xfs_log_force: error -5 returned.

Vivek mentioned that we may need to backport --storage-opt dm.xfs_nospace_max_retries=1 to take care of it.

Version-Release number of selected component (if applicable):
docker-1.12.5-14.el7.x86_64

How reproducible:
once

Comment 1 Vivek Goyal 2017-01-27 17:32:48 UTC
Shishir, can you please backport following

https://github.com/docker/docker/pull/26212
https://github.com/docker/docker/pull/26918

Comment 2 smahajan@redhat.com 2017-01-27 17:59:49 UTC
https://github.com/projectatomic/docker/pull/221

Shishir

Comment 3 Daniel Walsh 2017-01-28 10:36:59 UTC
Fixed in docker-1.12.6-2.el7

Comment 4 Qian Cai 2017-02-27 22:05:25 UTC
Use --storage-opt dm.xfs_nospace_max_retries=0 works fine that I can stop the container and delete it after filled up the docker thinp.

I added --storage-opt dm.xfs_nospace_max_retries=1 but the kernel still hang during the shutdown.

root     26138  4.4  0.6 1121400 53768 ?       Ssl  16:27   0:06 /usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --selinux-enabled --log-driver=journald --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/dockervg-docker--pool --storage-opt dm.use_deferred_removal=true --storage-opt dm.xfs_nospace_max_retries=1 --add-registry registry.access.redhat.com


[1884139.169446] device-mapper: thin: 252:3: reached low water mark for data device: sending event.
[1884139.174002] device-mapper: thin: 252:3: switching pool to out-of-data-space (queue IO) mode
[1884199.276966] device-mapper: thin: 252:3: switching pool to out-of-data-space (error IO) mode
[1884199.479266] Buffer I/O error on dev dm-5, logical block 250411, lost async page write
[1884199.481664] Buffer I/O error on dev dm-5, logical block 250412, lost async page write
[1884199.483015] Buffer I/O error on dev dm-5, logical block 250413, lost async page write
[1884199.484323] Buffer I/O error on dev dm-5, logical block 250414, lost async page write
[1884199.485571] Buffer I/O error on dev dm-5, logical block 250415, lost async page write
[1884199.486811] Buffer I/O error on dev dm-5, logical block 250416, lost async page write
[1884199.488040] Buffer I/O error on dev dm-5, logical block 250417, lost async page write
[1884199.489250] Buffer I/O error on dev dm-5, logical block 250418, lost async page write
[1884199.490471] Buffer I/O error on dev dm-5, logical block 250419, lost async page write
[1884199.491702] Buffer I/O error on dev dm-5, logical block 250420, lost async page write
[1884199.600667] XFS (dm-5): metadata I/O error: block 0x1400 ("xfs_buf_iodone_callback_error") error 28 numblks 32
[1884199.668986] XFS (dm-5): Failing async write on buffer block 0xb3dc08. Retrying async write.
[1884199.670522] XFS (dm-5): Failing async write on buffer block 0xb3dc10. Retrying async write.
[1884199.671849] XFS (dm-5): Failing async write on buffer block 0xb3dc01. Retrying async write.
[1884199.673141] XFS (dm-5): Failing async write on buffer block 0x13fc08. Retrying async write.
[1884199.674401] XFS (dm-5): Failing async write on buffer block 0x13fc10. Retrying async write.
[1884199.675714] XFS (dm-5): Failing async write on buffer block 0x13fc01. Retrying async write.
[1884199.676933] XFS (dm-5): Failing async write on buffer block 0x3bf408. Retrying async write.
[1884199.678180] XFS (dm-5): Failing async write on buffer block 0x3bf410. Retrying async write.
[1884199.679365] XFS (dm-5): Failing async write on buffer block 0x3bf401. Retrying async write.
[1884199.680604] XFS (dm-5): Failing async write on buffer block 0xc7d808. Retrying async write.
[1884199.682056] XFS (dm-5): xfs_do_force_shutdown(0x1) called from line 1131 of file fs/xfs/xfs_buf_item.c.  Return address = 0xffffffffa0219f82
[1884199.684916] XFS (dm-5): I/O Error Detected. Shutting down filesystem
[1884199.686066] XFS (dm-5): Please umount the filesystem and rectify the problem(s)
[1884200.046224] docker0: port 1(veth440048c) entered disabled state
[1884200.048850] docker0: port 1(veth440048c) entered disabled state
[1884200.052327] device veth440048c left promiscuous mode
[1884200.053449] docker0: port 1(veth440048c) entered disabled state
[1884200.067657] XFS (dm-5): xfs_log_force: error -5 returned.
[1884200.573416] XFS (dm-5): Unmounting Filesystem
[1884200.575421] XFS (dm-5): xfs_log_force: error -5 returned.
[[32m  OK  [0m] Stopped Permit User Sessions.
[[32m  OK  [0m] Stopped Login Service.
[[32m  OK  [0m] Started Show Plymouth Reboot Screen.
[[32m  OK  [0m] Stopped Dynamic System Tuning Daemon.
[[32m  OK  [0m] Stopped LSB: Starts the Spacewalk Daemon.
[[32m  OK  [0m] Stopped target Remote File Systems.
[[32m  OK  [0m] Stopped target Network is Online.
[1884218.477052] XFS (dm-5): xfs_log_force: error -5 returned.
[1884248.557080] XFS (dm-5): xfs_log_force: error -5 returned.
[1884278.637071] XFS (dm-5): xfs_log_force: error -5 returned.
[1884308.717090] XFS (dm-5): xfs_log_force: error -5 returned.
[[32m  OK  [0m] Stopped Docker Application Container Engine.
[[32m  OK  [0m] Stopped target Network.
         Stopping LSB: Bring up/down networking...
[[32m  OK  [0m] Stopped LSB: Bring up/down networking.
         Stopping Network Manager...
[[32m  OK  [0m] Stopped Network Manager.
         Stopping firewalld - dynamic firewall daemon...
[1884570.693311] Ebtables v2.0 unregistered
[[32m  OK  [0m] Stopped firewalld - dynamic firewall daemon.
         Stopping D-Bus System Message Bus...
         Stopping Authorization Manager...
[[32m  OK  [0m] Stopped Authorization Manager.
[[32m  OK  [0m] Stopped D-Bus System Message Bus.
[[32m  OK  [0m] Stopped target Basic System.
[[32m  OK  [0m] Stopped target Paths.
[[32m  OK  [0m] Stopped target Slices.
[[32m  OK  [0m] Removed slice User and Session Slice.
[[32m  OK  [0m] Stopped target Sockets.
[[32m  OK  [0m] Closed Docker Block RHEL push plugin Socket for the API.
[[32m  OK  [0m] Closed D-Bus System Message Bus Socket.
[[32m  OK  [0m] Stopped target System Initialization.
[[32m  OK  [0m] Stopped Setup Virtual Console.
         Stopping Setup Virtual Console...
         Stopping Load/Save Random Seed...
[[32m  OK  [0m] Stopped target Encrypted Volumes.
         Stopping Update UTMP about System Boot/Shutdown...
[[32m  OK  [0m] Stopped Apply Kernel Variables.
         Stopping Apply Kernel Variables...
[[32m  OK  [0m] Stopped target Swap.
[[32m  OK  [0m] Stopped Load/Save Random Seed.
[[32m  OK  [0m] Stopped Update UTMP about System Boot/Shutdown.
         Stopping Security Auditing Service...
[1884570.829666] type=1305 audit(1488231336.227:8257): audit_pid=0 old=31821 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1
[1884570.836112] type=1130 audit(1488231336.234:8258): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=auditd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[1884570.839176] type=1131 audit(1488231336.237:8259): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=auditd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[1884570.842617] type=1130 audit(1488231336.240:8260): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[1884570.846093] type=1131 audit(1488231336.244:8261): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[[32m  OK  [0m] Stopped Security Auditing Service.
[[32m  OK  [0m] Stopped Create Volatile Files and Directories.
[1884570.851042] type=1130 audit(1488231336.249:8262): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=rhel-import-state comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[1884570.851051] type=1131 audit(1488231336.249:8263): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=rhel-import-state comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
         Stopping Create Volatile Files and Directories...
[[32m  OK  [0m] Stopped Import network configuration from initramfs.
         Stopping Import network configuration from initramfs...
[[32m  OK  [0m] Stopped target Local File Systems.
         Unmounting /run/user/1000...
[1884570.862628] type=1130 audit(1488231336.260:8264): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=rhel-readonly comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[1884570.862643] type=1131 audit(1488231336.260:8265): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=rhel-readonly comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[[32m  OK  [0m] Stopped Configure read-only root support.
         Stopping Configure read-only root support...
         Unmounting /root/rootfs...
[[32m  OK  [0m] Unmounted /run/user/1000.
[[32m  OK  [0m] Unmounted /root/rootfs.
[[32m  OK  [0m] Reached target Unmount All Filesystems.
[[32m  OK  [0m] Stopped target Local File Systems (Pre).
[[1884570.884102] type=1130 audit(1488231336.282:8266): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-setup-dev comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[32m  OK  [0m] Stopped Create Static Device Nodes in /dev.
         Stopping Create Static Device Nodes in /dev...
[[32m  OK  [0m] Stopped Remount Root and Kernel File Systems.
         Stopping Remount Root and Kernel File Systems...
[[32m  OK  [0m] Reached target Shutdown.
[1884579.436970] XFS (dm-5): xfs_log_force: error -5 returned.
[1884600.566979] INFO: task systemd:26462 blocked for more than 120 seconds.
[1884600.569263] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1884600.571502] systemd         D ffffffff8122fd80     0 26462      0 0x00000084
[1884600.573723]  ffff8800b685be38 0000000000000086 ffff88022b253ec0 ffff8800b685bfd8
[1884600.575870]  ffff8800b685bfd8 ffff8800b685bfd8 ffff88022b253ec0 ffff88022b253ec0
[1884600.578050]  ffff88008eac3868 fffffffeffffffff ffff88008eac3870 ffffffff8122fd80
[1884600.580236] Call Trace:
[1884600.582041]  [<ffffffff8122fd80>] ? generic_write_sync+0x60/0x60
[1884600.584014]  [<ffffffff8168b879>] schedule+0x29/0x70
[1884600.585849]  [<ffffffff8168d225>] rwsem_down_read_failed+0xf5/0x170
[1884600.587753]  [<ffffffff81326d48>] call_rwsem_down_read_failed+0x18/0x30
[1884600.589632]  [<ffffffff8168aa00>] down_read+0x20/0x30
[1884600.591453]  [<ffffffff81201dec>] iterate_supers+0x9c/0x110
[1884600.593265]  [<ffffffff81230054>] sys_sync+0x44/0xb0
[1884600.595119]  [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b
[1884600.596930] INFO: task dockerd-current:26143 blocked for more than 120 seconds.
[1884600.598874] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1884600.600731] dockerd-current D ffff8801d6ff8410     0 26143      1 0x00000084
[1884600.602540]  ffff880232e3bd90 0000000000000086 ffff880036bace70 ffff880232e3bfd8
[1884600.604395]  ffff880232e3bfd8 ffff880232e3bfd8 ffff880036bace70 ffff8801d6ff8400
[1884600.606176]  ffff8801eda18ab0 ffff8801d6ff8440 ffff8801d6ff8468 ffff8801d6ff8410
[1884600.607948] Call Trace:
[1884600.609320]  [<ffffffff8168b879>] schedule+0x29/0x70
[1884600.610930]  [<ffffffffa0223bb1>] xfs_ail_push_all_sync+0xc1/0x110 [xfs]
[1884600.612568]  [<ffffffff810b1720>] ? wake_up_atomic_t+0x30/0x30
[1884600.614127]  [<ffffffffa020cb61>] xfs_unmountfs+0x71/0x1c0 [xfs]
[1884600.615679]  [<ffffffffa020d66b>] ? xfs_mru_cache_destroy+0x6b/0x90 [xfs]
[1884600.617234]  [<ffffffffa020fcc2>] xfs_fs_put_super+0x32/0x90 [xfs]
[1884600.618780]  [<ffffffff812007f2>] generic_shutdown_super+0x72/0xf0
[1884600.620232]  [<ffffffff81200c37>] kill_block_super+0x27/0x70
[1884600.621603]  [<ffffffff81200f79>] deactivate_locked_super+0x49/0x60
[1884600.623057]  [<ffffffff81201576>] deactivate_super+0x46/0x60
[1884600.625697]  [<ffffffff8121e9b5>] mntput_no_expire+0xc5/0x120
[1884600.627107]  [<ffffffff8121faf0>] SyS_umount+0xa0/0x3b0
[1884600.628466]  [<ffffffff8169171b>] ? do_async_page_fault+0x1b/0xd0
[1884600.629919]  [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b
[1884609.516987] XFS (dm-5): xfs_log_force: error -5 returned.

Comment 5 Qian Cai 2017-02-27 22:07:50 UTC
Lokesh, If you could, please add this to the errata.


Note You need to log in before you can comment on or make changes to this bug.