Bug 182935
Summary: | I/O to a snapshot origin causes an OOM condition may hang the entire system | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Corey Marthaler <cmarthal> |
Component: | kernel | Assignee: | Alasdair Kergon <agk> |
Status: | CLOSED WONTFIX | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | agk, dwysocha, jbaron, jbrassow, lwoodman, mbroz |
Target Milestone: | --- | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-02-10 00:47:16 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2006-02-24 16:27:09 UTC
Please capture as much info as can: which process did the oom killer kill? The standard dm diagnostics (from just before you started the dd): dmsetup table dmsetup info -c Also run 'dmsetup status' continuously (e.g. loop with sleep 1) and report the last few occurrences before the machine hangs. When it's hung, if you can, please capture the process state *for every process on the system, not just the dm ones* with sysreq. If you can't, do a detailed 'ps' with sleep 1 continuously and report the last output before it hangs. [ie how much memory is each process using; monitoring slabinfo useful too] Another test - slow down the 'dd': break it up into smaller dd's with 'sync' (or sleep) between them and see if you can slow it down enough so that it no longer fails. Need to understand why the complete machine hung - not just the 'dd'; whether the OOM code changed recently. Only once have I been able to get OOM info, all other times, once the system is stuck, there's nothing that can be gathered. Before starting the dd: [root@link-08 ~]# dmsetup table snapper-lvol4: 0 8388608 snapshot 253:3 253:16 P 64 snapper-lvol5-cow: 0 5242880 linear 8:65 45089152 snapper-lvol3: 0 8388608 snapshot 253:3 253:14 P 64 snapper-fs_snap3-cow: 0 5242880 linear 8:65 50332032 snapper-lvol2: 0 8388608 snapshot 253:3 253:12 P 64 snapper-lvol1: 0 8388608 snapshot 253:3 253:10 P 64 snapper-block_snap16-cow: 0 5242880 linear 8:65 8388992 snapper-lvol1-cow: 0 5242880 linear 8:65 24117632 snapper-block_snap16: 0 8388608 snapshot 253:3 253:4 P 64 snapper-origin-real: 0 8388608 linear 8:65 384 snapper-lvol2-cow: 0 5242880 linear 8:65 29360512 snapper-fs_snap3: 0 8388608 snapshot 253:3 253:20 P 64 VolGroup00-LogVol01: 0 4063232 linear 3:6 110756224 snapper-lvol3-cow: 0 5242880 linear 8:65 34603392 snapper-fs_snap1-cow: 0 5242880 linear 8:65 13631872 snapper-fs_snap2: 0 8388608 snapshot 253:3 253:8 P 64 VolGroup00-LogVol00: 0 110755840 linear 3:6 384 snapper-fs_snap1: 0 8388608 snapshot 253:3 253:6 P 64 snapper-lvol4-cow: 0 5242880 linear 8:65 39846272 snapper-lvol5: 0 8388608 snapshot 253:3 253:18 P 64 snapper-origin: 0 8388608 snapshot-origin 253:3 snapper-fs_snap2-cow: 0 5242880 linear 8:65 18874752 [root@link-08 ~]# dmsetup info -c Name Maj Min Stat Open Targ Event UUID snapper-lvol4 253 17 L--w 0 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAnrngHzPAvGNNckXR84AFbdLaj0vNx7gk snapper-lvol5-cow 253 18 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAVMFqisU4PcQsc08zrOlNaFP1GeUCXRtk-cow snapper-lvol3 253 15 L--w 0 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAWi5Wk3AtT0Qe6pZ74HrS5UnIfoinFEZ6 snapper-fs_snap3-cow 253 20 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAtSEgZuQaP3wBJA0CAIsT4Q73KihDAtdT-cow snapper-lvol2 253 13 L--w 0 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAf577gVx32xEQ2pIEHqtS4oeV73gFEoiv snapper-lvol1 253 11 L--w 0 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAI9cxvp6G7nz5THCjX5c45iwgeblZ8v07 snapper-block_snap16-cow 253 4 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phASm3SAqUw1jrIgOBgD2S8j7HL6PDevqSo-cow snapper-lvol1-cow 253 10 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAI9cxvp6G7nz5THCjX5c45iwgeblZ8v07-cow snapper-block_snap16 253 5 L--w 0 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phASm3SAqUw1jrIgOBgD2S8j7HL6PDevqSo snapper-origin-real 253 3 L--w 10 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAVrzSEFk2NZ72pQfgGcY4Kw1Me5YI7LHl-real snapper-lvol2-cow 253 12 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAf577gVx32xEQ2pIEHqtS4oeV73gFEoiv-cow snapper-fs_snap3 253 21 L--w 0 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAtSEgZuQaP3wBJA0CAIsT4Q73KihDAtdT VolGroup00-LogVol01 253 1 L--w 1 1 0 LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxFLlUlLBlQdoGxn84xAMvc9QnYz7u0DfS snapper-lvol3-cow 253 14 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAWi5Wk3AtT0Qe6pZ74HrS5UnIfoinFEZ6-cow snapper-fs_snap1-cow 253 6 L--w 1 1 2 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAme51HdGB4OULlMijXQ9YJlRjLVhXYjWs-cow snapper-fs_snap2 253 9 L--w 0 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAw2omRxMF3HRhRXaDk2ClUlyIdD7jFV6t VolGroup00-LogVol00 253 0 L--w 1 1 0 LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxIn0DTEG9BvNPGlXNaP75R49PXzGI1YRg snapper-fs_snap1 253 7 L--w 0 1 2 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAme51HdGB4OULlMijXQ9YJlRjLVhXYjWs snapper-lvol4-cow 253 16 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAnrngHzPAvGNNckXR84AFbdLaj0vNx7gk-cow snapper-lvol5 253 19 L--w 0 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAVMFqisU4PcQsc08zrOlNaFP1GeUCXRtk snapper-origin 253 2 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAVrzSEFk2NZ72pQfgGcY4Kw1Me5YI7LHl snapper-fs_snap2-cow 253 8 L--w 1 1 0 LVM-YK3wqOcHPDyu25RU96XW2YFSw1lU0phAw2omRxMF3HRhRXaDk2ClUlyIdD7jFV6t-cow After the dd: Last dmsetup table: snapper-lvol4: 0 8388608 snapshot 253:3 253:16 P 64 snapper-lvol5-cow: 0 5242880 linear 8:65 45089152 snapper-lvol3: 0 8388608 snapshot 253:3 253:14 P 64 snapper-fs_snap3-cow: 0 5242880 linear 8:65 50332032 snapper-lvol2: 0 8388608 snapshot 253:3 253:12 P 64 snapper-lvol1: 0 8388608 snapshot 253:3 253:10 P 64 snapper-block_snap16-cow: 0 5242880 linear 8:65 8388992 snapper-lvol1-cow: 0 5242880 linear 8:65 24117632 snapper-block_snap16: 0 8388608 snapshot 253:3 253:4 P 64 snapper-origin-real: 0 8388608 linear 8:65 384 snapper-lvol2-cow: 0 5242880 linear 8:65 29360512 snapper-fs_snap3: 0 8388608 snapshot 253:3 253:20 P 64 VolGroup00-LogVol01: 0 4063232 linear 3:6 110756224 snapper-lvol3-cow: 0 5242880 linear 8:65 34603392 snapper-fs_snap1-cow: 0 5242880 linear 8:65 13631872 snapper-fs_snap2: 0 8388608 snapshot 253:3 253:8 P 64 VolGroup00-LogVol00: 0 110755840 linear 3:6 384 snapper-fs_snap1: 0 8388608 snapshot 253:3 253:6 P 64 snapper-lvol4-cow: 0 5242880 linear 8:65 39846272 snapper-lvol5: 0 8388608 snapshot 253:3 253:18 P 64 snapper-origin: 0 8388608 snapshot-origin 253:3 snapper-fs_snap2-cow: 0 5242880 linear 8:65 18874752 Last top: top - 09:07:35 up 42 min, 6 users, load average: 1.39, 0.49, 0.23 Tasks: 77 total, 2 running, 75 sleeping, 0 stopped, 0 zombie Cpu(s): 18.2% us, 79.5% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 2.3% si Mem: 1024432k total, 985052k used, 39380k free, 25396k buffers Swap: 4055788k total, 0k used, 4055788k free, 791080k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11528 root 25 0 49920 424 348 R 80.1 0.0 0:04.51 dd 4020 root 15 0 37280 2948 2300 S 17.8 0.3 0:28.13 sshd 4022 root 25 0 53992 1612 1192 S 13.3 0.2 0:15.80 bash 5202 root 16 0 0 0 0 S 13.3 0.0 0:00.58 pdflush 4376 root 5 -10 0 0 0 S 4.4 0.0 0:04.14 kcopyd 1 root 16 0 4752 572 476 S 0.0 0.1 0:00.57 init 2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1 5 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1 6 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 events/0 7 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 events/1 8 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 khelper 9 root 15 -10 0 0 0 S 0.0 0.0 0:00.00 kacpid 38 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 kblockd/0 39 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 kblockd/1 40 root 15 0 0 0 0 S 0.0 0.0 0:00.00 khubd 52 root 15 0 0 0 0 S 0.0 0.0 0:00.02 pdflush 56 root 13 -10 0 0 0 S 0.0 0.0 0:00.00 aio/0 54 root 15 0 0 0 0 S 0.0 0.0 0:00.00 kswapd1 55 root 25 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0 57 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 aio/1 130 root 24 0 0 0 0 S 0.0 0.0 0:00.00 kseriod 216 root 17 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_0 218 root 16 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_1 [...] Last /proc/slabinfo: 19218192181921819218192181921819218192181921819218192181921819218192181921819218192181921819218192181921819218192181921819218 slabinfo - version: 2.0 # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <batchcount> <limit> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> kcopyd-jobs 168377 168377 360 11 1 : tunables 54 27 8 : slabdata 15307 15307 0 dm-io-5 16 16 4096 1 1 : tunables 24 12 8 : slabdata 16 16 0 dm-io-4 32 32 2048 2 1 : tunables 24 12 8 : slabdata 16 16 0 dm-io-3 64 64 1024 4 1 : tunables 54 27 8 : slabdata 16 16 0 dm-io-2 525 525 256 15 1 : tunables 120 60 8 : slabdata 35 35 60 dm-io-1 256 305 64 61 1 : tunables 120 60 8 : slabdata 5 5 0 dm-io-0 512 675 16 225 1 : tunables 120 60 8 : slabdata 3 3 0 dm-io-bio 906 930 128 31 1 : tunables 120 60 8 : slabdata 30 30 60 fib6_nodes 5 61 64 61 1 : tunables 120 60 8 : slabdata 1 1 0 ip6_dst_cache 4 12 320 12 1 : tunables 54 27 8 : slabdata 1 1 0 ndisc_cache 1 15 256 15 1 : tunables 120 60 8 : slabdata 1 1 0 rawv6_sock 4 4 1024 4 1 : tunables 54 27 8 : slabdata 1 1 0 udpv6_sock 0 0 1024 4 1 : tunables 54 27 8 : slabdata 0 0 0 tcpv6_sock 6 8 1728 4 2 : tunables 24 12 8 : slabdata 2 2 0 rpc_buffers 8 8 2048 2 1 : tunables 24 12 8 : slabdata 4 4 0 rpc_tasks 8 12 320 12 1 : tunables 54 27 8 : slabdata 1 1 0 rpc_inode_cache 6 8 832 4 1 : tunables 54 27 8 : slabdata 2 2 0 ip_fib_alias 10 119 32 119 1 : tunables 120 60 8 : slabdata 1 1 0 ip_fib_hash 10 61 64 61 1 : tunables 120 60 8 : slabdata 1 1 0 dm-snapshot-in 176470 176470 112 35 1 : tunables 120 60 8 : slabdata 5042 5042 0 dm-snapshot-ex 69865 69972 32 119 1 : tunables 120 60 8 : slabdata 588 588 0 ext3_inode_cache 1246 1532 840 4 1 : tunables 54 27 8 : slabdata 370 383 216 ext3_xattr 0 0 88 45 1 : tunables 120 60 8 : slabdata 0 0 0 journal_handle 8 81 48 81 1 : tunables 120 60 8 : slabdata 1 1 0 journal_head 100 135 88 45 1 : tunables 120 60 8 : slabdata 3 3 0 revoke_table 4 225 16 225 1 : tunables 120 60 8 : slabdata 1 1 0 revoke_record 0 0 32 119 1 : tunables 120 60 8 : slabdata 0 0 0 dm_tio 44148 44148 24 156 1 : tunables 120 60 8 : slabdata 283 283 0 dm_io 26748 26784 40 96 1 : tunables 120 60 8 : slabdata 279 279 0 qla2xxx_srbs 164 165 256 15 1 : tunables 120 60 8 : slabdata 11 11 0 scsi_cmd_cache 42 42 512 7 1 : tunables 54 27 8 : slabdata 6 6 0 sgpool-128 46 46 4096 1 1 : tunables 24 12 8 : slabdata 46 46 0 sgpool-64 44 44 2048 2 1 : tunables 24 12 8 : slabdata 22 22 0 sgpool-32 44 44 1024 4 1 : tunables 54 27 8 : slabdata 11 11 0 sgpool-16 48 48 512 8 1 : tunables 54 27 8 : slabdata 6 6 0 sgpool-8 74 75 256 15 1 : tunables 120 60 8 : slabdata 5 5 0 unix_sock 45 50 768 5 1 : tunables 54 27 8 : slabdata 10 10 0 ip_mrt_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 0 tcp_tw_bucket 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 tcp_bind_bucket 6 119 32 119 1 : tunables 120 60 8 : slabdata 1 1 0 tcp_open_request 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 0 inet_peer_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 0 secpath_cache 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 xfrm_dst_cache 0 0 384 10 1 : tunables 54 27 8 : slabdata 0 0 0 ip_dst_cache 25 40 384 10 1 : tunables 54 27 8 : slabdata 4 4 0 arp_cache 2 15 256 15 1 : tunables 120 60 8 : slabdata 1 1 0 raw_sock 3 9 832 9 2 : tunables 54 27 8 : slabdata 1 1 0 udp_sock 7 27 832 9 2 : tunables 54 27 8 : slabdata 3 3 0 tcp_sock 5 5 1536 5 2 : tunables 24 12 8 : slabdata 1 1 0 flow_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 0 mqueue_inode_cache 1 4 896 4 1 : tunables 54 27 8 : slabdata 1 1 0 relayfs_inode_cache 0 0 576 7 1 : tunables 54 27 8 : slabdata 0 0 0 isofs_inode_cache 0 0 616 6 1 : tunables 54 27 8 : slabdata 0 0 0 hugetlbfs_inode_cache 1 6 608 6 1 : tunables 54 27 8 : slabdata 1 1 0 ext2_inode_cache 487 490 736 5 1 : tunables 54 27 8 : slabdata 98 98 0 ext2_xattr 0 0 88 45 1 : tunables 120 60 8 : slabdata 0 0 0 dquot 0 0 224 17 1 : tunables 120 60 8 : slabdata 0 0 0 eventpoll_pwq 1 54 72 54 1 : tunables 120 60 8 : slabdata 1 1 0 eventpoll_epi 1 20 192 20 1 : tunables 120 60 8 : slabdata 1 1 0 kioctx 0 0 384 10 1 : tunables 54 27 8 : slabdata 0 0 0 kiocb 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 dnotify_cache 2 96 40 96 1 : tunables 120 60 8 : slabdata 1 1 0 fasync_cache 0 0 24 156 1 : tunables 120 60 8 : slabdata 0 0 0 shmem_inode_cache 356 360 800 5 1 : tunables 54 27 8 : slabdata 72 72 0 posix_timers_cache 0 0 184 21 1 : tunables 120 60 8 : slabdata 0 0 0 uid_cache 5 31 128 31 1 : tunables 120 60 8 : slabdata 1 1 0 cfq_pool 138 138 56 69 1 : tunables 120 60 8 : slabdata 2 2 0 crq_pool 216 216 72 54 1 : tunables 120 60 8 : slabdata 4 4 0 deadline_drq 0 0 96 41 1 : tunables 120 60 8 : slabdata 0 0 0 as_arq 0 0 112 35 1 : tunables 120 60 8 : slabdata 0 0 0 blkdev_ioc 105 119 32 119 1 : tunables 120 60 8 : slabdata 1 1 0 blkdev_queue 49 54 848 9 2 : tunables 54 27 8 : slabdata 6 6 0 blkdev_requests 180 180 264 15 1 : tunables 54 27 8 : slabdata 12 12 0 biovec-(256) 256 256 4096 1 1 : tunables 24 12 8 : slabdata 256 256 0 biovec-128 20080 20080 2048 2 1 : tunables 24 12 8 : slabdata 10040 10040 0 biovec-64 260 260 1024 4 1 : tunables 54 27 8 : slabdata 65 65 0 biovec-16 660 660 256 15 1 : tunables 120 60 8 : slabdata 44 44 60 biovec-4 288 305 64 61 1 : tunables 120 60 8 : slabdata 5 5 0 biovec-1 40785 40950 16 225 1 : tunables 120 60 8 : slabdata 182 182 0 bio 60915 60915 128 31 1 : tunables 120 60 8 : slabdata 1965 1965 0 file_lock_cache 9 50 160 25 1 : tunables 120 60 8 : slabdata 2 2 0 sock_inode_cache 64 70 704 5 1 : tunables 54 27 8 : slabdata 14 14 0 skbuff_head_cache 325 336 320 12 1 : tunables 54 27 8 : slabdata 28 28 81 sock 8 18 640 6 1 : tunables 54 27 8 : slabdata 3 3 0 proc_inode_cache 595 612 600 6 1 : tunables 54 27 8 : slabdata 102 102 189 sigqueue 92 92 168 23 1 : tunables 120 60 8 : slabdata 4 4 0 radix_tree_node 3877 4970 536 7 1 : tunables 54 27 8 : slabdata 709 710 216 bdev_cache 54 65 768 5 1 : tunables 54 27 8 : slabdata 13 13 0 mnt_cache 28 40 192 20 1 : tunables 120 60 8 : slabdata 2 2 0 audit_watch_cache 0 0 88 45 1 : tunables 120 60 8 : slabdata 0 0 0 inode_cache 995 1183 568 7 1 : tunables 54 27 8 : slabdata 165 169 216 dentry_cache 2644 5216 240 16 1 : tunables 120 60 8 : slabdata 326 326 480 filp 795 825 256 15 1 : tunables 120 60 8 : slabdata 55 55 0 names_cache 26 26 4096 1 1 : tunables 24 12 8 : slabdata 26 26 0 avc_node 13 486 72 54 1 : tunables 120 60 8 : slabdata 9 9 0 key_jar 10 20 192 20 1 : tunables 120 60 8 : slabdata 1 1 0 idr_layer_cache 77 77 528 7 1 : tunables 54 27 8 : slabdata 11 11 0 buffer_head 195390 197055 88 45 1 : tunables 120 60 8 : slabdata 4379 4379 0 mm_struct 77 77 1152 7 2 : tunables 24 12 8 : slabdata 11 11 0 vm_area_struct 2008 2420 176 22 1 : tunables 120 60 8 : slabdata 110 110 60 fs_cache 180 183 64 61 1 : tunables 120 60 8 : slabdata 3 3 0 files_cache 108 108 832 9 2 : tunables 54 27 8 : slabdata 12 12 0 signal_cache 149 150 256 15 1 : tunables 120 60 8 : slabdata 10 10 0 sighand_cache 93 93 Last ps: F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 4 S root 1 0 0 76 0 - 1188 - 08:25 ? 00:00:00 init [3] 1 S root 2 1 0 -40 - - 0 migrat 08:25 ? 00:00:00 [migration/0] 1 S root 3 1 0 94 19 - 0 ksofti 08:25 ? 00:00:00 [ksoftirqd/0] 1 S root 4 1 0 -40 - - 0 migrat 08:25 ? 00:00:00 [migration/1] 1 S root 5 1 0 94 19 - 0 ksofti 08:25 ? 00:00:00 [ksoftirqd/1] 1 S root 6 1 0 65 -10 - 0 worker 08:25 ? 00:00:00 [events/0] 1 S root 7 1 0 65 -10 - 0 worker 08:25 ? 00:00:00 [events/1] 1 S root 8 6 0 65 -10 - 0 worker 08:25 ? 00:00:00 [khelper] 1 S root 9 6 0 75 -10 - 0 worker 08:25 ? 00:00:00 [kacpid] 1 S root 38 6 0 65 -10 - 0 worker 08:25 ? 00:00:00 [kblockd/0] 1 S root 39 6 0 65 -10 - 0 worker 08:25 ? 00:00:00 [kblockd/1] 1 S root 40 1 0 75 0 - 0 hub_th 08:25 ? 00:00:00 [khubd] 1 S root 52 6 0 75 0 - 0 pdflus 08:25 ? 00:00:00 [pdflush] 1 S root 56 6 0 73 -10 - 0 worker 08:25 ? 00:00:00 [aio/0] 1 S root 54 1 0 75 0 - 0 kswapd 08:25 ? 00:00:00 [kswapd1] 1 S root 55 1 0 75 0 - 0 kswapd 08:25 ? 00:00:00 [kswapd0] 1 S root 57 6 0 65 -10 - 0 worker 08:25 ? 00:00:00 [aio/1] 1 S root 130 1 0 84 0 - 0 serio_ 08:25 ? 00:00:00 [kseriod] 1 S root 216 1 0 77 0 - 0 160455 08:25 ? 00:00:00 [scsi_eh_0] 1 S root 218 1 0 76 0 - 0 160455 08:25 ? 00:00:00 [scsi_eh_1] 1 S root 235 1 0 80 0 - 0 160455 08:25 ? 00:00:00 [scsi_eh_2] 1 S root 236 1 0 60 -20 - 0 160455 08:25 ? 00:00:00 [qla2300_2_dpc] 1 S root 279 6 0 68 -10 - 0 worker 08:25 ? 00:00:00 [kmirrord] 1 S root 280 6 0 68 -10 - 0 worker 08:25 ? 00:00:00 [kmir_mon] 1 S root 292 1 0 75 0 - 0 kjourn 08:25 ? 00:00:00 [kjournald] 4 S root 1270 1 0 66 -10 - 901 - 08:25 ? 00:00:00 udevd 1 S root 1621 7 0 66 -10 - 0 kaudit 08:25 ? 00:00:00 [kauditd] 1 S root 1899 1 0 66 -10 - 1106 - 08:25 ? 00:00:00 /sbin/dhclient -1 -q -lf /var/lib/dhcp/dhclient 1 S root 2161 1 0 79 0 - 0 kjourn 08:25 ? 00:00:00 [kjournald] 5 S root 2660 1 0 75 0 - 1106 - 08:25 ? 00:00:00 /sbin/dhclient -1 -q -lf /var/lib/dhcp/dhclient 5 S root 2704 1 0 76 0 - 1445 - 08:25 ? 00:00:00 syslogd -m 0 5 S root 2708 1 0 76 0 - 634 syslog 08:25 ? 00:00:00 klogd -x 5 S root 2719 1 0 76 0 - 633 - 08:25 ? 00:00:00 irqbalance 5 S rpc 2730 1 0 75 0 - 1187 - 08:25 ? 00:00:00 portmap 5 S rpcuser 2750 1 0 78 0 - 1989 - 08:25 ? 00:00:00 rpc.statd 1 S root 2785 1 0 76 0 - 4965 - 08:25 ? 00:00:00 rpc.idmapd 5 S root 2871 1 0 77 0 - 728 - 08:25 ? 00:00:00 /usr/sbin/smartd 5 S root 2881 1 0 79 0 - 635 - 08:25 ? 00:00:00 /usr/sbin/acpid 5 S root 2893 1 0 76 0 - 18053 - 08:25 ? 00:00:00 cupsd 5 S root 2957 1 0 75 0 - 5490 - 08:25 ? 00:00:00 /usr/sbin/sshd 5 S root 2972 1 0 76 0 - 2178 - 08:25 ? 00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid 5 S root 2992 1 0 76 0 - 9010 - 08:25 ? 00:00:00 sendmail: accepting connections 1 S smmsp 3000 1 0 80 0 - 7209 pause 08:25 ? 00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/ 5 S root 3034 1 0 79 0 - 1044 - 08:25 ? 00:00:00 gpm -m /dev/input/mice -t exps2 5 S root 3141 1 0 76 0 - 14280 - 08:25 ? 00:00:00 crond 5 S xfs 3164 1 0 76 0 - 2565 - 08:25 ? 00:00:00 xfs -droppriv -daemon 5 S root 3183 1 0 76 0 - 2238 - 08:25 ? 00:00:00 /usr/sbin/atd 5 S dbus 3199 1 0 76 0 - 2421 - 08:25 ? 00:00:00 dbus-daemon-1 --system 5 S root 3210 1 0 78 0 - 12476 - 08:25 ? 00:00:00 rhnsd --interval 240 5 S root 3220 1 0 79 0 - 2248 - 08:25 ? 00:00:00 cups-config-daemon 5 S root 3231 1 0 76 0 - 5183 - 08:25 ? 00:00:01 hald 4 S root 3240 1 0 76 0 - 6902 wait 08:25 ? 00:00:00 login -- root 4 S root 3241 1 0 78 0 - 631 - 08:25 tty1 00:00:00 /sbin/mingetty tty1 4 S root 3242 1 0 78 0 - 631 - 08:25 tty2 00:00:00 /sbin/mingetty tty2 4 S root 3243 1 0 78 0 - 631 - 08:25 tty3 00:00:00 /sbin/mingetty tty3 4 S root 3244 1 0 78 0 - 631 - 08:25 tty4 00:00:00 /sbin/mingetty tty4 4 S root 3245 1 0 80 0 - 631 - 08:25 tty5 00:00:00 /sbin/mingetty tty5 4 S root 3246 1 0 78 0 - 631 - 08:25 tty6 00:00:00 /sbin/mingetty tty6 4 S root 3918 2957 0 76 0 - 9321 - 08:26 ? 00:00:00 sshd: root@pts/0 4 S root 3920 3918 0 77 0 - 13497 wait 08:26 pts/0 00:00:00 -bash 4 S root 3952 2957 0 76 0 - 9281 - 08:26 ? 00:00:00 sshd: root@pts/1 4 S root 3954 3952 0 76 0 - 13498 wait 08:26 pts/1 00:00:00 -bash 4 S root 3986 2957 0 75 0 - 9281 - 08:26 ? 00:00:00 sshd: root@pts/2 4 S root 3988 3986 0 76 0 - 13497 wait 08:26 pts/2 00:00:00 -bash 4 S root 4020 2957 1 76 0 - 9320 - 08:26 ? 00:00:28 sshd: root@pts/3 4 S root 4022 4020 0 85 0 - 13498 wait 08:26 pts/3 00:00:15 -bash 4 S root 4277 3240 0 76 0 - 13495 wait 08:27 ttyS0 00:00:00 -bash 4 S root 4308 4277 0 76 0 - 13435 - 08:27 ttyS0 00:00:00 tail -f /var/log/messages 1 S root 4376 7 0 65 -10 - 0 worker 08:28 ? 00:00:04 [kcopyd] 4 S root 4632 3920 0 76 0 - 1537 - 08:30 pts/0 00:00:02 top 1 S root 5202 7 0 76 0 - 0 pdflus 08:37 ? 00:00:00 [pdflush] 4 S root 19162 2957 0 76 0 - 9321 - 09:06 ? 00:00:00 sshd: root@pts/4 4 S root 19218 19162 0 76 0 - 13497 wait 09:06 pts/4 00:00:00 -bash 4 R root 11528 3954 80 85 0 - 12480 - 09:07 pts/1 00:00:05 dd if /dev/zero of /mnt/origin/ddfile count 100 4 R root 13379 3988 0 78 0 - 1361 - 09:07 pts/2 00:00:00 ps -efl 4 R root 13403 4022 0 85 0 - 0 - 09:07 pts/3 00:00:00 [dmsetup] 4 R root 13405 19218 0 78 0 - 0 - 09:07 pts/4 00:00:00 [cat] FWIW, I've only been able to hit this so far while dd'ing to a file in a filesystem on top of the origin volume. When I attempt to dd straight to the block device I don't see it. I must not be using the same block sizes or transfer sizes that the files system does. Reproduces it every time: dd if=/dev/zero of=/mnt/origin/ddfile count=10000000 The following to the device itself does not: [root@link-08 bin]# dd if=/dev/zero of=/dev/snapper/origin count=10000000 bs=256 dd: writing `/dev/snapper/origin': No space left on device 8388609+0 records in 8388608+0 records out [root@link-08 bin]# dd if=/dev/zero of=/dev/snapper/origin count=10000000 dd: writing to `/dev/snapper/origin': No space left on device 4194305+0 records in 4194304+0 records out [root@link-08 bin]# dd if=/dev/zero of=/dev/snapper/origin count=10000000 bs=1024 dd: writing `/dev/snapper/origin': No space left on device 2097153+0 records in 2097152+0 records out [root@link-08 bin]# dd if=/dev/zero of=/dev/snapper/origin count=10000000 bs=2048 dd: writing `/dev/snapper/origin': No space left on device 1048577+0 records in 1048576+0 records out Here's a historical look at meminfo as this issue occurs: MemFree drops from 687260 kB to 15316 kB Buffers drops from 27976 kB to 980 kB Cached raises from 232668 kB to 801240 kB SwapCached raises 0 kB to 112 kB LowFree drops from 687260 kB to 15316 kB Dirty raises from 112 kB to 203544 kB ********************************** MemTotal: 1024432 kB MemFree: 687260 kB Buffers: 27976 kB Cached: 232668 kB SwapCached: 0 kB Active: 257900 kB Inactive: 22760 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 687260 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 112 kB Writeback: 0 kB Mapped: 27408 kB Slab: 23188 kB Committed_AS: 41024 kB PageTables: 1936 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 650068 kB Buffers: 28024 kB Cached: 268796 kB SwapCached: 0 kB Active: 294108 kB Inactive: 22764 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 650068 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 36316 kB Writeback: 0 kB Mapped: 27508 kB Slab: 24056 kB Committed_AS: 41260 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 583124 kB Buffers: 28088 kB Cached: 334080 kB SwapCached: 0 kB Active: 359428 kB Inactive: 22824 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 583124 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 101656 kB Writeback: 0 kB Mapped: 27508 kB Slab: 25668 kB Committed_AS: 41260 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 515348 kB Buffers: 28152 kB Cached: 400112 kB SwapCached: 0 kB Active: 425508 kB Inactive: 22880 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 515348 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 167800 kB Writeback: 0 kB Mapped: 27508 kB Slab: 27300 kB Committed_AS: 41260 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 451220 kB Buffers: 28216 kB Cached: 458528 kB SwapCached: 0 kB Active: 483904 kB Inactive: 22908 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 451220 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 205520 kB Writeback: 20724 kB Mapped: 27508 kB Slab: 33368 kB Committed_AS: 41260 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 370900 kB Buffers: 28276 kB Cached: 523340 kB SwapCached: 0 kB Active: 548784 kB Inactive: 22900 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 370900 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 204092 kB Writeback: 87028 kB Mapped: 27508 kB Slab: 48784 kB Committed_AS: 41260 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 300564 kB Buffers: 28336 kB Cached: 579992 kB SwapCached: 0 kB Active: 605520 kB Inactive: 22948 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 300564 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 202848 kB Writeback: 145044 kB Mapped: 27508 kB Slab: 62200 kB Committed_AS: 41224 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 237716 kB Buffers: 28388 kB Cached: 631348 kB SwapCached: 0 kB Active: 656980 kB Inactive: 22904 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 237716 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 204500 kB Writeback: 194772 kB Mapped: 27508 kB Slab: 73696 kB Committed_AS: 41224 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 165908 kB Buffers: 28444 kB Cached: 689704 kB SwapCached: 0 kB Active: 715432 kB Inactive: 22852 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 165908 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 204924 kB Writeback: 252788 kB Mapped: 27508 kB Slab: 87216 kB Committed_AS: 41224 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 98260 kB Buffers: 28504 kB Cached: 744656 kB SwapCached: 0 kB Active: 770464 kB Inactive: 22856 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 98260 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 206092 kB Writeback: 306660 kB Mapped: 27508 kB Slab: 99732 kB Committed_AS: 41224 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 24404 kB Buffers: 28560 kB Cached: 804100 kB SwapCached: 0 kB Active: 829952 kB Inactive: 22868 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 24404 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 203432 kB Writeback: 368820 kB Mapped: 27508 kB Slab: 114092 kB Committed_AS: 41224 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 16212 kB Buffers: 6076 kB Cached: 823048 kB SwapCached: 0 kB Active: 636536 kB Inactive: 212656 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 16212 kB SwapTotal: 4055788 kB SwapFree: 4055788 kB Dirty: 205996 kB Writeback: 422692 kB Mapped: 27508 kB Slab: 125704 kB Committed_AS: 41224 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 16084 kB Buffers: 908 kB Cached: 813144 kB SwapCached: 112 kB Active: 249220 kB Inactive: 584372 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 16084 kB SwapTotal: 4055788 kB SwapFree: 4054932 kB Dirty: 206408 kB Writeback: 501552 kB Mapped: 26392 kB Slab: 141612 kB Committed_AS: 41224 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB ********************************** MemTotal: 1024432 kB MemFree: 15316 kB Buffers: 980 kB Cached: 801240 kB SwapCached: 112 kB Active: 261168 kB Inactive: 560600 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 1024432 kB LowFree: 15316 kB SwapTotal: 4055788 kB SwapFree: 4054932 kB Dirty: 203544 kB Writeback: 559568 kB Mapped: 26396 kB Slab: 153936 kB Committed_AS: 41224 kB PageTables: 1964 kB VmallocTotal: 536870911 kB VmallocUsed: 13516 kB VmallocChunk: 536857363 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB How many snapshots did you have against the origin when you hit this bug? (I suppose I could count, so let me ask a different way....) What is the minimum amount of snapshots against the origin which are required to trigger this bug? Are there certain sizes that trigger it more easily? Can you get this down to a cleaner/smaller setup and still recreate? Also, (perhaps not as important) what file system are you using? Did you create any of your snapshots during the test? Are you blowing off the end of your snapshot devices? 'cause your origin is 4G and your snaps 2.5G, and you are changing all the blocks. Perhaps this is a corner case that occurs when you fill up your snapshots? 10 snapshots, 4G origin, 2.5G snapshots, ext3... all seems fine so far (30 min). How long does it usually take to reproduce? It takes about 5 seconds after running the dd command for the system to hang. Here's a little more detail about the snapshots at the time of failure. None of them are near the full limit. LV VG Attr LSize Origin Snap% Move Log Copy% block_snap16 snapper swi-a- 2.50G origin 28.77 fs_snap1 snapper swi-a- 2.50G origin 28.85 fs_snap2 snapper swi-a- 2.50G origin 28.85 fs_snap3 snapper swi-a- 2.50G origin 26.27 lvol1 snapper swi-a- 2.50G origin 28.85 lvol2 snapper swi-a- 2.50G origin 28.85 lvol3 snapper swi-a- 2.50G origin 28.85 lvol4 snapper swi-a- 2.50G origin 28.85 lvol5 snapper swi-a- 2.50G origin 28.85 origin snapper owi-ao 4.00G I'm currently working on narrowing this case for you guys... I attempted to recreate this issue on two other machines, one with 8G of memory (taft-04) and another 1G memory machine (link-02) just like link-08, and I was only able to recreate this problem on the 1G machine. The 8G memory machine ran just fine. can you reproduce on a 32-bit machine? I've narrowed this down to just a few snapshots of the origin with an fs on it. The I/O in this case does not cause the system to hang but does cause lvm to run real slow and to appear hung for 20 seconds or so. An strace show that lvm is spending most of its time waiting in "ioctl(10, DM_DEV_STATUS". [root@link-08 ~]# lvs LV VG Attr LSize Origin Snap% Move Log Copy% fs_snap1 snapper Swi-I- 2.50G origin 100.00 fs_snap2 snapper Swi-I- 2.50G origin 100.00 fs_snap3 snapper Swi-I- 2.50G origin 100.00 origin snapper owi-ao 4.00G [root@link-08 ~]# lvscan inactive Original '/dev/snapper/origin' [4.00 GB] inherit inactive Snapshot '/dev/snapper/fs_snap1' [2.50 GB] inherit inactive Snapshot '/dev/snapper/fs_snap2' [2.50 GB] inherit inactive Snapshot '/dev/snapper/fs_snap3' [2.50 GB] inherit Also, why do these volumes show up as inactive? They should be active. Also don't let the 100% full snapshots confuse you, they only filled up because all the dd I/O was able to finish and fill the origin filesystem. This isn't a full snap issue. In response to comment #11, the snaps are inactive because they are full. I too have noticed the long delay between issuing the lvs and receiving the results... not sure if that's related to this bug or not though... I can reproduce the lvm slow down but not the complete lock up on 32-bit machines. However, both the machines I attempted this on have either 2G or 4G of memory. No update for 5 years - closing. |