Description of problem: 3 replica volume. Started untarring linux kernel and immidietly brought two bricks down. After many files have been created on the mount point, brought the bricks up (via volume start force), and started self-healing. While self-healing was going in the CPU usage of the bricks was very high going up 330% (4 core CPU). Noramal i/o on the mount point (without any self-heal going on) does not make brick processes use that much CPU. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. create a 3 replica volume, start it and mount it. 2. start untarring linux kernel and immedietly bring 2 bricks down. 3. after many files and directories have been created, bring the bricks up and start self-healing via volume heal command (or both volume heal as well as find | xargs stat). 4. Observe the CPU usage of the brick processes using top command. Actual results: The CPU usage of the brick processes is very high sometimes going upto 330% Expected results: Self-heal should not make brick processes use high CPU. Additional info: gluster volume profile output: Brick: hyperspace:/mnt/sda7/export3 ----------------------------------- Cumulative Stats: Block Size: 4b+ 8b+ 16b+ No. of Reads: 0 0 0 No. of Writes: 2 7 3 Block Size: 32b+ 64b+ 128b+ No. of Reads: 0 0 0 No. of Writes: 3 9 18 Block Size: 256b+ 512b+ 1024b+ No. of Reads: 0 0 0 No. of Writes: 47 86 173 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 0 0 0 No. of Writes: 274 279 199 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 0 0 0 No. of Writes: 108 31 15 Block Size: 131072b+ No. of Reads: 0 No. of Writes: 1 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1258 RELEASE 0.00 0.00 us 0.00 us 0.00 us 109 RELEASEDIR 0.00 101.25 us 14.00 us 206.00 us 4 GETXATTR 0.00 889.75 us 27.00 us 3215.00 us 4 READDIR 0.00 226.05 us 59.00 us 806.00 us 110 OPENDIR 0.01 48.48 us 11.00 us 2238.00 us 1254 FLUSH 0.02 84.22 us 24.00 us 1847.00 us 1255 WRITE 0.02 532.73 us 12.00 us 83930.00 us 220 READDIRP 0.03 130.33 us 28.00 us 44072.00 us 1254 FTRUNCATE 0.04 146.86 us 37.00 us 1639.00 us 1258 OPEN 0.04 956.81 us 15.00 us 197329.00 us 219 ENTRYLK 0.05 95.46 us 12.00 us 93841.00 us 2512 FSTAT 0.06 222.69 us 29.00 us 38979.00 us 1365 SETXATTR 0.08 279.55 us 65.00 us 22710.00 us 1474 XATTROP 0.11 146.94 us 25.00 us 25829.00 us 3766 FXATTROP 0.16 81.21 us 12.00 us 50296.00 us 10266 INODELK 6.93 316790.25 us 286.00 us 638685.00 us 112 MKDIR 22.35 21277.84 us 37.00 us 1686348.00 us 5375 SETATTR 24.34 98222.00 us 141.00 us 1723258.00 us 1268 MKNOD 45.75 41585.95 us 24.00 us 2227451.00 us 5630 LOOKUP Duration: 67 seconds Data Read: 0 bytes Data Written: 10417753 bytes Interval 0 Stats: Block Size: 4b+ 8b+ 16b+ No. of Reads: 0 0 0 No. of Writes: 2 7 3 Block Size: 32b+ 64b+ 128b+ No. of Reads: 0 0 0 No. of Writes: 3 9 18 Block Size: 256b+ 512b+ 1024b+ No. of Reads: 0 0 0 No. of Writes: 47 86 173 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 0 0 0 No. of Writes: 274 279 199 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 0 0 0 No. of Writes: 108 31 15 Block Size: 131072b+ No. of Reads: 0 No. of Writes: 1 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1258 RELEASE 0.00 0.00 us 0.00 us 0.00 us 109 RELEASEDIR 0.00 101.25 us 14.00 us 206.00 us 4 GETXATTR 0.00 889.75 us 27.00 us 3215.00 us 4 READDIR 0.00 226.05 us 59.00 us 806.00 us 110 OPENDIR 0.01 48.48 us 11.00 us 2238.00 us 1254 FLUSH 0.02 84.22 us 24.00 us 1847.00 us 1255 WRITE 0.02 532.73 us 12.00 us 83930.00 us 220 READDIRP 0.03 130.33 us 28.00 us 44072.00 us 1254 FTRUNCATE 0.04 146.86 us 37.00 us 1639.00 us 1258 OPEN 0.04 956.81 us 15.00 us 197329.00 us 219 ENTRYLK 0.05 95.46 us 12.00 us 93841.00 us 2512 FSTAT 0.06 222.69 us 29.00 us 38979.00 us 1365 SETXATTR 0.08 279.55 us 65.00 us 22710.00 us 1474 XATTROP 0.11 146.94 us 25.00 us 25829.00 us 3766 FXATTROP 0.16 81.21 us 12.00 us 50296.00 us 10266 INODELK 6.93 316790.25 us 286.00 us 638685.00 us 112 MKDIR 22.35 21277.84 us 37.00 us 1686348.00 us 5375 SETATTR 24.34 98222.00 us 141.00 us 1723258.00 us 1268 MKNOD 45.75 41585.95 us 24.00 us 2227451.00 us 5630 LOOKUP Duration: 67 seconds Data Read: 0 bytes Data Written: 10417753 bytes Brick: hyperspace:/mnt/sda8/export3 ----------------------------------- Cumulative Stats: Block Size: 4b+ 8b+ 16b+ No. of Reads: 0 0 0 No. of Writes: 2 7 3 Block Size: 32b+ 64b+ 128b+ No. of Reads: 1 0 1 No. of Writes: 2 9 17 Block Size: 256b+ 512b+ 1024b+ No. of Reads: 3 10 9 No. of Writes: 44 76 164 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 12 9 9 No. of Writes: 262 270 190 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 5 0 0 No. of Writes: 103 31 15 Block Size: 131072b+ No. of Reads: 0 No. of Writes: 1 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1267 RELEASE 0.00 0.00 us 0.00 us 0.00 us 109 RELEASEDIR 0.00 128.50 us 37.00 us 345.00 us 4 READDIR 0.00 153.21 us 12.00 us 5966.00 us 133 GETXATTR 0.00 215.69 us 60.00 us 581.00 us 110 OPENDIR 0.00 152.16 us 11.00 us 3708.00 us 220 READDIRP 0.00 615.58 us 23.00 us 10037.00 us 59 READ 0.01 59.79 us 9.00 us 9456.00 us 1254 FLUSH 0.01 82.29 us 20.00 us 6382.00 us 1196 WRITE 0.02 133.58 us 35.00 us 9972.00 us 1267 OPEN 0.03 150.61 us 12.00 us 120670.00 us 1258 FSTAT 0.04 1209.46 us 13.00 us 256704.00 us 221 ENTRYLK 0.04 240.53 us 26.00 us 184150.00 us 1195 FTRUNCATE 0.06 312.15 us 28.00 us 227518.00 us 1302 SETXATTR 0.06 274.79 us 57.00 us 15597.00 us 1517 XATTROP 0.10 192.96 us 25.00 us 191736.00 us 3766 FXATTROP 0.13 91.06 us 12.00 us 218203.00 us 10286 INODELK 6.44 430751.18 us 297.00 us 766979.00 us 110 MKDIR 22.73 138777.25 us 141.00 us 2219134.00 us 1206 MKNOD 31.85 45196.90 us 36.00 us 2108323.00 us 5188 SETATTR 38.48 50954.00 us 17.00 us 2281124.00 us 5560 LOOKUP Duration: 67 seconds Data Read: 350181 bytes Data Written: 10067572 bytes Interval 0 Stats: Block Size: 4b+ 8b+ 16b+ No. of Reads: 0 0 0 No. of Writes: 2 7 3 Block Size: 32b+ 64b+ 128b+ No. of Reads: 1 0 1 No. of Writes: 2 9 17 Block Size: 256b+ 512b+ 1024b+ No. of Reads: 3 10 9 No. of Writes: 44 76 164 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 12 9 9 No. of Writes: 262 270 190 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 5 0 0 No. of Writes: 103 31 15 Block Size: 131072b+ No. of Reads: 0 No. of Writes: 1 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1267 RELEASE 0.00 0.00 us 0.00 us 0.00 us 109 RELEASEDIR 0.00 128.50 us 37.00 us 345.00 us 4 READDIR 0.00 153.21 us 12.00 us 5966.00 us 133 GETXATTR 0.00 215.69 us 60.00 us 581.00 us 110 OPENDIR 0.00 152.16 us 11.00 us 3708.00 us 220 READDIRP 0.00 615.58 us 23.00 us 10037.00 us 59 READ 0.01 59.79 us 9.00 us 9456.00 us 1254 FLUSH 0.01 82.29 us 20.00 us 6382.00 us 1196 WRITE 0.02 133.58 us 35.00 us 9972.00 us 1267 OPEN 0.03 150.61 us 12.00 us 120670.00 us 1258 FSTAT 0.04 1209.46 us 13.00 us 256704.00 us 221 ENTRYLK 0.04 240.53 us 26.00 us 184150.00 us 1195 FTRUNCATE 0.06 312.15 us 28.00 us 227518.00 us 1302 SETXATTR 0.06 274.79 us 57.00 us 15597.00 us 1517 XATTROP 0.10 192.96 us 25.00 us 191736.00 us 3766 FXATTROP 0.13 91.06 us 12.00 us 218203.00 us 10286 INODELK 6.44 430751.18 us 297.00 us 766979.00 us 110 MKDIR 22.73 138777.25 us 141.00 us 2219134.00 us 1206 MKNOD 31.85 45196.90 us 36.00 us 2108323.00 us 5188 SETATTR 38.48 50954.00 us 17.00 us 2281124.00 us 5560 LOOKUP Duration: 67 seconds Data Read: 350181 bytes Data Written: 10067572 bytes Brick: hyperspace:/mnt/sda10/export3 ------------------------------------ Cumulative Stats: Block Size: 4b+ 8b+ 16b+ No. of Reads: 6 36 30 No. of Writes: 11 83 184 Block Size: 32b+ 64b+ 128b+ No. of Reads: 39 111 195 No. of Writes: 277 638 959 Block Size: 256b+ 512b+ 1024b+ No. of Reads: 383 715 1107 No. of Writes: 1781 4292 6018 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 1549 1523 1107 No. of Writes: 7115 6417 3346 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 707 185 58 No. of Writes: 1240 289 94 Block Size: 131072b+ No. of Reads: 5 No. of Writes: 0 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 21196 FORGET 0.00 0.00 us 0.00 us 0.00 us 30997 RELEASE 0.00 0.00 us 0.00 us 0.00 us 2067 RELEASEDIR 0.01 243.01 us 75.00 us 720.00 us 108 OPENDIR 0.01 909.28 us 49.00 us 2267.00 us 29 READDIR 0.03 72.36 us 21.00 us 8120.00 us 1196 READ 0.05 579.51 us 15.00 us 13165.00 us 215 READDIRP 0.08 164.16 us 50.00 us 7185.00 us 1280 OPEN 0.10 55.86 us 10.00 us 8981.00 us 4526 FSTAT 0.18 99.87 us 9.00 us 38436.00 us 4577 FLUSH 0.20 107.71 us 18.00 us 3017.00 us 4772 WRITE 0.28 297.70 us 33.00 us 99920.00 us 2322 STAT 0.51 174.08 us 11.00 us 456775.00 us 7343 ENTRYLK 2.16 281.26 us 9.00 us 471228.00 us 19260 FINODELK 2.17 457.72 us 32.00 us 537076.00 us 11917 SETATTR 3.26 34630.90 us 189.00 us 3570334.00 us 236 MKDIR 5.31 347.79 us 10.00 us 824600.00 us 38325 INODELK 5.82 964.51 us 21.00 us 1654664.00 us 15154 GETXATTR 16.07 4763.41 us 22.00 us 314613.00 us 8471 LOOKUP 16.48 12452.35 us 168.00 us 2889597.00 us 3323 CREATE 21.84 4120.66 us 30.00 us 3455974.00 us 13310 FXATTROP 25.43 2049.73 us 53.00 us 1918829.00 us 31148 XATTROP Duration: 5654 seconds Data Read: 58794837 bytes Data Written: 154488832 bytes Interval 0 Stats: Block Size: 4b+ 8b+ 16b+ No. of Reads: 6 36 30 No. of Writes: 11 83 184 Block Size: 32b+ 64b+ 128b+ No. of Reads: 39 111 195 No. of Writes: 277 638 959 Block Size: 256b+ 512b+ 1024b+ No. of Reads: 383 715 1107 No. of Writes: 1781 4292 6018 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 1549 1523 1107 No. of Writes: 7115 6417 3346 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 707 185 58 No. of Writes: 1240 289 94 Block Size: 131072b+ No. of Reads: 5 No. of Writes: 0 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 21196 FORGET 0.00 0.00 us 0.00 us 0.00 us 30997 RELEASE 0.00 0.00 us 0.00 us 0.00 us 2067 RELEASEDIR 0.01 243.01 us 75.00 us 720.00 us 108 OPENDIR 0.01 909.28 us 49.00 us 2267.00 us 29 READDIR 0.03 72.36 us 21.00 us 8120.00 us 1196 READ 0.05 579.51 us 15.00 us 13165.00 us 215 READDIRP 0.08 164.16 us 50.00 us 7185.00 us 1280 OPEN 0.10 55.86 us 10.00 us 8981.00 us 4526 FSTAT 0.18 99.87 us 9.00 us 38436.00 us 4577 FLUSH 0.20 107.71 us 18.00 us 3017.00 us 4772 WRITE 0.28 297.70 us 33.00 us 99920.00 us 2322 STAT 0.51 174.08 us 11.00 us 456775.00 us 7343 ENTRYLK 2.16 281.26 us 9.00 us 471228.00 us 19260 FINODELK 2.17 457.72 us 32.00 us 537076.00 us 11917 SETATTR 3.26 34630.90 us 189.00 us 3570334.00 us 236 MKDIR 5.31 347.79 us 10.00 us 824600.00 us 38325 INODELK 5.82 964.51 us 21.00 us 1654664.00 us 15154 GETXATTR 16.07 4763.41 us 22.00 us 314613.00 us 8471 LOOKUP 16.48 12452.35 us 168.00 us 2889597.00 us 3323 CREATE 21.84 4120.66 us 30.00 us 3455974.00 us 13310 FXATTROP 25.43 2049.73 us 53.00 us 1918829.00 us 31148 XATTROP Duration: 5654 seconds Data Read: 58794837 bytes Data Written: 154488832 bytes
Awaiting performance results from test run that includes patch at http://review.gluster.com/#change,3186 to determine whether it can be accepted.
CHANGE: http://review.gluster.com/3186 (performance/io-threads: Change the thread scaling logic.) merged in master by Vijay Bellur (vijay)
CHANGE: http://review.gluster.com/3187 (mgmt/glusterd: Make priority threads configurable) merged in master by Vijay Bellur (vijay)
The issue is observed because the scaling of threads happens for the number of fops in iot queue. But Only 1 thread is allowed (because of the limit on least-prio-queue) to process the fops, so threads are spawned only to be terminated. The new change considers the limits on the queues before scaling up.
Checked with glusterfs-3.3.0qa40 and now CPU usage has been reduced while self-heal is going on.