Bug 812515

Summary: [glusterfs-3.3.0qa34]: self-heal operation makes brick processes CPU usage very high
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: gluster-bugs, rfortier, shaines, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:28:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: glusterfs-3.3.0qa40 Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    

Description Raghavendra Bhat 2012-04-14 09:56:14 UTC
Description of problem:

3 replica volume. Started untarring linux kernel and immidietly brought two bricks down. After many files have been created on the mount point, brought the bricks up (via volume start force), and started self-healing. While self-healing was going in the CPU usage of the bricks was very high going up 330% (4 core CPU).

Noramal i/o on the mount point (without any self-heal going on) does not make brick processes use that much CPU.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. create a 3 replica volume, start it and mount it.
2. start untarring linux kernel and immedietly bring 2 bricks down.
3. after many files and directories have been created, bring the bricks up and start self-healing via volume heal command (or both volume heal as well as find | xargs stat).
4. Observe the CPU usage of the brick processes using top command.

  
Actual results:

The CPU usage of the brick processes is very high sometimes going upto 330%

Expected results:

Self-heal should not make brick processes use high CPU.

Additional info:

gluster volume profile output:

Brick: hyperspace:/mnt/sda7/export3
-----------------------------------
Cumulative Stats:
   Block Size:                  4b+                   8b+                  16b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                    2                     7                     3 
 
   Block Size:                 32b+                  64b+                 128b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                    3                     9                    18 
 
   Block Size:                256b+                 512b+                1024b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                   47                    86                   173 
 
   Block Size:               2048b+                4096b+                8192b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                  274                   279                   199 
 
   Block Size:              16384b+               32768b+               65536b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                  108                    31                    15 
 
   Block Size:             131072b+ 
 No. of Reads:                    0 
No. of Writes:                    1 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us           1258     RELEASE
      0.00       0.00 us       0.00 us       0.00 us            109  RELEASEDIR
      0.00     101.25 us      14.00 us     206.00 us              4    GETXATTR
      0.00     889.75 us      27.00 us    3215.00 us              4     READDIR
      0.00     226.05 us      59.00 us     806.00 us            110     OPENDIR
      0.01      48.48 us      11.00 us    2238.00 us           1254       FLUSH
      0.02      84.22 us      24.00 us    1847.00 us           1255       WRITE
      0.02     532.73 us      12.00 us   83930.00 us            220    READDIRP
      0.03     130.33 us      28.00 us   44072.00 us           1254   FTRUNCATE
      0.04     146.86 us      37.00 us    1639.00 us           1258        OPEN
      0.04     956.81 us      15.00 us  197329.00 us            219     ENTRYLK
      0.05      95.46 us      12.00 us   93841.00 us           2512       FSTAT
      0.06     222.69 us      29.00 us   38979.00 us           1365    SETXATTR
      0.08     279.55 us      65.00 us   22710.00 us           1474     XATTROP
      0.11     146.94 us      25.00 us   25829.00 us           3766    FXATTROP
      0.16      81.21 us      12.00 us   50296.00 us          10266     INODELK
      6.93  316790.25 us     286.00 us  638685.00 us            112       MKDIR
     22.35   21277.84 us      37.00 us 1686348.00 us           5375     SETATTR
     24.34   98222.00 us     141.00 us 1723258.00 us           1268       MKNOD
     45.75   41585.95 us      24.00 us 2227451.00 us           5630      LOOKUP
 
    Duration: 67 seconds
   Data Read: 0 bytes
Data Written: 10417753 bytes
 
Interval 0 Stats:
   Block Size:                  4b+                   8b+                  16b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                    2                     7                     3 
 
   Block Size:                 32b+                  64b+                 128b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                    3                     9                    18 
 
   Block Size:                256b+                 512b+                1024b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                   47                    86                   173 
 
   Block Size:               2048b+                4096b+                8192b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                  274                   279                   199 
 
   Block Size:              16384b+               32768b+               65536b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                  108                    31                    15 
 
   Block Size:             131072b+ 
 No. of Reads:                    0 
No. of Writes:                    1 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us           1258     RELEASE
      0.00       0.00 us       0.00 us       0.00 us            109  RELEASEDIR
      0.00     101.25 us      14.00 us     206.00 us              4    GETXATTR
      0.00     889.75 us      27.00 us    3215.00 us              4     READDIR
      0.00     226.05 us      59.00 us     806.00 us            110     OPENDIR
      0.01      48.48 us      11.00 us    2238.00 us           1254       FLUSH
      0.02      84.22 us      24.00 us    1847.00 us           1255       WRITE
      0.02     532.73 us      12.00 us   83930.00 us            220    READDIRP
      0.03     130.33 us      28.00 us   44072.00 us           1254   FTRUNCATE
      0.04     146.86 us      37.00 us    1639.00 us           1258        OPEN
      0.04     956.81 us      15.00 us  197329.00 us            219     ENTRYLK
      0.05      95.46 us      12.00 us   93841.00 us           2512       FSTAT
      0.06     222.69 us      29.00 us   38979.00 us           1365    SETXATTR
      0.08     279.55 us      65.00 us   22710.00 us           1474     XATTROP
      0.11     146.94 us      25.00 us   25829.00 us           3766    FXATTROP
      0.16      81.21 us      12.00 us   50296.00 us          10266     INODELK
      6.93  316790.25 us     286.00 us  638685.00 us            112       MKDIR
     22.35   21277.84 us      37.00 us 1686348.00 us           5375     SETATTR
     24.34   98222.00 us     141.00 us 1723258.00 us           1268       MKNOD
     45.75   41585.95 us      24.00 us 2227451.00 us           5630      LOOKUP
 
    Duration: 67 seconds
   Data Read: 0 bytes
Data Written: 10417753 bytes
 
Brick: hyperspace:/mnt/sda8/export3
-----------------------------------
Cumulative Stats:
   Block Size:                  4b+                   8b+                  16b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                    2                     7                     3 
 
   Block Size:                 32b+                  64b+                 128b+ 
 No. of Reads:                    1                     0                     1 
No. of Writes:                    2                     9                    17 
 
   Block Size:                256b+                 512b+                1024b+ 
 No. of Reads:                    3                    10                     9 
No. of Writes:                   44                    76                   164 
 
   Block Size:               2048b+                4096b+                8192b+ 
 No. of Reads:                   12                     9                     9 
No. of Writes:                  262                   270                   190 
 
   Block Size:              16384b+               32768b+               65536b+ 
 No. of Reads:                    5                     0                     0 
No. of Writes:                  103                    31                    15 
 
   Block Size:             131072b+ 
 No. of Reads:                    0 
No. of Writes:                    1 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us           1267     RELEASE
      0.00       0.00 us       0.00 us       0.00 us            109  RELEASEDIR
      0.00     128.50 us      37.00 us     345.00 us              4     READDIR
      0.00     153.21 us      12.00 us    5966.00 us            133    GETXATTR
      0.00     215.69 us      60.00 us     581.00 us            110     OPENDIR
      0.00     152.16 us      11.00 us    3708.00 us            220    READDIRP
      0.00     615.58 us      23.00 us   10037.00 us             59        READ
      0.01      59.79 us       9.00 us    9456.00 us           1254       FLUSH
      0.01      82.29 us      20.00 us    6382.00 us           1196       WRITE
      0.02     133.58 us      35.00 us    9972.00 us           1267        OPEN
      0.03     150.61 us      12.00 us  120670.00 us           1258       FSTAT
      0.04    1209.46 us      13.00 us  256704.00 us            221     ENTRYLK
      0.04     240.53 us      26.00 us  184150.00 us           1195   FTRUNCATE
      0.06     312.15 us      28.00 us  227518.00 us           1302    SETXATTR
      0.06     274.79 us      57.00 us   15597.00 us           1517     XATTROP
      0.10     192.96 us      25.00 us  191736.00 us           3766    FXATTROP
      0.13      91.06 us      12.00 us  218203.00 us          10286     INODELK
      6.44  430751.18 us     297.00 us  766979.00 us            110       MKDIR
     22.73  138777.25 us     141.00 us 2219134.00 us           1206       MKNOD
     31.85   45196.90 us      36.00 us 2108323.00 us           5188     SETATTR
     38.48   50954.00 us      17.00 us 2281124.00 us           5560      LOOKUP
 
    Duration: 67 seconds
   Data Read: 350181 bytes
Data Written: 10067572 bytes
 
Interval 0 Stats:
   Block Size:                  4b+                   8b+                  16b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                    2                     7                     3 
 
   Block Size:                 32b+                  64b+                 128b+ 
 No. of Reads:                    1                     0                     1 
No. of Writes:                    2                     9                    17 
 
   Block Size:                256b+                 512b+                1024b+ 
 No. of Reads:                    3                    10                     9 
No. of Writes:                   44                    76                   164 
 
   Block Size:               2048b+                4096b+                8192b+ 
 No. of Reads:                   12                     9                     9 
No. of Writes:                  262                   270                   190 
 
   Block Size:              16384b+               32768b+               65536b+ 
 No. of Reads:                    5                     0                     0 
No. of Writes:                  103                    31                    15 
 
   Block Size:             131072b+ 
 No. of Reads:                    0 
No. of Writes:                    1 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us           1267     RELEASE
      0.00       0.00 us       0.00 us       0.00 us            109  RELEASEDIR
      0.00     128.50 us      37.00 us     345.00 us              4     READDIR
      0.00     153.21 us      12.00 us    5966.00 us            133    GETXATTR
      0.00     215.69 us      60.00 us     581.00 us            110     OPENDIR
      0.00     152.16 us      11.00 us    3708.00 us            220    READDIRP
      0.00     615.58 us      23.00 us   10037.00 us             59        READ
      0.01      59.79 us       9.00 us    9456.00 us           1254       FLUSH
      0.01      82.29 us      20.00 us    6382.00 us           1196       WRITE
      0.02     133.58 us      35.00 us    9972.00 us           1267        OPEN
      0.03     150.61 us      12.00 us  120670.00 us           1258       FSTAT
      0.04    1209.46 us      13.00 us  256704.00 us            221     ENTRYLK
      0.04     240.53 us      26.00 us  184150.00 us           1195   FTRUNCATE
      0.06     312.15 us      28.00 us  227518.00 us           1302    SETXATTR
      0.06     274.79 us      57.00 us   15597.00 us           1517     XATTROP
      0.10     192.96 us      25.00 us  191736.00 us           3766    FXATTROP
      0.13      91.06 us      12.00 us  218203.00 us          10286     INODELK
      6.44  430751.18 us     297.00 us  766979.00 us            110       MKDIR
     22.73  138777.25 us     141.00 us 2219134.00 us           1206       MKNOD
     31.85   45196.90 us      36.00 us 2108323.00 us           5188     SETATTR
     38.48   50954.00 us      17.00 us 2281124.00 us           5560      LOOKUP
 
    Duration: 67 seconds
   Data Read: 350181 bytes
Data Written: 10067572 bytes
 
Brick: hyperspace:/mnt/sda10/export3
------------------------------------
Cumulative Stats:
   Block Size:                  4b+                   8b+                  16b+ 
 No. of Reads:                    6                    36                    30 
No. of Writes:                   11                    83                   184 
 
   Block Size:                 32b+                  64b+                 128b+ 
 No. of Reads:                   39                   111                   195 
No. of Writes:                  277                   638                   959 
 
   Block Size:                256b+                 512b+                1024b+ 
 No. of Reads:                  383                   715                  1107 
No. of Writes:                 1781                  4292                  6018 
 
   Block Size:               2048b+                4096b+                8192b+ 
 No. of Reads:                 1549                  1523                  1107 
No. of Writes:                 7115                  6417                  3346 
 
   Block Size:              16384b+               32768b+               65536b+ 
 No. of Reads:                  707                   185                    58 
No. of Writes:                 1240                   289                    94 
 
   Block Size:             131072b+ 
 No. of Reads:                    5 
No. of Writes:                    0 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us          21196      FORGET
      0.00       0.00 us       0.00 us       0.00 us          30997     RELEASE
      0.00       0.00 us       0.00 us       0.00 us           2067  RELEASEDIR
      0.01     243.01 us      75.00 us     720.00 us            108     OPENDIR
      0.01     909.28 us      49.00 us    2267.00 us             29     READDIR
      0.03      72.36 us      21.00 us    8120.00 us           1196        READ
      0.05     579.51 us      15.00 us   13165.00 us            215    READDIRP
      0.08     164.16 us      50.00 us    7185.00 us           1280        OPEN
      0.10      55.86 us      10.00 us    8981.00 us           4526       FSTAT
      0.18      99.87 us       9.00 us   38436.00 us           4577       FLUSH
      0.20     107.71 us      18.00 us    3017.00 us           4772       WRITE
      0.28     297.70 us      33.00 us   99920.00 us           2322        STAT
      0.51     174.08 us      11.00 us  456775.00 us           7343     ENTRYLK
      2.16     281.26 us       9.00 us  471228.00 us          19260    FINODELK
      2.17     457.72 us      32.00 us  537076.00 us          11917     SETATTR
      3.26   34630.90 us     189.00 us 3570334.00 us            236       MKDIR
      5.31     347.79 us      10.00 us  824600.00 us          38325     INODELK
      5.82     964.51 us      21.00 us 1654664.00 us          15154    GETXATTR
     16.07    4763.41 us      22.00 us  314613.00 us           8471      LOOKUP
     16.48   12452.35 us     168.00 us 2889597.00 us           3323      CREATE
     21.84    4120.66 us      30.00 us 3455974.00 us          13310    FXATTROP
     25.43    2049.73 us      53.00 us 1918829.00 us          31148     XATTROP
 
    Duration: 5654 seconds
   Data Read: 58794837 bytes
Data Written: 154488832 bytes
 
Interval 0 Stats:
   Block Size:                  4b+                   8b+                  16b+ 
 No. of Reads:                    6                    36                    30 
No. of Writes:                   11                    83                   184 
 
   Block Size:                 32b+                  64b+                 128b+ 
 No. of Reads:                   39                   111                   195 
No. of Writes:                  277                   638                   959 
 
   Block Size:                256b+                 512b+                1024b+ 
 No. of Reads:                  383                   715                  1107 
No. of Writes:                 1781                  4292                  6018 
 
   Block Size:               2048b+                4096b+                8192b+ 
 No. of Reads:                 1549                  1523                  1107 
No. of Writes:                 7115                  6417                  3346 
 
   Block Size:              16384b+               32768b+               65536b+ 
 No. of Reads:                  707                   185                    58 
No. of Writes:                 1240                   289                    94 
 
   Block Size:             131072b+ 
 No. of Reads:                    5 
No. of Writes:                    0 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us          21196      FORGET
      0.00       0.00 us       0.00 us       0.00 us          30997     RELEASE
      0.00       0.00 us       0.00 us       0.00 us           2067  RELEASEDIR
      0.01     243.01 us      75.00 us     720.00 us            108     OPENDIR
      0.01     909.28 us      49.00 us    2267.00 us             29     READDIR
      0.03      72.36 us      21.00 us    8120.00 us           1196        READ
      0.05     579.51 us      15.00 us   13165.00 us            215    READDIRP
      0.08     164.16 us      50.00 us    7185.00 us           1280        OPEN
      0.10      55.86 us      10.00 us    8981.00 us           4526       FSTAT
      0.18      99.87 us       9.00 us   38436.00 us           4577       FLUSH
      0.20     107.71 us      18.00 us    3017.00 us           4772       WRITE
      0.28     297.70 us      33.00 us   99920.00 us           2322        STAT
      0.51     174.08 us      11.00 us  456775.00 us           7343     ENTRYLK
      2.16     281.26 us       9.00 us  471228.00 us          19260    FINODELK
      2.17     457.72 us      32.00 us  537076.00 us          11917     SETATTR
      3.26   34630.90 us     189.00 us 3570334.00 us            236       MKDIR
      5.31     347.79 us      10.00 us  824600.00 us          38325     INODELK
      5.82     964.51 us      21.00 us 1654664.00 us          15154    GETXATTR
     16.07    4763.41 us      22.00 us  314613.00 us           8471      LOOKUP
     16.48   12452.35 us     168.00 us 2889597.00 us           3323      CREATE
     21.84    4120.66 us      30.00 us 3455974.00 us          13310    FXATTROP
     25.43    2049.73 us      53.00 us 1918829.00 us          31148     XATTROP
 
    Duration: 5654 seconds
   Data Read: 58794837 bytes
Data Written: 154488832 bytes

Comment 1 Vijay Bellur 2012-04-19 19:34:18 UTC
Awaiting performance results from test run that includes patch at http://review.gluster.com/#change,3186 to determine whether it can be accepted.

Comment 2 Anand Avati 2012-04-20 16:15:37 UTC
CHANGE: http://review.gluster.com/3186 (performance/io-threads: Change the thread scaling logic.) merged in master by Vijay Bellur (vijay)

Comment 3 Anand Avati 2012-04-20 16:15:57 UTC
CHANGE: http://review.gluster.com/3187 (mgmt/glusterd: Make priority threads configurable) merged in master by Vijay Bellur (vijay)

Comment 4 Pranith Kumar K 2012-04-20 17:41:56 UTC
The issue is observed because the scaling of threads happens for the number of fops in iot queue. But Only 1 thread is allowed (because of the limit on least-prio-queue) to process the fops, so threads are spawned only to be terminated. The new change considers the limits on the queues before scaling up.

Comment 5 Raghavendra Bhat 2012-05-09 09:26:29 UTC
Checked with glusterfs-3.3.0qa40 and now CPU usage has been reduced while self-heal is going on.