Bug 1251446 - Disperse volume: fuse mount hung after self healing
Summary: Disperse volume: fuse mount hung after self healing
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Xavi Hernandez
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1236050
TreeView+ depends on / blocked
 
Reported: 2015-08-07 10:05 UTC by Xavi Hernandez
Modified: 2016-06-16 13:29 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1236050
Environment:
Last Closed: 2016-06-16 13:29:42 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Xavi Hernandez 2015-08-07 10:05:21 UTC
+++ This bug was initially created as a clone of Bug #1236050 +++

Description of problem:
In a 3 x (4 + 2) = 18 distributed disperse volume, fuse mount point hung after self healing of failed disk files and folders.

Version-Release number of selected component (if applicable):
glusterfs 3.7.2 built on Jun 19 2015 16:33:27
Repository revision: git://git.gluster.com/glusterfs.git
<http://git.gluster.com/glusterfs.git>
Copyright (coffee) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU
General Public License


How reproducible:
100%

Steps to Reproduce:

1. create a 3x(4+2) distributed disperse volume across nodes
2. FUSE mount on the client and start creating files/directories on the following hierarchy
   /mountpoint/folder1/file1
   /mountpoint/folder2/file2
   /mountpoint/folder3/file3
 
3. simulate the disk failure by killing pid of file2 disk on any one node and add again the same disk after formatting the drive
4. start volume by force
5. self haling adding the file2 with 0 bytes in newly formatted drive
6. wait more time to finish self healing, but self healing doesn't happen. The file2 resides on 0 bytes
7. Try to read file2 from client, now the file name with 0 byte is tried to recovery and recovery will be completed. Get the md5sum of the file2 with all storage nodes and the result is positive
8. Now, bring down 2 of the nodes other than failed drive.
9. Now try to ls the mount point, mount point will hang

Actual results:
mount point hung

Expected results:

Mount point should list all the folders

Additional info:
admin@node001:~$ sudo gluster volume info

Volume Name: vaulttest21
Type: Distributed-Disperse
Volume ID: ac6a374d-a0a2-405c-823d-0672fd92f0af
Status: Started
Number of Bricks: 3 x (4 + 2) = 18
Transport-type: tcp
Bricks:
Brick1: 10.1.2.1:/media/disk1
Brick2: 10.1.2.2:/media/disk1
Brick3: 10.1.2.3:/media/disk1
Brick4: 10.1.2.4:/media/disk1
Brick5: 10.1.2.5:/media/disk1
Brick6: 10.1.2.6:/media/disk1
Brick7: 10.1.2.1:/media/disk2
Brick8: 10.1.2.2:/media/disk2
Brick9: 10.1.2.3:/media/disk2
Brick10: 10.1.2.4:/media/disk2
Brick11: 10.1.2.5:/media/disk2
Brick12: 10.1.2.6:/media/disk2
Brick13: 10.1.2.1:/media/disk3
Brick14: 10.1.2.2:/media/disk3
Brick15: 10.1.2.3:/media/disk3
Brick16: 10.1.2.4:/media/disk3
Brick17: 10.1.2.5:/media/disk3
Brick18: 10.1.2.6:/media/disk3
Options Reconfigured:
performance.readdir-ahead: on

root@mas03:/mnt/gluster# ls -R
.:
test1  test2  test3

./test1:
testfile1

./test2:
testfile8

./test3:
testfile10

Try to simluate disk failure and add again same disk.After recovery put ls on client mount point, mount point will hung.


node001:~$ sudo gluster volume get vaulttest21 all
Option                                  Value
------                                  -----
cluster.lookup-unhashed                 on
cluster.lookup-optimize                 off
cluster.min-free-disk                   10%
cluster.min-free-inodes                 5%
cluster.rebalance-stats                 off
cluster.subvols-per-directory           (null)
cluster.readdir-optimize                off
cluster.rsync-hash-regex                (null)
cluster.extra-hash-regex                (null)
cluster.dht-xattr-name                  trusted.glusterfs.dht
cluster.randomize-hash-range-by-gfid    off
cluster.rebal-throttle                  normal
cluster.local-volume-name               (null)
cluster.weighted-rebalance              on
cluster.entry-change-log                on
cluster.read-subvolume                  (null)
cluster.read-subvolume-index            -1
cluster.read-hash-mode                  1
cluster.background-self-heal-count      16
cluster.metadata-self-heal              on
cluster.data-self-heal                  on
cluster.entry-self-heal                 on
cluster.self-heal-daemon                on
cluster.heal-timeout                    600
cluster.self-heal-window-size           1
cluster.data-change-log                 on
cluster.metadata-change-log             on
cluster.data-self-heal-algorithm        (null)
cluster.eager-lock                      on
cluster.quorum-type                     none
cluster.quorum-count                    (null)
cluster.choose-local                    true
cluster.self-heal-readdir-size          1KB
cluster.post-op-delay-secs              1
cluster.ensure-durability               on
cluster.consistent-metadata             no
cluster.stripe-block-size               128KB
cluster.stripe-coalesce                 true
diagnostics.latency-measurement         off
diagnostics.dump-fd-stats               off
diagnostics.count-fop-hits              off
diagnostics.brick-log-level             INFO
diagnostics.client-log-level            INFO
diagnostics.brick-sys-log-level         CRITICAL
diagnostics.client-sys-log-level        CRITICAL
diagnostics.brick-logger                (null)
diagnostics.client-logger               (null)
diagnostics.brick-log-format            (null)
diagnostics.client-log-format           (null)
diagnostics.brick-log-buf-size          5
diagnostics.client-log-buf-size         5
diagnostics.brick-log-flush-timeout     120
diagnostics.client-log-flush-timeout    120
performance.cache-max-file-size         0
performance.cache-min-file-size         0
performance.cache-refresh-timeout       1
performance.cache-priority
performance.cache-size                  32MB
performance.io-thread-count             16
performance.high-prio-threads           16
performance.normal-prio-threads         16
performance.low-prio-threads            16
performance.least-prio-threads          1
performance.enable-least-priority       on
performance.least-rate-limit            0
performance.cache-size                  128MB
performance.flush-behind                on
performance.nfs.flush-behind            on
performance.write-behind-window-size    1MB
performance.nfs.write-behind-window-size1MB
performance.strict-o-direct             off
performance.nfs.strict-o-direct         off
performance.strict-write-ordering       off
performance.nfs.strict-write-ordering   off
performance.lazy-open                   yes
performance.read-after-open             no
performance.read-ahead-page-count       4
performance.md-cache-timeout            1
features.encryption                     off
encryption.master-key                   (null)
encryption.data-key-size                256
encryption.block-size                   4096
network.frame-timeout                   1800
network.ping-timeout                    42
network.tcp-window-size                 (null)
features.lock-heal                      off
features.grace-timeout                  10
network.remote-dio                      disable
client.event-threads                    2
network.ping-timeout                    42
network.tcp-window-size                 (null)
network.inode-lru-limit                 16384
auth.allow                              *
auth.reject                             (null)
transport.keepalive                     (null)
server.allow-insecure                   (null)
server.root-squash                      off
server.anonuid                          65534
server.anongid                          65534
server.statedump-path                   /var/run/gluster
server.outstanding-rpc-limit            64
features.lock-heal                      off
features.grace-timeout                  (null)
server.ssl                              (null)
auth.ssl-allow                          *
server.manage-gids                      off
client.send-gids                        on
server.gid-timeout                      300
server.own-thread                       (null)
server.event-threads                    2
performance.write-behind                on
performance.read-ahead                  on
performance.readdir-ahead               on
performance.io-cache                    on
performance.quick-read                  on
performance.open-behind                 on
performance.stat-prefetch               on
performance.client-io-threads           off
performance.nfs.write-behind            on
performance.nfs.read-ahead              off
performance.nfs.io-cache                off
performance.nfs.quick-read              off
performance.nfs.stat-prefetch           off
performance.nfs.io-threads              off
performance.force-readdirp              true
features.file-snapshot                  off
features.uss                            off
features.snapshot-directory             .snaps
features.show-snapshot-directory        off
network.compression                     off
network.compression.window-size         -15
network.compression.mem-level           8
network.compression.min-size            0
network.compression.compression-level   -1
network.compression.debug               false
features.limit-usage                    (null)
features.quota-timeout                  0
features.default-soft-limit             80%
features.soft-timeout                   60
features.hard-timeout                   5
features.alert-time                     86400
features.quota-deem-statfs              off
geo-replication.indexing                off
geo-replication.indexing                off
geo-replication.ignore-pid-check        off
geo-replication.ignore-pid-check        off
features.quota                          off
features.inode-quota                    off
features.bitrot                         disable
debug.trace                             off
debug.log-history                       no
debug.log-file                          no
debug.exclude-ops                       (null)
debug.include-ops                       (null)
debug.error-gen                         off
debug.error-failure                     (null)
debug.error-number                      (null)
debug.random-failure                    off
debug.error-fops                        (null)
nfs.enable-ino32                        no
nfs.mem-factor                          15
nfs.export-dirs                         on
nfs.export-volumes                      on
nfs.addr-namelookup                     off
nfs.dynamic-volumes                     off
nfs.register-with-portmap               on
nfs.outstanding-rpc-limit               16
nfs.port                                2049
nfs.rpc-auth-unix                       on
nfs.rpc-auth-null                       on
nfs.rpc-auth-allow                      all
nfs.rpc-auth-reject                     none
nfs.ports-insecure                      off
nfs.trusted-sync                        off
nfs.trusted-write                       off
nfs.volume-access                       read-write
nfs.export-dir
nfs.disable                             false
nfs.nlm                                 on
nfs.acl                                 on
nfs.mount-udp                           off
nfs.mount-rmtab                         /var/lib/glusterd/nfs/rmtab
nfs.rpc-statd                           /sbin/rpc.statd
nfs.server-aux-gids                     off
nfs.drc                                 off
nfs.drc-size                            0x20000
nfs.read-size                           (1 * 1048576ULL)
nfs.write-size                          (1 * 1048576ULL)
nfs.readdir-size                        (1 * 1048576ULL)
nfs.exports-auth-enable                 (null)
nfs.auth-refresh-interval-sec           (null)
nfs.auth-cache-ttl-sec                  (null)
features.read-only                      off
features.worm                           off
storage.linux-aio                       off
storage.batch-fsync-mode                reverse-fsync
storage.batch-fsync-delay-usec          0
storage.owner-uid                       -1
storage.owner-gid                       -1
storage.node-uuid-pathinfo              off
storage.health-check-interval           30
storage.build-pgfid                     off
storage.bd-aio                          off
cluster.server-quorum-type              off
cluster.server-quorum-ratio             0
changelog.changelog                     off
changelog.changelog-dir                 (null)
changelog.encoding                      ascii
changelog.rollover-time                 15
changelog.fsync-interval                5
changelog.changelog-barrier-timeout     120
changelog.capture-del-path              off
features.barrier                        disable
features.barrier-timeout                120
features.trash                          off
features.trash-dir                      .trashcan
features.trash-eliminate-path           (null)
features.trash-max-filesize             5MB
features.trash-internal-op              off
cluster.enable-shared-storage           disable
features.ctr-enabled                    off
features.record-counters                off
features.ctr_link_consistency           off
locks.trace                             (null)
cluster.disperse-self-heal-daemon       enable
cluster.quorum-reads                    no
client.bind-insecure                    (null)
ganesha.enable                          off
features.shard                          off
features.shard-block-size               4MB
features.scrub-throttle                 lazy
features.scrub-freq                     biweekly
features.expiry-time                    120
features.cache-invalidation             off
features.cache-invalidation-timeout     60

--- Additional comment from Pranith Kumar K on 2015-08-05 03:54:28 CEST ---

hi Backer,
    Could you try this test with 3.7.3 please. We fixed 2-3 hang bugs so it would be great if you could let us know if it still happens. Meanwhile Xavi and I are going to work on 1235964 you raised. Do you hangout on #gluster IRC? It would be great to know your feedback about 3.7.3 to see what you think about the stability of EC. We feel EC is almost ready for production with 3.7.3 release based on our tests in lab.

Pranith

--- Additional comment from Backer on 2015-08-05 10:09:17 CEST ---

I have tested the 3.7.3 as well as 3.7.2 nightly build( glusterfs-3.7.2-20150726.b639cb9.tar.gz) for the I/O error and handout issue. I found that 3.7.3 has the data corruption issue which is not present is 3.7.2 nightly build( glusterfs-3.7.2-20150707.36f24f5.tar.gz). Data has been corrupted after replacing the failed drive and running the self heal. Even we find the data corruption after the recovery of node failure ,When unavailable data chunks has been copied by proactive self heal daemon. You can reproduce the bug through the following steps

Steps to reproduce:
1. create a 3x(4+2) disperse volume across nodes
2. FUSE mount on the client and start creating files/directories with mkdir and rsync/dd
3. Now, bring down 2 of the nodes(node 5 & 6)
4. write some files(eg filenew1, filenew2). The files will be available only on 4 nodes( node 1,2,3 & 4 )
5. calculate the md5sum of filenew1 and filenew2
6. Now bring up the failed/down 2 nodes( node 5 & 6)
6. Pro active Self healing will create unavailable data chunks on 2 nodes (node 5 & 6).
7. Once finish the self healing, bring down another two nodes (node 1 & 2)
8. Now try to get the mdsum of same recovered file, there will be a mismatch in md5sum value.

But this bug is not available in 3.7.2 nightly build (glusterfs-3.7.2-20150707.36f24f5.tar.gz)

Also i would like to know, why the proactive self healing is not happening after replacing the failed drives. I have to manually run the volume heal command for healing the unavailable files.

--- Additional comment from Pranith Kumar K on 2015-08-05 11:24:47 CEST ---

hi Backer,
     Thanks for the quick reply. Based on your comment, I am assuming no hangs are observed. Auto-healing of replace-brick/disk-replacement is something we are working for 3.7.4, until then you need to execute "gluster volume heal ec2 full".

As for the data corruption bug, I am not able to re-create it:
Let me know if I missed any step:

root@localhost - ~ 
14:48:24 :) ⚡ glusterd && gluster volume create ec2 disperse 6 redundancy 2 `hostname`:/home/gfs/ec_{0..5} force && gluster volume start ec2 && mount -t glusterfs `hostname`:/ec2 /mnt/ec2
volume create: ec2: success: please start the volume to access data
volume start: ec2: success

#I disabled perf-xlators so that reads are served from the bricks always

root@localhost - ~ 
14:48:38 :( ⚡ ~/.scripts/disable-perf-xl.sh ec2
+ gluster volume set ec2 performance.quick-read off
volume set: success
+ gluster volume set ec2 performance.io-cache off
volume set: success
+ gluster volume set ec2 performance.write-behind off
volume set: success
+ gluster volume set ec2 performance.stat-prefetch off
volume set: success
+ gluster volume set ec2 performance.read-ahead off
volume set: success
+ gluster volume set ec2 performance.open-behind off
volume set: success

root@localhost - ~ 
14:48:47 :) ⚡ cd /mnt/ec2/

root@localhost - /mnt/ec2 
14:48:59 :) ⚡ gluster v status
Status of volume: ec2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick localhost.localdomain:/home/gfs/ec_0  49152     0          Y       14828
Brick localhost.localdomain:/home/gfs/ec_1  49153     0          Y       14846
Brick localhost.localdomain:/home/gfs/ec_2  49155     0          Y       14864
Brick localhost.localdomain:/home/gfs/ec_3  49156     0          Y       14882
Brick localhost.localdomain:/home/gfs/ec_4  49157     0          Y       14900
Brick localhost.localdomain:/home/gfs/ec_5  49158     0          Y       14918
NFS Server on localhost                     2049      0          Y       14937
 
Task Status of Volume ec2
------------------------------------------------------------------------------
There are no active volume tasks
 
root@localhost - /mnt/ec2 
14:49:02 :) ⚡ kill -9 14918 14900

root@localhost - /mnt/ec2 
14:49:11 :) ⚡ dd if=/dev/urandom of=1.txt bs=1M count=2
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.153835 s, 13.6 MB/s

root@localhost - /mnt/ec2 
14:49:15 :) ⚡ md5sum 1.txt
5ead68d0a60b8134f7daf0e8d1afe19c  1.txt

root@localhost - /mnt/ec2 
14:49:23 :) ⚡ gluster v start ec2 force
volume start: ec2: success

root@localhost - /mnt/ec2 
14:49:35 :) ⚡ gluster v heal ec2
Launching heal operation to perform index self heal on volume ec2 has been successful 
Use heal info commands to check status

root@localhost - /mnt/ec2 
14:49:39 :) ⚡ gluster v heal ec2 info
Brick localhost.localdomain:/home/gfs/ec_0/
/1.txt 
Number of entries: 1

Brick localhost.localdomain:/home/gfs/ec_1/
/1.txt 
Number of entries: 1

Brick localhost.localdomain:/home/gfs/ec_2/
/1.txt 
Number of entries: 1

Brick localhost.localdomain:/home/gfs/ec_3/
/1.txt 
Number of entries: 1

Brick localhost.localdomain:/home/gfs/ec_4/
Number of entries: 0

Brick localhost.localdomain:/home/gfs/ec_5/
Number of entries: 0


root@localhost - /mnt/ec2 
14:49:45 :) ⚡ gluster v heal ec2
Launching heal operation to perform index self heal on volume ec2 has been successful 
Use heal info commands to check status

root@localhost - /mnt/ec2 
14:49:47 :) ⚡ gluster v heal ec2 info
Brick localhost.localdomain:/home/gfs/ec_0/
Number of entries: 0

Brick localhost.localdomain:/home/gfs/ec_1/
Number of entries: 0

Brick localhost.localdomain:/home/gfs/ec_2/
Number of entries: 0

Brick localhost.localdomain:/home/gfs/ec_3/
Number of entries: 0

Brick localhost.localdomain:/home/gfs/ec_4/
Number of entries: 0

Brick localhost.localdomain:/home/gfs/ec_5/
Number of entries: 0


root@localhost - /mnt/ec2 
14:49:51 :) ⚡ gluster v status
Status of volume: ec2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick localhost.localdomain:/home/gfs/ec_0  49152     0          Y       14828
Brick localhost.localdomain:/home/gfs/ec_1  49153     0          Y       14846
Brick localhost.localdomain:/home/gfs/ec_2  49155     0          Y       14864
Brick localhost.localdomain:/home/gfs/ec_3  49156     0          Y       14882
Brick localhost.localdomain:/home/gfs/ec_4  49157     0          Y       15173
Brick localhost.localdomain:/home/gfs/ec_5  49158     0          Y       15191
NFS Server on localhost                     2049      0          Y       15211
 
Task Status of Volume ec2
------------------------------------------------------------------------------
There are no active volume tasks
 

root@localhost - /mnt/ec2 
14:49:56 :) ⚡ kill -9 14828 14846

root@localhost - /mnt/ec2 
14:50:03 :) ⚡ md5sum 1.txt
5ead68d0a60b8134f7daf0e8d1afe19c  1.txt

root@localhost - /mnt/ec2 
14:50:06 :) ⚡ cd

root@localhost - ~ 
14:50:13 :) ⚡ umount /mnt/ec2

root@localhost - ~ 
14:50:16 :) ⚡ mount -t glusterfs `hostname`:/ec2 /mnt/ec2

root@localhost - ~ 
14:50:19 :) ⚡ md5sum /mnt/ec2/1.txt
5ead68d0a60b8134f7daf0e8d1afe19c  /mnt/ec2/1.txt

--- Additional comment from Backer on 2015-08-06 10:46:11 CEST ---



--- Additional comment from Backer on 2015-08-06 10:49:10 CEST ---

I am getting random test results after disabled and enabled the perf-xlators. Please refer the attachment.

root@gfs-tst-08:/home/qubevaultadmin# gluster --version
glusterfs 3.7.3 built on Jul 31 2015 17:03:01
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.


root@gfs-tst-08:/home/gfsadmin# gluster volume  info

Volume Name: vaulttest39
Type: Disperse
Volume ID: fcbed6b5-0654-489c-a29e-d18f737ac2f7
Status: Started
Number of Bricks: 1 x (3 + 1) = 4
Transport-type: tcp
Bricks:
Brick1: 10.1.2.238:/media/disk1
Brick2: 10.1.2.238:/media/disk2
Brick3: 10.1.2.238:/media/disk3
Brick4: 10.1.2.238:/media/disk4
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.io-cache: off
performance.write-behind: off
performance.stat-prefetch: off
performance.read-ahead: off
performance.open-behind: off




gfsadmin@gfs-tst-08:~$ sudo gluster volume status
Status of volume: vaulttest39
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49152     0          Y       1560
Brick 10.1.2.238:/media/disk2               49153     0          Y       1568
Brick 10.1.2.238:/media/disk3               49154     0          Y       1576
Brick 10.1.2.238:/media/disk4               49155     0          Y       1582
NFS Server on localhost                     2049      0          Y       1544

Task Status of Volume vaulttest39
------------------------------------------------------------------------------
There are no active volume tasks

gfsadmin@gfs-tst-08:~$ sudo kill -9 1560
gfsadmin@gfs-tst-08:~$ sudo gluster volume status
Status of volume: vaulttest39
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               N/A       N/A        N       N/A
Brick 10.1.2.238:/media/disk2               49153     0          Y       1568
Brick 10.1.2.238:/media/disk3               49154     0          Y       1576
Brick 10.1.2.238:/media/disk4               49155     0          Y       1582
NFS Server on localhost                     2049      0          Y       1544

Task Status of Volume vaulttest39
------------------------------------------------------------------------------
There are no active volume tasks


root@gfs-tst-09:/mnt/gluster# dd if=/dev/urandom of=2.txt bs=1M count=2
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.226147 s, 9.3 MB/s
root@gfs-tst-09:/mnt/gluster# md5sum 2.txt
cd9db53f9c090958ff8c033161576b95  2.txt


gfsadmin@gfs-tst-08:~$ ls -l -h /media/disk{1..4}
/media/disk1:
total 960K
-rw-r--r-- 2 root root 683K Aug  6 13:58 1.txt

/media/disk2:
total 1.9M
-rw-r--r-- 2 root root 683K Aug  6 13:58 1.txt
-rw-r--r-- 2 root root 683K Aug  6 13:59 2.txt

/media/disk3:
total 1.9M
-rw-r--r-- 2 root root 683K Aug  6 13:58 1.txt
-rw-r--r-- 2 root root 683K Aug  6 13:59 2.txt

/media/disk4:
total 1.9M
-rw-r--r-- 2 root root 683K Aug  6 13:58 1.txt
-rw-r--r-- 2 root root 683K Aug  6 13:59 2.txt




root@gfs-tst-08:/home/gfsadmin# gluster v start vaulttest39 force
volume start: vaulttest39: success
root@gfs-tst-08:/home/gfsadmin#  gluster v heal  vaulttest39
Launching heal operation to perform index self heal on volume vaulttest39 has been successful
Use heal info commands to check status
root@gfs-tst-08:/home/gfsadmin# gluster v heal  vaulttest39 info
Brick gfs-tst-08:/media/disk1/
Number of entries: 0

Brick gfs-tst-08:/media/disk2/
Number of entries: 0

Brick gfs-tst-08:/media/disk3/
Number of entries: 0

Brick gfs-tst-08:/media/disk4/
Number of entries: 0

root@gfs-tst-08:/home/gfsadmin# gluster v heal vaulttest39
Launching heal operation to perform index self heal on volume vaulttest39 has been successful
Use heal info commands to check status
root@gfs-tst-08:/home/gfsadmin# gluster v heal vaulttest39 info
Brick gfs-tst-08:/media/disk1/
Number of entries: 0

Brick gfs-tst-08:/media/disk2/
Number of entries: 0

Brick gfs-tst-08:/media/disk3/
Number of entries: 0

Brick gfs-tst-08:/media/disk4/
Number of entries: 0

root@gfs-tst-08:/home/gfsadmin#  ls -l -h /media/disk{1..4}
/media/disk1:
total 1004K
-rw-r--r-- 2 root root 683K Aug  6 13:58 1.txt
-rw-r--r-- 2 root root 683K Aug  6 13:59 2.txt

/media/disk2:
total 1.9M
-rw-r--r-- 2 root root 683K Aug  6 13:58 1.txt
-rw-r--r-- 2 root root 683K Aug  6 13:59 2.txt

/media/disk3:
total 1.9M
-rw-r--r-- 2 root root 683K Aug  6 13:58 1.txt
-rw-r--r-- 2 root root 683K Aug  6 13:59 2.txt

/media/disk4:
total 1.9M
-rw-r--r-- 2 root root 683K Aug  6 13:58 1.txt
-rw-r--r-- 2 root root 683K Aug  6 13:59 2.txt

root@gfs-tst-08:/home/gfsadmin# gluster volume status
Status of volume: vaulttest39
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49152     0          Y       1721
Brick 10.1.2.238:/media/disk2               49153     0          Y       1568
Brick 10.1.2.238:/media/disk3               49154     0          Y       1576
Brick 10.1.2.238:/media/disk4               49155     0          Y       1582
NFS Server on localhost                     2049      0          Y       1740

Task Status of Volume vaulttest39
------------------------------------------------------------------------------
There are no active volume tasks


root@gfs-tst-08:/home/gfsadmin# kill -9 1582
root@gfs-tst-08:/home/gfsadmin# gluster volume status
Status of volume: vaulttest39
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49152     0          Y       1721
Brick 10.1.2.238:/media/disk2               49153     0          Y       1568
Brick 10.1.2.238:/media/disk3               49154     0          Y       1576
Brick 10.1.2.238:/media/disk4               N/A       N/A        N       N/A
NFS Server on localhost                     2049      0          Y       1740

Task Status of Volume vaulttest39
------------------------------------------------------------------------------
There are no active volume tasks


root@gfs-tst-09:/mnt/gluster# md5sum 2.txt
cd9db53f9c090958ff8c033161576b95  2.txt
root@gfs-tst-09:/mnt/gluster# md5sum 2.txt
cd9db53f9c090958ff8c033161576b95  2.txt
root@gfs-tst-09:/mnt/gluster# ls
1.txt  2.txt
root@gfs-tst-09:/mnt/gluster# ls
1.txt  2.txt
root@gfs-tst-09:/mnt/gluster# md5sum 2.txt
70b40a7e3f5dc85345e466968416cde1  2.txt
root@gfs-tst-09:/mnt/gluster# md5sum 2.txt
70b40a7e3f5dc85345e466968416cde1  2.txt
root@gfs-tst-09:/mnt/gluster# md5sum 2.txt
70b40a7e3f5dc85345e466968416cde1  2.txt
root@gfs-tst-09:/mnt/gluster#

--- Additional comment from Backer on 2015-08-06 16:07:10 CEST ---

I have created a new volume once again and confirmed the bug.

root@gfs-tst-08:/home/gfsadmin# gluster volume create vaulttest52 disperse-data 3 redundancy 1 10.1.2.238:/media/disk{1..4} force
root@gfs-tst-08:/home/gfsadmin# gluster v start vaulttest52

root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1574
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       1590
NFS Server on localhost                     2049      0          Y       1558

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks

root@gfs-tst-08:/home/gfsadmin# gluster v info

Volume Name: vaulttest52
Type: Disperse
Volume ID: 0b0b3f8f-acb9-4e2c-a029-fcb89f85b1e7
Status: Started
Number of Bricks: 1 x (3 + 1) = 4
Transport-type: tcp
Bricks:
Brick1: 10.1.2.238:/media/disk1
Brick2: 10.1.2.238:/media/disk2
Brick3: 10.1.2.238:/media/disk3
Brick4: 10.1.2.238:/media/disk4
Options Reconfigured:
performance.readdir-ahead: on


gfsadmin@gfs-tst-09:/mnt/gluster$ sudo dd if=/dev/urandom of=1.txt bs=1M count=2
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.208704 s, 10.0 MB/s
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 1.txt
1233b5321315c05abb4668cc9a1d9d25  1.txt

root@gfs-tst-08:/home/gfsadmin# ls -l -h /media/disk{1..4}
/media/disk1:
total 960K
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt

/media/disk2:
total 960K
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt

/media/disk3:
total 960K
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt

/media/disk4:
total 960K
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt



root@gfs-tst-08:/home/gfsadmin# kill -9 1574
root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               N/A       N/A        N       N/A
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       1590
NFS Server on localhost                     2049      0          Y       1558

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks


gfsadmin@gfs-tst-09:/mnt/gluster$ sudo dd if=/dev/urandom of=2.txt bs=1M count=2
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.205401 s, 10.2 MB/s

gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 2.txt
9c8b37847622efbf2ec75c683166de97  2.txt


root@gfs-tst-08:/home/gfsadmin# ls -l -h /media/disk{1..4}
/media/disk1:
total 960K
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt

/media/disk2:
total 1.9M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt

/media/disk3:
total 1.9M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt

/media/disk4:
total 1.4M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt

root@gfs-tst-08:/home/gfsadmin# gluster v start vaulttest52 force
volume start: vaulttest52: success
root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       1590
NFS Server on localhost                     2049      0          Y       1758

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks

root@gfs-tst-08:/home/gfsadmin# gluster v heal vaulttest52
Launching heal operation to perform index self heal on volume vaulttest52 has been successful
Use heal info commands to check status
root@gfs-tst-08:/home/gfsadmin# gluster v heal vaulttest52 info
Brick gfs-tst-08:/media/disk1/
Number of entries: 0

Brick gfs-tst-08:/media/disk2/
Number of entries: 0

Brick gfs-tst-08:/media/disk3/
Number of entries: 0

Brick gfs-tst-08:/media/disk4/
Number of entries: 0

root@gfs-tst-08:/home/gfsadmin# ls -l -h /media/disk{1..4}
/media/disk1:
total 728K
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt

/media/disk2:
total 1.4M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt

/media/disk3:
total 1.4M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt

/media/disk4:
total 1.4M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt

root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       1590
NFS Server on localhost                     2049      0          Y       1758

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks

root@gfs-tst-08:/home/gfsadmin# kill -9 1590
root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               N/A       N/A        N       N/A
NFS Server on localhost                     2049      0          Y       1758

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks


gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 2.txt
96f6f469f4b743b4a575fdc408b5f007  2.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 2.txt
96f6f469f4b743b4a575fdc408b5f007  2.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 2.txt
96f6f469f4b743b4a575fdc408b5f007  2.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ ls
1.txt  2.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ ls
1.txt  2.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ ls
1.txt  2.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 2.txt
96f6f469f4b743b4a575fdc408b5f007  2.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 2.txt
96f6f469f4b743b4a575fdc408b5f007  2.txt

=====================================
MD5SUM has ben changed 
====================================


root@gfs-tst-08:/home/gfsadmin# gluster v start vaulttest52 force
volume start: vaulttest52: success

root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       1852
NFS Server on localhost                     2049      0          Y       1871

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks


======================================
disabled perf-xlators
=====================================

root@gfs-tst-08:/home/gfsadmin# gluster volume set vaulttest52 performance.quick-read off
gluster volume set vaulttest52 performance.io-cache off
gluster volume set vaulttest52 performance.write-behind off
gluster volume set vaulttest52 performance.stat-prefetch off
gluster volume set vaulttest52 performance.read-ahead off
gluster volume set vaulttest52 performance.open-behind off
volume set: success
root@gfs-tst-08:/home/gfsadmin# gluster volume set vaulttest52 performance.io-cache off
volume set: success
root@gfs-tst-08:/home/gfsadmin# gluster volume set vaulttest52 performance.write-behind off
volume set: success
root@gfs-tst-08:/home/gfsadmin# gluster volume set vaulttest52 performance.stat-prefetch off
volume set: success
root@gfs-tst-08:/home/gfsadmin# gluster volume set vaulttest52 performance.read-ahead off
volume set: success
root@gfs-tst-08:/home/gfsadmin# gluster volume set vaulttest52 performance.open-behind off
volume set: success

root@gfs-tst-08:/home/gfsadmin# gluster v info

Volume Name: vaulttest52
Type: Disperse
Volume ID: 0b0b3f8f-acb9-4e2c-a029-fcb89f85b1e7
Status: Started
Number of Bricks: 1 x (3 + 1) = 4
Transport-type: tcp
Bricks:
Brick1: 10.1.2.238:/media/disk1
Brick2: 10.1.2.238:/media/disk2
Brick3: 10.1.2.238:/media/disk3
Brick4: 10.1.2.238:/media/disk4
Options Reconfigured:
performance.open-behind: off
performance.read-ahead: off
performance.stat-prefetch: off
performance.write-behind: off
performance.io-cache: off
performance.quick-read: off
performance.readdir-ahead: on


root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       1852
NFS Server on localhost                     2049      0          Y       1871

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks

root@gfs-tst-08:/home/gfsadmin# kill -9 1852
root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               N/A       N/A        N       N/A
NFS Server on localhost                     2049      0          Y       1871

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks


gfsadmin@gfs-tst-09:/mnt/gluster$ sudo dd if=/dev/urandom of=3.txt bs=5M count=10
10+0 records in
10+0 records out
52428800 bytes (52 MB) copied, 5.40714 s, 9.7 MB/s

gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 3.txt
fa9d9d3e298d01c8cf54855968784b83  3.txt

root@gfs-tst-08:/home/gfsadmin# gluster v start vaulttest52 force
volume start: vaulttest52: success
root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       2017
NFS Server on localhost                     N/A       N/A        N       N/A

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks

root@gfs-tst-08:/home/gfsadmin# gluster v heal vaulttest52
Launching heal operation to perform index self heal on volume vaulttest52 has been successful
Use heal info commands to check status
root@gfs-tst-08:/home/gfsadmin# gluster v heal vaulttest52  info
Brick gfs-tst-08:/media/disk1/
Number of entries: 0

Brick gfs-tst-08:/media/disk2/
Number of entries: 0

Brick gfs-tst-08:/media/disk3/
Number of entries: 0

Brick gfs-tst-08:/media/disk4/
Number of entries: 0

root@gfs-tst-08:/home/gfsadmin# ls -l -h /media/disk{1..4}
/media/disk1:
total 33M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt
-rw-r--r-- 2 root root  17M Aug  6 19:26 3.txt

/media/disk2:
total 34M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt
-rw-r--r-- 2 root root  17M Aug  6 19:26 3.txt

/media/disk3:
total 34M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt
-rw-r--r-- 2 root root  17M Aug  6 19:26 3.txt

/media/disk4:
total 1.4M
-rw-r--r-- 2 root root 683K Aug  6 19:14 1.txt
-rw-r--r-- 2 root root 683K Aug  6 19:16 2.txt
-rw-r--r-- 2 root root  17M Aug  6 19:26 3.txt

root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               49173     0          Y       1582
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       2017
NFS Server on localhost                     2049      0          Y       2036

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks

root@gfs-tst-08:/home/gfsadmin# kill -9 1582
root@gfs-tst-08:/home/gfsadmin# gluster v status
Status of volume: vaulttest52
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.1.2.238:/media/disk1               49172     0          Y       1739
Brick 10.1.2.238:/media/disk2               N/A       N/A        N       N/A
Brick 10.1.2.238:/media/disk3               49174     0          Y       1595
Brick 10.1.2.238:/media/disk4               49175     0          Y       2017
NFS Server on localhost                     2049      0          Y       2036

Task Status of Volume vaulttest52
------------------------------------------------------------------------------
There are no active volume tasks


gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 3.txt
fa9d9d3e298d01c8cf54855968784b83  3.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 3.txt
fa9d9d3e298d01c8cf54855968784b83  3.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 3.txt
fa9d9d3e298d01c8cf54855968784b83  3.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 3.txt
fa9d9d3e298d01c8cf54855968784b83  3.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ ls
1.txt  2.txt  3.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ ls
1.txt  2.txt  3.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ ls
1.txt  2.txt  3.txt
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 3.txt
ea50603ce500b29c73dca6a9c733eb7a  3.txt


gfsadmin@gfs-tst-09:/$ sudo umount /mnt/gluster
gfsadmin@gfs-tst-09:/$ sudo mount -t glusterfs 10.1.2.238:/vaulttest52 /mnt/gluster/

gfsadmin@gfs-tst-09:/$ cd /mnt/gluster/
gfsadmin@gfs-tst-09:/mnt/gluster$ md5sum 3.txt
ea50603ce500b29c73dca6a9c733eb7a  3.txt

After putting ls command in local dir, the md5sum hash has been changed

Comment 1 Anand Avati 2015-08-07 10:41:43 UTC
REVIEW: http://review.gluster.org/11862 (cluster/ec: Fix write size in self-heal) posted (#1) for review on master by Xavier Hernandez (xhernandez)

Comment 2 Anand Avati 2015-08-14 09:09:52 UTC
COMMIT: http://review.gluster.org/11862 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 289d00369f0ddb78f534735f7d3bf86268adac60
Author: Xavier Hernandez <xhernandez>
Date:   Fri Aug 7 12:37:52 2015 +0200

    cluster/ec: Fix write size in self-heal
    
    Self-heal was always using a fixed block size to heal a file. This
    was incorrect for dispersed volumes with a number of data bricks not
    being a power of 2.
    
    This patch adjusts the block size to a multiple of the stripe size
    of the volume. It also propagates errors detected during the data
    heal to stop healing the file and not mark it as healed.
    
    Change-Id: I9ee3fde98a9e5d6116fd096ceef88686fd1d28e2
    BUG: 1251446
    Signed-off-by: Xavier Hernandez <xhernandez>
    Reviewed-on: http://review.gluster.org/11862
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 3 Niels de Vos 2016-06-16 13:29:42 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.