Description of problem: ====================== Though heal works on a disperse volume from the server side, the unsuccessful message is misleading as show below : [root@vertigo bricks]# gluster v heal testvol full Launching heal operation to perform full self heal on volume testvol has been unsuccessful If the same command is given on the peer, commit failed message is thrown. [root@ninja ~]# gluster v heal testvol full Commit failed on 10.70.34.56. Please check log file for details. [root@ninja ~]# gluster peer status Number of Peers: 1 Hostname: 10.70.34.56 Uuid: 5656b0cb-9f99-4e7b-9125-95ea80b0c9a1 State: Peer in Cluster (Connected) [root@ninja ~]# Version-Release number of selected component (if applicable): ============================================================= [root@vertigo bricks]# gluster --version glusterfs 3.7dev built on Mar 12 2015 01:40:59 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. [root@vertigo bricks]# How reproducible: ================= 100% Steps to Reproduce: 1. Create a disperse volume and create files and directories from client 2. Bring down 2 of the bricks and let the IO continue 3. Brick the bricks back up after some time and trigger heal with "gluster v heal <volname> full" Actual results: Expected results: ================ The message should be successful. Additional info: ================ Sosreport of the node will be attached. Gluster volume options : ======================== [root@ninja ~]# gluster v get testvol all Option Value ------ ----- cluster.lookup-unhashed on cluster.min-free-disk 10% cluster.min-free-inodes 5% cluster.rebalance-stats off cluster.subvols-per-directory (null) cluster.readdir-optimize off cluster.rsync-hash-regex (null) cluster.extra-hash-regex (null) cluster.dht-xattr-name trusted.glusterfs.dht cluster.randomize-hash-range-by-gfid off cluster.local-volume-name (null) cluster.weighted-rebalance on cluster.switch-pattern (null) cluster.entry-change-log on cluster.read-subvolume (null) cluster.read-subvolume-index -1 cluster.read-hash-mode 1 cluster.background-self-heal-count 16 cluster.metadata-self-heal on cluster.data-self-heal on cluster.entry-self-heal on cluster.self-heal-daemon on cluster.heal-timeout 600 cluster.self-heal-window-size 1 cluster.data-change-log on cluster.metadata-change-log on cluster.data-self-heal-algorithm (null) cluster.eager-lock on cluster.quorum-type none cluster.quorum-count (null) cluster.choose-local true cluster.self-heal-readdir-size 1KB cluster.post-op-delay-secs 1 cluster.ensure-durability on cluster.stripe-block-size 128KB cluster.stripe-coalesce true diagnostics.latency-measurement off diagnostics.dump-fd-stats off diagnostics.count-fop-hits off diagnostics.brick-log-level INFO diagnostics.client-log-level INFO diagnostics.brick-sys-log-level CRITICAL diagnostics.client-sys-log-level CRITICAL diagnostics.brick-logger (null) diagnostics.client-logger (null) diagnostics.brick-log-format (null) diagnostics.client-log-format (null) diagnostics.brick-log-buf-size 5 diagnostics.client-log-buf-size 5 diagnostics.brick-log-flush-timeout 120 diagnostics.client-log-flush-timeout 120 performance.cache-max-file-size 0 performance.cache-min-file-size 0 performance.cache-refresh-timeout 1 performance.cache-priority performance.cache-size 32MB performance.io-thread-count 16 performance.high-prio-threads 16 performance.normal-prio-threads 16 performance.low-prio-threads 16 performance.least-prio-threads 1 performance.enable-least-priority on performance.least-rate-limit 0 performance.cache-size 128MB performance.flush-behind on performance.nfs.flush-behind on performance.write-behind-window-size 1MB performance.nfs.write-behind-window-size1MB performance.strict-o-direct off performance.nfs.strict-o-direct off performance.strict-write-ordering off performance.nfs.strict-write-ordering off performance.lazy-open yes performance.read-after-open no performance.read-ahead-page-count 4 performance.md-cache-timeout 1 features.encryption off encryption.master-key (null) encryption.data-key-size 256 encryption.block-size 4096 network.frame-timeout 1800 network.ping-timeout 42 network.tcp-window-size (null) features.lock-heal off features.grace-timeout 10 network.remote-dio disable client.event-threads 4 network.tcp-window-size (null) network.inode-lru-limit 16384 auth.allow * auth.reject (null) transport.keepalive (null) server.allow-insecure (null) server.root-squash off server.anonuid 65534 server.anongid 65534 server.statedump-path /var/run/gluster server.outstanding-rpc-limit 64 features.lock-heal off features.grace-timeout (null) server.ssl (null) auth.ssl-allow * server.manage-gids off client.send-gids on server.gid-timeout 2 server.own-thread (null) server.event-threads 4 performance.write-behind on performance.read-ahead on performance.readdir-ahead off performance.io-cache on performance.quick-read on performance.open-behind on performance.stat-prefetch on performance.client-io-threads off performance.nfs.write-behind on performance.nfs.read-ahead off performance.nfs.io-cache off performance.nfs.quick-read off performance.nfs.stat-prefetch off performance.nfs.io-threads off performance.force-readdirp true features.file-snapshot off features.uss on features.snapshot-directory .snaps features.show-snapshot-directory off network.compression off network.compression.window-size -15 network.compression.mem-level 8 network.compression.min-size 0 network.compression.compression-level -1 network.compression.debug false features.limit-usage (null) features.quota-timeout 0 features.default-soft-limit 80% features.soft-timeout 60 features.hard-timeout 5 features.alert-time 86400 features.quota-deem-statfs on geo-replication.indexing off geo-replication.indexing off geo-replication.ignore-pid-check off geo-replication.ignore-pid-check off features.quota on debug.trace off debug.log-history no debug.log-file no debug.exclude-ops (null) debug.include-ops (null) debug.error-gen off debug.error-failure (null) debug.error-number (null) debug.random-failure off debug.error-fops (null) nfs.enable-ino32 no nfs.mem-factor 15 nfs.export-dirs on nfs.export-volumes on nfs.addr-namelookup off nfs.dynamic-volumes off nfs.register-with-portmap on nfs.outstanding-rpc-limit 16 nfs.port 2049 nfs.rpc-auth-unix on nfs.rpc-auth-null on nfs.rpc-auth-allow all nfs.rpc-auth-reject none nfs.ports-insecure off nfs.trusted-sync off nfs.trusted-write off nfs.volume-access read-write nfs.export-dir nfs.disable false nfs.nlm on nfs.acl on nfs.mount-udp off nfs.mount-rmtab /var/lib/glusterd/nfs/rmtab nfs.rpc-statd /sbin/rpc.statd nfs.server-aux-gids off nfs.drc off nfs.drc-size 0x20000 nfs.read-size (1 * 1048576ULL) nfs.write-size (1 * 1048576ULL) nfs.readdir-size (1 * 1048576ULL) features.read-only off features.worm off storage.linux-aio off storage.batch-fsync-mode reverse-fsync storage.batch-fsync-delay-usec 0 storage.owner-uid -1 storage.owner-gid -1 storage.node-uuid-pathinfo off storage.health-check-interval 30 storage.build-pgfid off storage.bd-aio off cluster.server-quorum-type off cluster.server-quorum-ratio 0 changelog.changelog off changelog.changelog-dir (null) changelog.encoding ascii changelog.rollover-time 15 changelog.fsync-interval 5 changelog.changelog-barrier-timeout 120 features.barrier disable features.barrier-timeout 120 locks.trace (null) cluster.disperse-self-heal-daemon enable cluster.quorum-reads no client.bind-insecure (null) [root@ninja ~]# Gluster volume status & info : ============================== [root@ninja ~]# gluster v status Status of volume: testvol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick vertigo:/rhs/brick1/b1 49152 0 Y 4237 Brick ninja:/rhs/brick1/b1 49152 0 Y 4161 Brick vertigo:/rhs/brick2/b2 49153 0 Y 4255 Brick ninja:/rhs/brick2/b2 49153 0 Y 4176 Brick vertigo:/rhs/brick3/b3 49154 0 Y 2748 Brick ninja:/rhs/brick3/b3 49154 0 Y 2524 Brick vertigo:/rhs/brick4/b4 49155 0 Y 2761 Brick ninja:/rhs/brick4/b4 49155 0 Y 2537 Brick vertigo:/rhs/brick1/b1-1 49156 0 Y 2203 Brick ninja:/rhs/brick1/b1-1 49156 0 Y 2550 Brick vertigo:/rhs/brick2/b2-1 49157 0 Y 2218 Brick ninja:/rhs/brick2/b2-1 49157 0 Y 2563 Snapshot Daemon on localhost 49158 0 Y 2577 NFS Server on localhost 2049 0 Y 4192 Quota Daemon on localhost N/A N/A Y 4210 Snapshot Daemon on 10.70.34.56 49158 0 Y 2801 NFS Server on 10.70.34.56 2049 0 Y 682 Quota Daemon on 10.70.34.56 N/A N/A Y 701 Task Status of Volume testvol ------------------------------------------------------------------------------ There are no active volume tasks [root@ninja ~]# ========================================================================= [root@ninja ~]# gluster v info Volume Name: testvol Type: Disperse Volume ID: 7393260c-51d1-4dca-8fc8-e1f5ad6fee14 Status: Started Number of Bricks: 1 x (8 + 4) = 12 Transport-type: tcp Bricks: Brick1: vertigo:/rhs/brick1/b1 Brick2: ninja:/rhs/brick1/b1 Brick3: vertigo:/rhs/brick2/b2 Brick4: ninja:/rhs/brick2/b2 Brick5: vertigo:/rhs/brick3/b3 Brick6: ninja:/rhs/brick3/b3 Brick7: vertigo:/rhs/brick4/b4 Brick8: ninja:/rhs/brick4/b4 Brick9: vertigo:/rhs/brick1/b1-1 Brick10: ninja:/rhs/brick1/b1-1 Brick11: vertigo:/rhs/brick2/b2-1 Brick12: ninja:/rhs/brick2/b2-1 Options Reconfigured: features.quota-deem-statfs: on cluster.disperse-self-heal-daemon: enable features.uss: on client.event-threads: 4 server.event-threads: 4 features.quota: on [root@ninja ~]#
REVIEW: http://review.gluster.org/11267 ( ec: Display correct message after successful heal start) posted (#1) for review on master by Ashish Pandey (aspandey)
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user