Bug 1324239

Summary: Inconsistent Volume Heal Outputs
Product: [Community] GlusterFS Reporter: Nathan Hill <Sustugriel>
Component: replicateAssignee: Krutika Dhananjay <kdhananj>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.7.10CC: bugs, kdhananj
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-01 12:30:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
glustershd for server with chatty output. Heals are as a result of glusterd restarts. none

Description Nathan Hill 2016-04-05 22:56:04 UTC
Created attachment 1144029 [details]
glustershd for server with chatty output. Heals are as a result of glusterd restarts.

Description of problem:

Since upgrading to 3.7.10, I have been experiencing inconsistent or possibly erroneous outputs for "volume heal <vol> info" commands.


Version-Release number of selected component (if applicable): 3.7.10-1


How reproducible: Unknown


Steps to Reproduce:
1. With synchronized 3.7.9 volume, upgrade all bricks to 3.7.10
2. Allow all bricks to synchronize.
3. Start using Volume in production. (In this case, as virtual machine storage)
4. gluster volume heal <vol> info.

Actual results:

Inconsistent or chatty outputs. Possibly symbolizing an out of sync volume but not likely, no bricks went down, nor was this experienced in 3.7.9.

Expected results:

Empty entry list in command output.

Additional info:

Volume Name: TriDistRepl
Type: Distributed-Replicate
Volume ID: 08a2dad5-0d43-4458-8c5b-a17add9cb3e4
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: rhev.styx.local:/virtual/TriDistRepl
Brick2: host.styx.local:/virtual/TriDistRepl
Brick3: gfs1.styx.local:/virtual/TriDistRepl
Brick4: repl.styx.local:/virtual/TriDistRepl
Brick5: fs1.styx.local:/media/DataPartRaid5/virtual/TriDistRepl
Brick6: gfs2.styx.local:/virtual/TriDistRepl
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
server.allow-insecure: on
storage.owner-gid: 36
network.ping-timeout: 10
features.shard-block-size: 512MB
features.shard: on

Sample Volume Heal Output:
gluster> vol heal TriDistRepl info
Brick rhev.styx.local:/virtual/TriDistRepl
Status: Connected
Number of entries: 0

Brick host.styx.local:/virtual/TriDistRepl
Status: Connected
Number of entries: 0

Brick gfs1.styx.local:/virtual/TriDistRepl
Status: Connected
Number of entries: 0

Brick repl.styx.local:/virtual/TriDistRepl
/5d8b9ed2-7dae-472f-90f7-904085d4dbf9/images/20475fd1-d0eb-4cb6-82a9-09d43a07397f/ba91e4f9-d144-41a0-901e-b7b78ae395b5
/5d8b9ed2-7dae-472f-90f7-904085d4dbf9/images/1d9867ba-37ce-4b27-af1e-2d965e5338c0/7a518bca-a2e3-4b50-a007-b2a8913a5ce4
Status: Connected
Number of entries: 2

Brick fs1.styx.local:/media/DataPartRaid5/virtual/TriDistRepl
/5d8b9ed2-7dae-472f-90f7-904085d4dbf9/images/20475fd1-d0eb-4cb6-82a9-09d43a07397f/ba91e4f9-d144-41a0-901e-b7b78ae395b5
Status: Connected
Number of entries: 1

Brick gfs2.styx.local:/virtual/TriDistRepl
/5d8b9ed2-7dae-472f-90f7-904085d4dbf9/images/f19279f0-2839-44d6-b36c-09f0995ea47f/be06560c-2edf-44fa-ae4b-cbf68961e838
/5d8b9ed2-7dae-472f-90f7-904085d4dbf9/images/c0c9d41c-e354-4e34-9e8f-d03066bd0d5c/3e9e133e-248a-404a-a93e-5ead81d43fa4
Status: Connected
Number of entries: 2


Same command ran from different brick in same volume:

gluster> vol heal TriDistRepl info
Brick rhev.styx.local:/virtual/TriDistRepl
Status: Connected
Number of entries: 0

Brick host.styx.local:/virtual/TriDistRepl
Status: Connected
Number of entries: 0

Brick gfs1.styx.local:/virtual/TriDistRepl
Status: Connected
Number of entries: 0

Brick repl.styx.local:/virtual/TriDistRepl
Status: Connected
Number of entries: 0

Brick fs1.styx.local:/media/DataPartRaid5/virtual/TriDistRepl
Status: Connected
Number of entries: 0

Brick gfs2.styx.local:/virtual/TriDistRepl
Status: Connected
Number of entries: 0


Other troubleshooting steps taken:
-Restarted bricks experiencing the chatty outputs.

Comment 1 Nathan Hill 2016-07-01 12:30:51 UTC
Resolved in 3.7.12. Not sure why.