Bug 1524325 - wrong healing source after upgrade
Summary: wrong healing source after upgrade
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: bitrot
Version: 3.10
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
bugs@gluster.org
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-11 09:25 UTC by Dmitry Melekhov
Modified: 2018-06-20 18:24 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-06-20 18:24:51 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
logs (266.67 KB, application/x-gzip)
2017-12-11 09:25 UTC, Dmitry Melekhov
no flags Details

Description Dmitry Melekhov 2017-12-11 09:25:50 UTC
Created attachment 1365830 [details]
logs

Description of problem:

We run 2 nodes cluster with replicated volume, yes , this is not recommended setup, but...
Nodes names are father and son.
VMs and gluster are or these nodes.

We moved all VMs to one node (namely father).
We upgraded gluster on one node from 3.10.7 to 3.10.8 on one of nodes ( namely son) and rebooted it.
After this we see that  healing for one of VM images is running from son to father:

[root@son ~]# gluster volume heal pool info
Brick father:/wall/pool/brick
/shador.img 
/balamak.img 
/devaron.img 
/talita.img 
Status: Connected
Number of entries: 4

Brick son:/wall/pool/brick
/endor.img 
Status: Connected
Number of entries: 1


And image became broken.

There was bitrot detection enabled on this volume and , looks like, it is only process which accessed local data on son during boot ( please, look into logs).

We disabled bitrot detection for now.

Version-Release number of selected component (if applicable):

Centos 7.4, gluster 3.10.7 and 3.10.8.


How reproducible:

we don't know how to reproduce it.

Steps to Reproduce:
1. install 2 nodes gluster with replicated volume
2. set VMs on it
3. upgrade
4. reboot

May be just reboot is enough, we don't know

Actual results:
some (one in our case ) VM images are broken, because healed from old data.


Expected results:

healthy data on cluster.


Thank you!

Comment 1 Dmitry Melekhov 2017-12-11 12:03:26 UTC
btw, upgraded and rebooted second node (father) with bitrot detection turned off,
everything is fine.

Comment 2 Dmitry Melekhov 2017-12-13 04:35:25 UTC
And, I don't think this is replicate.
I guess this is caused by bitrot- if there was no i/o in VM , then it may change metadata on wrong server.

Comment 3 Shyamsundar 2018-06-20 18:24:51 UTC
This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained.

As a result this bug is being closed.

If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.


Note You need to log in before you can comment on or make changes to this bug.