Bug 1524325

Summary:

wrong healing source after upgrade

Product:

[Community] GlusterFS

Reporter:

Dmitry Melekhov <dm>

Component:

bitrot

Assignee:

bugs <bugs>

Status:

CLOSED EOL

QA Contact:

Severity:

high

Docs Contact:

bugs <bugs>

Priority:

unspecified

Version:

3.10

CC:

amukherj, bugs

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-06-20 18:24:51 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
logs	none

Description Dmitry Melekhov 2017-12-11 09:25:50 UTC

Created attachment 1365830 [details]
logs

Description of problem:

We run 2 nodes cluster with replicated volume, yes , this is not recommended setup, but...
Nodes names are father and son.
VMs and gluster are or these nodes.

We moved all VMs to one node (namely father).
We upgraded gluster on one node from 3.10.7 to 3.10.8 on one of nodes ( namely son) and rebooted it.
After this we see that  healing for one of VM images is running from son to father:

[root@son ~]# gluster volume heal pool info
Brick father:/wall/pool/brick
/shador.img 
/balamak.img 
/devaron.img 
/talita.img 
Status: Connected
Number of entries: 4

Brick son:/wall/pool/brick
/endor.img 
Status: Connected
Number of entries: 1


And image became broken.

There was bitrot detection enabled on this volume and , looks like, it is only process which accessed local data on son during boot ( please, look into logs).

We disabled bitrot detection for now.

Version-Release number of selected component (if applicable):

Centos 7.4, gluster 3.10.7 and 3.10.8.


How reproducible:

we don't know how to reproduce it.

Steps to Reproduce:
1. install 2 nodes gluster with replicated volume
2. set VMs on it
3. upgrade
4. reboot

May be just reboot is enough, we don't know

Actual results:
some (one in our case ) VM images are broken, because healed from old data.


Expected results:

healthy data on cluster.


Thank you!

Comment 1 Dmitry Melekhov 2017-12-11 12:03:26 UTC

btw, upgraded and rebooted second node (father) with bitrot detection turned off,
everything is fine.

Comment 2 Dmitry Melekhov 2017-12-13 04:35:25 UTC

And, I don't think this is replicate.
I guess this is caused by bitrot- if there was no i/o in VM , then it may change metadata on wrong server.

Comment 3 Shyamsundar 2018-06-20 18:24:51 UTC

This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained.

As a result this bug is being closed.

If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.