Bug 802522

Summary: CHILD_UP/CHILD_DOWN behavior breaks quorum calculations
Product: [Community] GlusterFS Reporter: Jeff Darcy <jdarcy>
Component: replicateAssignee: Jeff Darcy <jdarcy>
Status: CLOSED CURRENTRELEASE QA Contact: Raghavendra Bhat <rabhat>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: amarts, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:32:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: glusterfs-3.3.0qa45 Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    

Description Jeff Darcy 2012-03-12 18:05:47 UTC
When the quorum-enforcement patch was first written, AFR would get a CHILD_UP event for each client subvolume that was available at startup, and nothing at all for a client subvolume that was down.  Some time between now and then, the behavior changed so that we'd get a CHILD_DOWN during startup for a subvolume that had never been available.  This causes the down count to be one greater than it should be, and quorum calculations will be incorrect unless/until that subvolume comes up.

Comment 1 Anand Avati 2012-03-19 22:19:56 UTC
CHANGE: http://review.gluster.com/2927 (replicate: fix a glitch in up_count/down_count updates.) merged in master by Anand Avati (avati)

Comment 2 Raghavendra Bhat 2012-06-04 12:14:45 UTC
Checked with glusterfs 3.3.0qa45


dd if=k of=/tmp/kkk bs=10k count=22
dd: opening `k': No such file or directory
root@hyperspace:/mnt/client# dd if=/dev/urandom of=k bs=10k count=22
22+0 records in
22+0 records out
225280 bytes (225 kB) copied, 0.0412702 s, 5.5 MB/s


killed 2 bricks.

gluster volume status
Status of volume: mirror
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick hyperspace:/mnt/sda7/export3			49152	N	N/A
Brick hyperspace:/mnt/sda8/export3			49153	N	N/A
Brick hyperspace:/mnt/sda7/last35			49173	Y	31887
NFS Server on localhost					38467	Y	4960
Self-heal Daemon on localhost				N/A	Y	4967

gluster volume info mirror
 
Volume Name: mirror
Type: Replicate
Volume ID: 3382aaa7-37d0-4fab-bd3c-dc9a7a350acf
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: hyperspace:/mnt/sda7/export3
Brick2: hyperspace:/mnt/sda8/export3
Brick3: hyperspace:/mnt/sda7/last35
Options Reconfigured:
cluster.quorum-type: auto
cluster.quorum-count: 2
features.lock-heal: on
features.quota: on
features.limit-usage: /:22GB
diagnostics.latency-measurement: on
diagnostics.count-fop-hits: on
geo-replication.indexing: on
performance.stat-prefetch: on

root@hyperspace: dd if=/dev/urandom of=k bs=10k count=22
dd: opening `k': Read-only file system
root@hyperspace:/mnt/client#