1400916 – [compound FOPs]: Need to disconnect and remount the clients for Compound Fops feature to take effect

Bug 1400916 - [compound FOPs]: Need to disconnect and remount the clients for Compound Fops feature to take effect

Summary: [compound FOPs]: Need to disconnect and remount the clients for Compound Fops...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Pranith Kumar K
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-12-02 10:02 UTC by Nag Pavan Chilakam
Modified:	2017-02-01 14:32 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-12-02 10:23:46 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nag Pavan Chilakam 2016-12-02 10:02:00 UTC

Description of problem:
-=====================
Seems like we need to remount the volumes on the clients to let the compound fops feature to take effect.
Just enabling the option is not sufficient

I had a 4 node cluster, where I created a 1x2 volume spanning on n1 and n2
I fuse mounted the volume on c1 and c2 using n3 and n4 IPs respectively
I then enabled Compound fops on the volume and restarted the volume.

The volume graph show cfops enabled on brick logs, but there is no regenration of vol graph on client side.

I discussed with Glusterd Team(Atin) and found that it is not necessary to regenerate vol graph on client side on restart of volume 

That means that Compound fops will not be in effect.
For it it come into effect, turning on cfops options will not suffice. User has to remount the volume on all clients.
This will also, mean that any customer post upgrade and enabling cfops must be remounting the clients, once upgraded 
Hence there will be unavailability of the volume for some time.

We need to either fix it by regenerating vol graph on client side, or if it is not the ideal way of doing, we may have to document this



Version-Release number of selected component (if applicable):
===============
3.8.4-6

Comment 2 Krutika Dhananjay 2016-12-02 10:23:46 UTC

Nag and I checked his setup.
This is not a bug.
Volfile will be generated only when there is a graph switch.

Just to double-check, we attached the client to gdb and checked for afr's private member and saw that compound-fops was on:

<snip>

Breakpoint 1, afr_lookup (frame=0x7f08e79dccd8, this=0x7f08d8009060, loc=0x7f08e722f678, xattr_req=0x7f08e717d35c) at afr-common.c:2858
2858    {
(gdb) p this->private
$1 = (void *) 0x7f08d804f220
(gdb) p (afr_private_t *)this->private
$2 = (afr_private_t *) 0x7f08d804f220
(gdb) p *$2
$3 = {lock = {spinlock = 1, mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = "\001", '\000' <repeats 38 times>, __align = 1}}, child_count = 2, arbiter_count = 0, children = 0x7f08d804a3b0, root_inode = 0x7f08d43ce06c,
  child_up = 0x7f08d804a350 "\001\001\r", <incomplete sequence \360\255\272>, local = 0x7f08d804a8f0 "", pending_key = 0x7f08d804a410, data_self_heal = 0x7f08d7de9e34 "on",
  data_self_heal_algorithm = 0x0, data_self_heal_window_size = 1, heal_waiting = {next = 0x7f08d804f290, prev = 0x7f08d804f290}, heal_wait_qlen = 128, heal_waiters = 0, healing = {
    next = 0x7f08d804f2a8, prev = 0x7f08d804f2a8}, background_self_heal_count = 8, healers = 0, metadata_self_heal = _gf_true, entry_self_heal = _gf_true, data_change_log = _gf_true,
  metadata_change_log = _gf_true, entry_change_log = _gf_true, metadata_splitbrain_forced_heal = _gf_false, read_child = -1, hash_mode = 1, favorite_child = -1,
  fav_child_policy = AFR_FAV_CHILD_NONE, inodelk_trace = _gf_false, entrylk_trace = _gf_false, wait_count = 1, timer = 0x0, optimistic_change_log = _gf_true, eager_lock = _gf_true,
  pre_op_compat = _gf_true, post_op_delay_secs = 1, quorum_count = 0, quorum_reads = _gf_false, vol_uuid = '\000' <repeats 36 times>, last_event = 0x7f08d804f4b0, event_generation = 6,
  choose_local = _gf_true, did_discovery = _gf_true, sh_readdir_size = 1024, ensure_durability = _gf_true, sh_domain = 0x7f08d804f440 "rep2-replicate-0:self-heal",
  afr_dirty = 0x7f08d7deccd4 "trusted.afr.dirty", shd = {iamshd = _gf_false, enabled = _gf_true, timeout = 600, index_healers = 0x7f08d804fc80, full_healers = 0x7f08d804fe30,
    split_brain = 0x7f08d804ffe0, statistics = 0x7f08d8052150, max_threads = 1, wait_qlength = 1024}, consistent_metadata = _gf_false, spb_choice_timeout = 300, need_heal = _gf_false,
  pump_private = 0x0, use_afr_in_pump = _gf_false, locking_scheme = 0x7f08d7deb341 "full", esh_granular = _gf_false, use_compound_fops = _gf_true}
(gdb)

</snip>

So closing the BZ.

Note You need to log in before you can comment on or make changes to this bug.