Bug 1275158

Summary: Data Tiering:Getting lookup failed on files in hot tier, when volume is restarted
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: tierAssignee: Dan Lambright <dlambrig>
Status: CLOSED ERRATA QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: medium Docs Contact:
Priority: urgent    
Version: rhgs-3.1CC: dlambrig, jbyers, rhinduja, rhs-bugs, sankarshan, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.5-5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1275382 1275383 (view as bug list) Environment:
Last Closed: 2016-03-01 05:44:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1260783, 1260923, 1275382, 1275383    

Description Nag Pavan Chilakam 2015-10-26 07:31:16 UTC
Description of problem:
========================
When We restart a tier volume, we consistently get "lookup failed" on all hot tier files.


Version-Release number of selected component (if applicable):
==============================================================
glusterfs-3.7.5-0.3.el7rhgs.x86_64



Steps to Reproduce:
=====================
1.create and start a tier volume and have files in hot and cold tier
2.Note down all the files in each tier
3.now keep viewing tier log and do a restart of volume



LOG SAMPLE:
===========
===================
[2015-10-23 12:57:58.907027] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-spain-client-4: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-10-23 12:57:58.907356] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-spain-client-4: Connected to spain-client-4, attached to remote volume '/rhs/brick6/spain_hot'.
[2015-10-23 12:57:58.907396] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-spain-client-4: Server and Client lk-version numbers are not same, reopening the fds
[2015-10-23 12:57:58.907474] I [MSGID: 108005] [afr-common.c:3842:afr_notify] 0-spain-replicate-3: Subvolume 'spain-client-4' came back up; going online.
[2015-10-23 12:57:58.907512] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-spain-client-4: Server lk version = 1
[2015-10-23 12:57:58.918357] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-spain-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-10-23 12:57:58.918381] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-spain-client-7: changing port to 49175 (from 0)
[2015-10-23 12:57:58.918418] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-spain-client-3: disconnected from spain-client-3. Client process will keep trying to connect to glusterd until brick's port is available
[2015-10-23 12:57:58.918611] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-spain-client-5: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-10-23 12:57:58.918660] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-spain-client-5: disconnected from spain-client-5. Client process will keep trying to connect to glusterd until brick's port is available
[2015-10-23 12:57:58.923646] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-spain-client-7: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-10-23 12:57:58.928203] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-spain-client-7: Connected to spain-client-7, attached to remote volume '/rhs/brick7/spain_hot'.
[2015-10-23 12:57:58.928243] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-spain-client-7: Server and Client lk-version numbers are not same, reopening the fds
[2015-10-23 12:57:58.929179] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-spain-client-7: Server lk version = 1
[2015-10-23 12:57:58.937254] I [MSGID: 108031] [afr-common.c:1783:afr_local_discovery_cbk] 0-spain-replicate-0: selecting local read_child spain-client-0
[2015-10-23 12:57:58.938353] I [MSGID: 108031] [afr-common.c:1783:afr_local_discovery_cbk] 0-spain-replicate-1: selecting local read_child spain-client-2
[2015-10-23 12:57:58.939454] I [MSGID: 108031] [afr-common.c:1783:afr_local_discovery_cbk] 0-spain-replicate-2: selecting local read_child spain-client-6
[2015-10-23 12:57:58.940294] I [dht-rebalance.c:2950:gf_defrag_start_crawl] 0-spain-tier-dht: gf_defrag_start_crawl using commit hash 2974851209
[2015-10-23 12:57:58.940360] I [MSGID: 108031] [afr-common.c:1783:afr_local_discovery_cbk] 0-spain-replicate-3: selecting local read_child spain-client-4
[2015-10-23 12:57:58.941772] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-spain-tier-dht: fixing the layout of /
[2015-10-23 12:57:58.941809] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 0 (spain-cold-dht): 1897198 chunks
[2015-10-23 12:57:58.941822] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 1 (spain-hot-dht): 1176241 chunks
[2015-10-23 12:57:58.943583] W [afr-inode-read.c:745:afr_getxattr_node_uuid_cbk] 0-spain-replicate-3: op_ret (-1): Re-querying afr-child (1/2)
[2015-10-23 12:57:58.943889] W [dict.c:612:dict_ref] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/distribute.so(dht_find_local_subvol_cbk+0x1b8) [0x7f562bb5af78] -->/lib64/libglusterfs.so.0(syncop_getxattr_cbk+0x34) [0x7f563dc7c894] -->/lib64/libglusterfs.so.0(dict_ref+0x79) [0x7f563dc322a9] ) 0-dict: dict is NULL [Invalid argument]
[2015-10-23 12:57:58.944065] I [MSGID: 0] [dht-rebalance.c:3028:gf_defrag_start_crawl] 0-spain-tier-dht: local subvols are spain-cold-dht
[2015-10-23 12:57:58.944089] I [MSGID: 0] [dht-rebalance.c:3028:gf_defrag_start_crawl] 0-spain-tier-dht: local subvols are spain-hot-dht
[2015-10-23 12:57:58.944132] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[0] creation successful
[2015-10-23 12:57:58.944164] I [dht-rebalance.c:1917:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 7
[2015-10-23 12:57:58.944181] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[1] creation successful
[2015-10-23 12:57:58.944279] I [dht-rebalance.c:1917:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 6
[2015-10-23 12:57:58.944297] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[2] creation successful
[2015-10-23 12:57:58.944354] I [dht-rebalance.c:1917:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 5
[2015-10-23 12:57:58.944370] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[3] creation successful
[2015-10-23 12:57:58.944427] I [dht-rebalance.c:1917:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 4
[2015-10-23 12:57:58.944444] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[4] creation successful
[2015-10-23 12:57:58.944477] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[5] creation successful
[2015-10-23 12:57:58.945148] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[6] creation successful
[2015-10-23 12:57:58.945180] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[7] creation successful
[2015-10-23 12:57:58.949235] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-spain-tier-dht: fixing the layout of /.trashcan
[2015-10-23 12:57:58.949270] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 0 (spain-cold-dht): 1897198 chunks
[2015-10-23 12:57:58.949283] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 1 (spain-hot-dht): 1176241 chunks
[2015-10-23 12:57:58.954350] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-spain-tier-dht: fixing the layout of /.trashcan/internal_op
[2015-10-23 12:57:58.954384] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 0 (spain-cold-dht): 1897198 chunks
[2015-10-23 12:57:58.954397] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 1 (spain-hot-dht): 1176241 chunks
[2015-10-23 12:57:58.964590] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-spain-tier-dht: fixing the layout of /dir1
[2015-10-23 12:57:58.964630] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 0 (spain-cold-dht): 1897198 chunks
[2015-10-23 12:57:58.964643] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 1 (spain-hot-dht): 1176241 chunks
[2015-10-23 12:57:58.979458] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.stat lookup failed
[2015-10-23 12:57:58.980400] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.1 lookup failed
[2015-10-23 12:57:58.980986] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.17 lookup failed
[2015-10-23 12:57:58.981677] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.24 lookup failed
[2015-10-23 12:57:58.982320] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.25 lookup failed
[2015-10-23 12:57:58.983451] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.28 lookup failed
[2015-10-23 12:57:58.984092] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.3 lookup failed
[2015-10-23 12:57:58.984724] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.31 lookup failed
[2015-10-23 12:57:58.985347] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.34 lookup failed
[2015-10-23 12:57:58.985960] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.36 lookup failed
[2015-10-23 12:57:58.986565] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.37 lookup failed
[2015-10-23 12:57:58.988495] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.38 lookup failed
[2015-10-23 12:57:58.989163] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.4 lookup failed
[2015-10-23 12:57:58.989829] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.40 lookup failed
[2015-10-23 12:57:58.990481] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.45 lookup failed
[2015-10-23 12:57:58.991111] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.46 lookup failed
[2015-10-23 12:57:58.991730] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.47 lookup failed
[2015-10-23 12:57:58.992345] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.5 lookup failed
[2015-10-23 12:57:58.992897] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.7 lookup failed
[2015-10-23 12:57:58.993419] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.9 lookup failed
[2015-10-23 12:57:58.993923] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.10 lookup failed
[2015-10-23 12:57:58.994498] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.11 lookup failed
[2015-10-23 12:57:58.995020] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.12 lookup failed
[2015-10-23 12:57:58.995566] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.14 lookup failed
[2015-10-23 12:57:58.996067] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.15 lookup failed
[2015-10-23 12:57:58.996607] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.18 lookup failed
[2015-10-23 12:57:58.997181] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.19 lookup failed
[2015-10-23 12:57:58.997703] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.2 lookup failed
[2015-10-23 12:57:58.998297] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.20 lookup failed
[2015-10-23 12:57:58.998848] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.21 lookup failed
[2015-10-23 12:57:58.999389] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.22 lookup failed
[2015-10-23 12:57:58.999918] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.23 lookup failed
[2015-10-23 12:57:59.000416] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.26 lookup failed
[2015-10-23 12:57:59.000908] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.27 lookup failed
[2015-10-23 12:57:59.001392] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.32 lookup failed
[2015-10-23 12:57:59.001853] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.33 lookup failed
[2015-10-23 12:57:59.002414] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.35 lookup failed
[2015-10-23 12:57:59.002918] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.41 lookup failed
[2015-10-23 12:57:59.003420] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.42 lookup failed
[2015-10-23 12:57:59.003943] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.43 lookup failed
[2015-10-23 12:57:59.004438] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.50 lookup failed
[2015-10-23 12:57:59.004924] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.6 lookup failed
[2015-10-23 12:57:59.005417] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.8 lookup failed
[2015-10-23 12:57:59.010263] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /1gb lookup failed
[2015-10-23 12:57:59.010792] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf1 lookup failed
[2015-10-23 12:57:59.011357] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf2 lookup failed
[2015-10-23 12:57:59.011912] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf5 lookup failed
[2015-10-23 12:57:59.012506] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf9 lookup failed
[2015-10-23 12:57:59.013047] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf3 lookup failed
[2015-10-23 12:57:59.013640] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf4 lookup failed
[2015-10-23 12:57:59.014279] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf7 lookup failed

Comment 3 Dan Lambright 2015-10-31 14:24:27 UTC
*** Bug 1275602 has been marked as a duplicate of this bug. ***

Comment 4 Nag Pavan Chilakam 2015-11-03 12:54:42 UTC
Did a restart of volume and didn't find this issue. Moving bug to verified.
Following is the log:





[2015-11-03 12:48:42.646721] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-4: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-11-03 12:48:42.647196] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-4: Connected to quota_one-client-4, attached to remote volume '/dummy/brick100/quota_one_hot'.
[2015-11-03 12:48:42.647220] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-4: Server and Client lk-version numbers are not same, reopening the fds
[2015-11-03 12:48:42.647298] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-quota_one-replicate-3: Subvolume 'quota_one-client-4' came back up; going online.
[2015-11-03 12:48:42.647496] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-4: Server lk version = 1
[2015-11-03 12:48:42.656562] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-7: changing port to 49185 (from 0)
[2015-11-03 12:48:42.656630] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-quota_one-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-11-03 12:48:42.656662] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-quota_one-client-5: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-11-03 12:48:42.656705] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-quota_one-client-3: disconnected from quota_one-client-3. Client process will keep trying to connect to glusterd until brick's port is available
[2015-11-03 12:48:42.656920] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-quota_one-client-5: disconnected from quota_one-client-5. Client process will keep trying to connect to glusterd until brick's port is available
[2015-11-03 12:48:42.662326] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-7: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-11-03 12:48:42.667060] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-7: Connected to quota_one-client-7, attached to remote volume '/dummy/brick101/quota_one_hot'.
[2015-11-03 12:48:42.667095] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-7: Server and Client lk-version numbers are not same, reopening the fds
[2015-11-03 12:48:42.667518] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-7: Server lk version = 1
[2015-11-03 12:48:42.675603] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-0: selecting local read_child quota_one-client-0
[2015-11-03 12:48:42.677016] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-1: selecting local read_child quota_one-client-2
[2015-11-03 12:48:42.678384] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-2: selecting local read_child quota_one-client-6
[2015-11-03 12:48:42.679546] I [dht-rebalance.c:3229:gf_defrag_start_crawl] 0-quota_one-tier-dht: gf_defrag_start_crawl using commit hash 2982399137
[2015-11-03 12:48:42.679818] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-3: selecting local read_child quota_one-client-4
[2015-11-03 12:48:42.681370] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of /
[2015-11-03 12:48:42.681402] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks
[2015-11-03 12:48:42.681435] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks
[2015-11-03 12:48:42.683915] W [afr-inode-read.c:745:afr_getxattr_node_uuid_cbk] 0-quota_one-replicate-3: op_ret (-1): Re-querying afr-child (1/2)
[2015-11-03 12:48:42.684421] W [dict.c:612:dict_ref] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/distribute.so(dht_find_local_subvol_cbk+0x1b8) [0x7fd041f6a898] -->/lib64/libglusterfs.so.0(syncop_getxattr_cbk+0x34) [0x7fd04ff5a894] -->/lib64/libglusterfs.so.0(dict_ref+0x79) [0x7fd04ff102a9] ) 0-dict: dict is NULL [Invalid argument]
[2015-11-03 12:48:42.684477] I [MSGID: 0] [dht-rebalance.c:3307:gf_defrag_start_crawl] 0-quota_one-tier-dht: local subvols are quota_one-cold-dht
[2015-11-03 12:48:42.684501] I [MSGID: 0] [dht-rebalance.c:3307:gf_defrag_start_crawl] 0-quota_one-tier-dht: local subvols are quota_one-hot-dht
[2015-11-03 12:48:42.684557] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[0] creation successful
[2015-11-03 12:48:42.684660] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[1] creation successful
[2015-11-03 12:48:42.684664] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 7
[2015-11-03 12:48:42.685360] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[2] creation successful
[2015-11-03 12:48:42.685404] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 6
[2015-11-03 12:48:42.685419] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[3] creation successful
[2015-11-03 12:48:42.685469] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 5
[2015-11-03 12:48:42.685489] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[4] creation successful
[2015-11-03 12:48:42.685535] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 4
[2015-11-03 12:48:42.685552] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[5] creation successful
[2015-11-03 12:48:42.685586] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[6] creation successful
[2015-11-03 12:48:42.685619] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[7] creation successful
[2015-11-03 12:48:42.686304] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1
[2015-11-03 12:48:42.686331] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001
[2015-11-03 12:48:42.686469] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1
[2015-11-03 12:48:42.686487] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001
[2015-11-03 12:48:42.691876] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of /.trashcan
[2015-11-03 12:48:42.691910] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks
[2015-11-03 12:48:42.691943] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks
[2015-11-03 12:48:42.695073] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1
[2015-11-03 12:48:42.695102] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan, gfid = 00000000-0000-0000-0000-000000000005
[2015-11-03 12:48:42.695226] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1
[2015-11-03 12:48:42.695244] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan, gfid = 00000000-0000-0000-0000-000000000005
[2015-11-03 12:48:42.700325] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of /.trashcan/internal_op
[2015-11-03 12:48:42.700361] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks
[2015-11-03 12:48:42.700373] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks
[2015-11-03 12:48:42.703445] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1
[2015-11-03 12:48:42.703475] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan/internal_op, gfid = 00000000-0000-0000-0000-000000000006
[2015-11-03 12:48:42.703654] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1
[2015-11-03 12:48:42.703680] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan/internal_op, gfid = 00000000-0000-0000-0000-000000000006
[2015-11-03 12:48:42.742156] I [MSGID: 109038] [tier.c:1355:tier_start] 0-quota_one-tier-dht: Begin run tier promote 0 demote 0
[2015-11-03 12:48:50.434189] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-1: changing port to 49182 (from 0)
[2015-11-03 12:48:50.434274] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-3: changing port to 49183 (from 0)
[2015-11-03 12:48:50.439355] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-5: changing port to 49184 (from 0)
[2015-11-03 12:48:50.442740] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-3: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-11-03 12:48:50.443178] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-11-03 12:48:50.443274] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-3: Connected to quota_one-client-3, attached to remote volume '/rhs/brick2/quota_one'.
[2015-11-03 12:48:50.443293] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-3: Server and Client lk-version numbers are not same, reopening the fds
[2015-11-03 12:48:50.443638] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-3: Server lk version = 1
[2015-11-03 12:48:50.443673] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-1: Connected to quota_one-client-1, attached to remote volume '/rhs/brick1/quota_one'.
[2015-11-03 12:48:50.443684] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2015-11-03 12:48:50.444012] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-1: Server lk version = 1
[2015-11-03 12:48:50.446705] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-5: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-11-03 12:48:50.447202] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-5: Connected to quota_one-client-5, attached to remote volume '/dummy/brick100/quota_one_hot'.
[2015-11-03 12:48:50.447235] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-5: Server and Client lk-version numbers are not same, reopening the fds
[2015-11-03 12:48:50.447684] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-5: Server lk version = 1



[root@zod ~]# gluster v info quota_one
 
Volume Name: quota_one
Type: Tier
Volume ID: 1f7be42a-0213-4e7c-9721-392a3747a19a
Status: Started
Number of Bricks: 8
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: yarrow:/dummy/brick101/quota_one_hot
Brick2: zod:/dummy/brick101/quota_one_hot
Brick3: yarrow:/dummy/brick100/quota_one_hot
Brick4: zod:/dummy/brick100/quota_one_hot
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick5: zod:/rhs/brick1/quota_one
Brick6: yarrow:/rhs/brick1/quota_one
Brick7: zod:/rhs/brick2/quota_one
Brick8: yarrow:/rhs/brick2/quota_one
Options Reconfigured:
diagnostics.brick-log-level: TRACE
features.quota-deem-statfs: on
features.ctr-enabled: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
[root@zod ~]# gluster v status quota_one
Status of volume: quota_one
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick yarrow:/dummy/brick101/quota_one_hot  49185     0          Y       18811
Brick zod:/dummy/brick101/quota_one_hot     49185     0          Y       20257
Brick yarrow:/dummy/brick100/quota_one_hot  49184     0          Y       18854
Brick zod:/dummy/brick100/quota_one_hot     49184     0          Y       20275
Cold Bricks:
Brick zod:/rhs/brick1/quota_one             49182     0          Y       20293
Brick yarrow:/rhs/brick1/quota_one          49182     0          Y       18883
Brick zod:/rhs/brick2/quota_one             49183     0          Y       20311
Brick yarrow:/rhs/brick2/quota_one          49183     0          Y       18901
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       20347
Quota Daemon on localhost                   N/A       N/A        Y       20356
NFS Server on 10.70.34.43                   N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.34.43             N/A       N/A        Y       19003
Quota Daemon on 10.70.34.43                 N/A       N/A        Y       19012
 
Task Status of Volume quota_one
------------------------------------------------------------------------------
Task                 : Tier migration      
ID                   : eae47ea7-aea5-4220-8f1d-c6cfc145875d
Status               : in progress         
 
[root@zod ~]# rpm -qa|grep gluster
glusterfs-libs-3.7.5-5.el7rhgs.x86_64
glusterfs-fuse-3.7.5-5.el7rhgs.x86_64
glusterfs-3.7.5-5.el7rhgs.x86_64
glusterfs-server-3.7.5-5.el7rhgs.x86_64
glusterfs-client-xlators-3.7.5-5.el7rhgs.x86_64
glusterfs-cli-3.7.5-5.el7rhgs.x86_64
glusterfs-api-3.7.5-5.el7rhgs.x86_64

Comment 6 errata-xmlrpc 2016-03-01 05:44:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html