Description of problem: ======================== When We restart a tier volume, we consistently get "lookup failed" on all hot tier files. Version-Release number of selected component (if applicable): ============================================================== glusterfs-3.7.5-0.3.el7rhgs.x86_64 Steps to Reproduce: ===================== 1.create and start a tier volume and have files in hot and cold tier 2.Note down all the files in each tier 3.now keep viewing tier log and do a restart of volume LOG SAMPLE: =========== =================== [2015-10-23 12:57:58.907027] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-spain-client-4: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-10-23 12:57:58.907356] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-spain-client-4: Connected to spain-client-4, attached to remote volume '/rhs/brick6/spain_hot'. [2015-10-23 12:57:58.907396] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-spain-client-4: Server and Client lk-version numbers are not same, reopening the fds [2015-10-23 12:57:58.907474] I [MSGID: 108005] [afr-common.c:3842:afr_notify] 0-spain-replicate-3: Subvolume 'spain-client-4' came back up; going online. [2015-10-23 12:57:58.907512] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-spain-client-4: Server lk version = 1 [2015-10-23 12:57:58.918357] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-spain-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2015-10-23 12:57:58.918381] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-spain-client-7: changing port to 49175 (from 0) [2015-10-23 12:57:58.918418] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-spain-client-3: disconnected from spain-client-3. Client process will keep trying to connect to glusterd until brick's port is available [2015-10-23 12:57:58.918611] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-spain-client-5: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2015-10-23 12:57:58.918660] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-spain-client-5: disconnected from spain-client-5. Client process will keep trying to connect to glusterd until brick's port is available [2015-10-23 12:57:58.923646] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-spain-client-7: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-10-23 12:57:58.928203] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-spain-client-7: Connected to spain-client-7, attached to remote volume '/rhs/brick7/spain_hot'. [2015-10-23 12:57:58.928243] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-spain-client-7: Server and Client lk-version numbers are not same, reopening the fds [2015-10-23 12:57:58.929179] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-spain-client-7: Server lk version = 1 [2015-10-23 12:57:58.937254] I [MSGID: 108031] [afr-common.c:1783:afr_local_discovery_cbk] 0-spain-replicate-0: selecting local read_child spain-client-0 [2015-10-23 12:57:58.938353] I [MSGID: 108031] [afr-common.c:1783:afr_local_discovery_cbk] 0-spain-replicate-1: selecting local read_child spain-client-2 [2015-10-23 12:57:58.939454] I [MSGID: 108031] [afr-common.c:1783:afr_local_discovery_cbk] 0-spain-replicate-2: selecting local read_child spain-client-6 [2015-10-23 12:57:58.940294] I [dht-rebalance.c:2950:gf_defrag_start_crawl] 0-spain-tier-dht: gf_defrag_start_crawl using commit hash 2974851209 [2015-10-23 12:57:58.940360] I [MSGID: 108031] [afr-common.c:1783:afr_local_discovery_cbk] 0-spain-replicate-3: selecting local read_child spain-client-4 [2015-10-23 12:57:58.941772] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-spain-tier-dht: fixing the layout of / [2015-10-23 12:57:58.941809] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 0 (spain-cold-dht): 1897198 chunks [2015-10-23 12:57:58.941822] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 1 (spain-hot-dht): 1176241 chunks [2015-10-23 12:57:58.943583] W [afr-inode-read.c:745:afr_getxattr_node_uuid_cbk] 0-spain-replicate-3: op_ret (-1): Re-querying afr-child (1/2) [2015-10-23 12:57:58.943889] W [dict.c:612:dict_ref] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/distribute.so(dht_find_local_subvol_cbk+0x1b8) [0x7f562bb5af78] -->/lib64/libglusterfs.so.0(syncop_getxattr_cbk+0x34) [0x7f563dc7c894] -->/lib64/libglusterfs.so.0(dict_ref+0x79) [0x7f563dc322a9] ) 0-dict: dict is NULL [Invalid argument] [2015-10-23 12:57:58.944065] I [MSGID: 0] [dht-rebalance.c:3028:gf_defrag_start_crawl] 0-spain-tier-dht: local subvols are spain-cold-dht [2015-10-23 12:57:58.944089] I [MSGID: 0] [dht-rebalance.c:3028:gf_defrag_start_crawl] 0-spain-tier-dht: local subvols are spain-hot-dht [2015-10-23 12:57:58.944132] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[0] creation successful [2015-10-23 12:57:58.944164] I [dht-rebalance.c:1917:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 7 [2015-10-23 12:57:58.944181] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[1] creation successful [2015-10-23 12:57:58.944279] I [dht-rebalance.c:1917:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 6 [2015-10-23 12:57:58.944297] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[2] creation successful [2015-10-23 12:57:58.944354] I [dht-rebalance.c:1917:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 5 [2015-10-23 12:57:58.944370] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[3] creation successful [2015-10-23 12:57:58.944427] I [dht-rebalance.c:1917:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 4 [2015-10-23 12:57:58.944444] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[4] creation successful [2015-10-23 12:57:58.944477] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[5] creation successful [2015-10-23 12:57:58.945148] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[6] creation successful [2015-10-23 12:57:58.945180] I [dht-rebalance.c:3063:gf_defrag_start_crawl] 0-DHT: Thread[7] creation successful [2015-10-23 12:57:58.949235] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-spain-tier-dht: fixing the layout of /.trashcan [2015-10-23 12:57:58.949270] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 0 (spain-cold-dht): 1897198 chunks [2015-10-23 12:57:58.949283] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 1 (spain-hot-dht): 1176241 chunks [2015-10-23 12:57:58.954350] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-spain-tier-dht: fixing the layout of /.trashcan/internal_op [2015-10-23 12:57:58.954384] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 0 (spain-cold-dht): 1897198 chunks [2015-10-23 12:57:58.954397] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 1 (spain-hot-dht): 1176241 chunks [2015-10-23 12:57:58.964590] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-spain-tier-dht: fixing the layout of /dir1 [2015-10-23 12:57:58.964630] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 0 (spain-cold-dht): 1897198 chunks [2015-10-23 12:57:58.964643] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-spain-tier-dht: subvolume 1 (spain-hot-dht): 1176241 chunks [2015-10-23 12:57:58.979458] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.stat lookup failed [2015-10-23 12:57:58.980400] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.1 lookup failed [2015-10-23 12:57:58.980986] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.17 lookup failed [2015-10-23 12:57:58.981677] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.24 lookup failed [2015-10-23 12:57:58.982320] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.25 lookup failed [2015-10-23 12:57:58.983451] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.28 lookup failed [2015-10-23 12:57:58.984092] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.3 lookup failed [2015-10-23 12:57:58.984724] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.31 lookup failed [2015-10-23 12:57:58.985347] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.34 lookup failed [2015-10-23 12:57:58.985960] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.36 lookup failed [2015-10-23 12:57:58.986565] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.37 lookup failed [2015-10-23 12:57:58.988495] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.38 lookup failed [2015-10-23 12:57:58.989163] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.4 lookup failed [2015-10-23 12:57:58.989829] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.40 lookup failed [2015-10-23 12:57:58.990481] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.45 lookup failed [2015-10-23 12:57:58.991111] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.46 lookup failed [2015-10-23 12:57:58.991730] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.47 lookup failed [2015-10-23 12:57:58.992345] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.5 lookup failed [2015-10-23 12:57:58.992897] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.7 lookup failed [2015-10-23 12:57:58.993419] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.9 lookup failed [2015-10-23 12:57:58.993923] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.10 lookup failed [2015-10-23 12:57:58.994498] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.11 lookup failed [2015-10-23 12:57:58.995020] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.12 lookup failed [2015-10-23 12:57:58.995566] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.14 lookup failed [2015-10-23 12:57:58.996067] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.15 lookup failed [2015-10-23 12:57:58.996607] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.18 lookup failed [2015-10-23 12:57:58.997181] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.19 lookup failed [2015-10-23 12:57:58.997703] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.2 lookup failed [2015-10-23 12:57:58.998297] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.20 lookup failed [2015-10-23 12:57:58.998848] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.21 lookup failed [2015-10-23 12:57:58.999389] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.22 lookup failed [2015-10-23 12:57:58.999918] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.23 lookup failed [2015-10-23 12:57:59.000416] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.26 lookup failed [2015-10-23 12:57:59.000908] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.27 lookup failed [2015-10-23 12:57:59.001392] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.32 lookup failed [2015-10-23 12:57:59.001853] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.33 lookup failed [2015-10-23 12:57:59.002414] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.35 lookup failed [2015-10-23 12:57:59.002918] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.41 lookup failed [2015-10-23 12:57:59.003420] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.42 lookup failed [2015-10-23 12:57:59.003943] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.43 lookup failed [2015-10-23 12:57:59.004438] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.50 lookup failed [2015-10-23 12:57:59.004924] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.6 lookup failed [2015-10-23 12:57:59.005417] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /dir1/cz.8 lookup failed [2015-10-23 12:57:59.010263] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /1gb lookup failed [2015-10-23 12:57:59.010792] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf1 lookup failed [2015-10-23 12:57:59.011357] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf2 lookup failed [2015-10-23 12:57:59.011912] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf5 lookup failed [2015-10-23 12:57:59.012506] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf9 lookup failed [2015-10-23 12:57:59.013047] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf3 lookup failed [2015-10-23 12:57:59.013640] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf4 lookup failed [2015-10-23 12:57:59.014279] E [MSGID: 109037] [dht-rebalance.c:2666:gf_fix_layout_tier_attach_lookup] 0-spain-tier-dht: /hf7 lookup failed
*** Bug 1275602 has been marked as a duplicate of this bug. ***
Did a restart of volume and didn't find this issue. Moving bug to verified. Following is the log: [2015-11-03 12:48:42.646721] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-4: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:42.647196] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-4: Connected to quota_one-client-4, attached to remote volume '/dummy/brick100/quota_one_hot'. [2015-11-03 12:48:42.647220] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-4: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:42.647298] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-quota_one-replicate-3: Subvolume 'quota_one-client-4' came back up; going online. [2015-11-03 12:48:42.647496] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-4: Server lk version = 1 [2015-11-03 12:48:42.656562] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-7: changing port to 49185 (from 0) [2015-11-03 12:48:42.656630] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-quota_one-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2015-11-03 12:48:42.656662] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-quota_one-client-5: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2015-11-03 12:48:42.656705] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-quota_one-client-3: disconnected from quota_one-client-3. Client process will keep trying to connect to glusterd until brick's port is available [2015-11-03 12:48:42.656920] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-quota_one-client-5: disconnected from quota_one-client-5. Client process will keep trying to connect to glusterd until brick's port is available [2015-11-03 12:48:42.662326] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-7: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:42.667060] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-7: Connected to quota_one-client-7, attached to remote volume '/dummy/brick101/quota_one_hot'. [2015-11-03 12:48:42.667095] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-7: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:42.667518] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-7: Server lk version = 1 [2015-11-03 12:48:42.675603] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-0: selecting local read_child quota_one-client-0 [2015-11-03 12:48:42.677016] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-1: selecting local read_child quota_one-client-2 [2015-11-03 12:48:42.678384] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-2: selecting local read_child quota_one-client-6 [2015-11-03 12:48:42.679546] I [dht-rebalance.c:3229:gf_defrag_start_crawl] 0-quota_one-tier-dht: gf_defrag_start_crawl using commit hash 2982399137 [2015-11-03 12:48:42.679818] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-3: selecting local read_child quota_one-client-4 [2015-11-03 12:48:42.681370] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of / [2015-11-03 12:48:42.681402] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks [2015-11-03 12:48:42.681435] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks [2015-11-03 12:48:42.683915] W [afr-inode-read.c:745:afr_getxattr_node_uuid_cbk] 0-quota_one-replicate-3: op_ret (-1): Re-querying afr-child (1/2) [2015-11-03 12:48:42.684421] W [dict.c:612:dict_ref] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/distribute.so(dht_find_local_subvol_cbk+0x1b8) [0x7fd041f6a898] -->/lib64/libglusterfs.so.0(syncop_getxattr_cbk+0x34) [0x7fd04ff5a894] -->/lib64/libglusterfs.so.0(dict_ref+0x79) [0x7fd04ff102a9] ) 0-dict: dict is NULL [Invalid argument] [2015-11-03 12:48:42.684477] I [MSGID: 0] [dht-rebalance.c:3307:gf_defrag_start_crawl] 0-quota_one-tier-dht: local subvols are quota_one-cold-dht [2015-11-03 12:48:42.684501] I [MSGID: 0] [dht-rebalance.c:3307:gf_defrag_start_crawl] 0-quota_one-tier-dht: local subvols are quota_one-hot-dht [2015-11-03 12:48:42.684557] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[0] creation successful [2015-11-03 12:48:42.684660] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[1] creation successful [2015-11-03 12:48:42.684664] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 7 [2015-11-03 12:48:42.685360] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[2] creation successful [2015-11-03 12:48:42.685404] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 6 [2015-11-03 12:48:42.685419] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[3] creation successful [2015-11-03 12:48:42.685469] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 5 [2015-11-03 12:48:42.685489] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[4] creation successful [2015-11-03 12:48:42.685535] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 4 [2015-11-03 12:48:42.685552] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[5] creation successful [2015-11-03 12:48:42.685586] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[6] creation successful [2015-11-03 12:48:42.685619] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[7] creation successful [2015-11-03 12:48:42.686304] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1 [2015-11-03 12:48:42.686331] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001 [2015-11-03 12:48:42.686469] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1 [2015-11-03 12:48:42.686487] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001 [2015-11-03 12:48:42.691876] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of /.trashcan [2015-11-03 12:48:42.691910] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks [2015-11-03 12:48:42.691943] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks [2015-11-03 12:48:42.695073] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1 [2015-11-03 12:48:42.695102] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan, gfid = 00000000-0000-0000-0000-000000000005 [2015-11-03 12:48:42.695226] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1 [2015-11-03 12:48:42.695244] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan, gfid = 00000000-0000-0000-0000-000000000005 [2015-11-03 12:48:42.700325] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of /.trashcan/internal_op [2015-11-03 12:48:42.700361] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks [2015-11-03 12:48:42.700373] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks [2015-11-03 12:48:42.703445] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1 [2015-11-03 12:48:42.703475] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan/internal_op, gfid = 00000000-0000-0000-0000-000000000006 [2015-11-03 12:48:42.703654] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1 [2015-11-03 12:48:42.703680] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan/internal_op, gfid = 00000000-0000-0000-0000-000000000006 [2015-11-03 12:48:42.742156] I [MSGID: 109038] [tier.c:1355:tier_start] 0-quota_one-tier-dht: Begin run tier promote 0 demote 0 [2015-11-03 12:48:50.434189] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-1: changing port to 49182 (from 0) [2015-11-03 12:48:50.434274] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-3: changing port to 49183 (from 0) [2015-11-03 12:48:50.439355] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-5: changing port to 49184 (from 0) [2015-11-03 12:48:50.442740] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-3: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:50.443178] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:50.443274] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-3: Connected to quota_one-client-3, attached to remote volume '/rhs/brick2/quota_one'. [2015-11-03 12:48:50.443293] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-3: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:50.443638] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-3: Server lk version = 1 [2015-11-03 12:48:50.443673] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-1: Connected to quota_one-client-1, attached to remote volume '/rhs/brick1/quota_one'. [2015-11-03 12:48:50.443684] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-1: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:50.444012] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-1: Server lk version = 1 [2015-11-03 12:48:50.446705] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-5: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:50.447202] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-5: Connected to quota_one-client-5, attached to remote volume '/dummy/brick100/quota_one_hot'. [2015-11-03 12:48:50.447235] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-5: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:50.447684] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-5: Server lk version = 1 [root@zod ~]# gluster v info quota_one Volume Name: quota_one Type: Tier Volume ID: 1f7be42a-0213-4e7c-9721-392a3747a19a Status: Started Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: yarrow:/dummy/brick101/quota_one_hot Brick2: zod:/dummy/brick101/quota_one_hot Brick3: yarrow:/dummy/brick100/quota_one_hot Brick4: zod:/dummy/brick100/quota_one_hot Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick5: zod:/rhs/brick1/quota_one Brick6: yarrow:/rhs/brick1/quota_one Brick7: zod:/rhs/brick2/quota_one Brick8: yarrow:/rhs/brick2/quota_one Options Reconfigured: diagnostics.brick-log-level: TRACE features.quota-deem-statfs: on features.ctr-enabled: on features.inode-quota: on features.quota: on performance.readdir-ahead: on [root@zod ~]# gluster v status quota_one Status of volume: quota_one Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Hot Bricks: Brick yarrow:/dummy/brick101/quota_one_hot 49185 0 Y 18811 Brick zod:/dummy/brick101/quota_one_hot 49185 0 Y 20257 Brick yarrow:/dummy/brick100/quota_one_hot 49184 0 Y 18854 Brick zod:/dummy/brick100/quota_one_hot 49184 0 Y 20275 Cold Bricks: Brick zod:/rhs/brick1/quota_one 49182 0 Y 20293 Brick yarrow:/rhs/brick1/quota_one 49182 0 Y 18883 Brick zod:/rhs/brick2/quota_one 49183 0 Y 20311 Brick yarrow:/rhs/brick2/quota_one 49183 0 Y 18901 NFS Server on localhost N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 20347 Quota Daemon on localhost N/A N/A Y 20356 NFS Server on 10.70.34.43 N/A N/A N N/A Self-heal Daemon on 10.70.34.43 N/A N/A Y 19003 Quota Daemon on 10.70.34.43 N/A N/A Y 19012 Task Status of Volume quota_one ------------------------------------------------------------------------------ Task : Tier migration ID : eae47ea7-aea5-4220-8f1d-c6cfc145875d Status : in progress [root@zod ~]# rpm -qa|grep gluster glusterfs-libs-3.7.5-5.el7rhgs.x86_64 glusterfs-fuse-3.7.5-5.el7rhgs.x86_64 glusterfs-3.7.5-5.el7rhgs.x86_64 glusterfs-server-3.7.5-5.el7rhgs.x86_64 glusterfs-client-xlators-3.7.5-5.el7rhgs.x86_64 glusterfs-cli-3.7.5-5.el7rhgs.x86_64 glusterfs-api-3.7.5-5.el7rhgs.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html