Bug 1275158
Summary: | Data Tiering:Getting lookup failed on files in hot tier, when volume is restarted | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> | |
Component: | tier | Assignee: | Dan Lambright <dlambrig> | |
Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> | |
Severity: | medium | Docs Contact: | ||
Priority: | urgent | |||
Version: | rhgs-3.1 | CC: | dlambrig, jbyers, rhinduja, rhs-bugs, sankarshan, storage-qa-internal | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | RHGS 3.1.2 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.7.5-5 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1275382 1275383 (view as bug list) | Environment: | ||
Last Closed: | 2016-03-01 05:44:57 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1260783, 1260923, 1275382, 1275383 |
Description
Nag Pavan Chilakam
2015-10-26 07:31:16 UTC
*** Bug 1275602 has been marked as a duplicate of this bug. *** Did a restart of volume and didn't find this issue. Moving bug to verified. Following is the log: [2015-11-03 12:48:42.646721] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-4: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:42.647196] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-4: Connected to quota_one-client-4, attached to remote volume '/dummy/brick100/quota_one_hot'. [2015-11-03 12:48:42.647220] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-4: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:42.647298] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-quota_one-replicate-3: Subvolume 'quota_one-client-4' came back up; going online. [2015-11-03 12:48:42.647496] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-4: Server lk version = 1 [2015-11-03 12:48:42.656562] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-7: changing port to 49185 (from 0) [2015-11-03 12:48:42.656630] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-quota_one-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2015-11-03 12:48:42.656662] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-quota_one-client-5: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2015-11-03 12:48:42.656705] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-quota_one-client-3: disconnected from quota_one-client-3. Client process will keep trying to connect to glusterd until brick's port is available [2015-11-03 12:48:42.656920] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-quota_one-client-5: disconnected from quota_one-client-5. Client process will keep trying to connect to glusterd until brick's port is available [2015-11-03 12:48:42.662326] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-7: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:42.667060] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-7: Connected to quota_one-client-7, attached to remote volume '/dummy/brick101/quota_one_hot'. [2015-11-03 12:48:42.667095] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-7: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:42.667518] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-7: Server lk version = 1 [2015-11-03 12:48:42.675603] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-0: selecting local read_child quota_one-client-0 [2015-11-03 12:48:42.677016] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-1: selecting local read_child quota_one-client-2 [2015-11-03 12:48:42.678384] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-2: selecting local read_child quota_one-client-6 [2015-11-03 12:48:42.679546] I [dht-rebalance.c:3229:gf_defrag_start_crawl] 0-quota_one-tier-dht: gf_defrag_start_crawl using commit hash 2982399137 [2015-11-03 12:48:42.679818] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-quota_one-replicate-3: selecting local read_child quota_one-client-4 [2015-11-03 12:48:42.681370] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of / [2015-11-03 12:48:42.681402] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks [2015-11-03 12:48:42.681435] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks [2015-11-03 12:48:42.683915] W [afr-inode-read.c:745:afr_getxattr_node_uuid_cbk] 0-quota_one-replicate-3: op_ret (-1): Re-querying afr-child (1/2) [2015-11-03 12:48:42.684421] W [dict.c:612:dict_ref] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/distribute.so(dht_find_local_subvol_cbk+0x1b8) [0x7fd041f6a898] -->/lib64/libglusterfs.so.0(syncop_getxattr_cbk+0x34) [0x7fd04ff5a894] -->/lib64/libglusterfs.so.0(dict_ref+0x79) [0x7fd04ff102a9] ) 0-dict: dict is NULL [Invalid argument] [2015-11-03 12:48:42.684477] I [MSGID: 0] [dht-rebalance.c:3307:gf_defrag_start_crawl] 0-quota_one-tier-dht: local subvols are quota_one-cold-dht [2015-11-03 12:48:42.684501] I [MSGID: 0] [dht-rebalance.c:3307:gf_defrag_start_crawl] 0-quota_one-tier-dht: local subvols are quota_one-hot-dht [2015-11-03 12:48:42.684557] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[0] creation successful [2015-11-03 12:48:42.684660] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[1] creation successful [2015-11-03 12:48:42.684664] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 7 [2015-11-03 12:48:42.685360] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[2] creation successful [2015-11-03 12:48:42.685404] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 6 [2015-11-03 12:48:42.685419] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[3] creation successful [2015-11-03 12:48:42.685469] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 5 [2015-11-03 12:48:42.685489] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[4] creation successful [2015-11-03 12:48:42.685535] I [dht-rebalance.c:2074:gf_defrag_task] 0-DHT: Thread sleeping. defrag->current_thread_count: 4 [2015-11-03 12:48:42.685552] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[5] creation successful [2015-11-03 12:48:42.685586] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[6] creation successful [2015-11-03 12:48:42.685619] I [dht-rebalance.c:3342:gf_defrag_start_crawl] 0-DHT: Thread[7] creation successful [2015-11-03 12:48:42.686304] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1 [2015-11-03 12:48:42.686331] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001 [2015-11-03 12:48:42.686469] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1 [2015-11-03 12:48:42.686487] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001 [2015-11-03 12:48:42.691876] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of /.trashcan [2015-11-03 12:48:42.691910] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks [2015-11-03 12:48:42.691943] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks [2015-11-03 12:48:42.695073] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1 [2015-11-03 12:48:42.695102] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan, gfid = 00000000-0000-0000-0000-000000000005 [2015-11-03 12:48:42.695226] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1 [2015-11-03 12:48:42.695244] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan, gfid = 00000000-0000-0000-0000-000000000005 [2015-11-03 12:48:42.700325] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-quota_one-tier-dht: fixing the layout of /.trashcan/internal_op [2015-11-03 12:48:42.700361] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 0 (quota_one-cold-dht): 1897198 chunks [2015-11-03 12:48:42.700373] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-quota_one-tier-dht: subvolume 1 (quota_one-hot-dht): 1947 chunks [2015-11-03 12:48:42.703445] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-cold-dht; inode layout - 0 - 4289564677 - 1; disk layout - 0 - 3608801279 - 1 [2015-11-03 12:48:42.703475] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan/internal_op, gfid = 00000000-0000-0000-0000-000000000006 [2015-11-03 12:48:42.703654] I [MSGID: 109064] [dht-layout.c:808:dht_layout_dir_mismatch] 0-quota_one-tier-dht: subvol: quota_one-hot-dht; inode layout - 4289564678 - 4294967295 - 1; disk layout - 3608801280 - 4294967295 - 1 [2015-11-03 12:48:42.703680] I [MSGID: 109018] [dht-common.c:811:dht_revalidate_cbk] 0-quota_one-tier-dht: Mismatching layouts for /.trashcan/internal_op, gfid = 00000000-0000-0000-0000-000000000006 [2015-11-03 12:48:42.742156] I [MSGID: 109038] [tier.c:1355:tier_start] 0-quota_one-tier-dht: Begin run tier promote 0 demote 0 [2015-11-03 12:48:50.434189] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-1: changing port to 49182 (from 0) [2015-11-03 12:48:50.434274] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-3: changing port to 49183 (from 0) [2015-11-03 12:48:50.439355] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-quota_one-client-5: changing port to 49184 (from 0) [2015-11-03 12:48:50.442740] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-3: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:50.443178] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:50.443274] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-3: Connected to quota_one-client-3, attached to remote volume '/rhs/brick2/quota_one'. [2015-11-03 12:48:50.443293] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-3: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:50.443638] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-3: Server lk version = 1 [2015-11-03 12:48:50.443673] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-1: Connected to quota_one-client-1, attached to remote volume '/rhs/brick1/quota_one'. [2015-11-03 12:48:50.443684] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-1: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:50.444012] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-1: Server lk version = 1 [2015-11-03 12:48:50.446705] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-quota_one-client-5: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-11-03 12:48:50.447202] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-quota_one-client-5: Connected to quota_one-client-5, attached to remote volume '/dummy/brick100/quota_one_hot'. [2015-11-03 12:48:50.447235] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-quota_one-client-5: Server and Client lk-version numbers are not same, reopening the fds [2015-11-03 12:48:50.447684] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-quota_one-client-5: Server lk version = 1 [root@zod ~]# gluster v info quota_one Volume Name: quota_one Type: Tier Volume ID: 1f7be42a-0213-4e7c-9721-392a3747a19a Status: Started Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: yarrow:/dummy/brick101/quota_one_hot Brick2: zod:/dummy/brick101/quota_one_hot Brick3: yarrow:/dummy/brick100/quota_one_hot Brick4: zod:/dummy/brick100/quota_one_hot Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick5: zod:/rhs/brick1/quota_one Brick6: yarrow:/rhs/brick1/quota_one Brick7: zod:/rhs/brick2/quota_one Brick8: yarrow:/rhs/brick2/quota_one Options Reconfigured: diagnostics.brick-log-level: TRACE features.quota-deem-statfs: on features.ctr-enabled: on features.inode-quota: on features.quota: on performance.readdir-ahead: on [root@zod ~]# gluster v status quota_one Status of volume: quota_one Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Hot Bricks: Brick yarrow:/dummy/brick101/quota_one_hot 49185 0 Y 18811 Brick zod:/dummy/brick101/quota_one_hot 49185 0 Y 20257 Brick yarrow:/dummy/brick100/quota_one_hot 49184 0 Y 18854 Brick zod:/dummy/brick100/quota_one_hot 49184 0 Y 20275 Cold Bricks: Brick zod:/rhs/brick1/quota_one 49182 0 Y 20293 Brick yarrow:/rhs/brick1/quota_one 49182 0 Y 18883 Brick zod:/rhs/brick2/quota_one 49183 0 Y 20311 Brick yarrow:/rhs/brick2/quota_one 49183 0 Y 18901 NFS Server on localhost N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 20347 Quota Daemon on localhost N/A N/A Y 20356 NFS Server on 10.70.34.43 N/A N/A N N/A Self-heal Daemon on 10.70.34.43 N/A N/A Y 19003 Quota Daemon on 10.70.34.43 N/A N/A Y 19012 Task Status of Volume quota_one ------------------------------------------------------------------------------ Task : Tier migration ID : eae47ea7-aea5-4220-8f1d-c6cfc145875d Status : in progress [root@zod ~]# rpm -qa|grep gluster glusterfs-libs-3.7.5-5.el7rhgs.x86_64 glusterfs-fuse-3.7.5-5.el7rhgs.x86_64 glusterfs-3.7.5-5.el7rhgs.x86_64 glusterfs-server-3.7.5-5.el7rhgs.x86_64 glusterfs-client-xlators-3.7.5-5.el7rhgs.x86_64 glusterfs-cli-3.7.5-5.el7rhgs.x86_64 glusterfs-api-3.7.5-5.el7rhgs.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html |