Bug 1276062
Summary: | Getting IO error while VM instance is migrating from source to destination brick | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | RajeshReddy <rmekala> | ||||||||
Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | ||||||||
Status: | CLOSED WONTFIX | QA Contact: | SATHEESARAN <sasundar> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | rhgs-3.1 | CC: | annair, mzywusko, nbalacha, ravishankar, rgowdapp, rhs-bugs, sankarshan, sasundar, spalai, tdesala | ||||||||
Target Milestone: | --- | Keywords: | ZStream | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | dht-IO-rebalance, dht-fops-while-rebal, dht-retest | ||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2018-04-16 17:59:22 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
RajeshReddy
2015-10-28 14:44:58 UTC
sosreports are available on rhsqe-repo.lab.eng.blr.redhat.com in the following location /home/repo/sosreports/bug.1276062 NOTE: You get in to this issue ONLY if you remove all the bricks in a replica set (which people may never do, consciously). Hence, this is really a very edge case and may be a candidate for documentation (if not there already). The rebalance logs show some EIO messages returned by AFR: [2015-10-27 09:22:39.886186] E [MSGID: 108008] [afr-read-txn.c:76:afr_read_txn_refresh_done] 0-glance1-replicate-3: Failing GETXATTR on gfid 00000000-0000-0000-0000-000000000000: split-brain observed. [Input/output error] [2015-10-27 09:22:39.886260] W [MSGID: 109023] [dht-rebalance.c:1076:dht_migrate_file] 0-glance1-dht: Migrate file failed:/glance/images/45edba02-7b69-4161-ade7-047a1d5f2e9b: failed to get xattr from glance1-replicate-3 (Invalid argument) [2015-10-27 09:22:39.886396] W [MSGID: 109023] [dht-rebalance.c:546:__dht_rebalance_create_dst_file] 0-glance1-dht: /glance/images/d6fb9845-fdfe-4139-83c7-7e90b3072824: failed to set xattr on glance1-replicate-0 (Cannot allocate memory) [2015-10-27 09:22:39.887923] E [MSGID: 108008] [afr-transaction.c:1984:afr_transaction] 0-glance1-replicate-3: Failing SETXATTR on gfid 00000000-0000-0000-0000-000000000000: split-brain observed. [Input/output error] [2015-10-27 09:22:39.888262] E [MSGID: 109023] [dht-rebalance.c:792:__dht_rebalance_open_src_file] 0-glance1-dht: failed to set xattr on /glance/images/ad92693e-3c51-408e-ae5a-85ce73a9dc62 in glance1-replicate-3 (Input/output error) [2015-10-27 09:22:39.888288] E [MSGID: 109023] [dht-rebalance.c:1098:dht_migrate_file] 0-glance1-dht: Migrate file failed: failed to open /glance/images/ad92693e-3c51-408e-ae5a-85ce73a9dc62 on glance1-replicate-3 [2015-10-27 09:22:39.888319] E [MSGID: 101046] [afr-inode-write.c:1534:afr_fsetxattr] 0-glance1-replicate-0: setxattr dict is null [2015-10-27 09:22:39.888533] W [MSGID: 109023] [dht-rebalance.c:546:__dht_rebalance_create_dst_file] 0-glance1-dht: /glance/images/45edba02-7b69-4161-ade7-047a1d5f2e9b: failed to set xattr on glance1-replicate-0 (Cannot allocate memory) [2015-10-27 09:22:39.889482] E [MSGID: 108008] [afr-transaction.c:1984:afr_transaction] 0-glance1-replicate-3: Failing SETXATTR on gfid 00000000-0000-0000-0000-000000000000: split-brain observed. [Input/output error] [2015-10-27 09:22:39.889855] E [MSGID: 109023] [dht-rebalance.c:792:__dht_rebalance_open_src_file] 0-glance1-dht: failed to set xattr on /glance/images/d6fb9845-fdfe-4139-83c7-7e90b3072824 in glance1-replicate-3 (Input/output error) [2015-10-27 09:22:39.889873] E [MSGID: 109023] [dht-rebalance.c:1098:dht_migrate_file] 0-glance1-dht: Migrate file failed: failed to open /glance/images/d6fb9845-fdfe-4139-83c7-7e90b3072824 on glance1-replicate-3 [2015-10-27 09:22:39.891221] E [MSGID: 108008] [afr-transaction.c:1984:afr_transaction] 0-glance1-replicate-3: Failing SETXATTR on gfid 00000000-0000-0000-0000-000000000000: split-brain observed. [Input/output error] [2015-10-27 09:22:39.891487] E [MSGID: 109023] [dht-rebalance.c:792:__dht_rebalance_open_src_file] 0-glance1-dht: failed to set xattr on /glance/images/45edba02-7b69-4161-ade7-047a1d5f2e9b in glance1-replicate-3 (Input/output error) Setting a NeedInfo on Ravi to see if this is a known issue. It is difficult to figure out the exact failure as the client logs are not available. (In reply to Nithya Balachandran from comment #6) > The rebalance logs show some EIO messages returned by AFR: > > > [2015-10-27 09:22:39.886186] E [MSGID: 108008] > [afr-read-txn.c:76:afr_read_txn_refresh_done] 0-glance1-replicate-3: Failing > GETXATTR on gfid 00000000-0000-0000-0000-000000000000: split-brain observed. > [Input/output error] > Setting a NeedInfo on Ravi to see if this is a known issue. We had some known spurious split-brain logs that was fixed via BZ 1411625 some time back where getfattr failed with EIO spuriously. But here the gifd is all zeroes which is strange. Probably needs to be tested with latest gluster bits to see if the issue is reproducible. Created attachment 1416622 [details]
Fuse mount logs logrotated
Fuse mount logs ( logrotated ones )
Created attachment 1416623 [details]
Fuse mount logs continued
Here is the rest of the fuse mount logs. Refer previous attachment for logrotated logs. Most of the errors could be seen in these fuse mount logs
Created attachment 1416625 [details]
sosreport from hypervisor
Gluster volume is fuse mounted on the hypervisor in the location /mnt/test
|