Bug 2096267
| Summary: | HostedEngine .shard file size=0 in all nodes | ||
|---|---|---|---|
| Product: | [oVirt] ovirt-hosted-engine-setup | Reporter: | Corrado Zabeo <corrado.zabeo> |
| Component: | General | Assignee: | Gobinda Das <godas> |
| Status: | CLOSED UPSTREAM | QA Contact: | meital avital <mavital> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | --- | CC: | bugs, stirabos, vharihar |
| Target Milestone: | --- | Flags: | godas:
needinfo?
(vharihar) |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-01-16 10:13:36 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Gluster | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Corrado Zabeo
2022-06-13 12:06:08 UTC
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again. @Vinayak Can you please help? (In reply to Corrado Zabeo from comment #0) > Description of problem: > hi team, > my configuration is as follows: 3 replica gluster servers containing VM lvm > and VM hostedEngine 3.8.10 + 1 server running all VMs. > Due to a prolonged nighttime power failure and ups battery consumption the > system shut down, after starting looking at the logs I saw 2 boot in 3 > minutes, so I assume that the power was cut several times. > HostedEngine was paused upon complete restart of all services. > I checked the situation with "gluster volume heal engine info" ... 3 > split-brain connected nodes on .shard (15 files), all files were size = 0 on > node1. > 14 files I have recovered and aligned gfid from the replica, while I find a > file size = 0 in all nodes. So the split-brain remains active. > I would like to know how I can fix this and be able to recreate the segment > with the correct size. > Thanks in advance > > Version-Release number of selected component (if applicable):3.8.10 > > > How reproducible: > > > Steps to Reproduce: > 1. > 2. > 3. > > Actual results: > > > Expected results: > > > Additional info: hi, sorry for not replying earlier. I solved the split-brain problem in the following way: 1 - I identified the bricks at zero with the command "gluster volume heal engine info" and checked the differences with "getfattr -d -m. -E hex", in my case in /bricks/engine/brick/.shard 2 - I deleted the zero bricks in the .shard folder and the relative links in the .glusterfs folder 3 - the bricks have been automatically recreated 4 - I was left with one last problem brick 7013 was zero on all nodes, I proceeded to delete the brick and related links in the 3 nodes, they were automatically recreated and the split-brain disappeared The operating system restarted correctly, so fortunately the brick was empty. However, I don't understand why such an inconvenience happened. Below is the "volume heal" screen. Greetings [root@vmgluster01 zones]# gluster volume heal engine info Brick 192.170.254.3:/bricks/engine/brick /.shard - Is in split-brain /__DIRECT_IO_TEST__ /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7013 Status: Connected Number of entries: 3 Brick 192.170.254.4:/bricks/engine/brick /.shard - Is in split-brain /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7015 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7016 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7017 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7018 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7019 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7020 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7021 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7022 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7024 /__DIRECT_IO_TEST__ /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7013 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7023 Status: Connected Number of entries: 13 Brick 192.170.254.6:/bricks/engine/brick /.shard - Is in split-brain /__DIRECT_IO_TEST__ /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7013 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7015 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7016 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7017 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7018 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7019 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7020 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7021 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7022 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7023 /.shard/6a48d9f7-8aaa-4763-84ef-98adee5781d9.7024 Status: Connected Number of entries: 13 Moved to GitHub: https://github.com/oVirt/ovirt-hosted-engine-setup/issues/73 |