Bug 1596626
Summary: | Heketi stops all further operations on finding "Id Not Found" errors due to db inconsistent state | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Saravanakumar <sarumuga> |
Component: | heketi | Assignee: | Michael Adam <madam> |
Status: | CLOSED ERRATA | QA Contact: | Rachael <rgeorge> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | cns-3.10 | CC: | akrishna, hchiramm, jmulligan, madam, pprakash, rhs-bugs, rtalur, sankarshan, sarumuga, sselvan, storage-qa-internal, vinug |
Target Milestone: | --- | ||
Target Release: | CNS 3.10 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | heketi-7.0.0-2.el7rhgs | Doc Type: | Bug Fix |
Doc Text: |
Previously, when the heketi database contained entries with broken references, various operations failed with the error "Id not found". With this fix, broken references are ignored when deleting a block hosting volume, cleaning up bricks with empty paths, and starting the heketi service when removing said reference would not lead to any additional broken references.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-09-12 09:23:45 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1568862 |
Description
Saravanakumar
2018-06-29 10:12:29 UTC
Prasanth, We don't need a rebase, I will change the description and title. However, upstream has identified three PRs https://github.com/heketi/heketi/pull/1216 https://github.com/heketi/heketi/pull/1213/ https://github.com/heketi/heketi/pull/1206 which are critical for stability and recovering from bad db state. I would like to use this bug for pulling in the three PRs mentioned. To complement the update in the title: Earlier, whenever heketi encountered "Id not found" errors from the database it aborted the operation. With the patches mentioned in comment 7, it is possible for heketi to skip such errors when not critical. This allows users to continue using heketi. 1. kill heketi pod while creating a bunch of PVCs in a loop 2. when the pod is restarted, ensure that there are some pending operations. You can see pending operations by either using heketi-cli dump db op or using heketi db export command and looking into the json. 3. Once the existence of pending operations is confirmed, using the db tool to delete pending operations. 4. Perform node replace or device replace operations and ensure they all pass. Updated doc text in the Doc Text field. Please review for technical accuracy. Doc Text looks OK Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2686 |