Bug 739194
Summary: | No Version#0 changeset created for drift configurations created during network outage, even after the outage is repaired (scenario #7) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | Mike Foley <mfoley> | ||||
Component: | drift | Assignee: | John Sanda <jsanda> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.1 | CC: | jsanda | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-02-07 19:29:09 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 707225 | ||||||
Attachments: |
|
Description
Mike Foley
2011-09-16 18:22:29 UTC
Created attachment 523604 [details]
Agent log
This issue should be resolved now with changes introduced around error handling. commit hash: 3f3397557aedabbd11420c022344776d08f76e2e From the commit log... This commit introduces several changes and a changed work flow to address some boundary conditions that can arise when the server fails to receive a change set report. The issue stems from the way we stream the change set report to the server. Because the request is processed aysnchronously, we cannot know for certainty if/when errors arise in the comm layer. When DriftDetector runs, a new snapshot file is generated, and now a copy of the previous version snapshot is maintained as well. After the server processes the change set, it now sends an ack to the agent. This lets the agent know that the change set was successfully persisted on the server. The agent then cleans up, deleting the previous version snapshot, and the change set zip file. If drift detection runs again before the agent receives the ackowlegement, drift detection is skipped. The most likely scenario for not receiving an acknowledgement would be a network error or a down server. If any errors occur during drift detection, which includes sending the change set to the server, the agent will attempt to revert back to the previous version snapshot. This is to ensure we have a consistent snapshot on disk with which to work. This commit also fixes a bug in the drift inventory sync code. In situations where there are existing change sets on the server, and the agent has to fetch a snapshot from the server, the snapshot version was getting set incorrectly. This is because the snapshot was not being built correctly. Change sets were being applied out of order. This is fixed now. documenting the behavior here: 1) i did *not* get version #0 changeset after repairing the outage 2) i did *not* get a version #0 changeset after repairing the outage and clicking the "detect now" button 3) it was only after a subsequent change was drift detected ... and this change was picked up as version #0 this is different than i expected. i expected version #0 changeset after the network outage was repaired and clicking the "detect now" button. jsanda ... can you clarify if the behavior i am seeing is correct or not? You definitely should get that initial change set at some point after agent reconnects with the server. I think I see the problem. I forgot to handle the base case. For any version greater than zero, the agent checks to see if there is a copy of the previous snapshot before doing a drift scan. That previous snapshot only gets removed when the server acknowledges that it processed it successfully; so, its presence let's the agent know that something may have gone wrong. This would likely be the case during a network outage. When agent and server reconnect, an inventory sync runs, and the agent will revert to the previous snapshot which will trigger the agent to resend the change set to the server (assuming that drift is still present). I need to put similar logic in place for the initial change set. I have retested this and got the expected results. I think this was fixed some time ago and I just forgot to update the BZ. changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE marking VERIFIED BZs to CLOSED/CURRENTRELEASE |