Bug 1118754
Summary: | Dist-geo-rep : after upgrade from RHS2.1(3.4.0.59rhs) to RHS3.0(3.6.0.24-1), geo-rep logs get ChangelogException: [Errno 2] No such file or directory" | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Vijaykumar Koppad <vkoppad> | ||||
Component: | geo-replication | Assignee: | Bug Updates Notification Mailing List <rhs-bugs> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | amainkar | ||||
Severity: | high | Docs Contact: | |||||
Priority: | low | ||||||
Version: | rhgs-3.0 | CC: | aavati, avishwan, csaba, david.macdonald, mzywusko, nlevinki, nsathyan, vagarwal, vshankar | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | usability | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1146397 (view as bug list) | Environment: | |||||
Last Closed: | 2015-08-06 15:00:43 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1146397 | ||||||
Attachments: |
|
Description
Vijaykumar Koppad
2014-07-11 13:09:10 UTC
Since it fails to do history crawl after upgrade, it might affect renames and deletes done during the upgrade(During the time geo-rep was stopped) Vijaykumar, please upload sosreports. Created attachment 917734 [details]
sosreport of the all the nodes.
The bug is a genuinly acceptable issue, it was being misunderstood because of traceback and errno. EXPLANATION: Geo-rep "start", after upgrade, called history with a start time(start time is the moment master gluster was stopped) which is not recorded in htime(because htimes are recorded in the upgraded version), hence no linkages found. This causes history to return -1, which causes agent to raise the exception. What needs to be done: No logical code-base change but logging improvements could help in debugging in future. It happened in two other scenarios, which didn't involve upgrade, but doesn't happen consistently. First scenario ======================================== 1. create and start geo-rep relationship between master and slave. 2. disable changelog. 3. create data on master. 4. check geo-rep logs, there could have traceback as given in description. ======================================== second scenario ========================================= 1. create and start geo-rep relationship between master and slave. 2. kill monitor, feedback and agent processes from one of the active nodes. 3. create data on master. 4. start force geo-rep. 5. Check geo-rep logs for traceback. It doesn't happen everytime. ========================================= Changelog agent(`ps -ax | grep gsyncd | grep agent`) interacts with changelogapi and raises exception in case of any error. geo-rep worker communicates with agent using RPC. Changelog Exceptions are handled in worker. Since RPC propagates traceback from agent to worker, Exception is logged in log files. These exceptions are no effect as these are handled in worker. But it confuses users. No new Changelogs index file(HTIME) is created after upgrade/brick node reboots. HTIME file will be created only when Changelog disabled and enabled. (BZ 1211327) This issue is not seen during upgrade tests of RHGS 3.1. Closing this bug. Please reopen if this issue found again. |