Bug 1232216
| Summary: | [geo-rep]: UnboundLocalError: local variable 'fd' referenced before assignment | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rahul Hinduja <rhinduja> | |
| Component: | geo-replication | Assignee: | Kotresh HR <khiremat> | |
| Status: | CLOSED ERRATA | QA Contact: | Rahul Hinduja <rhinduja> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | rhgs-3.1 | CC: | annair, asriram, asrivast, avishwan, chrisw, csaba, divya, khiremat, nlevinki, nsathyan | |
| Target Milestone: | --- | Keywords: | ZStream | |
| Target Release: | RHGS 3.1.1 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.7.1-12 | Doc Type: | Bug Fix | |
| Doc Text: |
Previously, if a meta-volume is configured, there was a small race window, where geo-replication worker access the unreferenced fd of the lock file maintained in shared storage volume. As a consequence, the geo-replication worker died and restarted. A fix is made in worker to always get the right fd. Now, the geo-replication worker does not die and restart.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1233411 (view as bug list) | Environment: | ||
| Last Closed: | 2015-10-05 07:11:39 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1216951, 1233411, 1240607, 1251815 | |||
Hit this multiple times during Rename and Remove operations.
[2015-07-04 15:04:45.75320] E [syncdutils(/rhs/brick1/b1):276:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 165, in main
main_i()
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 659, in main_i
local.service_loop(*[r for r in [remote] if r])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1438, in service_loop
g3.crawlwrap(oneshot=True)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 531, in crawlwrap
crawl = self.should_crawl()
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 485, in should_crawl
return self.mgmt_lock()
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 469, in mgmt_lock
os.close(fd)
UnboundLocalError: local variable 'fd' referenced before assignment
[2015-07-04 15:04:45.78900] I [syncdutils(/rhs/brick1/b1):220:finalize] <top>: exiting.
Doc text is edited. Please sign off to be included in Known Issues. Edited DocText. Please check. Included the edited text. Upstream Patch (Master): http://review.gluster.org/#/c/11318/ Upstream Patch (3.7): http://review.gluster.org/#/c/11563/ Merged in upstream (master) and upstream (3.7). Hence moving it to POST Downstream Patch: https://code.engineering.redhat.com/gerrit/#/c/55050/ Verified with build: glusterfs-3.7.1-14.el7rhgs.x86_64 Didn't see the Traceback. Moving the bug to verified state. [root@georep1 ~]# [root@georep1 ~]# grep -i "UnboundLocalError" /var/log/glusterfs/geo-replication/master/ssh%3A%2F%2Froot%4010.70.46.167%3Agluster%3A%2F%2F127.0.0.1%3Aslave.* [root@georep1 ~]# [root@georep1 ~]# [root@georep2 ~]# [root@georep2 ~]# grep -i "UnboundLocalError" /var/log/glusterfs/geo-replication/master/ssh%3A%2F%2Froot%4010.70.46.167%3Agluster%3A%2F%2F127.0.0.1%3Aslave.* [root@georep2 ~]# [root@georep3 ~]# [root@georep3 ~]# grep -i "UnboundLocalError" /var/log/glusterfs/geo-replication/master/ssh%3A%2F%2Froot%4010.70.46.167%3Agluster%3A%2F%2F127.0.0.1%3Aslave.* [root@georep3 ~]# [root@georep4 ~]# [root@georep4 ~]# grep -i "UnboundLocalError" /var/log/glusterfs/geo-replication/master/ssh%3A%2F%2Froot%4010.70.46.167%3Agluster%3A%2F%2F127.0.0.1%3Aslave.* [root@georep4 ~]# [root@georep1 ~]# gluster volume list gluster_shared_storage master [root@georep1 ~]# Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1845.html |
Description of problem: ======================= Happened to see the below traceback in the geo-rep log with the use of meta_volume. [2015-06-16 20:11:43.911356] I [master(/bricks/brick1/master_brick2):528:crawlwrap] _GMaster: crawl interval: 1 seconds [2015-06-16 20:11:43.912364] I [master(/bricks/brick0/master_brick0):519:crawlwrap] _GMaster: primary master with volume id a6de699b-83c5-4747-95a7-81c36f5d79c6 ... [2015-06-16 20:11:43.917547] I [master(/bricks/brick0/master_brick0):528:crawlwrap] _GMaster: crawl interval: 1 seconds [2015-06-16 20:11:43.918723] I [master(/bricks/brick1/master_brick2):455:mgmt_lock] _GMaster: Creating geo-rep directory in meta volume... [2015-06-16 20:11:43.923168] I [master(/bricks/brick0/master_brick0):455:mgmt_lock] _GMaster: Creating geo-rep directory in meta volume... [2015-06-16 20:11:43.938770] E [syncdutils(/bricks/brick0/master_brick0):276:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 165, in main main_i() File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 659, in main_i local.service_loop(*[r for r in [remote] if r]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1438, in service_loop g3.crawlwrap(oneshot=True) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 531, in crawlwrap crawl = self.should_crawl() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 485, in should_crawl return self.mgmt_lock() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 469, in mgmt_lock os.close(fd) UnboundLocalError: local variable 'fd' referenced before assignment [2015-06-16 20:11:43.941852] I [syncdutils(/bricks/brick0/master_brick0):220:finalize] <top>: exiting. [2015-06-16 20:11:43.945887] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-06-16 20:11:43.946471] I [syncdutils(agent):220:finalize] <top>: exiting. Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.7.1-3.el7rhgs.x86_64