Created attachment 823875 [details] geo-rep master log from the node where crash was found. Description of problem: When was running the geo-rep session and there were some node reboots, worker crashed thrice with the "ValueError: filedescriptor out of range in select()" back trace in logs. But the good news is that the worker is restarted by the monitor. But this happened thrice at irregular intervals. Version-Release number of selected component (if applicable): glusterfs-3.4.0.44rhs-1.el6rhs.x86_64 How reproducible: Not consistently reproducible. Hit thrice all on single setup. Steps to Reproduce: 1. Create and start the geo-rep session between 2*2 dist-rep master 2*2 dist-rep slave. 2. Now start creating smallfiles and tar files using smallfiles.py and crefi.py respectively. 3. Now reboot a passive node. 4. Bring down a Active node for 2 hours and then bring it on again. 5. After about 5-6 hours, reboot the same active node which was down for 2 hours. Actual results: [2013-11-14 12:56:54.371954] E [syncdutils(/rhs/bricks/brick0):207:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 150, in main main_i() File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 540, in main_i local.service_loop(*[r for r in [remote] if r]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1157, in service_loop g2.crawlwrap() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 461, in crawlwrap self.crawl() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1027, in crawl self.process(changes) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 815, in process if self.process_change(change, done, retry): File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 804, in process_change if self.syncdata(datas): File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 223, in syncdata self.syncer.sync() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1249, in sync self.syncjob() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1241, in syncjob po = self.slave.tarssh(pb) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1266, in tarssh return sup(self, files, self.slaveurl) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 35, in sup return getattr(super(type(x), x), sys._getframe(1).f_code.co_name)(*a, **kw) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 785, in tarssh p1.terminate_geterr(fail_on_err = False) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 240, in terminate_geterr if not select([self.stderr],[],[],0.1)[0]: File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 293, in select return eintr_wrap(oselect.select, oselect.error, *a) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 286, in eintr_wrap return func(*a) ValueError: filedescriptor out of range in select() [2013-11-14 12:56:54.910108] I [syncdutils(/rhs/bricks/brick0):159:finalize] <top>: exiting. [2013-11-14 12:56:54.931972] I [monitor(monitor):81:set_state] Monitor: new state: faulty [2013-11-14 12:57:07.114229] I [monitor(monitor):129:monitor] Monitor: ------------------------------------------------------------ Status goes to faulty with the above crash. But the monitor starts it again. Expected results: There should be no worker crashes. Additional info: Will attach the complete log file. NOTE: The crash was found in a node which was never rebooted.
Even I observed this crash, The backtrace was like this >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2013-12-02 14:19:13.673443] E [syncdutils(/bricks/brick1):207:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 233, in twrap tf(*aa) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1363, in syncjob po = self.sync_engine(pb) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1266, in tarssh return sup(self, files, self.slaveurl) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 35, in sup return getattr(super(type(x), x), sys._getframe(1).f_code.co_name)(*a, **kw) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 785, in tarssh p1.terminate_geterr(fail_on_err = False) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 240, in terminate_geterr if not select([self.stderr],[],[],0.1)[0]: File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 293, in select return eintr_wrap(oselect.select, oselect.error, *a) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 286, in eintr_wrap return func(*a) ValueError: filedescriptor out of range in select() [2013-12-02 14:19:13.721365] I [syncdutils(/bricks/brick1):159:finalize] <top>: exiting. [2013-12-02 14:19:13.734653] I [monitor(monitor):81:set_state] Monitor: new state: faulty >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
This backtrace is a result of an fd leak when tar+ssh is used an the sync engine. A missing cleanup of the tar process resulted in the file descriptor number that's assigned to subsequent "tar" processes to be ever growing. This would result in maxing out the number of open fd's per process and eventually crash, but python's select() has it's own limitations -- it can only handle fd numbers upto 1023 thereby hitting the limit early as shown below: In [2]: select.select([1024], [], []) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-2-9f305b4e42e8> in <module>() ----> 1 select.select([1024], [], []) ValueError: filedescriptor out of range in select() Therefore, as time progressed and there were changes in the file system, geo-replication would invoke the sync engine ("tar+ssh" in this case) and never properly cleanup. Fix is to properly cleanup (wait and close open fd's) so as to recycle file descriptor numbers.
review link: https://code.engineering.redhat.com/gerrit/#/c/18580/
It looks like this was fixed in a build already. Can we confirm whether this will be included in 3.0?
Setting flags required to add BZs to RHS 3.0 Errata
verified on the build glusterfs-3.6.0.15-1.el6rhs verification steps. 1. create and start a geo-rep relationship between master and slave. 2. set config option use-tarssh to true. 3. start creating data on master. 4. track fd's opened by the geo-rep worker process in /proc/<worker_PID>/fd/ 5. Number of fd's doesn't keep growing. With the bug, it keeps growing eg: with the bug, entries like this keep growing. lrwx------ 1 root root 64 Jun 10 16:17 23 -> pipe:[17609585] lr-x------ 1 root root 64 Jun 10 16:19 28 -> pipe:[17609582]
*** Bug 1050754 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html