Created attachment 1622497 [details] cmvlt script Description of problem: ====================== On adding bricks and starting reabalance on a disperse volume , cmvlt script fails with IO Error Version-Release number of selected component (if applicable): ============================================================== [root@dhcp35-146 ~]# rpm -qa|grep gluster glusterfs-libs-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 glusterfs-cli-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-23.el7_7.1.x86_64 vdsm-gluster-4.30.18-1.0.el7rhgs.x86_64 glusterfs-api-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 glusterfs-fuse-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 glusterfs-geo-replication-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 glusterfs-events-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 glusterfs-rdma-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 glusterfs-client-xlators-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 python2-gluster-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64 glusterfs-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 glusterfs-server-6.0-15.2.git02dd9a3ad.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch [root@dhcp35-146 ~]# Test build provided for bug https://bugzilla.redhat.com/show_bug.cgi?id=1756325 and https://bugzilla.redhat.com/show_bug.cgi?id=1744881 How reproducible: ================== 2/2 Steps to Reproduce: =================== 1.Mounted a disperse volume 2.Started the cmvlt script from 2 clients (one from /mnt/EC and another from /mnt/EC/dir1) 3.Added brick and started rebalance 4.Rebalance has completed IO's failed on one client - 10.70.41.186 with the below error and then resumed also Thread [6] starting Thread [3] starting Thread [3], Iteration [0] starting Thread [6], Iteration [0] starting Thread [3], Exception: [Errno 77] File descriptor in bad state Thread [3], Traceback: Traceback (most recent call last): File "cmvlt.py", line 443, in run Thread.run (self) File "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File "cmvlt.py", line 405, in main if not sf.verify (): File "cmvlt.py", line 361, in verify bytes = f.read (7) IOError: [Errno 77] File descriptor in bad state close failed in file object destructor: IOError: [Errno 77] File descriptor in bad state Thread [5], Iteration [0] completed Thread [5], Iteration [1] starting Thread [1], Iteration [0] completed Thread [1], Iteration [1] starting Actual results: ================ IO error observed Expected results: ================= IO Error should not be seen Additional info: =================== [root@dhcp35-129 ~]# gluster v status vol_-2-11 Status of volume: vol_-2-11 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.45:/gluster/brick2/vol_-2-11 49154 0 Y 27052 Brick 10.70.35.227:/gluster/brick2/vol_-2-1 1 49154 0 Y 8721 Brick 10.70.35.146:/gluster/brick2/vol_-2-1 1 49154 0 Y 24948 Brick 10.70.35.129:/gluster/brick2/vol_-2-1 1 49154 0 Y 18662 Brick 10.70.35.111:/gluster/brick2/vol_-2-1 1 49154 0 Y 16863 Brick 10.70.35.232:/gluster/brick2/vol_-2-1 1 49154 0 Y 17553 Brick 10.70.35.45:/gluster/brick2/addnew1 49154 0 Y 27052 Brick 10.70.35.227:/gluster/brick2/addnew2 49154 0 Y 8721 Brick 10.70.35.146:/gluster/brick2/addnew3 49154 0 Y 24948 Brick 10.70.35.129:/gluster/brick2/addnew4 49154 0 Y 18662 Brick 10.70.35.111:/gluster/brick2/addnew5 49154 0 Y 16863 Brick 10.70.35.232:/gluster/brick2/addnew6 49154 0 Y 17553 Self-heal Daemon on localhost N/A N/A Y 23604 Self-heal Daemon on 10.70.35.146 N/A N/A Y 29837 Self-heal Daemon on 10.70.35.227 N/A N/A Y 16248 Self-heal Daemon on 10.70.35.111 N/A N/A Y 24529 Self-heal Daemon on 10.70.35.45 N/A N/A Y 1484 Self-heal Daemon on 10.70.35.232 N/A N/A Y 23001 Task Status of Volume vol_-2-11 ------------------------------------------------------------------------------ Task : Rebalance ID : 84cd8bbd-8aab-4f2b-aba5-d8dc54d76e7f Status : completed [root@dhcp35-129 ~]# [root@dhcp35-129 ~]# Client where the IO's have failed 10.70.41.186 (root-redhat) Client logs - [2019-10-03 10:11:44.028541] W [MSGID: 114061] [client-common.c:2871:client_pre_fstat_v2] 2-vol_-2-11-client-6: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.028629] W [MSGID: 114061] [client-common.c:2871:client_pre_fstat_v2] 2-vol_-2-11-client-7: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.028656] W [MSGID: 114061] [client-common.c:2871:client_pre_fstat_v2] 2-vol_-2-11-client-8: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.028681] W [MSGID: 114061] [client-common.c:2871:client_pre_fstat_v2] 2-vol_-2-11-client-9: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.028711] W [MSGID: 114061] [client-common.c:2871:client_pre_fstat_v2] 2-vol_-2-11-client-10: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.028779] W [MSGID: 114061] [client-common.c:2871:client_pre_fstat_v2] 2-vol_-2-11-client-11: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.028823] W [fuse-bridge.c:1269:fuse_attr_cbk] 0-glusterfs-fuse: 70782: FSTAT() ERR => -1 (File descriptor in bad state) [2019-10-03 10:11:44.033748] W [MSGID: 122033] [ec-common.c:1914:ec_locked] 2-vol_-2-11-disperse-0: Failed to complete preop lock [Stale file handle] [2019-10-03 10:11:44.038882] I [MSGID: 114024] [client-helpers.c:96:this_fd_set_ctx] 2-vol_-2-11-client-6: /DrillHoleTest.log (1f697779-794a-4f9e-b4de-56310b23b9e3): trying duplicate remote fd set. [2019-10-03 10:11:44.039147] I [MSGID: 114024] [client-helpers.c:96:this_fd_set_ctx] 2-vol_-2-11-client-7: /DrillHoleTest.log (1f697779-794a-4f9e-b4de-56310b23b9e3): trying duplicate remote fd set. [2019-10-03 10:11:44.039200] I [MSGID: 114024] [client-helpers.c:96:this_fd_set_ctx] 2-vol_-2-11-client-9: /DrillHoleTest.log (1f697779-794a-4f9e-b4de-56310b23b9e3): trying duplicate remote fd set. [2019-10-03 10:11:44.039245] I [MSGID: 114024] [client-helpers.c:96:this_fd_set_ctx] 2-vol_-2-11-client-8: /DrillHoleTest.log (1f697779-794a-4f9e-b4de-56310b23b9e3): trying duplicate remote fd set. [2019-10-03 10:11:44.039277] I [MSGID: 114024] [client-helpers.c:96:this_fd_set_ctx] 2-vol_-2-11-client-10: /DrillHoleTest.log (1f697779-794a-4f9e-b4de-56310b23b9e3): trying duplicate remote fd set. [2019-10-03 10:11:44.039315] I [MSGID: 114024] [client-helpers.c:96:this_fd_set_ctx] 2-vol_-2-11-client-11: /DrillHoleTest.log (1f697779-794a-4f9e-b4de-56310b23b9e3): trying duplicate remote fd set. [2019-10-03 10:11:44.375281] W [MSGID: 114061] [client-common.c:2625:client_pre_flush_v2] 2-vol_-2-11-client-6: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.375331] W [MSGID: 114061] [client-common.c:2625:client_pre_flush_v2] 2-vol_-2-11-client-7: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.375359] W [MSGID: 114061] [client-common.c:2625:client_pre_flush_v2] 2-vol_-2-11-client-8: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.375384] W [MSGID: 114061] [client-common.c:2625:client_pre_flush_v2] 2-vol_-2-11-client-9: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.375407] W [MSGID: 114061] [client-common.c:2625:client_pre_flush_v2] 2-vol_-2-11-client-10: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.375431] W [MSGID: 114061] [client-common.c:2625:client_pre_flush_v2] 2-vol_-2-11-client-11: (5a4e775b-afc4-4872-8269-10f7f9c9cf5f) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-10-03 10:11:44.375471] W [fuse-bridge.c:1826:fuse_err_cbk] 0-glusterfs-fuse: 70811: FLUSH() ERR => -1 (File descriptor in bad state) Server details - 10.70.35.45(root-redhat) Client details - 10.70.41.186 & 10.70.43.113
Created attachment 1624187 [details] script logs
Created attachment 1624678 [details] drill logs from client
Created attachment 1624679 [details] drill logs from client
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:3249