Bug 1308892 - [geo-rep]: Observed "Operation not supported" error with traceback on slave log
Summary: [geo-rep]: Observed "Operation not supported" error with traceback on slave log
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: hari gowtham
QA Contact: Rahul Hinduja
URL:
Whiteboard: rebase
Depends On:
Blocks: 1499180 1500396
TreeView+ depends on / blocked
 
Reported: 2016-02-16 11:33 UTC by Rahul Hinduja
Modified: 2020-06-10 12:12 UTC (History)
14 users (show)

Fixed In Version: glusterfs-3.12.2-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1499180 (view as bug list)
Environment:
Last Closed: 2020-03-10 09:09:11 UTC
Embargoed:


Attachments (Terms of Use)

Description Rahul Hinduja 2016-02-16 11:33:59 UTC
Description of problem:
=======================

During the automation run found a traceback with Operation not supported errors as follows: 

[2016-02-12 10:06:56.109858] I [repce(slave):92:service_loop] RepceServer: terminating on reaching EOF.
[2016-02-12 10:06:56.110126] I [syncdutils(slave):220:finalize] <top>: exiting.
[2016-02-12 10:07:02.720356] E [repce(slave):117:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 685, in entry_ops
    [], [ENOTEMPTY, ESTALE, ENODATA])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 662, in recursive_rmdir
    recursive_rmdir(gfid, entry, fullname)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 650, in recursive_rmdir
    names = errno_wrap(os.listdir, [path], [ENOENT], [ESTALE])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap
    return call(*arg)
OSError: [Errno 95] Operation not supported: '.gfid/63d4c437-7990-4b4b-a6fa-c0c6ef942d3f/level88/symlink_to_files'


Other logs reported during readdir as:


[2016-02-12 09:57:51.995660] W [MSGID: 114031]
[client-rpc-fops.c:2669:client3_3_readdirp_cbk] 0-slave-client-8: remote
operation failed [Operation not permitted]
[2016-02-12 09:57:51.996228] W [MSGID: 114031]
[client-rpc-fops.c:2669:client3_3_readdirp_cbk] 0-slave-client-11:
remote operation failed [Operation not permitted]
[2016-02-12 09:57:51.996475] W [MSGID: 114031]
[client-rpc-fops.c:2669:client3_3_readdirp_cbk] 0-slave-client-9: remote
operation failed [Operation not permitted]
[2016-02-12 09:57:51.996700] W [MSGID: 114031]
[client-rpc-fops.c:2669:client3_3_readdirp_cbk] 0-slave-client-4: remote
operation failed [Operation not permitted]

Syncing is done to slave and arequal checksum matches.


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-19.el6rhs.x86_64


How reproducible:
=================

Observed once

Steps to Reproduce:
===================
1. on 6x2 master and slave volume run the geo-replication distaf automation of 34 cases with rsync and fuse protocol

Actual results:
===============

Operation not supported errors observed in slave logs

Comment 3 Kotresh HR 2016-02-17 10:36:49 UTC
Analysis:

Geo-replication slave logs on Slave node is as follows:

[2016-02-12 09:58:34.276675] I [gsyncd(slave):652:main_i] <top>: syncing: gluster://localhost:slave
[2016-02-12 09:58:35.422693] I [resource(slave):844:service_loop] GLUSTER: slave listening
[2016-02-12 10:00:01.626035] E [repce(slave):117:worker] <top>: call failed:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 685, in entry_ops
    [], [ENOTEMPTY, ESTALE, ENODATA])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 662, in recursive_rmdir
    recursive_rmdir(gfid, entry, fullname)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 650, in recursive_rmdir
    names = errno_wrap(os.listdir, [path], [ENOENT], [ESTALE])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap
    return call(*arg)
OSError: [Errno 95] Operation not supported: '.gfid/0fd92c95-df7b-4e64-a0c6-04d6cf01e9ee/level68/symlink_to_files'


So it is during os.lisdir from python.

If we see the corresponding aux-mount logs during same time:
...
[2016-02-12 10:00:01.625414] W [fuse-bridge.c:982:fuse_fd_cbk] 0-glusterfs-fuse: 7065: OPENDIR() /.gfid/519f470b-b488-4da2-8a8d-a968d5394fc7 => -1 (Operation not supported)
...

We can see OPENDIR on /.gfid/<GFID> is being called, which is not supported.
gfid access translator will return with 'Operatoin not supported' on such opendirs.

But yet to find out who called OPENDIR on .gfid/<GFID>

Comment 4 Rahul Hinduja 2016-05-23 09:18:26 UTC
For reference: Hitting this on 3.1.3 (glusterfs-3.7.9-5)

Slave Log:
==========

[2016-05-22 15:56:37.597151] E [repce(slave):117:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 714, in entry_ops
    [], [ENOTEMPTY, ESTALE, ENODATA])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 654, in recursive_rmdir
    names = errno_wrap(os.listdir, [path], [ENOENT], [ESTALE])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 475, in errno_wrap
    return call(*arg)
OSError: [Errno 95] Operation not supported: '.gfid/201b5726-579d-4158-a72d-8b30ed275553/symlink_to_files'
[2016-05-22 15:56:37.631872] I [repce(slave):92:service_loop] RepceServer: terminating on reaching EOF.

Comment 5 Rahul Hinduja 2016-11-14 07:33:19 UTC
For reference: Hitting this on 3.2.0 (glusterfs-3.8.4-3.el7rhgs.x86_64)

Slave:
======

[2016-11-13 11:36:06.387545] E [repce(slave):117:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 726, in entry_ops
    [], [ENOTEMPTY, ESTALE, ENODATA])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 495, in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 666, in recursive_rmdir
    names = errno_wrap(os.listdir, [path], [ENOENT], [ESTALE])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 495, in errno_wrap
    return call(*arg)
OSError: [Errno 95] Operation not supported: '.gfid/28693d18-b73b-47c7-986a-b0e1608f8ab5/level410'
[2016-11-13 11:36:06.449531] I [repce(slave):92:service_loop] RepceServer: terminating on reaching EOF.


Master:
=======
[2016-11-13 11:36:06.389225] E [syncdutils(/bricks/brick1/master_brick4):296:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 204, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 747, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1535, in service_loop
    g2.crawlwrap()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 571, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1132, in crawl
    self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1107, in changelogs_batch_process
    self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 992, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 933, in process_change
    failures = self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208, in __call__
    raise res
OSError: [Errno 95] Operation not supported: '.gfid/28693d18-b73b-47c7-986a-b0e1608f8ab5/level410'
[2016-11-13 11:36:06.418972] I [syncdutils(/bricks/brick1/master_brick4):237:finalize] <top>: exiting.

Comment 7 Sunil Kumar Acharya 2017-10-06 10:31:29 UTC
upstream patch : https://review.gluster.org/18443

Comment 20 Anand Paladugu 2018-07-18 19:22:33 UTC
Is it worth documenting this issue (as a KCS) or known issue in 3.4 Release notes , so customer can try to remedy the issue instead of opening a support case ?

Comment 21 Kotresh HR 2018-07-20 12:53:11 UTC
(In reply to Anand Paladugu from comment #20)
> Is it worth documenting this issue (as a KCS) or known issue in 3.4 Release
> notes , so customer can try to remedy the issue instead of opening a support
> case ?

No, not required. There is no manual admint intervention required. The geo-rep worker restarts automatically and fixes it self.


Note You need to log in before you can comment on or make changes to this bug.