Bug 1311881 - VM paused with permission denied and Invalid argument errors in fuse mount logs
VM paused with permission denied and Invalid argument errors in fuse mount logs
Status: CLOSED WORKSFORME
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: replicate (Show other bugs)
3.1
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Krutika Dhananjay
storage-qa-internal@redhat.com
: ZStream
Depends On:
Blocks: Gluster-HC-1
  Show dependency treegraph
 
Reported: 2016-02-25 04:26 EST by Sahina Bose
Modified: 2016-09-17 08:18 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-17 01:10:49 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Sahina Bose 2016-02-25 04:26:28 EST
Description of problem:

On running replace-brick on a replica 3 volume, the VM running on the gluster volume goes to paused state.

Details:

rhsdev9, rhsdev-docker1, rhsdev-docker2 - running volume engine (replica 3)

Volume Name: engine
Type: Replicate
Volume ID: 7830b6f2-4cdf-4ec5-bb2b-c8804196a554
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: rhsdev9.xxx:/rhgs/engine/brick1
Brick2: rhsdev-docker1.xxx:/rhgs/engine/brick1
Brick3: rhsdev-docker2.xxx:/rhgs/engine/brick1
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
diagnostics.latency-measurement: on
diagnostics.count-fop-hits: on


1. Peer probed new node(rhsdev14) to this cluster
2.  gluster volume replace-brick engine rhsdev-docker1.lab.eng.blr.redhat.com:/rhgs/engine/brick1 rhsdev14.lab.eng.blr.redhat.com:/rhgs/engine/brick1 commit force
3. gluster volume heal engine info - reported unsynced entries , and I waited for heal to complete

qemu VM that was running on rhsdev-docker1 goes to paused state with EOTHER error code

Mount log on rhsdev-docker1 :

[2016-02-23 11:33:44.759843] E [MSGID: 114031] [client-rpc-fops.c:466:client3_3_open_cbk] 3-engine-client-1: remote operation failed. Path: <gfid:5947ba69-bd6b-48f9-9bee-8cc60f8da2fc> (5947ba69-bd6b-48f9-9bee-8cc60f8da2fc) [Permission denied]
[2016-02-23 11:33:44.779746] I [MSGID: 108026] [afr-self-heal-common.c:662:afr_log_selfheal] 3-engine-replicate-0: Completed data selfheal on 5947ba69-bd6b-48f9-9bee-8cc60f8da2fc. source=0 sinks=1
[2016-02-23 11:33:44.784343] W [MSGID: 114031] [client-rpc-fops.c:2325:client3_3_setattr_cbk] 3-engine-client-1: remote operation failed [Operation not permitted]
[2016-02-23 11:33:44.784716] W [MSGID: 114031] [client-rpc-fops.c:1164:client3_3_getxattr_cbk] 3-engine-client-1: remote operation failed. Path: <gfid:5947ba69-bd6b-48f9-9bee-8cc60f8da2fc> (5947ba69-bd6b-48f9-9bee-8cc60f8da2fc). Key: (null) [Permission denied]
[2016-02-23 11:33:44.785114] W [MSGID: 114031] [client-rpc-fops.c:1088:client3_3_setxattr_cbk] 3-engine-client-1: remote operation failed [Permission denied]
[2016-02-23 11:33:44.788631] I [MSGID: 108026] [afr-self-heal-common.c:662:afr_log_selfheal] 3-engine-replicate-0: Completed metadata selfheal on 5947ba69-bd6b-48f9-9bee-8cc60f8da2fc. source=0 sinks=
[2016-02-23 11:33:47.561145] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 3-engine-replicate-0: performing metadata selfheal on 7ebdb94b-26fa-40b5-92d1-581c6f15063a
[2016-02-23 11:33:47.563518] W [MSGID: 114031] [client-rpc-fops.c:2325:client3_3_setattr_cbk] 3-engine-client-1: remote operation failed [Operation not permitted]
[2016-02-23 11:33:47.563999] W [MSGID: 114031] [client-rpc-fops.c:1164:client3_3_getxattr_cbk] 3-engine-client-1: remote operation failed. Path: /5f3618cd-cf2f-4f11-9ee4-670b2841616f/images/cc4c6f3d-f71e-4380-850f-7ee0cab016a5/ffcd824c-f473-43a7-97e8-2bd08053e6b8 (7ebdb94b-26fa-40b5-92d1-581c6f15063a). Key: (null) [Permission denied]
[2016-02-23 11:33:47.564546] W [MSGID: 114031] [client-rpc-fops.c:1088:client3_3_setxattr_cbk] 3-engine-client-1: remote operation failed [Permission denied]
[2016-02-23 11:33:47.569011] I [MSGID: 108026] [afr-self-heal-common.c:662:afr_log_selfheal] 3-engine-replicate-0: Completed metadata selfheal on 7ebdb94b-26fa-40b5-92d1-581c6f15063a. source=0 sinks=
[2016-02-23 11:33:44.782783] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 3-engine-replicate-0: performing metadata selfheal on 5947ba69-bd6b-48f9-9bee-8cc60f8da2fc
[2016-02-23 11:45:14.475350] W [fuse-bridge.c:1284:fuse_err_cbk] 0-glusterfs-fuse: 64097409: FSYNC() ERR => -1 (Invalid argument) 

Version-Release number of selected component (if applicable):
glusterfs-3.7.5-18.32.giteb76d88.el7rhgs.x86_64


How reproducible:
Not always

Steps to Reproduce:
As above


Additional info:
Logs from the 3 nodes will be attached
Comment 2 Sahina Bose 2016-02-25 05:03:30 EST
Log files are at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1311881/
Comment 5 Sahina Bose 2016-05-17 01:10:49 EDT
Could not reproduce the permission denied errors on trying with latest glusterfs3.7.9-4(rhgs). However, a new issue was encountered and logged as Bug 1336295.

Will re-open this if encountered again.

Note You need to log in before you can comment on or make changes to this bug.