Description of problem: On running replace-brick on a replica 3 volume, the VM running on the gluster volume goes to paused state. Details: rhsdev9, rhsdev-docker1, rhsdev-docker2 - running volume engine (replica 3) Volume Name: engine Type: Replicate Volume ID: 7830b6f2-4cdf-4ec5-bb2b-c8804196a554 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: rhsdev9.xxx:/rhgs/engine/brick1 Brick2: rhsdev-docker1.xxx:/rhgs/engine/brick1 Brick3: rhsdev-docker2.xxx:/rhgs/engine/brick1 Options Reconfigured: performance.readdir-ahead: on performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 features.shard: on features.shard-block-size: 512MB performance.low-prio-threads: 32 cluster.data-self-heal-algorithm: full diagnostics.latency-measurement: on diagnostics.count-fop-hits: on 1. Peer probed new node(rhsdev14) to this cluster 2. gluster volume replace-brick engine rhsdev-docker1.lab.eng.blr.redhat.com:/rhgs/engine/brick1 rhsdev14.lab.eng.blr.redhat.com:/rhgs/engine/brick1 commit force 3. gluster volume heal engine info - reported unsynced entries , and I waited for heal to complete qemu VM that was running on rhsdev-docker1 goes to paused state with EOTHER error code Mount log on rhsdev-docker1 : [2016-02-23 11:33:44.759843] E [MSGID: 114031] [client-rpc-fops.c:466:client3_3_open_cbk] 3-engine-client-1: remote operation failed. Path: <gfid:5947ba69-bd6b-48f9-9bee-8cc60f8da2fc> (5947ba69-bd6b-48f9-9bee-8cc60f8da2fc) [Permission denied] [2016-02-23 11:33:44.779746] I [MSGID: 108026] [afr-self-heal-common.c:662:afr_log_selfheal] 3-engine-replicate-0: Completed data selfheal on 5947ba69-bd6b-48f9-9bee-8cc60f8da2fc. source=0 sinks=1 [2016-02-23 11:33:44.784343] W [MSGID: 114031] [client-rpc-fops.c:2325:client3_3_setattr_cbk] 3-engine-client-1: remote operation failed [Operation not permitted] [2016-02-23 11:33:44.784716] W [MSGID: 114031] [client-rpc-fops.c:1164:client3_3_getxattr_cbk] 3-engine-client-1: remote operation failed. Path: <gfid:5947ba69-bd6b-48f9-9bee-8cc60f8da2fc> (5947ba69-bd6b-48f9-9bee-8cc60f8da2fc). Key: (null) [Permission denied] [2016-02-23 11:33:44.785114] W [MSGID: 114031] [client-rpc-fops.c:1088:client3_3_setxattr_cbk] 3-engine-client-1: remote operation failed [Permission denied] [2016-02-23 11:33:44.788631] I [MSGID: 108026] [afr-self-heal-common.c:662:afr_log_selfheal] 3-engine-replicate-0: Completed metadata selfheal on 5947ba69-bd6b-48f9-9bee-8cc60f8da2fc. source=0 sinks= [2016-02-23 11:33:47.561145] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 3-engine-replicate-0: performing metadata selfheal on 7ebdb94b-26fa-40b5-92d1-581c6f15063a [2016-02-23 11:33:47.563518] W [MSGID: 114031] [client-rpc-fops.c:2325:client3_3_setattr_cbk] 3-engine-client-1: remote operation failed [Operation not permitted] [2016-02-23 11:33:47.563999] W [MSGID: 114031] [client-rpc-fops.c:1164:client3_3_getxattr_cbk] 3-engine-client-1: remote operation failed. Path: /5f3618cd-cf2f-4f11-9ee4-670b2841616f/images/cc4c6f3d-f71e-4380-850f-7ee0cab016a5/ffcd824c-f473-43a7-97e8-2bd08053e6b8 (7ebdb94b-26fa-40b5-92d1-581c6f15063a). Key: (null) [Permission denied] [2016-02-23 11:33:47.564546] W [MSGID: 114031] [client-rpc-fops.c:1088:client3_3_setxattr_cbk] 3-engine-client-1: remote operation failed [Permission denied] [2016-02-23 11:33:47.569011] I [MSGID: 108026] [afr-self-heal-common.c:662:afr_log_selfheal] 3-engine-replicate-0: Completed metadata selfheal on 7ebdb94b-26fa-40b5-92d1-581c6f15063a. source=0 sinks= [2016-02-23 11:33:44.782783] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 3-engine-replicate-0: performing metadata selfheal on 5947ba69-bd6b-48f9-9bee-8cc60f8da2fc [2016-02-23 11:45:14.475350] W [fuse-bridge.c:1284:fuse_err_cbk] 0-glusterfs-fuse: 64097409: FSYNC() ERR => -1 (Invalid argument) Version-Release number of selected component (if applicable): glusterfs-3.7.5-18.32.giteb76d88.el7rhgs.x86_64 How reproducible: Not always Steps to Reproduce: As above Additional info: Logs from the 3 nodes will be attached
Log files are at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1311881/
Could not reproduce the permission denied errors on trying with latest glusterfs3.7.9-4(rhgs). However, a new issue was encountered and logged as Bug 1336295. Will re-open this if encountered again.