Description of problem: ===================== Observation: I had enabled brick multiplexing I am seeing EIO for .trashcan folder on the mount for an ecvol as below [root@dhcp35-103 ecv82]# ls -la total 12 drwxr-xr-x. 4 root root 4096 Apr 20 15:22 . drwxr-xr-x. 15 root root 4096 Apr 20 15:17 .. drwxr-xr-x. 2 root root 4096 Apr 20 15:22 dir1 [root@dhcp35-103 ecv82]# ls -lA ls: cannot access .trashcan: Input/output error total 4 drwxr-xr-x. 2 root root 4096 Apr 20 15:22 dir1 d?????????? ? ? ? ? ? .trashcan [root@dhcp35-103 ecv82]# Steps ===== Step1: I had 6 node setup on which i created below volumes Step2: enable brick multiplexing Step3: ecreated below vols ecv82-->an ec volume of 2x(8+2) spanning across nodes n1..n5 distrep3-->a distrep x3 volume 2x3 spanning across n1..n3 now as expected the brick PIDs for bricks hosted by one node is same due to brick mux enabled (check under logs) Step4: I then went ahead and changed the log level to debug for distrep3 (which should use same brick log as ecv82) Step5: I then mounted distrep3 on a fuse client =======>NOTE: I am not seeing .trashcan folder on the mount, don't know why Step6: Did some IOs Step7: Set min-free disk limit to 50% for distrep3 step8: did IOs to see if i am getting a warn for breaching minfree and got the below on client fuse log [2017-04-20 09:45:58.717617] W [MSGID: 109033] [dht-diskusage.c:263:dht_is_subvol_filled] 0-distrep3-dht: disk space on subvolume 'distrep3-replicate-1' is getting full (55.00 %), consider adding more bricks [2017-04-20 09:46:50.749409] W [MSGID: 109033] [dht-diskusage.c:263:dht_is_subvol_filled] 0-distrep3-dht: disk space on subvolume 'distrep3-replicate-2' is getting full (54.00 %), consider adding more bricks Step9: Now mounted ecv82 on a fuse client Step10: did an ls -lA and got the EIO [root@dhcp35-103 ecv82]# ls -lA ls: cannot access .trashcan: Input/output error total 4 drwxr-xr-x. 2 root root 4096 Apr 20 15:22 dir1 d?????????? ? ? ? ? ? .trashcan [root@dhcp35-103 ecv82]# #############logs ############## [root@dhcp35-45 ~]# [root@dhcp35-45 ~]# gluster v status Status of volume: distrep3 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.138:/rhs/brick11/distrep3 49152 0 Y 25929 Brick 10.70.35.130:/rhs/brick11/distrep3 49152 0 Y 6100 Brick 10.70.35.122:/rhs/brick11/distrep3 49152 0 Y 26260 Brick 10.70.35.138:/rhs/brick12/distrep3 49152 0 Y 25929 Brick 10.70.35.130:/rhs/brick12/distrep3 49152 0 Y 6100 Brick 10.70.35.122:/rhs/brick12/distrep3 49152 0 Y 26260 Brick 10.70.35.138:/rhs/brick13/distrep3 49152 0 Y 25929 Brick 10.70.35.130:/rhs/brick13/distrep3 49152 0 Y 6100 Brick 10.70.35.122:/rhs/brick13/distrep3 49152 0 Y 26260 Self-heal Daemon on localhost N/A N/A Y 27630 Self-heal Daemon on 10.70.35.23 N/A N/A Y 11436 Self-heal Daemon on 10.70.35.112 N/A N/A Y 27693 Self-heal Daemon on 10.70.35.122 N/A N/A Y 26404 Self-heal Daemon on 10.70.35.138 N/A N/A Y 26077 Self-heal Daemon on 10.70.35.130 N/A N/A Y 6247 Task Status of Volume distrep3 ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: ecv82 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.138:/rhs/brick1/ecv82 49152 0 Y 25929 Brick 10.70.35.130:/rhs/brick1/ecv82 49152 0 Y 6100 Brick 10.70.35.122:/rhs/brick1/ecv82 49152 0 Y 26260 Brick 10.70.35.23:/rhs/brick1/ecv82 49152 0 Y 11328 Brick 10.70.35.112:/rhs/brick1/ecv82 49152 0 Y 27485 Brick 10.70.35.138:/rhs/brick2/ecv82 49152 0 Y 25929 Brick 10.70.35.130:/rhs/brick2/ecv82 49152 0 Y 6100 Brick 10.70.35.122:/rhs/brick2/ecv82 49152 0 Y 26260 Brick 10.70.35.23:/rhs/brick2/ecv82 49152 0 Y 11328 Brick 10.70.35.112:/rhs/brick2/ecv82 49152 0 Y 27485 Brick 10.70.35.138:/rhs/brick3/ecv82 49152 0 Y 25929 Brick 10.70.35.130:/rhs/brick3/ecv82 49152 0 Y 6100 Brick 10.70.35.122:/rhs/brick3/ecv82 49152 0 Y 26260 Brick 10.70.35.23:/rhs/brick3/ecv82 49152 0 Y 11328 Brick 10.70.35.112:/rhs/brick3/ecv82 49152 0 Y 27485 Brick 10.70.35.138:/rhs/brick4/ecv82 49152 0 Y 25929 Brick 10.70.35.130:/rhs/brick4/ecv82 49152 0 Y 6100 Brick 10.70.35.122:/rhs/brick4/ecv82 49152 0 Y 26260 Brick 10.70.35.23:/rhs/brick4/ecv82 49152 0 Y 11328 Brick 10.70.35.112:/rhs/brick4/ecv82 49152 0 Y 27485 Self-heal Daemon on localhost N/A N/A Y 27630 Self-heal Daemon on 10.70.35.122 N/A N/A Y 26404 Self-heal Daemon on 10.70.35.138 N/A N/A Y 26077 Self-heal Daemon on 10.70.35.23 N/A N/A Y 11436 Self-heal Daemon on 10.70.35.130 N/A N/A Y 6247 Self-heal Daemon on 10.70.35.112 N/A N/A Y 27693 Task Status of Volume ecv82 ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp35-45 ~]# gluster v info Volume Name: distrep3 Type: Distributed-Replicate Volume ID: 28a6c08e-b7a0-4135-88fa-4b9ae250d609 Status: Started Snapshot Count: 0 Number of Bricks: 3 x 3 = 9 Transport-type: tcp Bricks: Brick1: 10.70.35.138:/rhs/brick11/distrep3 Brick2: 10.70.35.130:/rhs/brick11/distrep3 Brick3: 10.70.35.122:/rhs/brick11/distrep3 Brick4: 10.70.35.138:/rhs/brick12/distrep3 Brick5: 10.70.35.130:/rhs/brick12/distrep3 Brick6: 10.70.35.122:/rhs/brick12/distrep3 Brick7: 10.70.35.138:/rhs/brick13/distrep3 Brick8: 10.70.35.130:/rhs/brick13/distrep3 Brick9: 10.70.35.122:/rhs/brick13/distrep3 Options Reconfigured: cluster.min-free-disk: 50 cluster.quorum-count: 1 diagnostics.brick-log-level: DEBUG transport.address-family: inet nfs.disable: on cluster.brick-multiplex: enable Volume Name: ecv82 Type: Distributed-Disperse Volume ID: c2a84a0f-a95f-4264-984b-2e0879da7f99 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (8 + 2) = 20 Transport-type: tcp Bricks: Brick1: 10.70.35.138:/rhs/brick1/ecv82 Brick2: 10.70.35.130:/rhs/brick1/ecv82 Brick3: 10.70.35.122:/rhs/brick1/ecv82 Brick4: 10.70.35.23:/rhs/brick1/ecv82 Brick5: 10.70.35.112:/rhs/brick1/ecv82 Brick6: 10.70.35.138:/rhs/brick2/ecv82 Brick7: 10.70.35.130:/rhs/brick2/ecv82 Brick8: 10.70.35.122:/rhs/brick2/ecv82 Brick9: 10.70.35.23:/rhs/brick2/ecv82 Brick10: 10.70.35.112:/rhs/brick2/ecv82 Brick11: 10.70.35.138:/rhs/brick3/ecv82 Brick12: 10.70.35.130:/rhs/brick3/ecv82 Brick13: 10.70.35.122:/rhs/brick3/ecv82 Brick14: 10.70.35.23:/rhs/brick3/ecv82 Brick15: 10.70.35.112:/rhs/brick3/ecv82 Brick16: 10.70.35.138:/rhs/brick4/ecv82 Brick17: 10.70.35.130:/rhs/brick4/ecv82 Brick18: 10.70.35.122:/rhs/brick4/ecv82 Brick19: 10.70.35.23:/rhs/brick4/ecv82 Brick20: 10.70.35.112:/rhs/brick4/ecv82 Options Reconfigured: transport.address-family: inet nfs.disable: on cluster.brick-multiplex: enable [root@dhcp35-45 ~]# Rationale Of the testing: I wanted to check the behavior when we have brick mux in effect and we try to change some brick settings Version-Release number of selected component (if applicable): ======== 3.8.4-22
logs available at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1443941/
fuse mount log: [2017-04-20 09:48:19.851779] W [fuse-resolve.c:61:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/.trashcan: failed to resolve (Input/output error) [2017-04-20 09:48:19.854563] I [MSGID: 109063] [dht-layout.c:713:dht_layout_normalize] 0-ecv82-dht: Found anomalies in /.trashcan (gfid = 00000000-0000-0000-0000-000000000005). Holes=1 overlaps=0 [2017-04-20 09:48:19.855996] W [MSGID: 109065] [dht-selfheal.c:1410:dht_selfheal_dir_mkdir_lock_cbk] 0-ecv82-dht: acquiring inodelk failed for /.trashcan [Input/output error] [2017-04-20 09:48:19.856077] W [fuse-bridge.c:471:fuse_entry_cbk] 0-glusterfs-fuse: 23: LOOKUP() /.trashcan => -1 (Input/output error) [2017-04-20 09:52:35.145993] I [MSGID: 109063] [dht-layout.c:713:dht_layout_normalize] 0-ecv82-dht: Found anomalies in /.trashcan (gfid = 00000000-0000-0000-0000-000000000005). Holes=1 overlaps=0 [2017-04-20 09:52:35.147543] W [MSGID: 109065] [dht-selfheal.c:1410:dht_selfheal_dir_mkdir_lock_cbk] 0-ecv82-dht: acquiring inodelk failed for /.trashcan [Input/output error] [2017-04-20 09:52:35.147583] W [fuse-resolve.c:61:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/.trashcan: failed to resolve (Input/output error) [2017-04-20 09:52:35.152434] W [fuse-bridge.c:471:fuse_entry_cbk] 0-glusterfs-fuse: 866: LOOKUP() /.trashcan => -1 (Input/output error) [2017-04-20 09:52:35.150938] I [MSGID: 109063] [dht-layout.c:713:dht_layout_normalize] 0-ecv82-dht: Found anomalies in /.trashcan (gfid = 00000000-0000-0000-0000-000000000005). Holes=1 overlaps=0 [2017-04-20 09:52:35.152401] W [MSGID: 109065] [dht-selfheal.c:1410:dht_selfheal_dir_mkdir_lock_cbk] 0-ecv82-dht: acquiring inodelk failed for /.trashcan [Input/output error]
upstream patch : https://review.gluster.org/#/c/17225
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106137
Looks like we have an issue with this patch, moving this bug to POST.
downstream patch :https://code.engineering.redhat.com/gerrit/#/c/108021/
tested on 3.8.4-27: with above steps not seeing the problem anymore hence moving to verified Also now i see .trashcan on all volumes with brick mux enabled
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774