+++ This bug was initially created as a clone of Bug #1726673 +++ Description of problem: While performing remove-brick to convert 3X3 volume to 2X3 volume, there were failures in remove-brick rebalance due to " E [MSGID: 114031] [client-rpc-fops_v2.c:2540:client4_0_opendir_cbk] 0-vol4-client-8: remote operation failed. Path: /dir1/thread0/level03/level13/level23/level33/level43 (69e97af3-d2d7-450a-881e-0c4ef6ac1355) [Input/output error] " Version-Release number of selected component (if applicable): 6.0.7 How reproducible: 1/1 Steps to Reproduce: 1. Created 1X3 volume. 2. Fuse mount the volume and start I/O on the volume. 3. Convert it into 2X3 volume, triggered rebalance. 4. Let the rebalance complete and then convert into 3X3 volume;triggered rebalance. 5. After that, started remove-brick operation on the volume to convert it back into 2X3 volume. 6. Check the remove-brick status. Actual results: There are failures in remove-brick rebalance. Errors from rebalance logs: E [MSGID: 114031] [client-rpc-fops_v2.c:2540:client4_0_opendir_cbk] 0-vol4-client-2: remote operation failed. Path: /dir1/thread0/level03/level13/level23/level33/level43 (69e97af3-d2d7-450a-881e-0c4ef6ac1355) [Input/output error] E [MSGID: 114031] [client-rpc-fops_v2.c:2540:client4_0_opendir_cbk] 0-vol4-client-8: remote operation failed. Path: /dir1/thread0/level03/level13/level23/level33/level43 (69e97af3-d2d7-450a-881e-0c4ef6ac1355) [Input/output error] W [MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-vol4-client-8: remote operation failed. Path: /dir1/thread0/level03/level13/level23/level33/level43/level53/5d1b1579%%P3TRO7PG35 (558423e2-478e-40e9-9958-31c710e50b89) [Input/output error] W [MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-vol4-client-2: remote operation failed. Path: /dir1/thread0/level03/level13/level23/level33/level43 (69e97af3-d2d7-450a-881e-0c4ef6ac1355) [Input/output error] Expected results: Remove-brick should complete successfully. Remove-brick rebalance status: ============================== # gluster v remove-brick vol4 replica 3 10.70.47.88:/bricks/brick2/vol4-b2 10.70.47.190:/bricks/brick2/vol4-b2 10.70.47.5:/bricks/brick2/vol4-b2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- 10.70.47.190 3463 3.5MB 18425 23 0 completed 0:37:14 10.70.47.5 3308 3.7MB 21920 136 0 completed 0:32:59 localhost 3397 3.3MB 21977 138 0 completed 0:33:35 On checking the volume status, it showed that two bricks are down: ================================================================= # gluster v status vol4 Status of volume: vol4 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.47.88:/bricks/brick2/vol4-b1 49159 0 Y 30394 Brick 10.70.47.190:/bricks/brick2/vol4-b1 49159 0 Y 29191 Brick 10.70.47.5:/bricks/brick2/vol4-b1 N/A N/A N N/A Brick 10.70.46.246:/bricks/brick2/vol4-b1 49158 0 Y 22598 Brick 10.70.47.188:/bricks/brick2/vol4-b1 49158 0 Y 22865 Brick 10.70.46.63:/bricks/brick2/vol4-b1 49158 0 Y 21036 Brick 10.70.47.88:/bricks/brick2/vol4-b2 49160 0 Y 5938 Brick 10.70.47.190:/bricks/brick2/vol4-b2 49160 0 Y 4825 Brick 10.70.47.5:/bricks/brick2/vol4-b2 N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 6330 Self-heal Daemon on 10.70.46.246 N/A N/A Y 5672 Self-heal Daemon on 10.70.47.5 N/A N/A Y 5600 Self-heal Daemon on 10.70.46.63 N/A N/A Y 4593 Self-heal Daemon on 10.70.47.188 N/A N/A Y 4501 Self-heal Daemon on 10.70.47.190 N/A N/A Y 5352 Task Status of Volume vol4 ------------------------------------------------------------------------------ Task : Remove brick ID : 273f04c3-b8bb-4613-a403-0c655de86ca3 Removed bricks: 10.70.47.88:/bricks/brick2/vol4-b2 10.70.47.190:/bricks/brick2/vol4-b2 10.70.47.5:/bricks/brick2/vol4-b2 Status : completed dmesg: ===== [161039.214245] XFS (dm-66): Metadata CRC error detected at xfs_dir3_block_read_verify+0x5e/0x110 [xfs], xfs_dir3_block block 0x1dd8568 [161039.214912] XFS (dm-66): Unmount and run xfs_repair [161039.215126] XFS (dm-66): First 64 bytes of corrupted metadata buffer: [161039.215426] ffffbb1db27a6000: 20 20 20 20 20 23 20 51 75 69 63 6b 20 4d 61 69 # Quick Mai [161039.215729] ffffbb1db27a6010: 6c 20 54 72 61 6e 73 66 65 72 20 50 72 6f 74 6f l Transfer Proto [161039.216110] ffffbb1db27a6020: 63 6f 6c 0a 71 6d 74 70 20 20 20 20 20 20 20 20 col.qmtp [161039.216527] ffffbb1db27a6030: 20 20 20 20 32 30 39 2f 75 64 70 20 20 20 20 20 209/udp [161039.217200] XFS (dm-66): metadata I/O error: block 0x1dd8568 ("xfs_trans_read_buf_map") error 74 numblks 16 [161039.217937] XFS (dm-66): xfs_do_force_shutdown(0x1) called from line 370 of file fs/xfs/xfs_trans_buf.c. Return address = 0xffffffffc057de9a [161039.344196] XFS (dm-66): I/O Error Detected. Shutting down filesystem [161039.344495] XFS (dm-66): Please umount the filesystem and rectify the problem(s) ---> Though due to the brick issue, one brick is down in two replica pairs of the volume, but as it is a distributed-replicated volume,there should not be failures in rebalance. Failure reason: "[2019-07-02 08:32:01.514139] W [MSGID: 109023] [dht-rebalance.c:626:__is_file_migratable] 0-vol4-dht: Mi grate file failed:/dir1/thread0/level04/level14/level24/level34/level44/level54/level64/level74/level84/ symlink_to_files/5d1b15ed%%XS3OMQKQBN: Unable to get lock count for file " Key:/GLUSTERFS_POSIXLK_COUNT is used to get lock count from posix-lock translator. This information is used to decide whether to migrate the file or not. In the current scenario as Sayalee mentioned one disk is corrupted on server *.5 rendering both participating brick from that server unresponsive(all operation leading to IO error). Given that only of the brick from two replicas was down, DHT should have received a valid response. Actually, the key was entirely missing from the dictionary itself. Moving to AFR component for analysis. Adding a needinfo on Rafi, as he had done some investigation on the same. --- Additional comment from Mohammed Rafi KC on 2019-07-10 16:08:22 UTC --- RCA: As mentioned in the comment6, it failed because the lookup couldn't return lock count requested through GLUSTERFS_POSIXLK_COUNT. This is because While processing afr_lookup_cbk, if it requires a name heal, we process the name heal in afr_lookup_selfheal_wrap by wiping all the current lookup data. And after finishing the lookup we return the fresh data. But here when doing the healing using lookup we are not passing the xdata_req, which then posix misses to populate lock count. <code> 2802 int 2803 afr_lookup_selfheal_wrap(void *opaque) 2804 { 2805 int ret = 0; 2806 call_frame_t *frame = opaque; 2807 afr_local_t *local = NULL; 2808 xlator_t *this = NULL; 2809 inode_t *inode = NULL; 2810 uuid_t pargfid = { 2811 0, 2812 }; 2813 2814 local = frame->local; 2815 this = frame->this; 2816 loc_pargfid(&local->loc, pargfid); 2817 2818 ret = afr_selfheal_name(frame->this, pargfid, local->loc.name, 2819 &local->cont.lookup.gfid_req, local->xattr_req); 2820 if (ret == -EIO) 2821 goto unwind; 2822 2823 afr_local_replies_wipe(local, this->private); 2824 2825 inode = afr_selfheal_unlocked_lookup_on(frame, local->loc.parent, 2826 local->loc.name, local->replies, 2827 local->child_up, NULL); 2828 if (inode) 2829 inode_unref(inode); 2830 2831 afr_lookup_metadata_heal_check(frame, this); 2832 return 0; 2833 2834 unwind: 2835 AFR_STACK_UNWIND(lookup, frame, -1, EIO, NULL, NULL, NULL, NULL); 2836 return 0; </code>
REVIEW: https://review.gluster.org/23024 (afr/lookup: Pass xattr_req in while doing a slefheal in lookup) posted (#1) for review on master by mohammed rafi kc
REVIEW: https://review.gluster.org/23024 (afr/lookup: Pass xattr_req in while doing a selfheal in lookup) merged (#15) on master by Ravishankar N