Description of problem: ======================= Remove-brick failed on a Distributed-Disperse volume while rm -rf is in-progress Version-Release number of selected component (if applicable): 3.12.2-8.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: =================== 1) Create a Distributed-Disperse volume and start it. 2) FUSE mount it on multiple clients. 3) From mount point, create Files and directories. 4) Start removing the data set using rm -rf * from multiple clients. 5) Now, initiate remove-brick operation. Out of 6 nodes, remove-brick failed on 5 nodes with different errors. If the root cause is different for these errors, let me know I'll file new BZs to track these errors separately. Node-1: [2018-04-24 11:41:52.185492] W [MSGID: 109073] [dht-common.c:10519:dht_notify] 0-ec_new-dht: Received CHILD_DOWN. Exiting The message "W [MSGID: 109073] [dht-common.c:10519:dht_notify] 0-ec_new-dht: Received CHILD_DOWN. Exiting" repeated 2 times between [2018-04-24 11:41:52.185492] and [2018-04-24 11:41:52.185703] [2018-04-24 11:41:52.200419] E [MSGID: 101046] [dht-common.c:1501:dht_lookup_dir_cbk] 0-ec_new-dht: dict is null [2018-04-24 11:41:52.290707] E [MSGID: 109027] [dht-rebalance.c:4422:gf_defrag_start_crawl] 0-ec_new-dht: Failed to start rebalance: look up on / failed [Transport endpoint is not connected] Node-2: [2018-04-24 11:43:47.921584] E [MSGID: 109016] [dht-rebalance.c:3840:gf_defrag_fix_layout] 0-ec_new-dht: Fix layout failed for /linux-4.9.27/Documentation/devicetree/bindings/power/supply Node-3: [2018-04-24 11:42:27.198197] W [dht-rebalance.c:3386:gf_defrag_process_dir] 0-ec_new-dht: Found error from gf_defrag_get_entry [2018-04-24 11:42:27.199605] E [MSGID: 109111] [dht-rebalance.c:3903:gf_defrag_fix_layout] 0-ec_new-dht: gf_defrag_process_dir failed for directory: /linux-4.9.27/Documentation/devicetree/bindings/arm [2018-04-24 11:42:27.210743] E [MSGID: 109016] [dht-rebalance.c:3840:gf_defrag_fix_layout] 0-ec_new-dht: Fix layout failed for /linux-4.9.27/Documentation/devicetree/bindings/arm Node-4: [2018-04-24 11:44:20.565793] E [MSGID: 109110] [dht-rebalance.c:3926:gf_defrag_fix_layout] 0-ec_new-dht: Settle hash failed for /linux-4.9.27/Documentation/devicetree/bindings/powerpc/nintendo [2018-04-24 11:44:20.571375] E [MSGID: 109016] [dht-rebalance.c:3840:gf_defrag_fix_layout] 0-ec_new-dht: Fix layout failed for /linux-4.9.27/Documentation/devicetree/bindings/powerpc/nintendo Node-5: [2018-04-24 11:42:05.224439] W [MSGID: 122040] [ec-common.c:1144:ec_prepare_update_cbk] 0-ec_new-disperse-3: Failed to get size and version [Input/output error] [2018-04-24 11:42:05.224712] E [MSGID: 109039] [dht-common.c:4078:dht_find_local_subvol_cbk] 0-ec_new-dht: getxattr err for dir [Input/output error] [2018-04-24 11:42:05.323977] W [MSGID: 122040] [ec-common.c:1144:ec_prepare_update_cbk] 0-ec_new-disperse-0: Failed to get size and version [Input/output error] [2018-04-24 11:42:05.324146] E [MSGID: 109039] [dht-common.c:4078:dht_find_local_subvol_cbk] 0-ec_new-dht: getxattr err for dir [Input/output error] [2018-04-24 11:42:05.324991] W [MSGID: 122040] [ec-common.c:1144:ec_prepare_update_cbk] 0-ec_new-disperse-8: Failed to get size and version [Input/output error] [2018-04-24 11:42:05.325125] E [MSGID: 109039] [dht-common.c:4078:dht_find_local_subvol_cbk] 0-ec_new-dht: getxattr err for dir [Input/output error] [2018-04-24 11:42:05.333846] E [MSGID: 0] [dht-rebalance.c:4279:dht_get_local_subvols_and_nodeuuids] 0-ec_new-dht: local subvolume determination failed with error: 5 [Input/output error] Node-6: Remove-brick completed successfully. Actual results: ============== Remove-brick failed on nodes while rm -rf is in-progress Expected results: ================= Remove-brick should complete without failure.
Development Management has reviewed and declined this request. You may appeal this decision by reopening this request.
Is there any pending work for this bug to get to it's closure?