Bug 1676400
Summary: | rm -rf fails with "Directory not empty" | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Nithya Balachandran <nbalacha> | |
Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | |
Status: | CLOSED NEXTRELEASE | QA Contact: | ||
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | mainline | CC: | bugs | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1677260 1695403 (view as bug list) | Environment: | ||
Last Closed: | 2019-02-13 18:20:26 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1458215, 1661258, 1677260, 1686272, 1695403 |
Description
Nithya Balachandran
2019-02-12 08:06:21 UTC
RCA for the invisible directory left behind with concurrent rm -rf : -------------------------------------------------------------------- dht_selfheal_dir_mkdir_lookup_cbk (...) { ... 1381 this_call_cnt = dht_frame_return (frame); 1382 1383 LOCK (&frame->lock); 1384 { 1385 if ((op_ret < 0) && 1386 (op_errno == ENOENT || op_errno == ESTALE)) { 1387 local->selfheal.hole_cnt = !local->selfheal.hole_cnt ? 1 1388 : local->selfheal.hole_cnt + 1; 1389 } 1390 1391 if (!op_ret) { 1392 dht_iatt_merge (this, &local->stbuf, stbuf, prev); 1393 } 1394 check_mds = dht_dict_get_array (xattr, conf->mds_xattr_key, 1395 mds_xattr_val, 1, &errst); 1396 if (dict_get (xattr, conf->mds_xattr_key) && check_mds && !errst) { 1397 dict_unref (local->xattr); 1398 local->xattr = dict_ref (xattr); 1399 } 1400 1401 } 1402 UNLOCK (&frame->lock); 1403 1404 if (is_last_call (this_call_cnt)) { 1405 if (local->selfheal.hole_cnt == layout->cnt) { 1406 gf_msg_debug (this->name, op_errno, 1407 "Lookup failed, an rmdir could have " 1408 "deleted this entry %s", loc->name); 1409 local->op_errno = op_errno; 1410 goto err; 1411 } else { 1412 for (i = 0; i < layout->cnt; i++) { 1413 if (layout->list[i].err == ENOENT || 1414 layout->list[i].err == ESTALE || 1415 local->selfheal.force_mkdir) 1416 missing_dirs++; 1417 } There are 2 problems here: 1. The layout is not updated with the new subvol status on error. In this case, the initial lookup found a directory on the hashed subvol so only 2 entries in the layout indicate missing directories. However, by the time the selfheal code is executed, the racing rmdir has deleted the directory from all the subvols. At this point, the directory does not exist on any subvol and dht_selfheal_dir_mkdir_lookup_cbk gets an error from all 3 subvols, but this new status is not updated in the layout which still has only 2 missing dirs marked. 2. this_call_cnt = dht_frame_return (frame); is called before processing the frame. So with a call cnt of 3, it is possible that the second response has reached 1404 before the third one has started processing the return values. At this point, local->selfheal.hole_cnt != layout->cnt so control goes to line 1412. At line 1412, since we are still using the old layout, only the directories on the non-hashed subvols are considered when incrementing missing_dirs and for the healing. The combination of these two causes the selfheal to start healing the directories on the non-hashed subvols. It succeeds in creating the dirs on the non-hashed subvols. However, to set the layout, dht takes an inodelk on the hashed subvol which fails because the directory does on exist there. We therefore end up with directories on the non-hashed subvols with no layouts set. REVIEW: https://review.gluster.org/22195 (cluster/dht: Fix lookup selfheal and rmdir race) posted (#1) for review on master by N Balachandran REVIEW: https://review.gluster.org/22195 (cluster/dht: Fix lookup selfheal and rmdir race) merged (#3) on master by Raghavendra G |