Bug 1793390

Summary: Pre-validation failure does not provide any hints on the reason for the failure
Product: [Community] GlusterFS Reporter: Yaniv Kaul <ykaul>
Component: glusterdAssignee: bugs <bugs>
Status: CLOSED UPSTREAM QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, moagrawa, pasik, srakonde
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-12 12:20:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yaniv Kaul 2020-01-21 08:52:35 UTC
Description of problem:

For example, stop prevalidation failed. All I'm seeing is:
2020-01-21 07:37:52.657023] W [MSGID: 106121] [glusterd-mgmt.c:186:gd_mgmt_v3_pre_validate_fn] 0-management: Volume stop prevalidation failed. 
[2020-01-21 07:37:52.657058] E [MSGID: 106121] [glusterd-mgmt.c:968:glusterd_mgmt_v3_pre_validate] 0-management: Pre Validation failed for operation Stop on local node 
[2020-01-21 07:37:52.657066] E [MSGID: 106121] [glusterd-mgmt.c:2328:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre Validation Failed 

I don't know why it failed - the ret code is not printed, nothing more specific
is logged so I can find out what the issue is. Perhaps in debug mode there's more information?
Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Sanju 2020-01-28 09:35:03 UTC
In case of volume stop, gd_mgmt_v3_pre_validate_fn() is calling glusterd_op_stage_stop_volume(). There are multiple functions which are called by glusterd_op_stage_stop_volume() but not all of them are having logs at every failure check.

For example,

glusterd_op_stage_stop_volume() is calling glusterd_op_stop_volume_args_get() and not logging anything on failure.

    ret = glusterd_op_stop_volume_args_get(dict, &volname, &flags);
    if (ret)
        goto out;

Also glusterd_op_stop_volume_args_get() has a scenario when it return -1 but logs nothing.

int
glusterd_op_stop_volume_args_get(dict_t *dict, char **volname, int *flags)
{
    int ret = -1;
    xlator_t *this = NULL;

    this = THIS;
    GF_ASSERT(this);

    if (!dict || !volname || !flags)
        goto out;                               <----- returning -1 but logging nothing

    ret = dict_get_strn(dict, "volname", SLEN("volname"), volname);
    if (ret) {
        gf_msg(this->name, GF_LOG_ERROR, 0, GD_MSG_DICT_GET_FAILED,
               "Unable to get volume name");
        goto out;
    }

    ret = dict_get_int32n(dict, "flags", SLEN("flags"), flags);
    if (ret) {
        gf_msg(this->name, GF_LOG_ERROR, 0, GD_MSG_DICT_GET_FAILED,
               "Unable to get flags");
        goto out;
    }
out:
    return ret;
}

In such cases, we don't have any logs to say what went wrong. Although DEBUG logs help, I would agree that there is a scope for improvement.

I'm more inclined towards assigning this to a newbie as it is not a complex thing and also it gives an opportunity to read the code.

Thanks,
Sanju

Comment 2 Worker Ant 2020-03-12 12:20:22 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/874, and will be tracked there from now on. Visit GitHub issues URL for further details