Bug 1716812

Summary: Failed to create volume which transport_type is "tcp,rdma"
Product: [Community] GlusterFS Reporter: guolei <guol-fnst>
Component: glusterdAssignee: bugs <bugs>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: amukherj, bugs, guol-fnst, pgurusid, srakonde
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1721105 1721106 1721109 (view as bug list) Environment:
Last Closed: 2019-06-17 10:31:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1721105, 1721106, 1721109    

Description guolei 2019-06-04 07:52:34 UTC
Description of problem:

gluster volume create 11 transport tcp,rdma 193.168.141.101:/tmp/11 193.168.141.101:/tmp/12 force
volume create: 11: failed: Failed to create volume files


Version-Release number of selected component (if applicable):

# gluster --version
glusterfs 4.1.8
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:50:56:9c:8b:a9 brd ff:ff:ff:ff:ff:ff
    inet 193.168.141.101/16 brd 193.168.255.255 scope global dynamic ens192
       valid_lft 2591093sec preferred_lft 2591093sec
    inet6 fe80::250:56ff:fe9c:8ba9/64 scope link
       valid_lft forever preferred_lft forever
3: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:50:56:9c:53:58 brd ff:ff:ff:ff:ff:ff


How reproducible:


Steps to Reproduce:
1.rxe_cfg start
2.rxe_cfg add ens192
3.gluster volume create 11 transport tcp,rdma 193.168.141.101:/tmp/11 193.168.141.101:/tmp/12 force


Actual results:
volume create: 11: failed: Failed to create volume files


Expected results:
Success to create volume

Additional info:

[2019-06-04 07:36:45.966125] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-glusterd: Started running glusterd version 4.1.8 (args: glusterd --xlator-option *.upgrade=on -N)
[2019-06-04 07:36:45.970884] I [MSGID: 106478] [glusterd.c:1423:init] 0-management: Maximum allowed open file descriptors set to 65536
[2019-06-04 07:36:45.970900] I [MSGID: 106479] [glusterd.c:1481:init] 0-management: Using /var/lib/glusterd as working directory
[2019-06-04 07:36:45.970906] I [MSGID: 106479] [glusterd.c:1486:init] 0-management: Using /var/run/gluster as pid file working directory
[2019-06-04 07:36:45.973455] E [rpc-transport.c:284:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/4.1.8/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2019-06-04 07:36:45.973468] W [rpc-transport.c:288:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2019-06-04 07:36:45.973473] W [rpcsvc.c:1781:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed
[2019-06-04 07:36:45.973478] E [MSGID: 106244] [glusterd.c:1764:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2019-06-04 07:36:45.976348] I [MSGID: 106513] [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 31202
[2019-06-04 07:36:45.977372] I [MSGID: 106544] [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: 79e7e129-d041-48b6-b1d0-746c55d148fc
[2019-06-04 07:36:45.989706] I [MSGID: 106194] [glusterd-store.c:3850:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list.
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option rpc-auth-allow-insecure on
  7:     option transport.listen-backlog 10
  8:     option upgrade on
  9:     option event-threads 1
 10:     option ping-timeout 0
 11:     option transport.socket.read-fail-log off
 12:     option transport.socket.keepalive-interval 2
 13:     option transport.socket.keepalive-time 10
 14:     option transport-type rdma
 15:     option working-directory /var/lib/glusterd
 16: end-volume
 17:
+------------------------------------------------------------------------------+
[2019-06-04 07:36:46.005401] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2019-06-04 07:36:46.006879] W [glusterfsd.c:1514:cleanup_and_exit] (-->/usr/lib64/libpthread.so.0(+0x7dd5) [0x7f55547bbdd5] -->glusterd(glusterfs_sigwaiter+0xe5) [0x55c659e7dd65] -->glusterd(cleanup_and_exit+0x6b) [0x55c659e7db8b] ) 0-: received signum (15), shutting down
[2019-06-04 07:36:46.006997] E [rpcsvc.c:1536:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not unregister with portmap
[2019-06-04 07:36:46.007004] E [rpcsvc.c:1662:rpcsvc_program_unregister] 0-rpc-service: portmap unregistration of program failed
[2019-06-04 07:36:46.007008] E [rpcsvc.c:1708:rpcsvc_program_unregister] 0-rpc-service: Program unregistration failed: GlusterD svc cli, Num: 1238463, Ver: 2, Port: 0
[2019-06-04 07:36:46.007061] E [rpcsvc.c:1536:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not unregister with portmap
[2019-06-04 07:36:46.007066] E [rpcsvc.c:1662:rpcsvc_program_unregister] 0-rpc-service: portmap unregistration of program failed
[2019-06-04 07:36:46.007070] E [rpcsvc.c:1708:rpcsvc_program_unregister] 0-rpc-service: Program unregistration failed: Gluster Handshake, Num: 14398633, Ver: 2, Port: 0
[2019-06-04 07:37:18.784525] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 4.1.8 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
[2019-06-04 07:37:18.787926] I [MSGID: 106478] [glusterd.c:1423:init] 0-management: Maximum allowed open file descriptors set to 65536
[2019-06-04 07:37:18.787944] I [MSGID: 106479] [glusterd.c:1481:init] 0-management: Using /var/lib/glusterd as working directory
[2019-06-04 07:37:18.787950] I [MSGID: 106479] [glusterd.c:1486:init] 0-management: Using /var/run/gluster as pid file working directory
[2019-06-04 07:37:18.814752] W [MSGID: 103071] [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
[2019-06-04 07:37:18.814780] W [MSGID: 103055] [rdma.c:4938:init] 0-rdma.management: Failed to initialize IB Device
[2019-06-04 07:37:18.814786] W [rpc-transport.c:351:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed
[2019-06-04 07:37:18.814844] W [rpcsvc.c:1781:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed
[2019-06-04 07:37:18.814852] E [MSGID: 106244] [glusterd.c:1764:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2019-06-04 07:37:19.617049] I [MSGID: 106513] [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 31202
[2019-06-04 07:37:19.617342] I [MSGID: 106544] [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: 79e7e129-d041-48b6-b1d0-746c55d148fc
[2019-06-04 07:37:19.626546] I [MSGID: 106194] [glusterd-store.c:3850:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list.
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option rpc-auth-allow-insecure on
  7:     option transport.listen-backlog 10
  8:     option event-threads 1
  9:     option ping-timeout 0
 10:     option transport.socket.read-fail-log off
 11:     option transport.socket.keepalive-interval 2
 12:     option transport.socket.keepalive-time 10
 13:     option transport-type rdma
 14:     option working-directory /var/lib/glusterd
 15: end-volume
 16:
+------------------------------------------------------------------------------+
[2019-06-04 07:37:19.626791] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2019-06-04 07:37:20.874611] W [MSGID: 101095] [xlator.c:181:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/4.1.8/xlator/nfs/server.so: cannot open shared object file: No such file or directory
[2019-06-04 07:37:20.889571] E [MSGID: 106068] [glusterd-volgen.c:1034:volgen_write_volfile] 0-management: failed to create volfile
[2019-06-04 07:37:20.889588] E [glusterd-volgen.c:6727:glusterd_create_volfiles] 0-management: Could not generate gfproxy client volfiles
[2019-06-04 07:37:20.889601] E [MSGID: 106122] [glusterd-syncop.c:1482:gd_commit_op_phase] 0-management: Commit of operation 'Volume Create' failed on localhost : Failed to create volume files
[2019-06-04 07:38:49.194175] W [MSGID: 101095] [xlator.c:181:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/4.1.8/xlator/nfs/server.so: cannot open shared object file: No such file or directory
[2019-06-04 07:38:49.211380] E [MSGID: 106068] [glusterd-volgen.c:1034:volgen_write_volfile] 0-management: failed to create volfile
[2019-06-04 07:38:49.211407] E [glusterd-volgen.c:6727:glusterd_create_volfiles] 0-management: Could not generate gfproxy client volfiles
[2019-06-04 07:38:49.211433] E [MSGID: 106122] [glusterd-syncop.c:1482:gd_commit_op_phase] 0-management: Commit of operation 'Volume Create' failed on localhost : Failed to create volume files

Comment 1 guolei 2019-06-04 07:58:44 UTC
Test is ok on glusterfs3.12.9 ,failed on glusterfs3.13.2 and later version.



generate_client_volfiles (glusterd_volinfo_t *volinfo,
                          glusterd_client_type_t client_type)
{
        int                i                  = 0;
        int                ret                = -1;
        char               filepath[PATH_MAX] = {0,};
        char               *types[]           = {NULL, NULL, NULL};
        dict_t             *dict              = NULL;
        xlator_t           *this              = NULL;
        gf_transport_type  type               = GF_TRANSPORT_TCP;

        this = THIS;

        enumerate_transport_reqs (volinfo->transport_type, types);
        dict = dict_new ();
        if (!dict)
                goto out;
        for (i = 0; types[i]; i++) {
                memset (filepath, 0, sizeof (filepath));
                ret = dict_set_str (dict, "client-transport-type", types[i]);
                if (ret)
                        goto out;
                type = transport_str_to_type (types[i]);

                ret = dict_set_uint32 (dict, "trusted-client", client_type);
                if (ret)
                        goto out;

                if (client_type == GF_CLIENT_TRUSTED) {
                        ret = glusterd_get_trusted_client_filepath (filepath,
                                                                    volinfo,
                                                                    type);
                } else if (client_type == GF_CLIENT_TRUSTED_PROXY) {
                        glusterd_get_gfproxy_client_volfile (volinfo,
                                                             filepath,
                                                             PATH_MAX);  <---------------------------- Maybe this is the problem?  transport type should be passed to glusterd_get_gfproxy_client_volfile .Or filepath is NULL.
                        ret = dict_set_str (dict, "gfproxy-client", "on");
                } else {
                        ret = glusterd_get_client_filepath (filepath,
                                                            volinfo,
                                                            type);
                }
                if (ret) {
                        gf_msg (this->name, GF_LOG_ERROR, EINVAL,
                                GD_MSG_INVALID_ENTRY,
                                "Received invalid transport-type");
                        goto out;
                }

*                ret = generate_single_transport_client_volfile (volinfo,
                                                                filepath,
                                                                dict);*
                if (ret)
                        goto out;
        }

        /* Generate volfile for rebalance process */
        glusterd_get_rebalance_volfile (volinfo, filepath, PATH_MAX);
        ret = build_rebalance_volfile (volinfo, filepath, dict);

        if (ret) {
                gf_msg (this->name, GF_LOG_ERROR, 0,
                        GD_MSG_VOLFILE_CREATE_FAIL,
                        "Failed to create rebalance volfile for %s",
                        volinfo->volname);
                goto out;
        }

out:
        if (dict)
                dict_unref (dict);

        gf_msg_trace ("glusterd", 0, "Returning %d", ret);
        return ret;
}

void
glusterd_get_gfproxy_client_volfile (glusterd_volinfo_t *volinfo,
                                        char *path, int path_len)
{
        char                    workdir[PATH_MAX]      = {0, };
        glusterd_conf_t        *priv                    = THIS->private;

        GLUSTERD_GET_VOLUME_DIR (workdir, volinfo, priv);

        switch (volinfo->transport_type) {
        case GF_TRANSPORT_TCP:
                snprintf (path, path_len,
                                "%s/trusted-%s.tcp-gfproxy-fuse.vol",
                                workdir, volinfo->volname);
                break;

        case GF_TRANSPORT_RDMA:
                snprintf (path, path_len,
                                "%s/trusted-%s.rdma-gfproxy-fuse.vol",
                                workdir, volinfo->volname);
                break;
        default:
                break;
        }
}

Comment 2 Atin Mukherjee 2019-06-10 12:18:36 UTC
Since type GF_TRANSPORT_BOTH_TCP_RDMA isn't handled in the function.

Poornima - Was this intentionally done or a bug? I feel it's the latter. Looking at glusterd_get_dummy_client_filepath () we just need to club GF_TRANSPORT_TCP & GF_TRANSPORT_BOTH_TCP_RDMA in the same place. Please confirm.

Comment 3 Sanju 2019-06-10 17:17:57 UTC
Looking at the code, I feel we missed handle GF_TRANSPORT_BOTH_TCP_RDMA. As we have provided choice to create volume using tcp,rdma we should handle GF_TRANSPORT_BOTH_TCP_RDMA in glusterd_get_gfproxy_client_volfile().

This issue exists in the latest master too.

Thanks,
Sanju

Comment 4 Worker Ant 2019-06-11 04:25:25 UTC
REVIEW: https://review.gluster.org/22851 (glusterd: add GF_TRANSPORT_BOTH_TCP_RDMA in glusterd_get_gfproxy_client_volfile) posted (#1) for review on master by Atin Mukherjee

Comment 5 Worker Ant 2019-06-17 10:31:00 UTC
REVIEW: https://review.gluster.org/22851 (glusterd: add GF_TRANSPORT_BOTH_TCP_RDMA in glusterd_get_gfproxy_client_volfile) merged (#5) on master by Amar Tumballi

Comment 6 Red Hat Bugzilla 2023-09-14 05:29:42 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days