Description of problem: When enabling quota on a volume that has TLS enabled, the crawler is unable to mount and subsequently exits, leaving the stale mountpoint behind Version-Release number of selected component (if applicable): glusterfs-server-3.8.4-54.el7rhgs.x86_64 How reproducible: always Steps to Reproduce: 1. Create a 1x3 volume w/ both management and data TLS enabled --> place tls keys (using common CA) --> touch secure_access --> start glusterd --> create volume --> set client.ssl and server.ssl 2. sudo gluster vol quota supervol01 enable 3. Note presence of stale mountpoint from quota Actual results: $ grep supervol01-brick /proc/mounts localhost:client_per_brick/supervol01.client.node3.bricks-supervol01-brick.vol /run/gluster/tmp/mntxqfB0P fuse.glusterfs rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072 0 0 $ ls /run/gluster/tmp/mntxqfB0P ls: cannot access /run/gluster/tmp/mntxqfB0P: Transport endpoint is not connected From glusterd.log: [2018-03-16 20:40:07.615411] I [MSGID: 106567] [glusterd-svc-mgmt.c:197:glusterd_svc_start] 0-management: Starting quotad service [2018-03-16 20:40:14.861880] W [MSGID: 106033] [glusterd-quota.c:331:_glusterd_quota_initiate_fs_crawl] 0-management: chdir /var/run/gluster/tmp/mntxqfB0P failed [Transport endpoint is not connected] [2018-03-16 20:41:43.716241] E [socket.c:2631:socket_poller] 0-socket.management: socket_poller 127.0.0.1:1020 failed (Input/output error) From brick log: [2018-03-16 20:40:11.757728] I [glusterfsd-mgmt.c:54:mgmt_cbk_spec] 0-mgmt: Volume file changed [2018-03-16 20:40:11.761844] I [MSGID: 101173] [graph.c:269:gf_add_cmdline_options] 0-supervol01-posix: adding option 'glusterd-uuid' for volume 'supervol01-posix' with value 'fed60a35-4c4a- 4a31-8ed4-c261561b1196' [2018-03-16 20:40:11.761860] I [MSGID: 115034] [server.c:406:_check_for_auth_option] 0-supervol01-io-stats: skip format check for non-addr auth option auth.login./bricks/supervol01/brick.all ow [2018-03-16 20:40:11.761876] I [MSGID: 115034] [server.c:406:_check_for_auth_option] 0-supervol01-io-stats: skip format check for non-addr auth option auth.login.e8f666df-0e75-458f-96ed-098c 418ea43f.password [2018-03-16 20:40:11.761930] I [addr.c:55:compare_addr_and_update] 0-/bricks/supervol01/brick: allowed = "*", received addr = "127.0.0.1" [2018-03-16 20:40:11.761936] I [login.c:34:gf_auth] 0-auth/login: connecting user name: node3 [2018-03-16 20:40:11.761943] I [addr.c:55:compare_addr_and_update] 0-/bricks/supervol01/brick: allowed = "*", received addr = "192.168.121.199" [2018-03-16 20:40:11.761947] I [login.c:34:gf_auth] 0-auth/login: connecting user name: node4 [2018-03-16 20:40:11.761954] I [addr.c:55:compare_addr_and_update] 0-/bricks/supervol01/brick: allowed = "*", received addr = "192.168.121.70" [2018-03-16 20:40:11.761957] I [login.c:34:gf_auth] 0-auth/login: connecting user name: node5 [2018-03-16 20:40:11.762049] I [MSGID: 121037] [changetimerecorder.c:1978:reconfigure] 0-supervol01-changetimerecorder: set [2018-03-16 20:40:11.762124] I [MSGID: 0] [gfdb_sqlite3.c:1398:gf_sqlite3_set_pragma] 0-sqlite3: Value set on DB wal_autocheckpoint : 25000 [2018-03-16 20:40:11.762581] I [MSGID: 0] [gfdb_sqlite3.c:1398:gf_sqlite3_set_pragma] 0-sqlite3: Value set on DB cache_size : 12500 [2018-03-16 20:40:11.762721] I [socket.c:4242:socket_init] 0-supervol01-quota: SSL support for glusterd is ENABLED [2018-03-16 20:40:11.762782] E [socket.c:4320:socket_init] 0-supervol01-quota: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled [2018-03-16 20:40:11.763036] W [socket.c:3911:reconfigure] 0-supervol01-quota: disabling non-blocking IO [2018-03-16 20:40:11.763416] I [MSGID: 101190] [event-epoll.c:602:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2018-03-16 20:40:11.777503] I [addr.c:55:compare_addr_and_update] 0-/bricks/supervol01/brick: allowed = "*", received addr = "127.0.0.1" [2018-03-16 20:40:11.777516] I [login.c:34:gf_auth] 0-auth/login: connecting user name: node3 [2018-03-16 20:40:11.777522] I [MSGID: 115029] [server-handshake.c:778:server_setvolume] 0-supervol01-server: accepted client from node3-2802-2018/03/16-20:40:07:620796-supervol01-client-0-0-0 (version: 3.8.4) [2018-03-16 20:40:14.860687] E [socket.c:358:ssl_setup_connection] 0-tcp.supervol01-server: SSL connect error (client: 127.0.0.1:1015) (server: 127.0.0.1:49154) [2018-03-16 20:40:14.860715] E [socket.c:202:ssl_dump_error_stack] 0-tcp.supervol01-server: error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number [2018-03-16 20:40:14.860727] E [socket.c:2510:socket_poller] 0-tcp.supervol01-server: server setup failed ... (and every 4 seconds in the log until the mount is removed manually) ... [2018-03-16 20:50:08.266288] E [socket.c:358:ssl_setup_connection] 0-tcp.supervol01-server: SSL connect error (client: 127.0.0.1:1018) (server: 127.0.0.1:49154) [2018-03-16 20:50:08.266337] E [socket.c:202:ssl_dump_error_stack] 0-tcp.supervol01-server: error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number [2018-03-16 20:50:08.266351] E [socket.c:2510:socket_poller] 0-tcp.supervol01-server: server setup failed Expected results: Quota scan should be able to mount and unmount on volumes using TLS Additional info: $ sudo gluster vol info supervol01 Volume Name: supervol01 Type: Replicate Volume ID: 25e8bbd7-1634-4030-91d7-5a2d922d680b Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: node3:/bricks/supervol01/brick Brick2: node4:/bricks/supervol01/brick Brick3: node5:/bricks/supervol01/brick Options Reconfigured: features.quota-deem-statfs: on features.inode-quota: on features.quota: on client.ssl: on server.ssl: on performance.cache-refresh-timeout: 60 performance.cache-size: 134217728 performance.nl-cache: on performance.md-cache-timeout: 300 transport.address-family: inet nfs.disable: on auto-delete: enable cluster.enable-shared-storage: enable
Quota uses a per brick volfile to do per brick crawl (earlier this used to be a volume level volfile.), The function used for this is glusterd_generate_client_per_brick_volfile. In this function the volfile generated does not have "option transport.socket.ssl-enabled on" The fix is to correct the volfile generation
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days