Description of problem: After upgrade from 5.3 to 6.1, gluster refuses to start bricks that apparently have 'crypt' and 'bd' xlators. None of these have been provided at creation and according to 'gluster get VOLUME all' they are not used. Version-Release number of selected component (if applicable): 6.1 [2019-04-23 10:36:44.325141] I [MSGID: 100030] [glusterfsd.c:2849:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 6.1 (args: /usr/sbin/glusterd --pid-file=/run/glusterd.pid) [2019-04-23 10:36:44.325505] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid of current running process is 31705 [2019-04-23 10:36:44.327314] I [MSGID: 106478] [glusterd.c:1422:init] 0-management: Maximum allowed open file descriptors set to 65536 [2019-04-23 10:36:44.327354] I [MSGID: 106479] [glusterd.c:1478:init] 0-management: Using /var/lib/glusterd as working directory [2019-04-23 10:36:44.327363] I [MSGID: 106479] [glusterd.c:1484:init] 0-management: Using /var/run/gluster as pid file working directory [2019-04-23 10:36:44.330126] I [socket.c:931:__socket_server_bind] 0-socket.management: process started listening on port (36203) [2019-04-23 10:36:44.330258] E [rpc-transport.c:297:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/6.1/rpc-transport/rdma.so: cannot open shared object file: No such file or directory [2019-04-23 10:36:44.330267] W [rpc-transport.c:301:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine [2019-04-23 10:36:44.330274] W [rpcsvc.c:1985:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed [2019-04-23 10:36:44.330281] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2019-04-23 10:36:44.331976] I [socket.c:902:__socket_server_bind] 0-socket.management: closing (AF_UNIX) reuse check socket 13 [2019-04-23 10:36:46.805843] I [MSGID: 106513] [glusterd-store.c:2394:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 50000 [2019-04-23 10:36:46.878878] I [MSGID: 106544] [glusterd.c:152:glusterd_uuid_init] 0-management: retrieved UUID: 5104ed01-f959-4a82-bbd6-17d4dd177ec2 [2019-04-23 10:36:46.881463] E [mem-pool.c:351:__gf_free] (-->/usr/lib64/glusterfs/6.1/xlator/mgmt/glusterd.so(+0x49190) [0x7fb0ecb64190] -->/usr/lib64/glusterfs/6.1/xlator/mgmt/glusterd.so(+0x48f72) [0x7fb0ecb63f 72] -->/usr/lib64/libglusterfs.so.0(__gf_free+0x21d) [0x7fb0f25091dd] ) 0-: Assertion failed: mem_acct->rec[header->type].size >= header->size [2019-04-23 10:36:46.908134] I [MSGID: 106498] [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2019-04-23 10:36:46.910052] I [MSGID: 106498] [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2019-04-23 10:36:46.910135] W [MSGID: 106061] [glusterd-handler.c:3472:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout [2019-04-23 10:36:46.910167] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2019-04-23 10:36:46.911425] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 Final graph: +------------------------------------------------------------------------------+ 1: volume management 2: type mgmt/glusterd 3: option rpc-auth.auth-glusterfs on 4: option rpc-auth.auth-unix on 5: option rpc-auth.auth-null on 6: option rpc-auth-allow-insecure on 7: option transport.listen-backlog 1024 8: option event-threads 1 9: option ping-timeout 0 10: option transport.socket.read-fail-log off 11: option transport.socket.keepalive-interval 2 12: option transport.socket.keepalive-time 10 13: option transport-type rdma 14: option working-directory /var/lib/glusterd 15: end-volume 16: +------------------------------------------------------------------------------+ [2019-04-23 10:36:46.911405] W [MSGID: 106061] [glusterd-handler.c:3472:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout [2019-04-23 10:36:46.914845] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2019-04-23 10:36:47.265981] I [MSGID: 106493] [glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: a6ff7d5b-1e8d-4cdc-97cf-4e03b89462a3, host: 10.10.0.25, port: 0 [2019-04-23 10:36:47.271481] I [glusterd-utils.c:6312:glusterd_brick_start] 0-management: starting a fresh brick process for brick /local.mnt/glfs/brick [2019-04-23 10:36:47.273759] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2019-04-23 10:36:47.336220] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600 [2019-04-23 10:36:47.336328] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: nfs already stopped [2019-04-23 10:36:47.336383] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: nfs service is stopped [2019-04-23 10:36:47.336735] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600 [2019-04-23 10:36:47.337733] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: glustershd already stopped [2019-04-23 10:36:47.337755] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: glustershd service is stopped [2019-04-23 10:36:47.337804] I [MSGID: 106567] [glusterd-svc-mgmt.c:220:glusterd_svc_start] 0-management: Starting glustershd service [2019-04-23 10:36:48.340193] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600 [2019-04-23 10:36:48.340446] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: quotad already stopped [2019-04-23 10:36:48.340482] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: quotad service is stopped [2019-04-23 10:36:48.340525] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600 [2019-04-23 10:36:48.340662] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: bitd already stopped [2019-04-23 10:36:48.340686] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: bitd service is stopped [2019-04-23 10:36:48.340721] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600 [2019-04-23 10:36:48.340851] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: scrub already stopped [2019-04-23 10:36:48.340865] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: scrub service is stopped [2019-04-23 10:36:48.340913] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600 [2019-04-23 10:36:48.341005] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-gfproxyd: setting frame-timeout to 600 [2019-04-23 10:36:48.342056] I [MSGID: 106493] [glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: a6ff7d5b-1e8d-4cdc-97cf-4e03b89462a3 [2019-04-23 10:36:48.342125] I [MSGID: 106493] [glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 88496e0c-298b-47ef-98a1-a884ca68d7d4, host: 10.10.0.208, port: 0 [2019-04-23 10:36:48.378690] I [MSGID: 106493] [glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 88496e0c-298b-47ef-98a1-a884ca68d7d4 [2019-04-23 10:37:15.410095] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory The message "W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory" repeated 2 times between [2019-04-23 10:37:15.410095] and [2019-04-23 10:37:15.410162] [2019-04-23 10:37:15.417228] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so: undefined symbol: xlator_api The message "E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so: undefined symbol: xlator_api" repeated 7 times between [2019-04-23 10:37:15.417228] and [2019-04-23 10:37:15.417319] [2019-04-23 10:37:15.449809] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/storage/bd.so: cannot open shared object file: No such file or directory [2019-04-23 12:23:14.757482] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory [2019-04-23 12:23:14.765810] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so: undefined symbol: xlator_api [2019-04-23 12:23:14.801394] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/storage/bd.so: cannot open shared object file: No such file or directory The message "W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory" repeated 2 times between [2019-04-23 12:23:14.757482] and [2019-04-23 12:23:14.757578] The message "E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so: undefined symbol: xlator_api" repeated 7 times between [2019-04-23 12:23:14.765810] and [2019-04-23 12:23:14.765864] [2019-04-23 12:29:45.957524] I [MSGID: 106488] [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2019-04-23 12:30:06.917403] I [MSGID: 106488] [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2019-04-23 12:38:25.514866] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory [2019-04-23 12:38:25.522473] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so: undefined symbol: xlator_api [2019-04-23 12:38:25.555952] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/storage/bd.so: cannot open shared object file: No such file or directory The message "W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory" repeated 2 times between [2019-04-23 12:38:25.514866] and [2019-04-23 12:38:25.514931] The message "E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so: undefined symbol: xlator_api" repeated 7 times between [2019-04-23 12:38:25.522473] and [2019-04-23 12:38:25.522545] [2019-04-23 12:52:00.569988] W [glusterfsd.c:1570:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7504) [0x7fb0f1310504] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xd5) [0x409f45] -->/usr/sbin/glusterd(cleanup_and_exit+0x57) [0x409db7] ) 0-: received signum (15), shutting down Option Value ------ ----- cluster.lookup-unhashed on cluster.lookup-optimize on cluster.min-free-disk 10% cluster.min-free-inodes 5% cluster.rebalance-stats off cluster.subvols-per-directory (null) cluster.readdir-optimize on cluster.rsync-hash-regex (null) cluster.extra-hash-regex (null) cluster.dht-xattr-name trusted.glusterfs.dht cluster.randomize-hash-range-by-gfid off cluster.rebal-throttle normal cluster.lock-migration off cluster.force-migration off cluster.local-volume-name (null) cluster.weighted-rebalance on cluster.switch-pattern (null) cluster.entry-change-log on cluster.read-subvolume (null) cluster.read-subvolume-index -1 cluster.read-hash-mode 1 cluster.background-self-heal-count 8 cluster.metadata-self-heal on cluster.data-self-heal on cluster.entry-self-heal on cluster.self-heal-daemon enable cluster.heal-timeout 600 cluster.self-heal-window-size 1 cluster.data-change-log on cluster.metadata-change-log on cluster.data-self-heal-algorithm (null) cluster.eager-lock on disperse.eager-lock on disperse.other-eager-lock on disperse.eager-lock-timeout 1 disperse.other-eager-lock-timeout 1 cluster.quorum-type auto cluster.quorum-count (null) cluster.choose-local true cluster.self-heal-readdir-size 1KB cluster.post-op-delay-secs 1 cluster.ensure-durability on cluster.consistent-metadata no cluster.heal-wait-queue-length 128 cluster.favorite-child-policy none cluster.full-lock yes cluster.stripe-block-size 128KB cluster.stripe-coalesce true diagnostics.latency-measurement off diagnostics.dump-fd-stats off diagnostics.count-fop-hits off diagnostics.brick-log-level CRITICAL diagnostics.client-log-level CRITICAL diagnostics.brick-sys-log-level CRITICAL diagnostics.client-sys-log-level CRITICAL diagnostics.brick-logger (null) diagnostics.client-logger (null) diagnostics.brick-log-format (null) diagnostics.client-log-format (null) diagnostics.brick-log-buf-size 5 diagnostics.client-log-buf-size 5 diagnostics.brick-log-flush-timeout 120 diagnostics.client-log-flush-timeout 120 diagnostics.stats-dump-interval 0 diagnostics.fop-sample-interval 0 diagnostics.stats-dump-format json diagnostics.fop-sample-buf-size 65535 diagnostics.stats-dnscache-ttl-sec 86400 performance.cache-max-file-size 0 performance.cache-min-file-size 0 performance.cache-refresh-timeout 1 performance.cache-priority performance.cache-size 32MB performance.io-thread-count 16 performance.high-prio-threads 16 performance.normal-prio-threads 16 performance.low-prio-threads 16 performance.least-prio-threads 1 performance.enable-least-priority on performance.iot-watchdog-secs (null) performance.iot-cleanup-disconnected-reqsoff performance.iot-pass-through false performance.io-cache-pass-through false performance.cache-size 128MB performance.qr-cache-timeout 1 performance.cache-invalidation on performance.ctime-invalidation false performance.flush-behind on performance.nfs.flush-behind on performance.write-behind-window-size 1MB performance.resync-failed-syncs-after-fsyncoff performance.nfs.write-behind-window-size1MB performance.strict-o-direct off performance.nfs.strict-o-direct off performance.strict-write-ordering off performance.nfs.strict-write-ordering off performance.write-behind-trickling-writeson performance.aggregate-size 128KB performance.nfs.write-behind-trickling-writeson performance.lazy-open yes performance.read-after-open yes performance.open-behind-pass-through false performance.read-ahead-page-count 4 performance.read-ahead-pass-through false performance.readdir-ahead-pass-through false performance.md-cache-pass-through false performance.md-cache-timeout 600 performance.cache-swift-metadata true performance.cache-samba-metadata false performance.cache-capability-xattrs true performance.cache-ima-xattrs true performance.md-cache-statfs off performance.xattr-cache-list performance.nl-cache-pass-through false features.encryption off encryption.master-key (null) encryption.data-key-size 256 encryption.block-size 4096 network.frame-timeout 1800 network.ping-timeout 42 network.tcp-window-size (null) network.remote-dio disable client.event-threads 2 client.tcp-user-timeout 0 client.keepalive-time 20 client.keepalive-interval 2 client.keepalive-count 9 network.tcp-window-size (null) network.inode-lru-limit 200000 auth.allow * auth.reject (null) transport.keepalive 1 server.allow-insecure on server.root-squash off server.anonuid 65534 server.anongid 65534 server.statedump-path /var/run/gluster server.outstanding-rpc-limit 64 server.ssl (null) auth.ssl-allow * server.manage-gids off server.dynamic-auth on client.send-gids on server.gid-timeout 300 server.own-thread (null) server.event-threads 1 server.tcp-user-timeout 0 server.keepalive-time 20 server.keepalive-interval 2 server.keepalive-count 9 transport.listen-backlog 1024 ssl.own-cert (null) ssl.private-key (null) ssl.ca-list (null) ssl.crl-path (null) ssl.certificate-depth (null) ssl.cipher-list (null) ssl.dh-param (null) ssl.ec-curve (null) transport.address-family inet performance.write-behind on performance.read-ahead on performance.readdir-ahead on performance.io-cache on performance.quick-read on performance.open-behind on performance.nl-cache off performance.stat-prefetch on performance.client-io-threads off performance.nfs.write-behind on performance.nfs.read-ahead off performance.nfs.io-cache off performance.nfs.quick-read off performance.nfs.stat-prefetch off performance.nfs.io-threads off performance.force-readdirp true performance.cache-invalidation on features.uss off features.snapshot-directory .snaps features.show-snapshot-directory off features.tag-namespaces off network.compression off network.compression.window-size -15 network.compression.mem-level 8 network.compression.min-size 0 network.compression.compression-level -1 network.compression.debug false features.default-soft-limit 80% features.soft-timeout 60 features.hard-timeout 5 features.alert-time 86400 features.quota-deem-statfs off geo-replication.indexing off geo-replication.indexing off geo-replication.ignore-pid-check off geo-replication.ignore-pid-check off features.quota off features.inode-quota off features.bitrot disable debug.trace off debug.log-history no debug.log-file no debug.exclude-ops (null) debug.include-ops (null) debug.error-gen off debug.error-failure (null) debug.error-number (null) debug.random-failure off debug.error-fops (null) nfs.enable-ino32 no nfs.mem-factor 15 nfs.export-dirs on nfs.export-volumes on nfs.addr-namelookup off nfs.dynamic-volumes off nfs.register-with-portmap on nfs.outstanding-rpc-limit 16 nfs.port 2049 nfs.rpc-auth-unix on nfs.rpc-auth-null on nfs.rpc-auth-allow all nfs.rpc-auth-reject none nfs.ports-insecure off nfs.trusted-sync off nfs.trusted-write off nfs.volume-access read-write nfs.export-dir nfs.disable on nfs.nlm on nfs.acl on nfs.mount-udp off nfs.mount-rmtab /var/lib/glusterd/nfs/rmtab nfs.rpc-statd /sbin/rpc.statd nfs.server-aux-gids off nfs.drc off nfs.drc-size 0x20000 nfs.read-size (1 * 1048576ULL) nfs.write-size (1 * 1048576ULL) nfs.readdir-size (1 * 1048576ULL) nfs.rdirplus on nfs.event-threads 1 nfs.exports-auth-enable (null) nfs.auth-refresh-interval-sec (null) nfs.auth-cache-ttl-sec (null) features.read-only off features.worm off features.worm-file-level off features.worm-files-deletable on features.default-retention-period 120 features.retention-mode relax features.auto-commit-period 180 storage.linux-aio off storage.batch-fsync-mode reverse-fsync storage.batch-fsync-delay-usec 0 storage.owner-uid -1 storage.owner-gid -1 storage.node-uuid-pathinfo off storage.health-check-interval 30 storage.build-pgfid off storage.gfid2path on storage.gfid2path-separator : storage.reserve 1 storage.health-check-timeout 10 storage.fips-mode-rchecksum off storage.force-create-mode 0000 storage.force-directory-mode 0000 storage.create-mask 0777 storage.create-directory-mask 0777 storage.max-hardlinks 100 storage.ctime off storage.bd-aio off config.gfproxyd off cluster.server-quorum-type off cluster.server-quorum-ratio 0 changelog.changelog off changelog.changelog-dir {{ brick.path }}/.glusterfs/changelogs changelog.encoding ascii changelog.rollover-time 15 changelog.fsync-interval 5 changelog.changelog-barrier-timeout 120 changelog.capture-del-path off features.barrier disable features.barrier-timeout 120 features.trash off features.trash-dir .trashcan features.trash-eliminate-path (null) features.trash-max-filesize 5MB features.trash-internal-op off cluster.enable-shared-storage disable locks.trace off locks.mandatory-locking off cluster.disperse-self-heal-daemon enable cluster.quorum-reads no client.bind-insecure (null) features.timeout 45 features.failover-hosts (null) features.shard off features.shard-block-size 64MB features.shard-lru-limit 16384 features.shard-deletion-rate 100 features.scrub-throttle lazy features.scrub-freq biweekly features.scrub false features.expiry-time 120 features.cache-invalidation on features.cache-invalidation-timeout 600 features.leases off features.lease-lock-recall-timeout 60 disperse.background-heals 8 disperse.heal-wait-qlength 128 cluster.heal-timeout 600 dht.force-readdirp on disperse.read-policy gfid-hash cluster.shd-max-threads 1 cluster.shd-wait-qlength 1024 cluster.locking-scheme full cluster.granular-entry-heal no features.locks-revocation-secs 0 features.locks-revocation-clear-all false features.locks-revocation-max-blocked 0 features.locks-monkey-unlocking false features.locks-notify-contention no features.locks-notify-contention-delay 5 disperse.shd-max-threads 1 disperse.shd-wait-qlength 1024 disperse.cpu-extensions auto disperse.self-heal-window-size 1 cluster.use-compound-fops off performance.parallel-readdir off performance.rda-request-size 131072 performance.rda-low-wmark 4096 performance.rda-high-wmark 128KB performance.rda-cache-limit 10MB performance.nl-cache-positive-entry false performance.nl-cache-limit 10MB performance.nl-cache-timeout 60 cluster.brick-multiplex off cluster.max-bricks-per-process 0 disperse.optimistic-change-log on disperse.stripe-cache 4 cluster.halo-enabled False cluster.halo-shd-max-latency 99999 cluster.halo-nfsd-max-latency 5 cluster.halo-max-latency 5 cluster.halo-max-replicas 99999 cluster.halo-min-replicas 2 cluster.daemon-log-level INFO debug.delay-gen off delay-gen.delay-percentage 10% delay-gen.delay-duration 100000 delay-gen.enable disperse.parallel-writes on features.sdfs on features.cloudsync off features.utime off ctime.noatime on feature.cloudsync-storetype (null)
Geo-replication is failing as well due to this: ==> cli.log <== [2019-04-23 19:37:29.048169] I [cli.c:845:main] 0-cli: Started running gluster with version 6.1 [2019-04-23 19:37:29.108778] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2019-04-23 19:37:29.109073] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 ==> cmd_history.log <== [2019-04-23 19:37:30.341565] : volume geo-replication mariadb 11.22.33.44::mariadb create push-pem : FAILED : Passwordless ssh login has not been setup with 11.22.33.44 for user root. ==> cli.log <== [2019-04-23 19:37:30.341932] I [input.c:31:cli_batch] 0-: Exiting with: -1 ==> glusterd.log <== The message "W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory" repeated 2 times between [2019-04-23 19:36:27.419582] and [2019-04-23 19:36:27.419641] The message "E [MSGID: 106316] [glusterd-geo-rep.c:2890:glusterd_verify_slave] 0-management: Not a valid slave" repeated 2 times between [2019-04-23 19:35:42.340661] and [2019-04-23 19:37:30.340518] The message "E [MSGID: 106316] [glusterd-geo-rep.c:3282:glusterd_op_stage_gsync_create] 0-management: 11.22.33.44::mariadb is not a valid slave volume. Error: Passwordless ssh login has not been setup with 11.22.33.44 for user root." repeated 2 times between [2019-04-23 19:35:42.340803] and [2019-04-23 19:37:30.340611] The message "E [MSGID: 106301] [glusterd-syncop.c:1317:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication Create' failed on localhost : Passwordless ssh login has not been setup with 11.22.33.44 for user root." repeated 2 times between [2019-04-23 19:35:42.340842] and [2019-04-23 19:37:30.340618]
I don't think the upgrade failure or the geo-replication session issue is due to the missing xlators what you highlighted in the report. If you notice the following log snippet, the cleanup_and_exit which is a shutdown trigger of glusterd happened much later than the logs which complaint about the missing xlators and I can confirm that they are benign. [2019-04-23 12:38:25.514866] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory [2019-04-23 12:38:25.522473] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so: undefined symbol: xlator_api [2019-04-23 12:38:25.555952] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/storage/bd.so: cannot open shared object file: No such file or directory The message "W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.1/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory" repeated 2 times between [2019-04-23 12:38:25.514866] and [2019-04-23 12:38:25.514931] The message "E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so: undefined symbol: xlator_api" repeated 7 times between [2019-04-23 12:38:25.522473] and [2019-04-23 12:38:25.522545] ################################# There's a gap of ~14 minutes here ################################################### [2019-04-23 12:52:00.569988] W [glusterfsd.c:1570:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7504) [0x7fb0f1310504] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xd5) [0x409f45] -->/usr/sbin/glusterd(cleanup_and_exit+0x57) [0x409db7] ) 0-: received signum (15), shutting down You'd need to provide us the brick logs along with glusterd logs, gluster volume status and gluster get-state output from the node where you see this happening. Related to geo-rep failures, I'd suggest you to file a different bug once this stabilises.
Hi, I tried upgrading one of the nodes again: 1) shutdown glusterd 5.6 2) install 6.1 3) start glusterd 6.1 4) no working brick 5) shutdown glusterd 6.1 6) downgrade to 5.6 7) start glusterd 5.6 8) brick is working fine again The volume status is showing only the other nodes as the node running 6.1 is failing the brick process: === START volume status === Status of volume: jf-vol0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.10.0.25:/local.mnt/glfs/brick 49153 0 Y 20952 Brick 10.10.0.208:/local.mnt/glfs/brick 49153 0 Y 29631 Self-heal Daemon on localhost N/A N/A Y 3487 Self-heal Daemon on 10.10.0.208 N/A N/A Y 27031 Task Status of Volume jf-vol0 ------------------------------------------------------------------------------ There are no active volume tasks === END volume status === === START glusterd.log === [2019-05-08 07:23:26.043605] I [MSGID: 100030] [glusterfsd.c:2849:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 6.1 (args: /usr/sbin/glusterd --pid-file=/run/glusterd.pid) [2019-05-08 07:23:26.044499] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid of current running process is 21399 [2019-05-08 07:23:26.047235] I [MSGID: 106478] [glusterd.c:1422:init] 0-management: Maximum allowed open file descriptors set to 65536 [2019-05-08 07:23:26.047270] I [MSGID: 106479] [glusterd.c:1478:init] 0-management: Using /var/lib/glusterd as working directory [2019-05-08 07:23:26.047284] I [MSGID: 106479] [glusterd.c:1484:init] 0-management: Using /var/run/gluster as pid file working directory [2019-05-08 07:23:26.051068] I [socket.c:931:__socket_server_bind] 0-socket.management: process started listening on port (44950) [2019-05-08 07:23:26.051268] E [rpc-transport.c:297:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/6.1/rpc-transport/rdma.so: cannot open shared object file: No such file or directory [2019-05-08 07:23:26.051282] W [rpc-transport.c:301:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine [2019-05-08 07:23:26.051292] W [rpcsvc.c:1985:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed [2019-05-08 07:23:26.051302] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2019-05-08 07:23:26.053127] I [socket.c:902:__socket_server_bind] 0-socket.management: closing (AF_UNIX) reuse check socket 13 [2019-05-08 07:23:28.584285] I [MSGID: 106513] [glusterd-store.c:2394:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 50000 [2019-05-08 07:23:28.650177] I [MSGID: 106544] [glusterd.c:152:glusterd_uuid_init] 0-management: retrieved UUID: 5104ed01-f959-4a82-bbd6-17d4dd177ec2 [2019-05-08 07:23:28.656448] E [mem-pool.c:351:__gf_free] (-->/usr/lib64/glusterfs/6.1/xlator/mgmt/glusterd.so(+0x49190) [0x7fa26784e190] -->/usr/lib64/glusterfs/6.1/xlator/mgmt/glusterd.so(+0x48f72) [0x7fa26784df72] -->/usr/lib64/libglusterfs.so.0(__gf_free+0x21d) [0x7fa26d1f31dd] ) 0-: Assertion failed: mem_acct->rec[header->type].size >= header->size [2019-05-08 07:23:28.683589] I [MSGID: 106498] [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2019-05-08 07:23:28.686748] I [MSGID: 106498] [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2019-05-08 07:23:28.686787] W [MSGID: 106061] [glusterd-handler.c:3472:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout [2019-05-08 07:23:28.686819] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2019-05-08 07:23:28.687629] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 Final graph: +------------------------------------------------------------------------------+ 1: volume management 2: type mgmt/glusterd 3: option rpc-auth.auth-glusterfs on 4: option rpc-auth.auth-unix on 5: option rpc-auth.auth-null on 6: option rpc-auth-allow-insecure on 7: option transport.listen-backlog 1024 8: option event-threads 1 9: option ping-timeout 0 10: option transport.socket.read-fail-log off 11: option transport.socket.keepalive-interval 2 12: option transport.socket.keepalive-time 10 13: option transport-type rdma 14: option working-directory /var/lib/glusterd 15: end-volume 16: +------------------------------------------------------------------------------+ [2019-05-08 07:23:28.687625] W [MSGID: 106061] [glusterd-handler.c:3472:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout [2019-05-08 07:23:28.689771] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2019-05-08 07:23:29.388437] I [MSGID: 106493] [glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 88496e0c-298b-47ef-98a1-a884ca68d7d4, host: 10.10.0.208, port: 0 [2019-05-08 07:23:29.393409] I [glusterd-utils.c:6312:glusterd_brick_start] 0-management: starting a fresh brick process for brick /local.mnt/glfs/brick [2019-05-08 07:23:29.395426] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2019-05-08 07:23:29.460728] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600 [2019-05-08 07:23:29.460868] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: nfs already stopped [2019-05-08 07:23:29.460911] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: nfs service is stopped [2019-05-08 07:23:29.461360] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600 [2019-05-08 07:23:29.462857] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: glustershd already stopped [2019-05-08 07:23:29.462902] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: glustershd service is stopped [2019-05-08 07:23:29.462959] I [MSGID: 106567] [glusterd-svc-mgmt.c:220:glusterd_svc_start] 0-management: Starting glustershd service [2019-05-08 07:23:30.465107] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600 [2019-05-08 07:23:30.465293] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: quotad already stopped [2019-05-08 07:23:30.465314] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: quotad service is stopped [2019-05-08 07:23:30.465351] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600 [2019-05-08 07:23:30.465477] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: bitd already stopped [2019-05-08 07:23:30.465489] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: bitd service is stopped [2019-05-08 07:23:30.465517] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600 [2019-05-08 07:23:30.465633] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: scrub already stopped [2019-05-08 07:23:30.465645] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: scrub service is stopped [2019-05-08 07:23:30.465689] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600 [2019-05-08 07:23:30.465772] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-gfproxyd: setting frame-timeout to 600 [2019-05-08 07:23:30.466776] I [MSGID: 106493] [glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 88496e0c-298b-47ef-98a1-a884ca68d7d4 [2019-05-08 07:23:30.466822] I [MSGID: 106493] [glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: a6ff7d5b-1e8d-4cdc-97cf-4e03b89462a3, host: 10.10.0.25, port: 0 [2019-05-08 07:23:30.490461] I [MSGID: 106493] [glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: a6ff7d5b-1e8d-4cdc-97cf-4e03b89462a3 [2019-05-08 07:23:47.540967] I [MSGID: 106584] [glusterd-handler.c:5995:__glusterd_handle_get_state] 0-management: Received request to get state for glusterd [2019-05-08 07:23:47.541003] I [MSGID: 106061] [glusterd-handler.c:5517:glusterd_get_state] 0-management: Default output directory: /var/run/gluster/ [2019-05-08 07:23:47.541052] I [MSGID: 106061] [glusterd-handler.c:5553:glusterd_get_state] 0-management: Default filename: glusterd_state_20190508_092347 === END glusterd.log === === START glustershd.log === [2019-05-08 07:23:29.465963] I [MSGID: 100030] [glusterfsd.c:2849:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 6.1 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/dc47fa45e83d2326.socket --xlator-option *replicate*.node-uuid=5104ed01-f959-4a82-bbd6-17d4dd177ec2 --process-name glustershd --client-pid=-6) [2019-05-08 07:23:29.466783] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid of current running process is 29165 [2019-05-08 07:23:29.469726] I [socket.c:902:__socket_server_bind] 0-socket.glusterfsd: closing (AF_UNIX) reuse check socket 10 [2019-05-08 07:23:29.471280] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2019-05-08 07:23:29.471317] I [glusterfsd-mgmt.c:2443:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: localhost [2019-05-08 07:23:29.471326] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers [2019-05-08 07:23:29.471518] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-05-08 07:23:29.471540] W [glusterfsd.c:1570:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(+0xe7b3) [0x7f8e5adb37b3] -->/usr/sbin/glusterfs() [0x411629] -->/usr/sbin/glusterfs(cleanup_and_exit+0x57) [0x409db7] ) 0-: received signum (1), shutting down === END glustershd.log === === START local.mnt-glfs-brick.log === [2019-05-08 07:23:29.396753] I [MSGID: 100030] [glusterfsd.c:2849:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 6.1 (args: /usr/sbin/glusterfsd -s 10.10.0.177 --volfile-id jf-vol0.10.10.0.177.local.mnt-glfs-brick -p /var/run/gluster/vols/jf-vol0/10.10.0.177-local.mnt-glfs-brick.pid -S /var/run/gluster/ccdac309d72f1df7.socket --brick-name /local.mnt/glfs/brick -l /var/log/glusterfs/bricks/local.mnt-glfs-brick.log --xlator-option *-posix.glusterd-uuid=5104ed01-f959-4a82-bbd6-17d4dd177ec2 --process-name brick --brick-port 49153 --xlator-option jf-vol0-server.listen-port=49153) [2019-05-08 07:23:29.397519] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid of current running process is 28996 [2019-05-08 07:23:29.400575] I [socket.c:902:__socket_server_bind] 0-socket.glusterfsd: closing (AF_UNIX) reuse check socket 10 [2019-05-08 07:23:29.401901] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-05-08 07:23:29.402622] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2019-05-08 07:23:29.402631] I [glusterfsd-mgmt.c:2443:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: 10.10.0.177 [2019-05-08 07:23:29.402649] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers [2019-05-08 07:23:29.402770] W [glusterfsd.c:1570:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(+0xe7b3) [0x7fe46b1f77b3] -->/usr/sbin/glusterfsd() [0x411629] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x57) [0x409db7] ) 0-: received signum (1), shutting down [2019-05-08 07:23:29.403338] I [socket.c:3754:socket_submit_outgoing_msg] 0-glusterfs: not connected (priv->connected = 0) [2019-05-08 07:23:29.403353] W [rpc-clnt.c:1704:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (unique: 0, XID: 0x2 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs) [2019-05-08 07:23:29.403420] W [glusterfsd.c:1570:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(+0xe7b3) [0x7fe46b1f77b3] -->/usr/sbin/glusterfsd() [0x411629] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x57) [0x409db7] ) 0-: received signum (1), shutting down === END local.mnt-glfs-brick.log === === START glusterd_state_20190508_092347 === [Global] MYUUID: 5104ed01-f959-4a82-bbd6-17d4dd177ec2 op-version: 50000 [Global options] [Peers] Peer1.primary_hostname: 10.10.0.208 Peer1.uuid: 88496e0c-298b-47ef-98a1-a884ca68d7d4 Peer1.state: Peer in Cluster Peer1.connected: Connected Peer1.othernames: Peer2.primary_hostname: 10.10.0.25 Peer2.uuid: a6ff7d5b-1e8d-4cdc-97cf-4e03b89462a3 Peer2.state: Peer in Cluster Peer2.connected: Connected Peer2.othernames: [Volumes] Volume1.name: jf-vol0 Volume1.id: f90d35dd-b2a4-461b-9ae9-dcfc68dac322 Volume1.type: Replicate Volume1.transport_type: tcp Volume1.status: Started Volume1.profile_enabled: 0 Volume1.brickcount: 3 Volume1.Brick1.path: 10.10.0.177:/local.mnt/glfs/brick Volume1.Brick1.hostname: 10.10.0.177 Volume1.Brick1.port: 49153 Volume1.Brick1.rdma_port: 0 Volume1.Brick1.port_registered: 0 Volume1.Brick1.status: Stopped Volume1.Brick1.spacefree: 1891708428288Bytes Volume1.Brick1.spacetotal: 1891966050304Bytes Volume1.Brick2.path: 10.10.0.25:/local.mnt/glfs/brick Volume1.Brick2.hostname: 10.10.0.25 Volume1.Brick3.path: 10.10.0.208:/local.mnt/glfs/brick Volume1.Brick3.hostname: 10.10.0.208 Volume1.snap_count: 0 Volume1.stripe_count: 1 Volume1.replica_count: 3 Volume1.subvol_count: 1 Volume1.arbiter_count: 0 Volume1.disperse_count: 0 Volume1.redundancy_count: 0 Volume1.quorum_status: not_applicable Volume1.snapd_svc.online_status: Offline Volume1.snapd_svc.inited: True Volume1.rebalance.id: 00000000-0000-0000-0000-000000000000 Volume1.rebalance.status: not_started Volume1.rebalance.failures: 0 Volume1.rebalance.skipped: 0 Volume1.rebalance.lookedup: 0 Volume1.rebalance.files: 0 Volume1.rebalance.data: 0Bytes Volume1.time_left: 0 Volume1.gsync_count: 0 Volume1.options.cluster.readdir-optimize: on Volume1.options.cluster.self-heal-daemon: enable Volume1.options.cluster.lookup-optimize: on Volume1.options.network.inode-lru-limit: 200000 Volume1.options.performance.md-cache-timeout: 600 Volume1.options.performance.cache-invalidation: on Volume1.options.performance.stat-prefetch: on Volume1.options.features.cache-invalidation-timeout: 600 Volume1.options.features.cache-invalidation: on Volume1.options.diagnostics.brick-sys-log-level: INFO Volume1.options.diagnostics.brick-log-level: INFO Volume1.options.diagnostics.client-log-level: INFO Volume1.options.transport.address-family: inet Volume1.options.nfs.disable: on Volume1.options.performance.client-io-threads: off [Services] svc1.name: glustershd svc1.online_status: Offline svc2.name: nfs svc2.online_status: Offline svc3.name: bitd svc3.online_status: Offline svc4.name: scrub svc4.online_status: Offline svc5.name: quotad svc5.online_status: Offline [Misc] Base port: 49152 Last allocated port: 49153 === END glusterd_state_20190508_092347 ===
[2019-05-08 07:23:29.471317] I [glusterfsd-mgmt.c:2443:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: localhost [2019-05-08 07:23:29.471326] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers The above two logs from the brick log file are the cause. It appears that brick is unable to talk to glusterd. Could you please check what's the content of glusterd.vol file in this node (please locate the file and do paste the 'cat glusterd.vol' output) ? Do you see an entry 'option transport.socket.listen-port 24007' in the glusterd.vol file? If not, could you add that, restart the node and see if that makes any difference?
That was it! The brick now starts up OK. Thanks a lot! === START glusterd.vol === volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off # Adding this line made it work: option transport.socket.listen-port 24007 option ping-timeout 0 option event-threads 1 # option transport.address-family inet6 # option base-port 49152 end-volume === END glusterd.vol ===
Just got the same problem during upgrade from 5 to 6 and the same solution. It is not clear for me why it is closed as not a bug. There is nothing about it in 6 release notes, so it should work with default values. Thank you!
I also see this with glusterfs-6.5-1.el7.x86_64 on Centos 7.7 [2019-10-07 09:17:37.071409] I [run.c:242:runner_log] (-->/usr/lib64/glusterfs/6.5/xlator/mgmt/glusterd.so(+0xe8faa) [0x7fd6204d3faa] -->/usr/lib64/glusterfs/6.5/xlator/mgmt/glusterd.so(+0xe8a75) [0x7fd6204d3a75] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fd62c360495] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh --volname=IT-RELATED --first=no --version=1 --volume-op=start --gd-workdir=/var/lib/glusterd [2019-10-07 09:17:37.099416] I [run.c:242:runner_log] (-->/usr/lib64/glusterfs/6.5/xlator/mgmt/glusterd.so(+0xe8faa) [0x7fd6204d3faa] -->/usr/lib64/glusterfs/6.5/xlator/mgmt/glusterd.so(+0xe8a75) [0x7fd6204d3a75] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fd62c360495] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh --volname=IT-RELATED --first=no --version=1 --volume-op=start --gd-workdir=/var/lib/glusterd [2019-10-07 09:42:26.314045] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.5/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory [2019-10-07 09:42:26.328413] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.5/rpc-transport/socket.so: undefined symbol: xlator_api [2019-10-07 09:42:26.330640] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.5/xlator/nfs/server.so: cannot open shared object file: No such file or directory [2019-10-07 09:42:26.348399] W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.5/xlator/storage/bd.so: cannot open shared object file: No such file or directory The message "W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.5/xlator/encryption/crypt.so: cannot open shared object file: No such file or directory" repeated 2 times between [2019-10-07 09:42:26.314045] and [2019-10-07 09:42:26.314307] The message "E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.5/rpc-transport/socket.so: undefined symbol: xlator_api" repeated 7 times between [2019-10-07 09:42:26.328413] and [2019-10-07 09:42:26.328590] The message "W [MSGID: 101095] [xlator.c:210:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/6.5/xlator/nfs/server.so: cannot open shared object file: No such file or directory" repeated 30 times between [2019-10-07 09:42:26.330640] and [2019-10-07 09:42:26.331499]