Bug 1694139
Summary: | Error waiting for job 'heketi-storage-copy-job' to complete on one-node k3s deployment. | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | it.sergm |
Component: | glusterd | Assignee: | Raghavendra Talur <rtalur> |
Status: | CLOSED WONTFIX | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 4.1 | CC: | amukherj, assen.sharlandjiev, avishwan, bugs, rtalur |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-10-28 22:15:31 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
it.sergm
2019-03-29 15:46:29 UTC
Could you elaborate the problem bit more? Are you seeing volume mount failing or something wrong with the clustering? From the quick scan through of the bug report, I don't see anything problematic from glusterd end. The thing is i don't see any exception there also and heketidbstorage volume is manually mounting(no files inside), but still its not working with k3s and pod seems cannot mount needed stuff before starting. K3s itself works fine and can deploy stuff with no errors. I could be wrong, but there is only one volume listed from gluster pod: [root@k3s-gluster /]# gluster volume list heketidbstorage but regarding to pod's error there should be more: 3d12h Warning FailedMount Pod Unable to mount volumes for pod "heketi-storage-copy-job-qzpr7_kube-system(36e1b013-5200-11e9-a826-227e2ba50104)": timeout expired waiting for volumes to attach or mount for pod "kube-system"/"heketi-storage-copy-job-qzpr7". list of unmounted volumes=[heketi-storage]. list of unattached volumes=[heketi-storage heketi-storage-secret default-token-98jvk] Here is the list of volumes on gluster pod: [root@k3s-gluster /]# df -h Filesystem Size Used Avail Use% Mounted on overlay 9.8G 6.9G 2.5G 74% / udev 3.9G 0 3.9G 0% /dev /dev/vda2 9.8G 6.9G 2.5G 74% /run tmpfs 798M 1.3M 797M 1% /run/lvm tmpfs 3.9G 0 3.9G 0% /dev/shm tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup tmpfs 3.9G 12K 3.9G 1% /run/secrets/kubernetes.io/serviceaccount /dev/mapper/vg_fef96eab984d116ab3815e7479781110-brick_65d5aa6369e265d641f3557e6c9736b7 2.0G 33M 2.0G 2% /var/lib/heketi/mounts/vg_fef96eab984d116ab3815e7479781110/brick_65d5aa6369e265d641f3557e6c9736b7 [root@k3s-gluster /]# blkid /dev/loop0: TYPE="squashfs" /dev/loop1: TYPE="squashfs" /dev/loop2: TYPE="squashfs" /dev/vda1: PARTUUID="258e4699-a592-442c-86d7-3d7ee4a0dfb7" /dev/vda2: UUID="b394d2be-6b9e-11e8-82ca-22c5fe683ae4" TYPE="ext4" PARTUUID="97104384-f79f-4a39-b3d4-56d717673a18" /dev/vdb: UUID="RUR8Cw-eVYg-H26e-yQ4g-7YCe-NzNg-ocJazb" TYPE="LVM2_member" /dev/mapper/vg_fef96eab984d116ab3815e7479781110-brick_65d5aa6369e265d641f3557e6c9736b7: UUID="ab0e969f-ae85-459c-914f-b008aeafb45e" TYPE="xfs" Also here what i've found on main node - host ip is NULL (btw i've changed topology before with external and private ips - nothing changed for this): root@k3s-gluster:~# cat /var/log/glusterfs/cli.log.1 [2019-03-29 08:54:07.634136] I [cli.c:773:main] 0-cli: Started running gluster with version 4.1.7 [2019-03-29 08:54:07.678012] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-03-29 08:54:07.678105] I [socket.c:2632:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2019-03-29 08:54:07.678268] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glusterfs: error returned while attempting to connect to host:(null), port:0 [2019-03-29 08:54:07.721606] I [cli-rpc-ops.c:1169:gf_cli_create_volume_cbk] 0-cli: Received resp to create volume [2019-03-29 08:54:07.721773] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2019-03-29 08:54:07.817416] I [cli.c:773:main] 0-cli: Started running gluster with version 4.1.7 [2019-03-29 08:54:07.861767] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-03-29 08:54:07.861943] I [socket.c:2632:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2019-03-29 08:54:07.862016] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glusterfs: error returned while attempting to connect to host:(null), port:0 [2019-03-29 08:54:08.009116] I [cli-rpc-ops.c:1472:gf_cli_start_volume_cbk] 0-cli: Received resp to start volume [2019-03-29 08:54:08.009314] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2019-03-29 14:18:51.209759] I [cli.c:773:main] 0-cli: Started running gluster with version 4.1.7 [2019-03-29 14:18:51.256846] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-03-29 14:18:51.256985] I [socket.c:2632:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2019-03-29 14:18:51.257093] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glusterfs: error returned while attempting to connect to host:(null), port:0 [2019-03-29 14:18:51.259408] I [cli-rpc-ops.c:875:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2019-03-29 14:18:51.259587] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glusterfs: error returned while attempting to connect to host:(null), port:0 [2019-03-29 14:18:51.260102] I [cli-rpc-ops.c:875:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2019-03-29 14:18:51.260143] I [input.c:31:cli_batch] 0-: Exiting with: 0 Hi, I ran into the same problem. Checked the syslog on the k3s host node: #tail -f /var/log/syslog shows May 8 16:26:53 k3s-node2 k3s[922]: E0508 16:26:53.466167 922 desired_state_of_world_populator.go:298] Failed to add volume "heketi-storage" (specName: "heketi-storage") for pod "9a3ec318-718e-11e9-9557-3e1cb9b46815" to desiredStateOfWorld. err=failed to get Plugin from volumeSpec for volume "heketi-storage" err=no volume plugin matched May 8 16:26:53 k3s-node2 k3s[922]: E0508 16:26:53.569733 922 desired_state_of_world_populator.go:298] Failed to add volume "heketi-storage" (specName: "heketi-storage") for pod "9a3ec318-718e-11e9-9557-3e1cb9b46815" to desiredStateOfWorld. err=failed to get Plugin from volumeSpec for volume "heketi-storage" err=no volume plugin matched I guess we are missing something in the k3s agent node. hope this info helps. Talur - could you check what's going wrong here? I believe this isn't a (core) gluster problem as such. The error message "list of unmounted volumes=[heketi-storage]" indicates that k3s wasn't able to mount "glusterfs" volume type, look at the other error message : err=failed to get Plugin from volumeSpec for volume "heketi-storage" err=no volume plugin matched. For what I could find online, I see that k3s does not have the intree volume plugin for gluster. We (heketi developers have not tested heketi with anything other than k8s and therefore you are on a uncharted territory. A simple workaround I can think of is to split the deployment steps and do the copy job manually, you can find the copy job definition here https://github.com/heketi/heketi/blob/master/client/cli/go/cmds/heketi_storage.go#L256. I will close the bug as we don't really support k3s yet. |