Description of problem: Trying to create 100 PVC backed by Gluster, 30 of them got stuck in Pending state. I see following error in Heketi pod [cmdexec] ERROR 2018/05/04 15:23:57 /src/github.com/heketi/heketi/executors/cmdexec/brick.go:141: Unable to execute command on glusterfs-storage-7pr79: runtime/cgo: pthread_create failed: Resource temporarily unavailable Version-Release number of selected component (if applicable): openshift v3.10.0-0.32.0 kubernetes v1.10.0+b81c8f8 etcd 3.2.16 registry.reg-aws.openshift.com:443/rhgs3/rhgs-server-rhel7:3.3.1-13 registry.reg-aws.openshift.com:443/rhgs3/rhgs-volmanager-rhel7:3.3.1-10 How reproducible: Try to create 100 PVs using this template https://raw.githubusercontent.com/openshift/svt/master/openshift_scalability/content/pvc-templates/pvc-parameters.yaml Actual results: Error in Heketi logs and PVC stuck in Pending state Expected results: PVC should be created. Master Log: Attached, master logs, master pod logs, heketi node logs and hekiti pod logs Additional info:
Created attachment 1431475 [details] master controllers pod log
Created attachment 1431476 [details] master api pod log
Created attachment 1431477 [details] heketi pod log
Created attachment 1431478 [details] heketi node log
There is also crashes coming in glusterfs provisioner: E0504 12:50:14.745979 1 glusterfs.go:770] failed to create volume: Unable to execute command on glusterfs-storage-2r2vk: runtime/cgo: pthread_create failed: Resource temporarily unavailable SIGABRT: abort PC=0x7fd218f71207 m=0 sigcode=18446744073709551610 goroutine 0 [idle]: goroutine 1 [running]: runtime.systemstack_switch() /usr/lib/golang/src/runtime/asm_amd64.s:298 fp=0xc420030780 sp=0xc420030778 pc=0x55d55861ab00 runtime.main() /usr/lib/golang/src/runtime/proc.go:128 +0x67 fp=0xc4200307e0 sp=0xc420030780 pc=0x55d5585f00e7 runtime.goexit() /usr/lib/golang/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc4200307e8 sp=0xc4200307e0 pc=0x55d55861d741 rax 0x0 rbx 0x7fd219302868 rcx 0xffffffffffffffff rdx 0x6 rdi 0xfb3f rsi 0xfb3f rbp 0x55d5588f2556 rsp 0x7ffc1ad0c618 r8 0xa r9 0x7fd219b7b740 r10 0x8 r11 0x206 r12 0x55d55a3b1050 r13 0xf1 r14 0x11 r15 0x0 rip 0x7fd218f71207 rflags 0x206 cs 0x33 fs 0x0 gs 0x0 exec failed: container_linux.go:348: starting container process caused "read init-p: connection reset by peer" E0504 12:50:14.746095 1 glusterfs.go:708] failed to create volume: failed to create volume: Unable to execute command on glusterfs-storage-2r2vk: runtime/cgo: pthread_create failed: Resource temporarily unavailable SIGABRT: abort PC=0x7fd218f71207 m=0 sigcode=18446744073709551610 goroutine 0 [idle]: goroutine 1 [running]: runtime.systemstack_switch() /usr/lib/golang/src/runtime/asm_amd64.s:298 fp=0xc420030780 sp=0xc420030778 pc=0x55d55861ab00 runtime.main() /usr/lib/golang/src/runtime/proc.go:128 +0x67 fp=0xc4200307e0 sp=0xc420030780 pc=0x55d5585f00e7 runtime.goexit() /usr/lib/golang/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc4200307e8 sp=0xc4200307e0 pc=0x55d55861d741 rax 0x0 rbx 0x7fd219302868 rcx 0xffffffffffffffff rdx 0x6 rdi 0xfb3f rsi 0xfb3f rbp 0x55d5588f2556 rsp 0x7ffc1ad0c618 r8 0xa r9 0x7fd219b7b740 r10 0x8 r11 0x206 r12 0x55d55a3b1050 r13 0xf1 r14 0x11 r15 0x0 rip 0x7fd218f71207 rflags 0x206 cs 0x33 fs 0x0 gs 0x0 exec failed: container_linux.go:348: starting container process caused "read init-p: connection reset by peer" I0504 12:50:14.746142 1 pv_controller.go:1433] failed to provision volume for claim "pvcproject0/pvcol0jhvtxsq" with StorageClass "glusterfs-storage": failed to create volume: failed to create volume: Unable to execute command on glusterfs-storage-2r2vk: runtime/cgo: pthread_create failed: Resource temporarily unavailable SIGABRT: abort PC=0x7fd218f71207 m=0 sigcode=18446744073709551610 I think glusterfs client does not handle failures very well.
The SIGABRT is not on provisioner, rather happened when heketi EXEC'd into the gluster pod and when tried to execute volume create command. --snip-- from heketi logs: [kubeexec] ERROR 2018/05/04 14:55:21 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:240: Failed to run command [mkdir -p /var/lib/heketi/mounts/vg_8f6e7c119094ce46cd66607ace149438/brick_975687bed1b3389410e79d84f586fca2] on glusterfs-storage-vzxrd: Err[command terminated with exit code 1]: Stdout []: Stderr [runtime/cgo: pthread_create failed: Resource temporarily unavailable SIGABRT: abort PC=0x7fc4ebcf0207 m=0 sigcode=18446744073709551610 goroutine 0 [idle]: goroutine 1 [running]: runtime.systemstack_switch() /usr/lib/golang/src/runtime/asm_amd64.s:298 fp=0xc420030780 sp=0xc420030778 pc=0x5612c4e1cb00 runtime.main() /usr/lib/golang/src/runtime/proc.go:128 +0x67 fp=0xc4200307e0 sp=0xc420030780 pc=0x5612c4df20e7 runtime.goexit() /usr/lib/golang/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc4200307e8 sp=0xc4200307e0 pc=0x5612c4e1f741 rax 0x0 rbx 0x7fc4ec081868 rcx 0xffffffffffffffff rdx 0x6 rdi 0x3516 rsi 0x3516 rbp 0x5612c50f4556 rsp 0x7ffc3ec06578 r8 0xa r9 0x7fc4ec8fa740 r10 0x8 r11 0x202 r12 0x5612c6d90050 r13 0xf1 r14 0x11 r15 0x0 rip 0x7fc4ebcf0207 rflags 0x202 cs 0x33 fs 0x0 gs 0x0 exec failed: container_linux.go:348: starting container process caused "read init-p: connection reset by peer" ] [kubeexec] ERROR 2018/05/04 14:55:21 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:240: Failed to run command [mkdir -p /var/lib/heketi/mounts/vg_9ae9ceb16e15f542dbb3ae1281c9bdc7/brick_bf819a1b07374161a4d691011028f9e6] on glusterfs-storage-7pr79: Err[command terminated with exit code 1]: Stdout []: Stderr [runtime/cgo: pthread_create failed: Resource temporarily unavailable SIGABRT: abort PC=0x7f0ed9213207 m=0 sigcode=18446744073709551610 ------snip-- There are good number of volumes already created in this setup before receiving 'resource congestion or restriction error' on pthread_create(). We need to see why we hit on resource limit.
Madhu, I think this is related to the throttling feature you worked on. Could you check that? If it is part of cns-3.10.0 this bug may be moved to POST/MODIFIED, else we should move it to cns-3.11.0 (and back to NEW/ASSIGNED).
I think it can be fixed with throttling requests. but in throttling, there is a server-side change, as well as client-side change are also there. so kubernetes provisioner has to use the new heketi client (which has the retry mechanism)to fix this issue.
What is special on the gluster node settings with respect to resource limits? Please provide output for cat /proc/sys/kernel/threads-max bash -c "ulimit -u"
Stuck with 41 PVCs with target 100. # yum list installed | grep openshift atomic-openshift.x86_64 3.10.0-0.58.0.git.0.4ebb416.el7 # oc describe node | grep -i run runtime=cri-o Container Runtime Version: cri-o://1.10.2-dev # oc get pod -n glusterfs -o yaml | grep "image:" | sort -u image: registry.reg-aws.openshift.com:443/rhgs3/rhgs-gluster-block-prov-rhel7:3.3.1-7.1527091787 image: registry.reg-aws.openshift.com:443/rhgs3/rhgs-server-rhel7:3.3.1-10.1527091766 image: registry.reg-aws.openshift.com:443/rhgs3/rhgs-volmanager-rhel7:3.3.1-8.1527091742 # oc get pod -n glusterfs -o wide NAME READY STATUS RESTARTS AGE IP NODE glusterblock-storage-provisioner-dc-2-7jj8v 1/1 Running 0 13m 172.20.2.6 ip-172-31-12-22.us-west-2.compute.internal glusterfs-storage-lcvn9 1/1 Running 0 26m 172.31.54.222 ip-172-31-54-222.us-west-2.compute.internal glusterfs-storage-mqgrs 1/1 Running 0 26m 172.31.57.96 ip-172-31-57-96.us-west-2.compute.internal glusterfs-storage-nwxnx 1/1 Running 0 26m 172.31.30.13 ip-172-31-30-13.us-west-2.compute.internal heketi-storage-2-578b6 1/1 Running 0 12m 172.20.2.8 ip-172-31-12-22.us-west-2.compute.internal root@ip-172-31-35-224: ~ # root@ip-172-31-35-224: ~ # ssh ip-172-31-12-22.us-west-2.compute.internal Warning: Permanently added 'ip-172-31-12-22.us-west-2.compute.internal,172.31.12.22' (ECDSA) to the list of known hosts. Last login: Tue Jun 5 19:42:18 2018 from 66.187.233.202 root@ip-172-31-12-22: ~ # cat /proc/sys/kernel/threads-max 513648 root@ip-172-31-12-22: ~ # bash -c "ulimit -u" 256824 root@ip-172-31-12-22: ~ # exit logout Connection to ip-172-31-12-22.us-west-2.compute.internal closed. root@ip-172-31-35-224: ~ # ssh ip-172-31-54-222.us-west-2.compute.internal Warning: Permanently added 'ip-172-31-54-222.us-west-2.compute.internal,172.31.54.222' (ECDSA) to the list of known hosts. Last login: Tue Jun 5 20:22:55 2018 from 172.31.55.221 root@ip-172-31-54-222: ~ # bash -c "ulimit -u" 256824 root@ip-172-31-54-222: ~ # cat /proc/sys/kernel/threads-max 513648 root@ip-172-31-54-222: ~ # exit logout Connection to ip-172-31-54-222.us-west-2.compute.internal closed. root@ip-172-31-35-224: ~ # ssh ip-172-31-57-96.us-west-2.compute.internal Warning: Permanently added 'ip-172-31-57-96.us-west-2.compute.internal,172.31.57.96' (ECDSA) to the list of known hosts. Last login: Tue Jun 5 20:22:55 2018 from 172.31.55.221 root@ip-172-31-57-96: ~ # cat /proc/sys/kernel/threads-max 513648 root@ip-172-31-57-96: ~ # bash -c "ulimit -u" 256824 root@ip-172-31-57-96: ~ # exit logout Connection to ip-172-31-57-96.us-west-2.compute.internal closed. root@ip-172-31-35-224: ~ # ssh ip-172-31-30-13.us-west-2.compute.internal Warning: Permanently added 'ip-172-31-30-13.us-west-2.compute.internal,172.31.30.13' (ECDSA) to the list of known hosts. Last login: Tue Jun 5 20:22:55 2018 from 172.31.55.221 root@ip-172-31-30-13: ~ # cat /proc/sys/kernel/threads-max 513648 root@ip-172-31-30-13: ~ # bash -c "ulimit -u" 256824