Description of problem: 503 errors are seen coming from router when making a connection. The OpenShift router with nbthreads configured is routing to old endpoints that no longer exist, due to creating new sessions linked to old pids that were send the FINISH (SIGTTOU and SIGUSR1) signal. At this time this is only seen when nbthread is configured. Version-Release number of selected component (if applicable): OpenShift 3.9 -> 3.11 haproxy router Haproxy 1.8.8 Additional info: - PID 51938 in pod: 51938 Mar05 250:12 /usr/sbin/haproxy | v - Links to pid 4104 on the host 1000000+ 4104 2.9 0.1 355740 60468 ? Ssl Mar05 250:10 /usr/sbin/haproxy -f /var/lib/haproxy/conf/haproxy.config -p /var/lib/haproxy/run/haproxy.pid -x /var/lib/haproxy/run/haproxy.sock -sf 21158 51857 | v - lsof has the following established connections haproxy 4104 10000000 15u IPv4 1005375260 0t0 TCP y.y.y.18:https->x.x.x.14:40834 (ESTABLISHED) haproxy 4104 10000000 19u IPv4 1005375226 0t0 TCP y.y.y.18:https->x.x.x.13:47778 (ESTABLISHED) haproxy 4104 10000000 21u IPv4 1005365865 0t0 TCP y.y.y.18:https->x.x.x.13:46924 (ESTABLISHED) haproxy 4104 10000000 27u IPv4 1005379692 0t0 TCP y.y.y.18:https->x.x.x.14:40676 (ESTABLISHED) haproxy 4104 10000000 29u IPv4 1005366039 0t0 TCP y.y.y.18:https->x.x.x.13:46978 (ESTABLISHED) haproxy 4104 10000000 52u IPv4 1005345463 0t0 TCP y.y.y.18:https->x.x.x.14:59186 (ESTABLISHED) - The PID for 51938 was sent finish signal (likely by another pid but remains in list). The finish signal was sent at or before Mar 9th: 1000000+ 41881 2.7 0.1 355488 56384 ? Ssl Mar09 73:43 /usr/sbin/haproxy -f /var/lib/haproxy/conf/haproxy.config -p /var/lib/haproxy/run/haproxy.pid -x /var/lib/haproxy/run/haproxy.sock -sf 21158 51938 65730 - Haproxy logs show new sessions with this old pid: Mar 11 15:49:23 localhost haproxy[51938]: x.x.x.14:42972 [11/Mar/2019:15:49:23.872] fe_sni~ be_edge_http:myapp2:myapp2-bg/<NOSRV> 0/-1/-1/-1/0 503 3278 - - SCNN 1/1/0/0/0 0/0 "HEAD /lae2cae.html HTTP/1.1" Mar 11 15:49:19 localhost haproxy[51938]: x.x.x.13:50820 [11/Mar/2019:15:49:19.651] fe_sni~ be_edge_http:myapp:myapp-test/<NOSRV> 0/-1/-1/-1/0 503 3278 - - SCNN 2/1/0/0/0 0/0 "HEAD /lae2cae.html HTTP/1.1" - Info from haproxy $ echo "show info" | socat /var/lib/haproxy/run/haproxy.sock stdio Name: HAProxy Version: 1.8.8 Release_date: 2018/04/19 Nbthread: 4 Nbproc: 1 Process_num: 1 Pid: 44561
Thanks for the detailed report, Ryan. Will let you know if we need any more information to continue diagnosis.