Hide Forgot
Description of problem: One of two router pods intermittently goes into CrashLoopBackOff. Version-Release number of selected component (if applicable): 3.2.0 -- they customized the image to modify the header size but no other changes How reproducible: Unverified for us, consistent for them Actual results: CrashLoopBackoff Expected results: Running pod Additional info: Uploading logs and events momentarily
Events show many messages like: 1m 1m 1 router-1-7mkii Pod spec.containers{router} Normal Killing {kubelet apsrp6468.example.com} Killing container with docker id f20ad1d47f84: pod "router-1-7mkii_default(d0306181-b799-11e6-8fc2-0050568704dd)" container "router" is unhealthy, it will be killed and re-created. 19s 19s 1 router-1-7mkii Pod spec.containers{router} Normal Created {kubelet apsrp6468.example.com} Created container with docker id 51e261c75a16 19s 19s 1 router-1-7mkii Pod spec.containers{router} Normal Started {kubelet apsrp6468.example.com} Started container with docker id 51e261c75a16 37m 24s 37 router-1-syzb1 Pod spec.containers{router} Warning Unhealthy {kubelet apsrp6469.example.com} Readiness probe failed: Get http://localhost:1936/healthz: net/http: request canceled while waiting for connection 37m 24s 36 router-1-syzb1 Pod spec.containers{router} Warning Unhealthy {kubelet apsrp6469.example.com} Liveness probe failed: Get http://localhost:1936/healthz: net/http: request canceled while waiting for connection But nowhere in here or in logs does it appear to point to the cause (i.e. no indication of port conflicts, etc)