Hide Forgot
Description of problem: When router(haproxy) is performing reload operations and if it contains large number of route(about 10000). It won't pass the health check duo to Low performance,which kills router continuously. Version-Release number of selected component (if applicable): ocp 3.9.14 router 3.9.14 How reproducible: always Steps to Reproduce: 1.create a router 2.scale up to 3 router pods 3.create 10000+ routes Actual results: router pod keep restarting Expected results: running well Additional info: when increase haproxy backend check interval to 300s,the problem can be avoid.
Can we get details on the vm/machine they are running the router on. However, it sounds like they have identified a work-around for the time being. Changes that are going into 3.11 may help with this situation.
The hardware information of this infra node that running router is:8core 32GB Please feel free to let me know what additional info you need,thanks.
(In reply to Ben Bennett from comment #1) > Can we get details on the vm/machine they are running the router on. > > However, it sounds like they have identified a work-around for the time > being. Changes that are going into 3.11 may help with this situation. Thanks Bennett for your reply. The root cause of this issue probably is that haproxy has very low performance in reloading large number of routes, so that a health check can not be completed within the default Readniess and livessness detection cycle, which requires an improvement to fix. At present, temporarily increase number of health checks and the interval of detection can only evade the problem.
I'm closing this bug, feel free to reopen it if the problem persists.