Bug 1464475
| Summary: | [GSS] [OCP 3.4] haproxy config files are missing some servers | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Francesco Marchioni <fmarchio> | ||||||||||
| Component: | Networking | Assignee: | Phil Cameron <pcameron> | ||||||||||
| Networking sub component: | router | QA Contact: | zhaozhanqi <zzhao> | ||||||||||
| Status: | CLOSED DUPLICATE | Docs Contact: | |||||||||||
| Severity: | urgent | ||||||||||||
| Priority: | unspecified | CC: | aos-bugs, atragler, bbennett, bperkins, clichybi, eparis, pcameron, rkhan, sukulkar, trankin | ||||||||||
| Version: | 3.4.1 | ||||||||||||
| Target Milestone: | --- | ||||||||||||
| Target Release: | 3.4.z | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2017-08-02 14:10:00 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Francesco Marchioni
2017-06-23 14:06:57 UTC
Created attachment 1291102 [details]
Router configurations
Created attachment 1291103 [details]
router log files
Created attachment 1291105 [details]
Deployment config and ha-proxy config template
Per @fmarchio: I've checked through the router files and could find the following "broken" configurations: ./router-cpyi0018-10-5255r/haproxy.config ./router-cpyi0087-11-rbfed/haproxy.config More in detail, they are missing the following servers: < server 7dd6fb3777b32ca1383ef13f0622de63 10.1.22.144:8080 check inter 5000ms cookie 7dd6fb3777b32ca1383ef13f0622de63 weight 100 4606d4604 < server 7dd6fb3777b32ca1383ef13f0622de63 10.1.22.144:8080 check inter 5000ms cookie 7dd6fb3777b32ca1383ef13f0622de63 weight 100 4670d4667 < server 7dd6fb3777b32ca1383ef13f0622de63 10.1.22.144:8080 check inter 5000ms cookie 7dd6fb3777b32ca1383ef13f0622de63 weight 100 routes It could be https://bugzilla.redhat.com/show_bug.cgi?id=1464567 (or one of the other event queue bugs that we fixed). I know you aren't seeing the panics, but there are other queue problems that were identified when we fixed that bug. Can they try enabling the debugging endpoint as the comment here outlines: https://github.com/openshift/ose/pull/700 Then get me the output from that curl command and we can see if we can see anything funky there. Alternatively, would they be willing to run the router with an elevated log level? How to enable the debugging endpoints:
This implements an http endpoint controlled by setting
OPENSHIFT_PROFILE=web and then you can override the address it listens
on (default is 127.0.0.1) and the port (default 6061) using the
OPENSHIFT_PROFILE_HOST and OPENSHIFT_PROFILE_PORT environment
variables respectively.
This is disabled by default until OPENSHIFT_PROFILE=web is set.
With the default setup, you can do:
curl http://127.0.0.1:6061/debug/pprof/goroutine?debug=1
bbennett comment 4 lists the same server 3 times. This suggest the router didn't see the event for it. Can we get the current pod spec please? *** This bug has been marked as a duplicate of bug 1464567 *** |