Bug 1963243

Summary: HAproxy pod logs showing error "another server named 'pod:httpd-7c7ccfffdc-wdkvk:httpd:8080-tcp:10.128.x.x:8080' was already defined at line 326, please use distinct names"
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: RoutingAssignee: Stephen Greene <sgreene>
Status: CLOSED ERRATA QA Contact: Arvind iyengar <aiyengar>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6CC: aos-bugs, bperkins, dapark, hongli, mmasters
Target Milestone: ---   
Target Release: 4.7.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Remove selector from a service exposed via a route. Consequence: Duplicate endpointslices would be created for the service's pods, triggering HAProxy reload errors due to duplicate server entries. Fix: Filter out accidental duplicate server lines when writing out the HAProxy config file. Result: Deleting the selector from a service does not brick the router.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-15 09:28:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1961550    
Bug Blocks: 1965329    

Comment 1 Arvind iyengar 2021-05-25 07:47:09 UTC
Verified in "4.7.0-0.ci.test-2021-05-25-065027-ci-ln-756x4st-latest" CI release. With this payload, there are no more router reload errors when the selector is removed for a service mapped to a route:
--------
oc get all   
NAME                  READY   STATUS    RESTARTS   AGE
pod/hello-openshift   1/1     Running   0          16s

NAME                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/hello-openshift   ClusterIP   172.30.164.38   <none>        8080/TCP   12s

NAME                                       HOST/PORT                                                                           PATH   SERVICES          PORT   TERMINATION   WILDCARD
route.route.openshift.io/hello-openshift   hello-openshift-test.apps.ci-ln-756x4st-f76d1.origin-ci-int-gce.dev.openshift.com          hello-openshift   8080                 None

oc get svc hello-openshift -o yaml    
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2021-05-25T07:36:04Z"
  labels:
    name: hello-openshift
  name: hello-openshift
  namespace: test
  resourceVersion: "31482"
  selfLink: /api/v1/namespaces/test/services/hello-openshift
  uid: 2c54627e-0f33-4577-953e-a2437b56e552
spec:
  clusterIP: 172.30.164.38
  clusterIPs:
  - 172.30.164.38
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    name: hello-openshift
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}
  
  
oc patch service hello-openshift --patch '{"spec":{"selector":null}}'
service/hello-openshift patched

oc get svc hello-openshift -o yaml     
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2021-05-25T07:36:04Z"
  labels:
    name: hello-openshift
  name: hello-openshift
  namespace: test
  resourceVersion: "31932"
  selfLink: /api/v1/namespaces/test/services/hello-openshift
  uid: 2c54627e-0f33-4577-953e-a2437b56e552
spec:
  clusterIP: 172.30.164.38
  clusterIPs:
  - 172.30.164.38
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}


oc -n openshift-ingress logs router-default-5f75648458-gk29z --tail 100
I0525 07:11:35.747242       1 template.go:433] router "msg"="starting router"  "version"="majorFromGit: \nminorFromGit: \ncommitFromGit: e321412e\nversionFromGit: v0.0.0-unknown\ngitTreeState: dirty\nbuildDate: 2021-05-25T06:46:48Z\n"
I0525 07:11:35.749920       1 metrics.go:154] metrics "msg"="router health and metrics port listening on HTTP and HTTPS"  "address"="0.0.0.0:1936"
I0525 07:11:35.756219       1 router.go:191] template "msg"="creating a new template router"  "writeDir"="/var/lib/haproxy"
I0525 07:11:35.756280       1 router.go:270] template "msg"="router will coalesce reloads within an interval of each other"  "interval"="5s"
I0525 07:11:35.756719       1 router.go:332] template "msg"="watching for changes"  "path"="/etc/pki/tls/private"
I0525 07:11:35.756780       1 router.go:262] router "msg"="router is including routes in all namespaces"  
E0525 07:11:35.862446       1 haproxy.go:418] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory
I0525 07:11:35.925451       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:11:52.692601       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:11:57.673407       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:12:02.699695       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:12:07.675952       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:12:12.684067       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:12:19.063320       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:12:24.067060       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:12:56.715933       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:14:11.594934       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:14:45.977172       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:14:50.975239       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
I0525 07:36:09.840351       1 router.go:579] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
---------

Comment 3 Hongan Li 2021-05-31 01:12:14 UTC
the PR was merged into 4.7.0-0.nightly-2021-05-27-150845, not sure why robot didn't move it to verified.
so moving to verified manually.

Comment 5 Siddharth Sharma 2021-06-04 18:40:34 UTC
This bug will be shipped as part of next z-stream release 4.7.15 on June 14th, as 4.7.14 was dropped due to a regression https://bugzilla.redhat.com/show_bug.cgi?id=1967614

Comment 9 errata-xmlrpc 2021-06-15 09:28:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.16 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2286