We have a prometheus instance in us-east-1 that, when updated, takes 20minutes to 3hours for each router instance to start including it again. The endpoints are in place for the service (the new pod IP is in endpoints), but only some of the routers show up.
Project openshift-devops-monitor route prometheus
Only some of the router instances return the app - the others return a 503 for a very long time. This is 3.6.126/8
Please see the comment at https://bugzilla.redhat.com/show_bug.cgi?id=1471899#c2 for a way to tune things to work around this problem for the short term.
*** This bug has been marked as a duplicate of bug 1471899 ***