Bug 1263136 - [networking_124]F5 router can not be running
[networking_124]F5 router can not be running
Status: CLOSED WORKSFORME
Product: OpenShift Origin
Classification: Red Hat
Component: Routing (Show other bugs)
3.x
All All
high Severity high
: ---
: ---
Assigned To: Rajat Chopra
zhaozhanqi
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-15 04:06 EDT by zhaozhanqi
Modified: 2015-10-20 02:37 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-20 02:37:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 2 Miciah Dashiel Butler Masters 2015-09-15 09:34:22 EDT
I would expect there to be a "kubernetes" service, created by the master, which would explain why the "openshift_default_kubernetes" pool exists.  There should be no route defined for the "kubernetes" service, so the F5 route synchronizer should not configure F5 BIG-IP to use the "openshift_default_kubernetes" pool for anything.

Similarly, I have seen a "router" service, so I would expect a "openshift_default_router" pool to exist.  It is not clear why it does in some cases and does not in your case, but like the "openshift_default_kubernetes" pool, it should not be used for anything anyway.

More important is what other pools exist.  If you have created services and routes for which the F5 route synchronizer is failing to create corresponding pools and policy rules, can you provide details on those routes and services, and the corresponding log output (if any) from the router? (`oc get routes -o json` and `oc get services -o json` for the former.)
Comment 3 zhaozhanqi 2015-09-15 23:12:39 EDT
I can see other service in the F5 pool list. openshift_default_router is also in there, but always be deleted and recreated because the router pod is unhealthy and recreated.

I0916 03:01:45.399543     808 manager.go:1492] pod "router-1-0ljxl_default" container "router" is unhealthy (probe result: failure), it will be killed and re-created.
I0916 03:02:05.420802     808 manager.go:1492] pod "router-1-0ljxl_default" container "router" is unhealthy (probe result: failure), it will be killed and re-created.
I0916 03:02:15.429068     808 manager.go:1492] pod "router-1-0ljxl_default" container "router" is unhealthy (probe result: failure), it will be killed and re-created.


here are some logs from journalctl, hope it can help you analyse the root cause.

8:42 ip-172-18-10-71 systemd-udevd[247]: error: /dev/dm-4: No such device or address
Sep 16 03:08:42 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:42.601160080Z" level=info msg="DELETE /containers/c148af20f47886bf82886e6550d16b39f9d332f23d4b790fbadcd403b54a2ac2?v=1"
Sep 16 03:08:42 ip-172-18-10-71 kernel: EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
Sep 16 03:08:42 ip-172-18-10-71 kernel: SELinux: initialized (dev dm-4, type ext4), uses xattr
Sep 16 03:08:43 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:43.488763474Z" level=info msg="GET /version"
Sep 16 03:08:47 ip-172-18-10-71 systemd-udevd[247]: error: /dev/dm-5: No such device or address
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.491186801Z" level=info msg="GET /version"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.868613664Z" level=info msg="GET /containers/json"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.871177953Z" level=info msg="GET /containers/json"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.873113538Z" level=info msg="GET /containers/json?all=1"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.875927957Z" level=info msg="GET /containers/60e2fdb1edca7249b667e7fa0d79cf77acaf4b01ccdeebc5a8d16f9d28547862/json"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.877493151Z" level=info msg="GET /containers/6700e6912c5b13b4f0b9a87c41fc32416d81b36821657fefac20da127dee517e/json"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.881071068Z" level=info msg="GET /containers/6700e6912c5b13b4f0b9a87c41fc32416d81b36821657fefac20da127dee517e/json"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.883292458Z" level=info msg="GET /containers/json?all=1"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.885953010Z" level=info msg="GET /containers/60e2fdb1edca7249b667e7fa0d79cf77acaf4b01ccdeebc5a8d16f9d28547862/json"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.887532882Z" level=info msg="GET /containers/6700e6912c5b13b4f0b9a87c41fc32416d81b36821657fefac20da127dee517e/json"
Sep 16 03:08:48 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:48.971541160Z" level=info msg="GET /containers/json"
Sep 16 03:08:49 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:49.074441202Z" level=info msg="GET /containers/json"
Sep 16 03:08:49 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:49.177522263Z" level=info msg="GET /containers/json"
Sep 16 03:08:49 ip-172-18-10-71 docker[679]: time="2015-09-16T03:08:49.280566313Z" level=info msg="GET /containers/json"
Comment 4 zhaozhanqi 2015-10-20 02:37:32 EDT
The F5 router now can be running. anyway this issue has been fixed.
marked this issue 'verified'

Note You need to log in before you can comment on or make changes to this bug.