Bug 1419086 - Docker pull fails when accessing exposed registry through Load Balancer
Summary: Docker pull fails when accessing exposed registry through Load Balancer
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.3.1
Assignee: Ben Bennett
QA Contact: Meng Bo
Depends On:
TreeView+ depends on / blocked
Reported: 2017-02-03 15:21 UTC by Vladislav Walek
Modified: 2018-01-08 19:16 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2017-07-26 13:20:16 UTC
Target Upstream Version:

Attachments (Terms of Use)

Description Vladislav Walek 2017-02-03 15:21:46 UTC
Description of problem:

Customer recently upgraded the environment form 3.2 to 3.3. Customer is trying to pull image from exposed registry. The registry is running on openshift, service is securely exposed using re-encrypt. The have between the client and openshift router a load balancer provided by cluster provider (can't reconfigure the load balancer) where some encryption is done.
When they try to pull image on some external client with docker it fails using token from service account:

docker pull registry.com:443/namespace/image:latest

Will give truncated response and then docker fails. If they try that with curl, the response is complete:

curl -i -s -k  -X $'GET' -H $'User-Agent: docker/1.10.3 go/go1.6.3 git-commit/3999ccb-unsupported kernel/3.10.0-514.2.2.el7.x86_64 os/linux arch/amd64' -H $'Authorization: Bearer <long_token_here>' $'https://registry.com/v2/namepspace/image/manifests/latest'

When they tried to bypass the load balancer and use one of the router, the pull works normally. Even if the service is re-encrypt. 
Something is done on Load Balancer which brakes the docker pull. (docker push also doesn't work).

Version-Release number of selected component (if applicable):

Openshift 3.3

How reproducible:

Comment 7 Ben Bennett 2017-02-06 15:34:18 UTC
And, if worst comes to worst, we may need to get some wireshark traces from various points to see if one end is tearing down the connection abruptly.

Comment 10 Ruben Romero Montes 2017-02-06 15:49:47 UTC
I have requested more details about the load balancer and some tcpdumps between client -> lbalancer and lbalancer -> node

Comment 11 Vladislav Walek 2017-02-08 12:28:04 UTC
Hello, I have reply from customer.

The test : encrypted traffic from outside to LB and un-encrypted from LB to router works fine too.

Also, here is the response from Noris about load balancer:
> What kind of balancer is used?

A10 hardware appliance.

> How it encrypts the traffic, what steps are done?

TLS from the client-side is terminated on the load balancer.
Towards the servers a new TLS connection is opened.

> Which SNI is used on load balancer?

I don't understand this question.
What exactly do they want to know?
SNI is supported on the Load balancer.
However there's only one certificate in use for the virtual-server.
So SNI is not needed.

Unfortunately, customer can't provide the tcpdumps, due the load balancer is held by provider and they can't decrypt the tcpdumps.

Comment 20 Ben Bennett 2017-07-26 13:20:16 UTC
Closing due to insufficient data.  Everything points to the external loadbalancer as the problem because if they hit the OpenShift router directly, it works.  If more information arises, please re-open.

Note You need to log in before you can comment on or make changes to this bug.