1677420 – when we have multiple "tcp-connect port" in the backend, only first tcp-check is validated, the last one never gets validated.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1677420 - when we have multiple "tcp-connect port" in the backend, only first tcp-check is validated, the last one never gets validated.

Summary: when we have multiple "tcp-connect port" in the backend, only first tcp-check...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	haproxy
Sub Component:
Version:	7.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Ryan O'Hara
QA Contact:	Brandon Perkins
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-02-14 20:08 UTC by Sangam
Modified:	2019-09-26 14:57 UTC (History)
CC List:	7 users (show)
Fixed In Version:	haproxy-1.5.18-9.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-08-06 13:13:34 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:2287	0	None	None	None	2019-08-06 13:13:35 UTC

Description Sangam 2019-02-14 20:08:08 UTC

Description of problem: when we have multiple "tcp-connect port" in the backend, only first tcp-check is validated, the last one never gets validated.


Version-Release number of selected component (if applicable):
[root@keepalived1 ~]# haproxy -v
HA-Proxy version 1.5.14 2015/07/02
Copyright 2000-2015 Willy Tarreau <willy>


How reproducible:

Backend configuration
======================
backend HA101
    mode http
    balance roundrobin
    option tcp-check
    tcp-check connect port 443
    tcp-check connect port 80                               # this one is never checked for.
   server HA_100 192.168.124.159:443 check maxconn 3000
   server HA_102 192.168.124.84:80 check  maxconn 3000    


Steps to Reproduce:
1. Apply a iptables rule to reject all traffic on port 80 on real server.
2. All the tcp-checks on port 80 should ideally fail.
3. But HAProxy does not remove the server from the backend farm.

Actual results:
Haproxy does not perform health check on port 80(for above backend block), and hence it never removes the server from backend farm.

Expected results:
Haproxy should perform health check and remove the specific backend from server farm.

Additional info:
I see its fixed upstream : http://git.haproxy.org/?p=haproxy.git;a=commit;h=248f1173f272b10bf7ed2aa7061bfafd34160f70

I tried searching for the commit in our engineering tree but I was unable to find it.

Comment 1 Ryan O'Hara 2019-02-14 21:53:10 UTC

I'm confused -- why do you want to use 'tcp-check connect' here? The way your config is setup the server lines are already performing a health check to each port. Your current config is doing individual health checks on each server. Note that, based on your configuration, server "HA_100" is only getting traffic on port 442 and "HA_102" is only getting traffic on port 80. This doesn't make sense. Do you want both backend servers to handle either SSL (443) and non-SSL (80)? If so, your configuration is wrong.

I'm assuming that what you want is both servers, HA_100 and HA_102, to handle both port 80 and 443. Note that your did now show the frontend, so I do not know which port you are listening on, if you are doing SSL termination, etc. Anyway, if you want both servers to get a tcp-connect health check on both ports, you want to remove the ports form the server lines. Example:

server HA_100 192.168.124.159 check maxconn 3000
server HA_102 192.168.124.84 check maxconn 3000

Also, you might need to add 'ssl' to the end of 'tcp-check connect port 443'. See here:

http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#4.2-tcp-check%20connect

Finally, I am not sure that the patch you referenced is a fix for the issue you're seeing. In the patch comments, it specifically says 'single "tcp-check connect" rule', which is definitely not the case here. Also, upstream typically will tag something to be backported to 1.5 as needed but this was only tagged for 1.8. I can ask upstream if this is a problem in 1.5, but I think the first step is to clean-up the configuration and try to make it work.

Comment 2 Ryan O'Hara 2019-02-14 21:59:59 UTC

(In reply to Ryan O'Hara from comment #1)
> Note that, based on your configuration, server "HA_100" is only
> getting traffic on port 442 ...

Sorry, I meant 443 of course.

Comment 3 Ryan O'Hara 2019-02-18 16:44:51 UTC

I talked with upstream and we do not think that the bug referenced above is causing this. We also agree that the configuration is incorrect, but before I can provide advice on how to correct it I would have to know exactly what is trying to be achieved.

Also, the bug report stated that this is haproxy-1.5.14, yet this bug is filed against RHEL7.6. Note that RHEL7.6 contains haproxy-1.5.18.

Comment 8 Daniel Arena 2019-02-25 17:25:55 UTC

Hi Ryan,

I am the actual reporter of this bug. I think there is some confusion added since it is being relayed through Sangam. The example configuration is a bit different than mine and adds confusion by mentioning ports 80 and 443 in the backend servers.

My configuration example looks like this:

backend test1
    mode tcp
    option tcp-check
    tcp-check connect port 4567
    tcp-check connect port 6789
    server  server1 172.19.2.50:4567 check inter 500ms
    server  server2 172.19.2.51:4567 check inter 500ms
    server  server3 172.19.2.52:4567 check inter 500ms

and the purpose of this is to mark any of the servers "DOWN" unless both ports 4567 and 6789 are accepting connections. The reason for this is that the service running on port 4567 on all of the servers relies on the service on port 6789 on the same server also being up, so I do not want to forward connections to port 4567 on a server when port 6789 is not accepting connections even if port 4567 still is.

I also specifically mentioned in the support case how the commit I linked to may be confusing at first since they mention a configuration with only 1 tcp-check connect line, but if you read the description, it says:

"The main reason for this issue is that the piece of code which validates that we're not at the end of the chained list (of rules) prevents executing the validation of the establishment of the TCP connection.
Since validation is not executed, the rule is terminated and the report says no errors were encountered, hence the server is UP all the time."

As I confirmed by testing, the description explains how the bug is not actually specific to "a single tcp-check connect", but causes the LAST tcp-check connect to not be validated. Of course when you have a list of length 1, that 1 in the list is also the last one. If for example I have 5 tcp-check connect lines, like

backend test1
    mode tcp
    option tcp-check
    tcp-check connect port 4567
    tcp-check connect port 4568
    tcp-check connect port 4569
    tcp-check connect port 4570
    tcp-check connect port 4571
    server  server1 172.19.2.50:4567 check inter 500ms
    server  server2 172.19.2.51:4567 check inter 500ms
    server  server3 172.19.2.52:4567 check inter 500ms
 

a server will be marked DOWN if any of the ports 4567 through 4570 go down, but not port 4571.

If you know of how to make this work without using tcp-check connect, then I am willing to try it. I do not see how removing the port from the backend server line will work though. How does haproxy know the ports to check? They are not being specified anywhere else.

I am also indeed using the latest version, haproxy-1.5.18-8.el7.x86_64

Thanks,
Dan

Comment 9 Ryan O'Hara 2019-02-25 18:33:34 UTC

Daniel, I am aware of the bug and am working on a fix. I mentioned this in a post earlier but it is marked private so perhaps you can't see it? In short, the last 'tcp-check connect' rule will be run but the results of that check will have no effect. The simple workaround is to put in a dummy 'tcp-check connect' as the last rule.

On a related note, I am not entirely sure what you are trying to achieve with multiple 'tcp-check connect' rules without any send/expect logic. For example, perhaps you want to redirect HTTP to HTTPS on the frontend? Are you wanting SSL traffic to be SSL terminated or SSL passed all the way to the backend servers?

I am working on a backport of the patch.

Comment 10 Daniel Arena 2019-02-25 19:58:25 UTC

Thanks Ryan. What I am trying to achieve is a tcp connect check, without the send/expect, to multiple ports on every server, the same way the default check works. It makes sure the port is open/accepting connections, but does not fully establish a connection to send anything. The service I am load balancing is not HTTP or HTTPS, it is a proprietary encrypted protocol and currently does not have a way to get a health check reply.

I had already thought of the "dummy tcp-check connect" workaround and so have been using that for now, but still wanted to report the bug. I did not mention the workaround because I still wanted to get the issue fixed. I don't like having the dummy check line there. Hope you understand... :)

and yeah, I don't think I can see private comments. The last comment I can see that you made is comment #3 and then my comment is #8.

Thanks,
Dan

Comment 11 Ryan O'Hara 2019-02-25 22:12:58 UTC

(In reply to Daniel Arena from comment #10)
> Thanks Ryan. What I am trying to achieve is a tcp connect check, without the
> send/expect, to multiple ports on every server, the same way the default
> check works. It makes sure the port is open/accepting connections, but does
> not fully establish a connection to send anything. The service I am load
> balancing is not HTTP or HTTPS, it is a proprietary encrypted protocol and
> currently does not have a way to get a health check reply.

OK. That makes more sense. Typically, as you seem to understand, there are better ways to do this if we are talking about HTTP and HTTPS being load balanced.

> I had already thought of the "dummy tcp-check connect" workaround and so
> have been using that for now, but still wanted to report the bug. I did not
> mention the workaround because I still wanted to get the issue fixed. I
> don't like having the dummy check line there. Hope you understand... :)

I absolutely understand. The workaround is there only until we can get the fix in place. Thanks for your patience.

> and yeah, I don't think I can see private comments. The last comment I can
> see that you made is comment #3 and then my comment is #8.

Got it. Stay tuned.

Comment 21 errata-xmlrpc 2019-08-06 13:13:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2287

Note You need to log in before you can comment on or make changes to this bug.