Bug 119304
Summary: | mysql query hangs at _dl_sysinfo_int80 () from /lib/ld-linux.so.2 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Alois Treindl <alois> |
Component: | mysql | Assignee: | Tom Lane <tgl> |
Status: | CLOSED NOTABUG | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | hhorak |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-03-30 14:34:09 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Alois Treindl
2004-03-29 09:14:20 UTC
Hm, does the connection to the remote database pass through a firewall? Some firewalls will drop connections after a certain period of inactivity. If you can't adjust the firewall's behavior, it may be that the only solution is not to assume that you can hold the remote connection open indefinitely. Yes, the connection passes through a firewall, actually ipvsadm in a load balancer. But before we moved the application to RHEL 3, it has been running several years on HPUX and used exactly the same connection to the remote mysql database. There we never had this problem. By the way, the remote mysql server is a 3.23.37 release, while the local mysql client libraries are from the 4.0 tree. I am not sure whether that has anything to do with the problem, and I understand that redhat does not support mysql 4.x for RHEL 3. Indeed, there's only a limited amount that I can do with mysql 4 problem reports. (I do hope they fix the licensing problem soon, because it's pretty silly that we can't ship mysql 4, but I'm just an engineer...) It seems like you have quite a lot of variables here: you have a platform move and possibly a mysql version move (were you running 4.0.17 or something older when you were on HPUX?). This makes it really hard to tell whether the issue is an RHEL bug or a mysql bug. If it is in mysql 4 then there's not much I can do except send you to MySQL AB. If it is RHEL then I will have to reassign it to a kernel or glibc hacker, but I'm not going to bother those guys until I have enough evidence to at least tell whose problem it's likely to be. In short then we need more data. One thing I am wondering about is whether the command that's supposed to be sent to the remote database is actually getting flushed out completely. Obviously if it doesn't get sent fully then the remote wouldn't respond. One way to look into that is to watch the remote connection with a packet sniffer such as tcpdump or ethereal --- are you in a position to do that? Yes, I can watch the network. I have not used a sniffer ever, and would appreciate some advice. My setup is like this Client Machine A ------ LVS load balancer + NAT ---- Mysql host B The mysql host listens on host B port 3306, to which I send packets from A, any random port. Then A receives reply packets from B, port 3306. Where should I listen, and how do I setup the sniffer properly? Host B receives a lot of other myslq traffic from within the load balanced cluster, but these are easy to diferentiate because they come from different IP addresses. The packets from A will appear at B under the IP address of the load balancer, which sends no other requests. There are several simultaneous users on machine A with occasional mysql queries, but they all log their traffic in log files I have set up so that I should be able to sort out what came from whom later, via the timestamps. PS: before the move from HPUX to RHEL 3, yes, the mysql client was 3.23.37 too. But as we were moving to RHEL, I thought it a good idea to update, even at the risk of not getting redhat support. The load balancer is redhat 7.0, and the mysql host is Redhat 7.1. I assume this is not relevant for the problem. I think you'd be best off sniffing the single connection at the client machine, so as to minimize the amount of data you have to look through. I'm not much of a tcpdump expert myself, but I know you can filter by port number to watch just one connection. So it would go something like: start up client, determine what port is being used for the remote database connection (lsof should work if nothing else), start tcpdump, do something (to have a comparison point for what a working ping looks like), wait 20 minutes, do something again, repeat till failure, compare traces. BTW, if you use tcpdump, be aware that the default output format omits most of the non-header content of packets; you'll need to select options that show you the whole content of each packet. Not sure what ethereal's defaults are like, I've hardly used it. I have put up tcpdump sniffer on both sides, at the client and at the server. After runnign the sniffer for a while, I found this: if the client does not send packets for some time (a few minutes), and then sends mysql packets again (via mysql_ping), these packets never arrive at the the mysql server. The client keeps trying, but never gets a reply. Who is responsible for terminating the connection I do not know yet. Because it used to work before, with the HPUX client and the same firewall/load balancer, I assume it is not the load balancer. I have decided to program a workaround into our application: the client terminates each mysql connection voluntarily, after making a short sequence of queries, and a opens a connection if needed. It does not try to keep a permanent connection. That way, the problem is solved for me. The bugzilla case can be closed, as solved by workaround. It sure sounds like the firewall is at fault. That's an unusually short timeout though. Is it possible that the firewall was configured to not drop connections from your old client machine, and that that preference didn't get changed over to point to your new client's IP address? This is a bizarre theory but I have no others at the moment. Anyway, I'm going to close this bug since it's evidently not either mysql's or RHEL's fault. Feel free to reopen it if you get new evidence pointing back at us. The firewall rules were not touched, and there is none there which should create a different behaviour. Actually, the new client inherited to old HPUX client's IP address. Anyway, my workaround works. I impletemented a last_time_talked flags which remembers when the mysql connection was used the last time. If a new request sees that the last_time_talked is too far in the past, it closes the connection and opens a new one. Apparently, it can close it without sending a single packet, so that there is no danger of hangup. That way, we avoid using old connections which might have gone stale. |