Bug 1104957
| Summary: | unable to turn off max_connect_errors | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Fabio Massimo Di Nitto <fdinitto> |
| Component: | mariadb | Assignee: | Honza Horak <hhorak> |
| Status: | CLOSED NOTABUG | QA Contact: | qe-baseos-daemons |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.0 | CC: | databases-maint, fdinitto |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-06-11 11:48:18 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1083890 | ||
|
Description
Fabio Massimo Di Nitto
2014-06-05 06:03:38 UTC
Well, the way how upstream workarounds it is the following (http://lists.mysql.com/mysql/161223): To disable for practical purposes, set it to 2^32-1 = 4294967295. On top, once a day run FLUSH HOSTS However, I'll try to ask mariadb upstream, if they would accept a new-option solution. I've tried to simulate the described behaviour by random not-proper work with the socket, but I'm still not able to achieve this. Can you, please, provide some reproducer or a code snippet which you use for performing the regular checks? (In reply to Honza Horak from comment #2) > Well, the way how upstream workarounds it is the following > (http://lists.mysql.com/mysql/161223): > To disable for practical purposes, set it to 2^32-1 = 4294967295. On top, > once a > day run FLUSH HOSTS Right, I set at 10000000 or something. I am having issues to setup a custom cron just FLUSH HOSTS tho. Would you consider shipping a "disabled" cron job by default that checks for /etc/sysconfig/mariadb for CLEAN_HOSTS=yes and set proper values automatically? > > However, I'll try to ask mariadb upstream, if they would accept a new-option > solution. > > I've tried to simulate the described behaviour by random not-proper work > with the socket, but I'm still not able to achieve this. Can you, please, > provide some reproducer or a code snippet which you use for performing the > regular checks? I am using ha-proxy to check the mariadb network socket every second. You can take a look at ha-proxy config here: http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5-on-rhel7-lb and mariadb: http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5-on-rhel7-db don´t get too scared by the cluster setup :) the db is running only one machine. The lb nodes will poll on the node running mariadb every sec. If possible at all i´d prefer not to rely on timer changes since they don´t solve the problem permanently, but just expand the window. (In reply to Fabio Massimo Di Nitto from comment #3) > (In reply to Honza Horak from comment #2) > > Well, the way how upstream workarounds it is the following > > (http://lists.mysql.com/mysql/161223): > > To disable for practical purposes, set it to 2^32-1 = 4294967295. On top, > > once a > > day run FLUSH HOSTS > > Right, I set at 10000000 or something. I am having issues to setup a custom > cron just FLUSH HOSTS tho. > > Would you consider shipping a "disabled" cron job by default that checks for > /etc/sysconfig/mariadb for CLEAN_HOSTS=yes and set proper values > automatically? This seems to me like too big over-engineering. I'd be more willing to either add a new option or better to change the behavior only to not perform the error check in case the max_connect_errors is set to 0. > > However, I'll try to ask mariadb upstream, if they would accept a new-option > > solution. > > > > I've tried to simulate the described behaviour by random not-proper work > > with the socket, but I'm still not able to achieve this. Can you, please, > > provide some reproducer or a code snippet which you use for performing the > > regular checks? > > I am using ha-proxy to check the mariadb network socket every second. > > You can take a look at ha-proxy config here: > > http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5- > on-rhel7-lb > > and mariadb: > > http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5- > on-rhel7-db > > don´t get too scared by the cluster setup :) the db is running only one > machine. > > The lb nodes will poll on the node running mariadb every sec. > > If possible at all i´d prefer not to rely on timer changes since they don´t > solve the problem permanently, but just expand the window. Understood. I'll take a look at it more closely and will let you know. I guess, since there is a workaround (not very nice but should work), this is not something that would block you totally, right? (In reply to Honza Horak from comment #4) > (In reply to Fabio Massimo Di Nitto from comment #3) > > (In reply to Honza Horak from comment #2) > > > Well, the way how upstream workarounds it is the following > > > (http://lists.mysql.com/mysql/161223): > > > To disable for practical purposes, set it to 2^32-1 = 4294967295. On top, > > > once a > > > day run FLUSH HOSTS > > > > Right, I set at 10000000 or something. I am having issues to setup a custom > > cron just FLUSH HOSTS tho. > > > > Would you consider shipping a "disabled" cron job by default that checks for > > /etc/sysconfig/mariadb for CLEAN_HOSTS=yes and set proper values > > automatically? > > This seems to me like too big over-engineering. I'd be more willing to > either add a new option or better to change the behavior only to not perform > the error check in case the max_connect_errors is set to 0. Good point :) whatever works for you as long as we achieve the goal. > > > > However, I'll try to ask mariadb upstream, if they would accept a new-option > > > solution. > > > > > > I've tried to simulate the described behaviour by random not-proper work > > > with the socket, but I'm still not able to achieve this. Can you, please, > > > provide some reproducer or a code snippet which you use for performing the > > > regular checks? > > > > I am using ha-proxy to check the mariadb network socket every second. > > > > You can take a look at ha-proxy config here: > > > > http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5- > > on-rhel7-lb > > > > and mariadb: > > > > http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5- > > on-rhel7-db > > > > don´t get too scared by the cluster setup :) the db is running only one > > machine. > > > > The lb nodes will poll on the node running mariadb every sec. > > > > If possible at all i´d prefer not to rely on timer changes since they don´t > > solve the problem permanently, but just expand the window. > > Understood. I'll take a look at it more closely and will let you know. I > guess, since there is a workaround (not very nice but should work), this is > not something that would block you totally, right? It can potentially block Openstack 5 release because it require ha-proxy to monitor the db. I´ll check with the OSP guys if we can deploy a workaround while we look into a fix. Thanks! (In reply to Fabio Massimo Di Nitto from comment #5) > It can potentially block Openstack 5 release because it require ha-proxy to > monitor the db. I´ll check with the OSP guys if we can deploy a workaround > while we look into a fix. Well, setting max possible value (4294967295 for 32bit, 64bit arch should accept even bigger) means the window would be 1 years when you have permanently 100 connections per second. That should not be a blocker imho, but I may miss some consequences. I will definitely try to solve it properly though. Steps to reproduce blocking:
1. set max_connect_errors=2 in /etc/my.cnf
2. configure mariadb so it accepts connection from different machine
3. on another machine run `for i in {1..3} ; do echo "" | telnet mariadbserver 3306 ; done`
4. mysql -h mariadbserver
Upstream contacted: https://lists.launchpad.net/maria-developers/msg07355.html There was quite interesting comment on the upstream mailing list -- "using '--skip-name-resolve' should bypass the max_connect_errors mechanism altogether." Fabio, could you, please, provide your feedback, if this would be usable in your use case? (In reply to Honza Horak from comment #9) > There was quite interesting comment on the upstream mailing list -- "using > '--skip-name-resolve' should bypass the max_connect_errors mechanism > altogether." > > Fabio, could you, please, provide your feedback, if this would be usable in > your use case? Hi Honza, as long as it´s a configuration option I see no problem with it. Do you know if it can be added to my.cfg or does it need to be on command line? either should work but i generally prefer to keep it all in .cfg if possible (for consistency and avoid to remember what´s from where). Thanks a lot for all your help btw. it´s very much appreciated. Fabio (In reply to Fabio Massimo Di Nitto from comment #10) > as long as it´s a configuration option I see no problem with it. Do you know > if it can be added to my.cfg or does it need to be on command line? either > should work but i generally prefer to keep it all in .cfg if possible (for > consistency and avoid to remember what´s from where). It can be used as both, so for configuration option just add: skip-name-resolve=1 into your my.cnf and you should be set. Doc for this option is here: http://dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_skip-name-resolve (In reply to Fabio Massimo Di Nitto from comment #0) > The connection check will simply establish that the network socket is > listening but will not do a full / complete mysql login/check. Well, I'd like to go back to the original issue, because the way you check if a server is up does not seem correct. Actually, we do something similar in SysV init script/systemd unit file after daemon is started, so the script returns no sooner than the daemon is really able to accept connections. The way how we do it is to run `mysqladmin ping` and then we take either success or failure with 'Access denied for user' error as a sign that the server *is* ready: http://pkgs.fedoraproject.org/cgit/mariadb.git/tree/mariadb-wait-ready#n39 You may consider using something similar to check vitality of a server, instead of current approach. Also, in case you find any of the solutions above is good enough for you, please, close this request. (In reply to Honza Horak from comment #11) > (In reply to Fabio Massimo Di Nitto from comment #10) > > as long as it´s a configuration option I see no problem with it. Do you know > > if it can be added to my.cfg or does it need to be on command line? either > > should work but i generally prefer to keep it all in .cfg if possible (for > > consistency and avoid to remember what´s from where). > > It can be used as both, so for configuration option just add: > skip-name-resolve=1 > into your my.cnf and you should be set. > > Doc for this option is here: > http://dev.mysql.com/doc/refman/5.5/en/server-options. > html#option_mysqld_skip-name-resolve > > (In reply to Fabio Massimo Di Nitto from comment #0) > > The connection check will simply establish that the network socket is > > listening but will not do a full / complete mysql login/check. > > Well, I'd like to go back to the original issue, because the way you check > if a server is up does not seem correct. This is not something I decided or implemented :) I am a mortal user of the whole thing ;) > Actually, we do something similar > in SysV init script/systemd unit file after daemon is started, so the script > returns no sooner than the daemon is really able to accept connections. > > The way how we do it is to run `mysqladmin ping` and then we take either > success or failure with 'Access denied for user' error as a sign that the > server *is* ready: > http://pkgs.fedoraproject.org/cgit/mariadb.git/tree/mariadb-wait-ready#n39 > > You may consider using something similar to check vitality of a server, > instead of current approach. > > Also, in case you find any of the solutions above is good enough for you, > please, close this request. It's not an option to change the way we check, but it's not a problem either because clients would happily reconnected to the server when it dies. I am closing this bug, but it might be worth making it a documentation note for the future. (In reply to Fabio Massimo Di Nitto from comment #12) > It's not an option to change the way we check, but it's not a problem either > because clients would happily reconnected to the server when it dies. > > I am closing this bug, but it might be worth making it a documentation note > for the future. So, which way have you chosen to go with in the end? Setting the skip-name-resolve? (In reply to Honza Horak from comment #13) > (In reply to Fabio Massimo Di Nitto from comment #12) > > It's not an option to change the way we check, but it's not a problem either > > because clients would happily reconnected to the server when it dies. > > > > I am closing this bug, but it might be worth making it a documentation note > > for the future. > > So, which way have you chosen to go with in the end? Setting the > skip-name-resolve? [mysqld] skip-name-resolve=1 Yes, even tho the changes it involves in the user table needs to be adjusted, but it´s the fastest solution we have at the moment. I would still like to see a "max_connect_errors = 0" and retain the user ACL via host entries. |