Description of problem: When deploy 50 mysql concurrently, some pods can't be running. Their status are CrashLoopBackOff Version-Release number of selected component (if applicable): [root@openshift-146 deploy_test]# openshift version openshift v3.1.0.4-3-ga6353c7 kubernetes v1.1.0-origin-1107-g4c8e6f4 etcd 2.1.2 1-master 2-node every vm has 8G memory and 30G disk How reproducible: sometimes Steps to Reproduce: 1.Create 50 projects and in every project deploy a mysql 2.Check all pods Actual results: 2. Some pods is failed with error. [root@openshift-137 ~]# docker logs 158a16505dde Starting local mysqld server ... Waiting for MySQL to start ... 151113 5:06:56 [Note] /opt/rh/mysql55/root/usr/libexec/mysqld (mysqld 5.5.45) starting as process 9 ... 151113 5:06:56 [Warning] One can only use the --user switch if running as root 151113 5:06:56 [Note] Plugin 'FEDERATED' is disabled. 151113 5:06:56 InnoDB: The InnoDB memory heap is disabled 151113 5:06:56 InnoDB: Mutexes and rw_locks use GCC atomic builtins 151113 5:06:56 InnoDB: Compressed tables use zlib 1.2.7 151113 5:06:56 InnoDB: Using Linux native AIO 151113 5:06:56 InnoDB: Warning: io_setup() failed with EAGAIN. Will make 5 attempts before giving up. InnoDB: Warning: io_setup() attempt 1 failed. InnoDB: Warning: io_setup() attempt 2 failed. Waiting for MySQL to start ... InnoDB: Warning: io_setup() attempt 3 failed. InnoDB: Warning: io_setup() attempt 4 failed. Waiting for MySQL to start ... InnoDB: Warning: io_setup() attempt 5 failed. 151113 5:06:59 InnoDB: Error: io_setup() failed with EAGAIN after 5 attempts. InnoDB: You can disable Linux Native AIO by setting innodb_use_native_aio = 0 in my.cnf 151113 5:06:59 InnoDB: Fatal error: cannot initialize AIO sub-system 151113 5:06:59 [ERROR] Plugin 'InnoDB' init function returned error. 151113 5:06:59 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. 151113 5:06:59 [ERROR] Unknown/unsupported storage engine: InnoDB 151113 5:06:59 [ERROR] Aborting 151113 5:06:59 [Note] /opt/rh/mysql55/root/usr/libexec/mysqld: Shutdown complete Expected results: 2.All pod are running Additional info: disk usage was about 62%
Is this 50 independent pods, or is this a master/slave scenario? Since you say 50 projects I assume it's 50 independent ones. which volume type are you using for the pods?
DeShuai, there appear to be two possible solutions, can you please try both of them, independently? 1) Increase fs.aio-max-nr kernel limit on the nodes. See the answer here: https://unix.stackexchange.com/questions/116520/mysql-server-wont-install-to-a-new-os-debian-ubuntu 2) Set the MYSQL_AIO environment variable to 0 to all MySQL pods that you create. This should disable native AIO and the limit should not be hit.
(In reply to Ben Parees from comment #1) > Is this 50 independent pods, or is this a master/slave scenario? Since you > say 50 projects I assume it's 50 independent ones. > > which volume type are you using for the pods? 50 pods are independent. all pod is created using file:https://github.com/openshift/origin/blob/master/examples/db-templates/mysql-ephemeral-template.json
(In reply to Martin Nagy from comment #2) > DeShuai, there appear to be two possible solutions, can you please try both > of them, independently? > > 1) Increase fs.aio-max-nr kernel limit on the nodes. See the answer here: > https://unix.stackexchange.com/questions/116520/mysql-server-wont-install-to- > a-new-os-debian-ubuntu > > 2) Set the MYSQL_AIO environment variable to 0 to all MySQL pods that you > create. This should disable native AIO and the limit should not be hit. After try both two method, pod can be running well.thanks.
Now, what to do with this ... Doubt this is something we'd want to do globally on the kernel side for systems running openshift. My thought is that we should convert this to a documentation bugzilla and put a note in the openshift docs.
I agree. Should we move this bug to the Documentation component?
Martin, please document this in the openshift mysql image docs. Thanks.
Martin, did you want the docs team to add this note? If yes, I will assign it to one of the writers.
(In reply to Vikram Goyal from comment #8) > Martin, did you want the docs team to add this note? > If yes, I will assign it to one of the writers. Martin and i discussed this today. The plan is to add to the MySQL images section an explanation behind why MySQL fails w/ the above-mentioned error messages, as well as documentation on the two resolution paths: - increase the kernel resource limit on the node - set the env var The latter will mention that editing /etc/my.cfg directly is not the OpenShift way, and xref the "Managing Environment Variables" section of the Dev Guide. Martin, additions/corrections welcome. Assigning this to myself.
PR: https://github.com/openshift/openshift-docs/pull/1701
Commented in the pull, looks great!
Hi DeShuai, WDYT?
The doc pr LGTM. When pr is merged will verify this bug.
Hi DeShuai, PR merged: https://github.com/openshift/openshift-docs/pull/1701#event-593982568 WDYT?
Thanks DeShuai. Moving status to RELEASE_PENDING.
This is now live: https://access.redhat.com/documentation/en/openshift-enterprise/version-3.1/using-images/#troubleshooting Closing.