Bug 1281733 - Not all pods get started when deploying 50 mysql concurrently
Not all pods get started when deploying 50 mysql concurrently
Status: CLOSED CURRENTRELEASE
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation (Show other bugs)
3.1.0
Unspecified Unspecified
medium Severity low
: ---
: ---
Assigned To: Thien-Thi Nguyen
Vikram Goyal
Vikram Goyal
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-13 05:41 EST by DeShuai Ma
Modified: 2016-08-04 21:27 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-11 09:39:47 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description DeShuai Ma 2015-11-13 05:41:59 EST
Description of problem:
When deploy 50 mysql concurrently, some pods can't be running. Their status are CrashLoopBackOff

Version-Release number of selected component (if applicable):
[root@openshift-146 deploy_test]# openshift version
openshift v3.1.0.4-3-ga6353c7
kubernetes v1.1.0-origin-1107-g4c8e6f4
etcd 2.1.2

1-master 2-node
every vm has 8G memory and 30G disk

How reproducible:
sometimes

Steps to Reproduce:
1.Create 50 projects and in every project deploy a mysql

2.Check all pods

Actual results:
2. Some pods is failed with error.
[root@openshift-137 ~]# docker logs 158a16505dde
Starting local mysqld server ...
Waiting for MySQL to start ...
151113  5:06:56 [Note] /opt/rh/mysql55/root/usr/libexec/mysqld (mysqld 5.5.45) starting as process 9 ...
151113  5:06:56 [Warning] One can only use the --user switch if running as root

151113  5:06:56 [Note] Plugin 'FEDERATED' is disabled.
151113  5:06:56 InnoDB: The InnoDB memory heap is disabled
151113  5:06:56 InnoDB: Mutexes and rw_locks use GCC atomic builtins
151113  5:06:56 InnoDB: Compressed tables use zlib 1.2.7
151113  5:06:56 InnoDB: Using Linux native AIO
151113  5:06:56  InnoDB: Warning: io_setup() failed with EAGAIN. Will make 5 attempts before giving up.
InnoDB: Warning: io_setup() attempt 1 failed.
InnoDB: Warning: io_setup() attempt 2 failed.
Waiting for MySQL to start ...
InnoDB: Warning: io_setup() attempt 3 failed.
InnoDB: Warning: io_setup() attempt 4 failed.
Waiting for MySQL to start ...
InnoDB: Warning: io_setup() attempt 5 failed.
151113  5:06:59  InnoDB: Error: io_setup() failed with EAGAIN after 5 attempts.
InnoDB: You can disable Linux Native AIO by setting innodb_use_native_aio = 0 in my.cnf
151113  5:06:59 InnoDB: Fatal error: cannot initialize AIO sub-system
151113  5:06:59 [ERROR] Plugin 'InnoDB' init function returned error.
151113  5:06:59 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
151113  5:06:59 [ERROR] Unknown/unsupported storage engine: InnoDB
151113  5:06:59 [ERROR] Aborting

151113  5:06:59 [Note] /opt/rh/mysql55/root/usr/libexec/mysqld: Shutdown complete

Expected results:
2.All pod are running

Additional info:
disk usage was about 62%
Comment 1 Ben Parees 2015-11-13 07:55:24 EST
Is this 50 independent pods, or is this a master/slave scenario?  Since you say 50 projects I assume it's 50 independent ones.

which volume type are you using for the pods?
Comment 2 Martin Nagy 2015-11-13 10:07:48 EST
DeShuai, there appear to be two possible solutions, can you please try both of them, independently?

1) Increase fs.aio-max-nr kernel limit on the nodes. See the answer here: https://unix.stackexchange.com/questions/116520/mysql-server-wont-install-to-a-new-os-debian-ubuntu

2) Set the MYSQL_AIO environment variable to 0 to all MySQL pods that you create. This should disable native AIO and the limit should not be hit.
Comment 3 DeShuai Ma 2015-11-13 20:16:19 EST
(In reply to Ben Parees from comment #1)
> Is this 50 independent pods, or is this a master/slave scenario?  Since you
> say 50 projects I assume it's 50 independent ones.
> 
> which volume type are you using for the pods?

50 pods are independent. all pod is created using file:https://github.com/openshift/origin/blob/master/examples/db-templates/mysql-ephemeral-template.json
Comment 4 DeShuai Ma 2015-11-15 21:29:22 EST
(In reply to Martin Nagy from comment #2)
> DeShuai, there appear to be two possible solutions, can you please try both
> of them, independently?
> 
> 1) Increase fs.aio-max-nr kernel limit on the nodes. See the answer here:
> https://unix.stackexchange.com/questions/116520/mysql-server-wont-install-to-
> a-new-os-debian-ubuntu
> 
> 2) Set the MYSQL_AIO environment variable to 0 to all MySQL pods that you
> create. This should disable native AIO and the limit should not be hit.

After try both two method, pod can be running well.thanks.
Comment 5 Jeremy Eder 2015-11-17 07:57:11 EST
Now, what to do with this ... Doubt this is something we'd want to do globally on the kernel side for systems running openshift.

My thought is that we should convert this to a documentation bugzilla and put a note in the openshift docs.
Comment 6 Martin Nagy 2015-11-18 08:02:29 EST
I agree. Should we move this bug to the Documentation component?
Comment 7 Ben Parees 2016-01-04 10:16:56 EST
Martin, please document this in the openshift mysql image docs.  Thanks.
Comment 8 Vikram Goyal 2016-02-04 20:20:07 EST
Martin, did you want the docs team to add this note? If yes, I will assign it to one of the writers.
Comment 9 Thien-Thi Nguyen 2016-03-01 13:16:04 EST
(In reply to Vikram Goyal from comment #8)
> Martin, did you want the docs team to add this note?
> If yes, I will assign it to one of the writers.

Martin and i discussed this today.  The plan is to add to the MySQL images section an explanation behind why MySQL fails w/ the above-mentioned error messages, as well as documentation on the two resolution paths:
- increase the kernel resource limit on the node
- set the env var
The latter will mention that editing /etc/my.cfg directly is not the OpenShift way, and xref the "Managing Environment Variables" section of the Dev Guide.

Martin, additions/corrections welcome.

Assigning this to myself.
Comment 10 Thien-Thi Nguyen 2016-03-08 05:58:57 EST
PR: https://github.com/openshift/openshift-docs/pull/1701
Comment 11 Martin Nagy 2016-03-08 09:08:05 EST
Commented in the pull, looks great!
Comment 13 Thien-Thi Nguyen 2016-03-15 17:02:32 EDT
Hi DeShuai, WDYT?
Comment 14 Thien-Thi Nguyen 2016-03-15 17:03:23 EDT
Hi DeShuai, WDYT?
Comment 15 DeShuai Ma 2016-03-15 21:45:46 EDT
The doc pr LGTM. When pr is merged will verify this bug.
Comment 16 Thien-Thi Nguyen 2016-03-17 16:04:51 EDT
Hi DeShuai,

PR merged: https://github.com/openshift/openshift-docs/pull/1701#event-593982568

WDYT?
Comment 17 Thien-Thi Nguyen 2016-03-18 03:38:36 EDT
Thanks DeShuai.

Moving status to RELEASE_PENDING.

Note You need to log in before you can comment on or make changes to this bug.