Bug 1106450 - mysqld doesn't start after installation of mysql catridge on systems with high IO wait
Summary: mysqld doesn't start after installation of mysql catridge on systems with hig...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: ImageStreams
Version: 2.1.0
Hardware: Unspecified
OS: Unspecified
high
low
Target Milestone: ---
: ---
Assignee: John W. Lamb
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks: 1147060
TreeView+ depends on / blocked
 
Reported: 2014-06-09 11:43 UTC by Miheer Salunke
Modified: 2019-07-11 08:00 UTC (History)
10 users (show)

Fixed In Version: openshift-origin-cartridge-mysql-1.23.4.5-1.el6op
Doc Type: Bug Fix
Doc Text:
The control script "start" routine for the MySQL cartridge was configured to check the mysqld service 10 times, wait 1 second between each check, and time out if the service did not become available within that time. During periods of high I/O load, and under certain other conditions, the mysqld service might fail to start before all 10 checks have been performed. This caused the deployment or scale-up operation to fail. This bug fix increases the number of checks to 45 for a minimum timeout duration of 45 seconds. It also introduces the OPENSHIFT_MYSQL_START_TIMEOUT and OPENSHIFT_MYSQL_STOP_TIMEOUT environment variables which users can set using the client tools to specify the number of retries for the control "start" and "stop" routines, respectively. The MySQL cartridge is now more tolerant to high latency system conditions by default, and can be manipulated by the user to successfully deploy under a variety of system load and latency scenarios. After applying this fix, a cartridge upgrade is required.
Clone Of:
: 1147060 (view as bug list)
Environment:
Last Closed: 2014-10-02 13:59:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1353 0 normal SHIPPED_LIVE Red Hat OpenShift Enterprise 2.1.7 bug fix and enhancement update 2014-10-02 17:59:00 UTC

Description Miheer Salunke 2014-06-09 11:43:19 UTC
Description of problem:
We see a significant number of failures starting mysql cartridge on systems with high I/O wait.

After I increased the value in for loop to from 30 to 60 as follows the mysql cartridge started

/var/lib/openshift/.cartridge_repository/redhat-mysql/0.2.6/bin/control
=======================
# Start mysqld and block until it comes up.
function start {
  if ! is_running; then
    echo "Starting MySQL cartridge"
    oo-erb conf/my.cnf.erb.hidden > conf/my.cnf
    /usr/bin/mysqld_safe --defaults-file=$OPENSHIFT_MYSQL_DIR/conf/my.cnf > /dev/null 2>&1 &
    wait_for_mysqld_availability
  else
    echo "MySQL already running" 1>&2
  fi
}

function wait_for_mysqld_availability {
  test_select=${1:-false}

  for i in {1..30}; do        <------ 30 seconds defined here
    if is_running; then
      if $test_select; then
        run_sql 'select 1' 2> /dev/null && return 0
      else
        return 0
      fi
    fi

    sleep 1
  done

  return 1
}
=======================

The following change was made to the above code

=======================
  for i in {1..60}; do        <------ wait 1 minute
    if is_running; then
      if $test_select; then
=======================

We have a support case 01058320
(https://access.redhat.com/support/cases/01058320/)



Version-Release number of selected component (if applicable):
2.0

How reproducible:
Reproducible on machines with high IO wait.

Steps to Reproduce:
1.  Create a scale application on a system with on a system with high IO wait.
2.  Add mysql catridge to your applciation.


Actual results:
Installation of cartridge fails.
Mysql gets installed but, starting mysql cartridge fails


Expected results:
The mysql catridge cartridge should start after it gets installed.

Additional info: 

I have added logs in the mysqlcart_logs file which is attached to this ticket. So, PFA.

Comment 2 John W. Lamb 2014-08-21 14:53:23 UTC
Allow setting of timeout with an env var at app create time?

Comment 3 John W. Lamb 2014-09-04 19:41:48 UTC
Upstream fix: https://github.com/openshift/origin-server/pull/5785
Sets start timeout to 45 seconds, allows setting of timeout at app-create using user env var OPENSHIFT_MYSQL_START_TIMEOUT. Also stop timeout can be set with OPENSHIFT_MYSQL_STOP_TIMEOUT.

Comment 4 openshift-github-bot 2014-09-05 19:11:33 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/afb1d7117522eaf9cb88bd31858031ff28438652
mysql cart: user-settable start timeout

Increase startup timeout for MySQL cart to 45 seconds, allow setting
`start` timeout via `OPENSHIFT_MYSQL_START_TIMEOUT` user env var and
`stop` timeout via `OPENSHIFT_MYSQL_STOP_TIMEOUT`.

Enterprise bug 1106450
https://bugzilla.redhat.com/show_bug.cgi?id=1106450

Comment 8 John W. Lamb 2014-09-18 20:39:03 UTC
This is marked as ON_QA, but there's no puddle available for it yet. One should be available tomorrow, or if you want to test against the current 2.1.z puddle you can pull the updated openshift-origin-cartridge-mysql from brew: https://brewweb.devel.redhat.com/buildinfo?buildID=386459

Comment 9 Gaoyun Pei 2014-09-19 06:17:55 UTC
Verify this bug with package openshift-origin-cartridge-mysql-1.23.4.5-1.el6op

Checked the code, the default start_timeout in /usr/libexec/openshift/cartridges/mysql/bin/control has been set to 45s now.
OPENSHIFT_MYSQL_START_TIMEOUT and OPENSHIFT_MYSQL_STOP_TIMEOUT env variable could work.

Steps:
1. Create one scalable app, set env variable for this app
rhc env-set OPENSHIFT_MYSQL_START_TIMEOUT=5 OPENSHIFT_MYSQL_STOP_TIMEOUT=5 -a app1

2. Log into the app to check the env variable in the app
[app1-11.ose21z-manual.com.cn 541b9c99db26c8ae500002ea]\> env|grep TIMEOUT
OPENSHIFT_MYSQL_START_TIMEOUT=5
OPENSHIFT_MYSQL_STOP_TIMEOUT=5

3. Try to add mysql cartridge to this app
[root@broker ~]# rhc cartridge-add -a app1 -c mysql-5.5 
Adding mysql-5.5 to application 'app1' ... 
Starting MySQL 5.5 cartridge
MySQL server failed to start:
140918 23:03:37 mysqld_safe Logging to '/var/lib/openshift/541bc72cdb26c8ae5000041d/mysql//stdout.err'.
...
140918 23:03:41  InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 153 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Progress in MB: 100
Failed to execute: 'control start' for /var/lib/openshift/541bc72cdb26c8ae5000041d/mysql

4. Make OPENSHIFT_MYSQL_START_TIMEOUT to a bigger one, then add mysql cartridge again
[root@broker ~]# rhc env-set OPENSHIFT_MYSQL_START_TIMEOUT=40 -a app1
Setting environment variable(s) ... done
[root@broker ~]# rhc cartridge-add -a app1 -c mysql-5.5 -g medium
Adding mysql-5.5 to application 'app1' ... done

mysql-5.5 (MySQL 5.5)
...
Connection URL: mysql://$OPENSHIFT_MYSQL_DB_HOST:$OPENSHIFT_MYSQL_DB_PORT/

Comment 11 John W. Lamb 2014-09-26 18:21:03 UTC
Setting version to 2.1.0 to reflect product fix was shipped for, and to distinguish from soon-to-be-created clone of this bug for 2.0

Comment 13 errata-xmlrpc 2014-10-02 13:59:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1353.html


Note You need to log in before you can comment on or make changes to this bug.