Bug 1544606 - Mysql APB downgrade from 5.7 to 5.6 failed
Summary: Mysql APB downgrade from 5.7 to 5.6 failed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Broker
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.9.z
Assignee: Jason Montleon
QA Contact: Zihan Tang
URL:
Whiteboard:
Depends On: 1549259 1550644
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-13 03:25 UTC by Zihan Tang
Modified: 2018-09-10 16:16 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
: 1550644 (view as bug list)
Environment:
Last Closed: 2018-09-10 16:16:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Zihan Tang 2018-02-13 03:25:52 UTC
Description of problem:
Mysql APB downgrade from 5.7 to 5.6 failed wit error:
TASK [rhscl-mysql-apb-openshift : Restore database] ****************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc exec -n mysql-3 mysql-5.6-prod-1-bkxfc -- /bin/bash -c \"mysql -u root  < /tmp/db.dump\"", "delta": "0:00:00.536735", "end": "2018-02-13 02:44:52.064980", "msg": "non-zero return code", "rc": 1, "start": "2018-02-13 02:44:51.528245", "stderr": "ERROR 1726 (HY000) at line 210: Storage engine 'InnoDB' does not support system tables. [mysql.help_category]\ncommand terminated with exit code 1", "stderr_lines": ["ERROR 1726 (HY000) at line 210: Storage engine 'InnoDB' does not support system tables. [mysql.help_category]", "command terminated with exit code 1"], "stdout": "", "stdout_lines": []}


But upgrade from 5.6 to 5.7 will succeed

Version-Release number of selected component (if applicable):
ASB: 1.1.0
Mysql apb:  downstream : v3.9


How reproducible:
Always.

Steps to Reproduce:
1. Provision a mysql apb in version 5.7
2. Edit the serviceinstance in UI , change version to 5.6

Actual results:
The update failed ,the sand box stopped with error.
[root@host-172-16-120-66 ~]# oc project rh-mysql-apb-upda-cqf62 
Now using project "rh-mysql-apb-upda-cqf62" on server "https://172.16.120.66:8443".
[root@host-172-16-120-66 ~]# oc get pod 
NAME                                       READY     STATUS    RESTARTS   AGE
apb-3c97e158-1f70-4653-8e88-feb114fd1082   0/1       Error     0          14m
[root@host-172-16-120-66 ~]# oc logs -f apb-3c97e158-1f70-4653-8e88-feb114fd1082
+ [[ update --extra-vars {"_apb_plan_id":"prod","_apb_service_class_id":"73ead67495322cc462794387fa9884f5","_apb_service_instance_id":"9df50356-52f3-4ab5-88f3-777548cbcbf4","cluster":"openshift","mysql_database":"devel","mysql_password":"dddd","mysql_user":"devel","mysql_version":"5.6","namespace":"mysql-3","service_name":"mysql"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000390000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ [[ -e /opt/apb/actions/update.yml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"prod","_apb_service_class_id":"73ead67495322cc462794387fa9884f5","_apb_service_instance_id":"9df50356-52f3-4ab5-88f3-777548cbcbf4","cluster":"openshift","mysql_database":"devel","mysql_password":"dddd","mysql_user":"devel","mysql_version":"5.6","namespace":"mysql-3","service_name":"mysql"}'

PLAY [mysql-apb playbook to provision the application] *************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-mysql-apb-openshift : Find pod we need to update] ******************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Find dc we will clean up] ********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Backup source database] **********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Set mysql service state to present] **********
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
skipping: [localhost]

TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
included: /opt/ansible/roles/rhscl-mysql-apb-openshift/tasks/prod.yml for localhost

TASK [rhscl-mysql-apb-openshift : set volume claim state to present] ***********
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : set MySQL deployment using persistent storage to present] ***
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Wait for mysql to come up] *******************
ok: [localhost]

TASK [rhscl-mysql-apb-openshift : Find pod we need to restore] *****************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Restore database] ****************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc exec -n mysql-3 mysql-5.6-prod-1-bkxfc -- /bin/bash -c \"mysql -u root  < /tmp/db.dump\"", "delta": "0:00:00.536735", "end": "2018-02-13 02:44:52.064980", "msg": "non-zero return code", "rc": 1, "start": "2018-02-13 02:44:51.528245", "stderr": "ERROR 1726 (HY000) at line 210: Storage engine 'InnoDB' does not support system tables. [mysql.help_category]\ncommand terminated with exit code 1", "stderr_lines": ["ERROR 1726 (HY000) at line 210: Storage engine 'InnoDB' does not support system tables. [mysql.help_category]", "command terminated with exit code 1"], "stdout": "", "stdout_lines": []}

PLAY RECAP *********************************************************************
localhost                  : ok=11   changed=9    unreachable=0    failed=1   

 [WARNING]: Could not create retry file '/opt/apb/actions/update.retry'.
[Errno 13] Permission denied: u'/opt/apb/actions/update.retry'
+ EXIT_CODE=2
+ set +ex
+ '[' -f /var/tmp/test-result ']'
+ exit 2
^C



Expected results:
Update scceed.

Additional info:

Comment 3 Jason Montleon 2018-02-13 17:45:04 UTC
Thanks. 

I saw this the other day as well. We may have to disable 5.7 to 5.6 as it always results in, "Storage engine 'InnoDB' does not support system tables"

I'd be interested to see if 5.6 -> 5.7 -> 5.6 works. I think this has worked for me in the past, but I'd have to try it to be certain.

I'm not sure if the default for one of the created databases was switched from InnoDB to MyISAM or vice versa, and if we're able to effect what is used with environment variables to provide a consistent experience.

Comment 4 Jason Montleon 2018-02-13 22:23:12 UTC
Using the steps at: https://dev.mysql.com/doc/refman/5.7/en/downgrading-to-previous-series.html I was able to update the APB and successfully downgrade from 5.7 to 5.6

I added a step to run all the necessary alternations before dumping the DB:

+  - name: Prepare tables for possible downgrade
+    shell: |
+           oc exec -n {{ namespace }} {{ oldpod.stdout }} -- /bin/bash -c "mysql -u root -e \"
+           ALTER TABLE mysql.procs_priv MODIFY Grantor char(77) COLLATE utf8_bin NOT NULL DEFAULT '';
+           ALTER TABLE mysql.columns_priv MODIFY User char(16) NOT NULL default '';
+           ALTER TABLE mysql.user MODIFY User char(16) NOT NULL default '';
+           ALTER TABLE mysql.db MODIFY User char(16) NOT NULL default '';
+           ALTER TABLE mysql.procs_priv MODIFY User char(16) binary DEFAULT '' NOT NULL;
+           ALTER TABLE mysql.user ADD Password char(41) character set latin1 collate latin1_bin NOT NULL default '' AFTER user;
+           UPDATE mysql.user SET password = authentication_string WHERE LENGTH(authentication_string) = 41 AND plugin = 'mysql_native_password';
+           UPDATE mysql.user SET authentication_string = '' WHERE LENGTH(authentication_string) = 41 AND plugin = 'mysql_native_password';
+           ALTER TABLE mysql.help_category ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.help_keyword ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.help_relation ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.help_topic ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.time_zone ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.time_zone_leap_second ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.time_zone_name ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.time_zone_transition  ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.time_zone_transition_type ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.plugin ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.servers ENGINE='MyISAM' STATS_PERSISTENT=DEFAULT;
+           ALTER TABLE mysql.user MODIFY plugin CHAR(64) COLLATE utf8_bin DEFAULT 'mysql_native_password';
+           DROP DATABASE sys;\""
+    when: mysql_version == "5.6"
+


I also added a task to run mysql_upgrade which will do any fixing up needed as well after we import databases.

+  - name: Run mysql_upgrade
+    shell: oc exec -n {{ namespace }} {{ newpod.stdout }} -- /bin/bash -c "mysql_upgrade -u root"


I need to do more testing to make sure upgrades from 5.6 work and 5.6 prod/dev switches and 5.7 prod/dev switches work, but hope to post a PR soon.

Comment 6 Zihan Tang 2018-02-22 04:16:56 UTC
Wait for the mysql image ready in stage registry to test .

Comment 7 Zihan Tang 2018-02-23 07:06:50 UTC
Stage image is ready , change to ON_QA

Comment 8 Zihan Tang 2018-02-23 07:12:24 UTC
Use image mysql-apb:v3.9.0-0.47.0.0 (stage registry) to test.
ASB: 1.1.13

Downgrade from 5.7 to 5.6 without new data will succeed

But if create DB with user root or create table with user devel , then the downgrade from 5.7 to 5.6 failed. 
And pending on the task:
TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************

Upgrade from 5.6 to 5.7 with new data succeed.

Comment 11 Zihan Tang 2018-02-24 01:44:44 UTC
Jason,
In comment8, I used the mysql-apb:v3.9.0-0.47.0.0 to test. 
And verified that mysql can downgrade from 5.7 to 5.6 successfully without new data.
But , after I create DB or table , then perform downgrade, it will pending on the status of "
TASK [rhscl-mysql-apb-openshift : Copy over db backup]
“

Comment 12 Zihan Tang 2018-02-26 07:15:18 UTC
According to comment 8, change the status to ASSIGNED.

Comment 13 Jason Montleon 2018-02-26 13:56:18 UTC
Can you test this with the latest images.

The task your showing is copying the file between pods and shouldn't be impacted by the existence of additional tables or databases. More likely the file is failing to be written on one end or the other as was the case with postgres running on crio, due to selinux denials.

If you're using crio can you test without selinux and if it's just failing against crio with selinux file a bug against crio/selinux?

If it's still failing can you provide an environment that reproduces the issue.

Here is a container with mediawiki bound and a second db created by hand:
=======================================================
$ oc exec -it -n mysql mysql-5.7-dev-2-5k85r /bin/bash
bash-4.2$ mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 86
Server version: 5.7.20 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE database managerie;
Query OK, 1 row affected (0.01 sec)

mysql> USE managerie;
Database changed
mysql> CREATE TABLE pet (name VARCHAR(20), owner VARCHAR(20), species VARCHAR(20), sex CHAR(1), birth DATE, death DATE);
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO pet VALUES ('Puffball','Diane','hamster','f','1999-03-30',NULL);
Query OK, 1 row affected (0.01 sec)

mysql> select * FROM pet
    -> ;
+----------+-------+---------+------+------------+-------+
| name     | owner | species | sex  | birth      | death |
+----------+-------+---------+------+------------+-------+
| Puffball | Diane | hamster | f    | 1999-03-30 | NULL  |
+----------+-------+---------+------+------------+-------+
1 row in set (0.00 sec)

mysql> use devel;
Database changed
mysql> show tables;
+--------------------+
| Tables_in_devel    |
+--------------------+
| archive            |
| category           |
| categorylinks      |
...
50 rows in set (0.00 sec)

Logs for downgrade:
===================
$ oc logs -f -n dh-mysql-apb-upda-cgq7f             apb-885e494d-cca9-4dd8-b530-28b4f67c79df
+ [[ update --extra-vars {"_apb_plan_id":"dev","_apb_service_class_id":"ddd528762894b277001df310a126d5ad","_apb_service_instance_id":"c0d1c416-c570-42a4-99c6-bc018ff9f18a","cluster":"openshift","mysql_database":"devel","mysql_password":"changeme","mysql_user":"devel","mysql_version":"5.6","namespace":"mysql","service_name":"mysql"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000140000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ [[ -e /opt/apb/actions/update.yml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"dev","_apb_service_class_id":"ddd528762894b277001df310a126d5ad","_apb_service_instance_id":"c0d1c416-c570-42a4-99c6-bc018ff9f18a","cluster":"openshift","mysql_database":"devel","mysql_password":"changeme","mysql_user":"devel","mysql_version":"5.6","namespace":"mysql","service_name":"mysql"}'
PLAY [mysql-apb playbook to provision the application] *************************
TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]
TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]
TASK [rhscl-mysql-apb-openshift : Find pod we need to update] ******************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Find dc we will clean up] ********************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Prepare tables for downgrade] ****************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Backup source database] **********************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Set mysql service state to present] **********
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
included: /opt/ansible/roles/rhscl-mysql-apb-openshift/tasks/dev.yml for localhost
TASK [rhscl-mysql-apb-openshift : set MySQL deployment with ephemeral storage to present] ***
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
skipping: [localhost]
TASK [rhscl-mysql-apb-openshift : Wait for mysql to come up] *******************
ok: [localhost]
TASK [rhscl-mysql-apb-openshift : Find pod we need to restore] *****************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Restore database] ****************************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Run mysql_upgrade] ***************************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : Remove old dc] *******************************
changed: [localhost]
TASK [rhscl-mysql-apb-openshift : ensure production volume is absent] **********
ok: [localhost] => (item=5.6)
ok: [localhost] => (item=5.7)
TASK [rhscl-mysql-apb-openshift : encode bind credentials] *********************
changed: [localhost]
PLAY RECAP *********************************************************************
localhost                  : ok=16   changed=13   unreachable=0    failed=0   
+ EXIT_CODE=0
+ set +ex
+ '[' -f /var/tmp/test-result ']'
+ exit 0

Data on downgraded pod:
=======================
$ oc exec -it -n mysql mysql-5.6-dev-1-xk74x /bin/bash
bash-4.2$ mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 28
Server version: 5.6.38 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use managerie;
Database changed
mysql> select * from pet;
+----------+-------+---------+------+------------+-------+
| name     | owner | species | sex  | birth      | death |
+----------+-------+---------+------+------------+-------+
| Puffball | Diane | hamster | f    | 1999-03-30 | NULL  |
+----------+-------+---------+------+------------+-------+
1 row in set (0.00 sec)

mysql> use devel;
Database changed
mysql> show tables.
    -> ;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.' at line 1
mysql> show tables;
+--------------------+
| Tables_in_devel    |
+--------------------+
| archive            |
| category           |
| categorylinks      |
...
+--------------------+
50 rows in set (0.00 sec)

Data being present should not cause the downgrade to fail.

Comment 14 Zihan Tang 2018-02-27 08:21:11 UTC
I didn't use crio-env to test the bug.
When use the latest image to verify.
1.'mysql-root' project 
 provision mysql5.7 ->create data use 'root' user in mysql as your step in comment 13, -> downgrade to 5.6 ->succeed, data perserved.
  
2. 'mysql-1' project 
provision mysql5.7 ->create data use 'devel' user in mysql -> downgrade failed, pending on 'Copy over db backup'
[root@host-172-16-120-56 tmp]# oc rsh mysql-5.7-dev-1-xht7t
sh-4.2$ mysql -udevel -p -h127.0.0.1
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 13
Server version: 5.7.20 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| devel              |
+--------------------+
2 rows in set (0.00 sec)

mysql> CREATE database managerie;
ERROR 1044 (42000): Access denied for user 'devel'@'%' to database 'managerie'
mysql> use devel
Database changed
mysql> show tables;
Empty set (0.00 sec)

mysql> CREATE TABLE pet (name VARCHAR(20), owner VARCHAR(20), species VARCHAR(20), sex CHAR(1), birth DATE, death DATE);
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO pet VALUES ('Puffball','Diane','hamster','f','1999-03-30',NULL);
Query OK, 1 row affected (0.01 sec)

mysql> select * FROM pet
    -> ;
+----------+-------+---------+------+------------+-------+
| name     | owner | species | sex  | birth      | death |
+----------+-------+---------+------+------------+-------+
| Puffball | Diane | hamster | f    | 1999-03-30 | NULL  |
+----------+-------+---------+------+------------+-------+
1 row in set (0.00 sec)


Downgrade log:
[root@host-172-16-120-56 tmp]# oc logs -f apb-ce7aa35a-ff5b-45b9-8a47-9ecdcc5a6e67
+ [[ update --extra-vars {"_apb_plan_id":"dev","_apb_service_class_id":"73ead67495322cc462794387fa9884f5","_apb_service_instance_id":"abea4d2b-345b-4ef3-991f-7c87fd93acc3","cluster":"openshift","mysql_database":"devel","mysql_password":"dddd","mysql_user":"devel","mysql_version":"5.6","namespace":"mysql-devel","service_name":"mysql"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000340000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ [[ -e /opt/apb/actions/update.yml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"dev","_apb_service_class_id":"73ead67495322cc462794387fa9884f5","_apb_service_instance_id":"abea4d2b-345b-4ef3-991f-7c87fd93acc3","cluster":"openshift","mysql_database":"devel","mysql_password":"dddd","mysql_user":"devel","mysql_version":"5.6","namespace":"mysql-devel","service_name":"mysql"}'

PLAY [mysql-apb playbook to provision the application] *************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-mysql-apb-openshift : Find pod we need to update] ******************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Find dc we will clean up] ********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Prepare for downgrade] ***********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Backup source database] **********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Set mysql service state to present] **********
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
included: /opt/ansible/roles/rhscl-mysql-apb-openshift/tasks/dev.yml for localhost

TASK [rhscl-mysql-apb-openshift : set MySQL deployment with ephemeral storage to present] ***
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
skipping: [localhost]

TASK [rhscl-mysql-apb-openshift : Wait for mysql to come up] *******************
ok: [localhost]

TASK [rhscl-mysql-apb-openshift : Find pod we need to restore] *****************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************

Comment 16 Zihan Tang 2018-02-27 10:01:26 UTC
In comment 14 scenario 2:
it's in 'mysql-devel' project.
[root@host-172-16-120-56 ~]# oc get pod 
NAME                    READY     STATUS             RESTARTS   AGE
mysql-5.6-dev-1-jr628   1/1       Running            0          1h
mysql-5.7-dev-1-xht7t   0/1       CrashLoopBackOff   19         1h

[root@host-172-16-120-56 ~]# oc logs -f mysql-5.7-dev-1-xht7t
=> sourcing 20-validate-variables.sh ...
=> sourcing 25-validate-replication-variables.sh ...
=> sourcing 30-base-config.sh ...
---> 09:54:18     Processing basic MySQL configuration files ...
=> sourcing 60-replication-config.sh ...
=> sourcing 70-s2i-config.sh ...
---> 09:54:18     Processing additional arbitrary  MySQL configuration provided by s2i ...
=> sourcing 40-paas.cnf ...
=> sourcing 50-my-tuning.cnf ...
---> 09:54:18     Starting MySQL server with disabled networking ...
---> 09:54:18     Waiting for MySQL to start ...
2018-02-27T09:54:19.093520Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2018-02-27T09:54:19.102889Z 0 [Note] /opt/rh/rh-mysql57/root/usr/libexec/mysqld (mysqld 5.7.20) starting as process 26 ...
2018-02-27T09:54:19.112949Z 0 [Note] InnoDB: PUNCH HOLE support available
2018-02-27T09:54:19.113017Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-02-27T09:54:19.113046Z 0 [Note] InnoDB: Uses event mutexes
2018-02-27T09:54:19.113071Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2018-02-27T09:54:19.113087Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2018-02-27T09:54:19.113103Z 0 [Note] InnoDB: Using Linux native AIO
2018-02-27T09:54:19.113370Z 0 [Note] InnoDB: Number of pools: 1
2018-02-27T09:54:19.113522Z 0 [Note] InnoDB: Using CPU crc32 instructions
2018-02-27T09:54:19.115608Z 0 [Note] InnoDB: Initializing buffer pool, total size = 32M, instances = 1, chunk size = 32M
2018-02-27T09:54:19.117829Z 0 [Note] InnoDB: Completed initialization of buffer pool
2018-02-27T09:54:19.118809Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2018-02-27T09:54:19.138765Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2018-02-27T09:54:19.140271Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 2615150
2018-02-27T09:54:19.140298Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 2615178
2018-02-27T09:54:19.140319Z 0 [Note] InnoDB: Database was not shutdown normally!
2018-02-27T09:54:19.140332Z 0 [Note] InnoDB: Starting crash recovery.
2018-02-27T09:54:19.284950Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2018-02-27T09:54:19.284970Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2018-02-27T09:54:19.285061Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2018-02-27T09:54:19.309798Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2018-02-27T09:54:19.311740Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2018-02-27T09:54:19.311753Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2018-02-27T09:54:19.311973Z 0 [Note] InnoDB: Waiting for purge to start
2018-02-27T09:54:19.363174Z 0 [Note] InnoDB: 5.7.20 started; log sequence number 2615178
2018-02-27T09:54:19.363320Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/data/ib_buffer_pool
2018-02-27T09:54:19.363485Z 0 [Note] Plugin 'FEDERATED' is disabled.
2018-02-27T09:54:19.370409Z 0 [Note] InnoDB: Buffer pool(s) load completed at 180227  9:54:19
2018-02-27T09:54:19.371596Z 0 [Warning] System table 'plugin' is expected to be transactional.
2018-02-27T09:54:19.379478Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2018-02-27T09:54:19.379492Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
2018-02-27T09:54:19.380592Z 0 [Warning] CA certificate ca.pem is self signed.
2018-02-27T09:54:19.380658Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2018-02-27T09:54:19.388025Z 0 [Warning] 'user' entry 'root@localhost' ignored in --skip-name-resolve mode.
2018-02-27T09:54:19.388069Z 0 [Warning] 'user' entry 'mysql.session@localhost' ignored in --skip-name-resolve mode.
2018-02-27T09:54:19.388080Z 0 [Warning] 'user' entry 'mysql.sys@localhost' ignored in --skip-name-resolve mode.
2018-02-27T09:54:19.388241Z 0 [Warning] 'db' entry 'performance_schema mysql.session@localhost' ignored in --skip-name-resolve mode.
2018-02-27T09:54:19.388251Z 0 [Warning] 'db' entry 'sys mysql.sys@localhost' ignored in --skip-name-resolve mode.
2018-02-27T09:54:19.388384Z 0 [Warning] 'proxies_priv' entry '@ root@localhost' ignored in --skip-name-resolve mode.
2018-02-27T09:54:19.391412Z 0 [Warning] System table 'time_zone_leap_second' is expected to be transactional.
2018-02-27T09:54:19.391424Z 0 [Warning] System table 'time_zone_name' is expected to be transactional.
2018-02-27T09:54:19.391426Z 0 [Warning] System table 'time_zone' is expected to be transactional.
2018-02-27T09:54:19.391428Z 0 [Warning] System table 'time_zone_transition_type' is expected to be transactional.
2018-02-27T09:54:19.391430Z 0 [Warning] System table 'time_zone_transition' is expected to be transactional.
2018-02-27T09:54:19.394250Z 0 [Warning] 'tables_priv' entry 'user mysql.session@localhost' ignored in --skip-name-resolve mode.
2018-02-27T09:54:19.394268Z 0 [Warning] 'tables_priv' entry 'sys_config mysql.sys@localhost' ignored in --skip-name-resolve mode.
2018-02-27T09:54:19.394846Z 0 [Warning] System table 'servers' is expected to be transactional.
2018-02-27T09:54:19.438689Z 0 [ERROR] Incorrect definition of table mysql.db: expected column 'User' at position 2 to have type char(32), found type char(16).
2018-02-27T09:54:19.438784Z 0 [ERROR] mysql.user has no `Event_priv` column at position 28
2018-02-27T09:54:19.439768Z 0 [ERROR] Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.
2018-02-27T09:54:19.440589Z 0 [Note] /opt/rh/rh-mysql57/root/usr/libexec/mysqld: ready for connections.
Version: '5.7.20'  socket: '/tmp/mysql.sock'  port: 0  MySQL Community Server (GPL)
2018-02-27T09:54:19.440600Z 0 [Note] Executing 'SELECT * FROM INFORMATION_SCHEMA.TABLES;' to get a list of tables using the deprecated partition engine. You may use the startup option '--disable-partition-engine-check' to skip this check. 
2018-02-27T09:54:19.440602Z 0 [Note] Beginning of list of non-natively partitioned tables
2018-02-27T09:54:19.561464Z 0 [Note] End of list of non-natively partitioned tables
2018-02-27T09:54:19.741842Z 3 [Note] Access denied for user 'UNKNOWN_USER'@'localhost' (using password: NO)
---> 09:54:19     MySQL started successfully
=> sourcing 50-passwd-change.sh ...
---> 09:54:19     Setting passwords ...
2018-02-27T09:54:20.009318Z 4 [ERROR] /opt/rh/rh-mysql57/root/usr/libexec/mysqld: Column count of mysql.user is wrong. Expected 45, found 46. The table is probably corrupted
ERROR 1805 (HY000) at line 1: Column count of mysql.user is wrong. Expected 45, found 46. The table is probably corrupted

Comment 18 Jason Montleon 2018-02-27 14:07:43 UTC
With regard to the APB that has already errored out The db dump file is truncated on the new pod and unsurprisigingly failed to load. 

The one that's still running is in the same state. It's stuck trying to copy the db dump over and is stuck. Eventually it will likely error out the same way.

I'm not sure if this is a dupe of 1549259 and exhibiting some slightly different behavior in a docker env or if something else is going on.

Comment 20 Zihan Tang 2018-02-28 10:10:07 UTC
Downgrade failed using mysql-apb-v3.9.0-0.53.0.1 when 'Copy over db backup'
with error:
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc exec -it -n mysql-down mysql-5.6-dev-1-qqfml -- bash -c \"echo \\\"$(cat /tmp/db.dump)\\\" > /tmp/db.dump\"", "delta": "0:00:00.298582", "end": "2018-02-28 10:03:40.972465", "msg": "non-zero return code", "rc": 126, "start": "2018-02-28 10:03:40.673883", "stderr": "/bin/sh: /usr/bin/oc: Argument list too long", "stderr_lines": ["/bin/sh: /usr/bin/oc: Argument list too long"], "stdout": "", "stdout_lines": []}

1. provision mysql 5.7 dev;
2. create data use 'devel' user
3. downgrade to 5.6 dev;


[root@host-172-16-120-75 ~]# oc get pod 
NAME                    READY     STATUS    RESTARTS   AGE
mysql-5.6-dev-1-qqfml   1/1       Running   0          2m
mysql-5.7-dev-1-ctz2n   1/1       Running   0          5m
[root@host-172-16-120-75 ~]# oc exec -it mysql-5.7-dev-1-ctz2n -- /bin/bash -c "ls /tmp"
db.dump  ks-script-8eoXwX  yum.log
[root@host-172-16-120-75 ~]# oc exec -it mysql-5.6-dev-1-qqfml  -- /bin/bash -c "ls /tmp"
ks-script-8eoXwX  yum.log

[root@host-172-16-120-75 ~]# oc get pod 
NAME                                       READY     STATUS    RESTARTS   AGE
apb-ca578787-9d04-48fb-add2-ee466d59e57d   0/1       Error     0          3m
[root@host-172-16-120-75 ~]# oc logs -f apb-ca578787-9d04-48fb-add2-ee466d59e57d
+ [[ update --extra-vars {"_apb_plan_id":"dev","_apb_service_class_id":"73ead67495322cc462794387fa9884f5","_apb_service_instance_id":"7a67a56e-d126-446d-ba8e-7be62aaccdc8","cluster":"openshift","mysql_database":"devel","mysql_password":"dddd","mysql_user":"devel","mysql_version":"5.6","namespace":"mysql-down","service_name":"mysql"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000180000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ [[ -e /opt/apb/actions/update.yml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"dev","_apb_service_class_id":"73ead67495322cc462794387fa9884f5","_apb_service_instance_id":"7a67a56e-d126-446d-ba8e-7be62aaccdc8","cluster":"openshift","mysql_database":"devel","mysql_password":"dddd","mysql_user":"devel","mysql_version":"5.6","namespace":"mysql-down","service_name":"mysql"}'

PLAY [mysql-apb playbook to provision the application] *************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-mysql-apb-openshift : Find pod we need to update] ******************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Find dc we will clean up] ********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Prepare for downgrade] ***********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Backup source database] **********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Set mysql service state to present] **********
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
included: /opt/ansible/roles/rhscl-mysql-apb-openshift/tasks/dev.yml for localhost

TASK [rhscl-mysql-apb-openshift : set MySQL deployment with ephemeral storage to present] ***
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
skipping: [localhost]

TASK [rhscl-mysql-apb-openshift : Wait for mysql to come up] *******************
ok: [localhost]

TASK [rhscl-mysql-apb-openshift : Find pod we need to restore] *****************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc exec -it -n mysql-down mysql-5.6-dev-1-qqfml -- bash -c \"echo \\\"$(cat /tmp/db.dump)\\\" > /tmp/db.dump\"", "delta": "0:00:00.298582", "end": "2018-02-28 10:03:40.972465", "msg": "non-zero return code", "rc": 126, "start": "2018-02-28 10:03:40.673883", "stderr": "/bin/sh: /usr/bin/oc: Argument list too long", "stderr_lines": ["/bin/sh: /usr/bin/oc: Argument list too long"], "stdout": "", "stdout_lines": []}
 [WARNING]: Could not create retry file '/opt/apb/actions/update.retry'.
[Errno 13] Permission denied: u'/opt/apb/actions/update.retry'

PLAY RECAP *********************************************************************
localhost                  : ok=10   changed=8    unreachable=0    failed=1   

+ EXIT_CODE=2
+ set +ex
+ '[' -f /var/tmp/test-result ']'
+ exit 2

Comment 22 Jason Montleon 2018-02-28 15:12:51 UTC
I'll need to revert the change away from using cp. This won't work. The command line being generated is enormous and oc has a limit it appears.

Comment 24 Zihan Tang 2018-03-01 09:53:56 UTC
Jason,
I retry use v3.9.1-1.1, but this issue is NOT ALWAYS failed.
I tried 8 times and 3 failed. and the failed scenario can not be reproduced.
Here's my test scenarios:
mysql  5.7 dev ->create data -> 5.6 dev
          5.7 dev ->create data -> 5.6 prod
          5.7 prod ->create data -> 5.6 dev
          5.7 prod ->create data -> 5.6 prod

When it failed, the pod is pending on 'Copy over db backup'
Actually the db.tmp has been copied to the other pod, 
[root@host-172-16-120-56 ~]# oc get pod 
NAME                     READY     STATUS    RESTARTS   AGE
mysql-5.6-prod-1-s2dh7   1/1       Running   0          1h
mysql-5.7-dev-1-n26qf    1/1       Running   0          1h
[root@host-172-16-120-56 ~]# oc exec mysql-5.7-dev-1-n26qf -- /bin/bash -c 'ls /tmp'
db.dump
ks-script-8eoXwX
yum.log
[root@host-172-16-120-56 ~]# oc exec mysql-5.6-prod-1-s2dh7 -- /bin/bash -c 'ls /tmp'
db.dump
ks-script-8eoXwX
yum.log

Comment 25 Jason Montleon 2018-03-01 14:13:09 UTC
The problem is that oc isn't finishing the copy. If you look on the 5.6 node the database is cut off early. 

There is an (adjustable) idle timeout that defaults to four hours you will eventually hit on the stuck task (I hit this on the restore phase when testing with very large databases and asked for documentation on how to adjust it to be added) and the apb will eventually continue to run. At that point the Restore task will fail because the DB is malformed.

I was hoping this was related to the separate crio env bug in some way, but that looks like it may actually be in crio itself so we have exposed another bug in oc here.

It's not really a bug with the apb or apb code as far as I can tell, but we're definitely being impacted by it. We should probably move this back to assigned. 

I'll see if I can reproduce it outside of an APB and clone the bug.

# oc exec mysql-5.7-dev-1-n26qf -- /bin/bash -c 'cat /tmp/db.dump'
...
-- Dump completed on 2018-03-01  8:24:02


oc exec mysql-5.6-prod-1-s2dh7 -- /bin/bash -c 'cat /tmp/db.dump'
...
  `plugin` char(64) COLLATE utf8_bin DEFAULT 'mysql_native_password',
  `authentication_string` text COLLATE utf8_bin,
  `passw


Eventually:
TASK [rhscl-mysql-apb-openshift : Restore database] ****************************
 [WARNING]: Could not create retry file '/opt/apb/actions/update.retry'.
[Errno 13] Permission denied: u'/opt/apb/actions/update.retry'
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc exec -it -n mysql-down mysql-5.6-prod-1-s2dh7 -- /bin/bash -c \"mysql -u root < /tmp/db.dump\"", "delta": "0:00:01.017861", "end": "2018-03-01 12:25:26.569404", "msg": "non-zero return code", "rc": 1, "start": "2018-03-01 12:25:25.551543", "stderr": "Unable to use a TTY - input is not a terminal or the right kind of file\nERROR 1064 (42000) at line 861: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '`passw' at line 44\ncommand terminated with exit code 1", "stderr_lines": ["Unable to use a TTY - input is not a terminal or the right kind of file", "ERROR 1064 (42000) at line 861: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '`passw' at line 44", "command terminated with exit code 1"], "stdout": "", "stdout_lines": []}

Comment 26 Jason Montleon 2018-03-01 16:44:01 UTC
Zhang, If you are willing to try with an upstream test apb from a docker.io organization I replaced cp with rsync in docker.io/jmontleon/mysql-apb:latest

I'd like to work around the problem in the short term if possible, even if we don't get a fix for cp.

I'm not convinced it will make much difference as the only real change is whether it executes tar vs. rsync and I'm more concerned something lower level is the issue, but it may be worth a shot.

Comment 27 Jason Montleon 2018-03-01 16:47:53 UTC
I can reproduce this outside of the pod:
# oc cp -n msyql-down4 /tmp/db/db.dump mysql-5.6-prod-1-qqmt4:/tmp/db.dump
# oc cp -n msyql-down4 /tmp/db/db.dump mysql-5.6-prod-1-qqmt4:/tmp/db.dump
# oc cp -n msyql-down4 /tmp/db/db.dump mysql-5.6-prod-1-qqmt4:/tmp/db.dump
^C  
# oc cp -n msyql-down4 /tmp/db/db.dump mysql-5.6-prod-1-qqmt4:/tmp/db.dump
# oc cp -n msyql-down4 /tmp/db/db.dump mysql-5.6-prod-1-qqmt4:/tmp/db.dump
# oc cp -n msyql-down4 /tmp/db/db.dump mysql-5.6-prod-1-qqmt4:/tmp/db.dump
# oc cp -n msyql-down4 /tmp/db/db.dump mysql-5.6-prod-1-qqmt4:/tmp/db.dump
^C

rsync may actually suffice as a workaround. I ran it 30 times without a hang.

Comment 29 Zihan Tang 2018-03-02 09:28:04 UTC
They really more reliable.
But the instance can not update for the second time.
Failed at.
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc exec -it -n mysql-57 mysql-5.6-prod-1-2698w -- /bin/bash -c \"mkdir /tmp/db\"", "delta": "0:00:00.697088", "end": "2018-03-02 07:14:59.704604", "msg": "non-zero return code", "rc": 1, "start": "2018-03-02 07:14:59.007516", "stderr": "Unable to use a TTY - input is not a terminal or the right kind of file\nmkdir: cannot create directory '/tmp/db': File exists\ncommand terminated with exit code 1", "stderr_lines": ["Unable to use a TTY - input is not a terminal or the right kind of file", "mkdir: cannot create directory '/tmp/db': File exists", "command terminated with exit code 1"], "stdout": "", "stdout_lines": []}

the '/tmp/db' is exist because it is copied from the 'old' pod.

Comment 31 Zihan Tang 2018-03-05 08:08:54 UTC
Verified 
ASB: 1.1.15
apb: v3.9.1-1.3
I've updated about 20 times, no failure,


Note You need to log in before you can comment on or make changes to this bug.