Bug 1549019 - In crio env, update *sql.apb version failed with new db data created
Summary: In crio env, update *sql.apb version failed with new db data created
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Broker
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.9.0
Assignee: Jason Montleon
QA Contact: Zihan Tang
URL:
Whiteboard:
Depends On: 1549259
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-26 09:11 UTC by Zihan Tang
Modified: 2018-12-13 19:27 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
: 1549259 (view as bug list)
Environment:
Last Closed: 2018-12-13 19:26:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3748 0 None None None 2018-12-13 19:27:10 UTC

Description Zihan Tang 2018-02-26 09:11:06 UTC
Description of problem:
In cri-o env, after provision a sql-apb (such as mariaDB), then create new data in the db, then update the sql-apb version failed, and pending on  'TASK [... Backup source database]'. 
If update version or plan without create any data, update will succeed.
If disable selinux, still pending on 'Backup source database'

Version-Release number of selected component (if applicable):
ASB: 1.1.14
APB: stage registry: v3.9

How reproducible:
Always.

Steps to Reproduce:

1.disabled selinux : setenforce 0;
[root@host-172-16-120-96 tmp]# setenforce 0
[root@host-172-16-120-96 tmp]# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      28
2. Provision mariaDB 10.0, dev;
3. create data in mariaDB:
  [root@host-172-16-120-96 tmp]# oc rsh rhscl-mariadb-10.0-dev-1-746v8
sh-4.2$  mysql -uadmin -pdddd -h127.0.0.1  
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 6
Server version: 10.0.33-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| admin              |
| information_schema |
| test               |
+--------------------+
3 rows in set (0.00 sec)

MariaDB [(none)]> use admin
Database changed
MariaDB [admin]> show tables;
Empty set (0.00 sec)

MariaDB [admin]> CREATE TABLE COMPANY(
    ->    ID INT PRIMARY KEY     NOT NULL,
    ->    NAME           TEXT    NOT NULL,
    ->    AGE            INT     NOT NULL,
    ->    ADDRESS        CHAR(50),
    ->    SALARY         REAL,   JOIN_DATE   DATE
    -> );
Query OK, 0 rows affected (0.01 sec)

MariaDB [admin]> INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY,JOIN_DATE) VALUES (1, 'Paul', 32, 'California', 20000.00 ,'2001-07-13');
Query OK, 1 row affected (0.00 sec)

4. update mariaDB to 10.2 , prod in UI.

Actual results:
step 4 , update failed .
the update sandbox is always pending on 'Backup source database'
[root@host-172-16-120-96 tmp]# oc get pod 
NAME                                       READY     STATUS    RESTARTS   AGE
apb-936fdf7f-8c48-4718-9cfe-2642a7fc9fb5   1/1       Running   0          14m
[root@host-172-16-120-96 tmp]# oc logs -f apb-936fdf7f-8c48-4718-9cfe-2642a7fc9fb5
+ [[ update --extra-vars {"_apb_plan_id":"prod","_apb_service_class_id":"2c259ddd8059b9bc65081e07bf20058f","_apb_service_instance_id":"bec8e8df-e96f-4cf9-bd5e-281b35bd6dd9","cluster":"openshift","mariadb_database":"admin","mariadb_password":"dddd","mariadb_root_password":"dddd","mariadb_user":"admin","mariadb_version":"10.2","namespace":"maria-1"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000240000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ [[ -e /opt/apb/actions/update.yml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"prod","_apb_service_class_id":"2c259ddd8059b9bc65081e07bf20058f","_apb_service_instance_id":"bec8e8df-e96f-4cf9-bd5e-281b35bd6dd9","cluster":"openshift","mariadb_database":"admin","mariadb_password":"dddd","mariadb_root_password":"dddd","mariadb_user":"admin","mariadb_version":"10.2","namespace":"maria-1"}'

PLAY [Deploy rhscl-mariadb-apb to openshift] ***********************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-mariadb-apb-openshift : Find pod we need to update] ****************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Find dc we will clean up] ******************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Backup source database] ********************

Expected results:
update succeed.

Additional info:
Refer to bug 1535931#c18  and comment 23

Comment 2 Jason Montleon 2018-02-26 16:57:18 UTC
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"prod","_apb_service_class_id":"2c259ddd8059b9bc65081e07bf20058f","_apb_service_instance_id":"bec8e8df-e96f-4cf9-bd5e-281b35bd6dd9","cluster":"openshift","mariadb_database":"admin","mariadb_password":"dddd","mariadb_root_password":"dddd","mariadb_user":"admin","mariadb_version":"10.2","namespace":"maria-1"}'

looks like this should be the host in the maria-1 namespace

[root@host-172-16-120-96 audit]# oc exec -it maria-1                             rhscl-mariadb-10.0-dev-1-746v8 /bin/bash

No DB dump exists:
$ ls /tmp
ks-script-8eoXwX  yum.log

Can you tell me was selinux disabled prior to creating any APB's? Are you using the UI or some other means to create the APB's? What user are you operating as?

In you environment I had success as admin/admin:
# oc project maria-test
Now using project "maria-test" on server "https:

Before update:
# oc get pods
mediawiki123-2-r9f4l             1/1       Running   0          1m
rhscl-mariadb-10.1-dev-1-rh8v4   1/1       Running   0          3m

After update:
# oc get pods
NAME                             READY     STATUS    RESTARTS   AGE
mediawiki123-2-r9f4l             1/1       Running   0          6m
rhscl-mariadb-10.2-dev-1-8z5vm   1/1       Running   0          4m

Logs from update:
# oc logs -f -n rh-mariadb-apb-upda-6dt9l           apb-6af958b1-bb92-4fc6-b4e1-8154ec502957
+ [[ update --extra-vars {"_apb_plan_id":"dev","_apb_service_class_id":"2c259ddd8059b9bc65081e07bf20058f","_apb_service_instance_id":"44c8fa45-23e0-4de0-88b0-a41a6edd1a0d","cluster":"openshift","mariadb_database":"admin","mariadb_password":"changeme","mariadb_root_password":"changeme","mariadb_user":"admin","mariadb_version":"10.2","namespace":"maria-test"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000280000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ [[ -e /opt/apb/actions/update.yml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"dev","_apb_service_class_id":"2c259ddd8059b9bc65081e07bf20058f","_apb_service_instance_id":"44c8fa45-23e0-4de0-88b0-a41a6edd1a0d","cluster":"openshift","mariadb_database":"admin","mariadb_password":"changeme","mariadb_root_password":"changeme","mariadb_user":"admin","mariadb_version":"10.2","namespace":"maria-test"}'

PLAY [Deploy rhscl-mariadb-apb to openshift] ***********************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-mariadb-apb-openshift : Find pod we need to update] ****************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Find dc we will clean up] ******************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Backup source database] ********************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Copy over db backup] ***********************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : set rhscl-mariadb service state to present] ***
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : include_tasks] *****************************
included: /opt/ansible/roles/rhscl-mariadb-apb-openshift/tasks/dev.yml for localhost

TASK [rhscl-mariadb-apb-openshift : set mariadb deployment using persistent storage to present] ***
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : include_tasks] *****************************
skipping: [localhost]

TASK [rhscl-mariadb-apb-openshift : Wait for mariadb to come up] ***************
ok: [localhost]

TASK [rhscl-mariadb-apb-openshift : Find pod we need to restore] ***************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Copy over db backup] ***********************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Restore database] **************************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Remove old dc] *****************************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : ensure production volume is absent] ********
ok: [localhost] => (item=10.0)
ok: [localhost] => (item=10.1)
ok: [localhost] => (item=10.2)

TASK [rhscl-mariadb-apb-openshift : encode bind credentials] *******************
changed: [localhost]

PLAY RECAP *********************************************************************
localhost                  : ok=14   changed=11   unreachable=0    failed=0   

+ EXIT_CODE=0
+ set +ex
+ '[' -f /var/tmp/test-result ']'
+ exit 0

Comment 3 Jason Montleon 2018-02-26 17:48:30 UTC
Also in your env:

# oc project pg-test
Now using project "pg-test" on server "https://

Before update:
# oc get pods
NAME                         READY     STATUS    RESTARTS   AGE
mediawiki123-2-dnxv8         1/1       Running   0          23s
postgresql-9.5-dev-1-5cxqq   1/1       Running   0          44m

After Update:
# oc get pods
NAME                         READY     STATUS    RESTARTS   AGE
mediawiki123-2-dnxv8         1/1       Running   0          2m
postgresql-9.6-dev-1-d785k   1/1       Running   0          1m

APB Log:
# oc logs -f -n rh-postgresql-apb-upda-jf7lr        apb-f5ab05a8-6425-425c-bd34-af15375cb8e3
+ [[ update --extra-vars {"_apb_plan_id":"dev","_apb_service_class_id":"d5915e05b253df421efe6e41fb6a66ba","_apb_service_instance_id":"1359848d-9736-4e5d-868b-5edd2e382549","cluster":"openshift","namespace":"pg-test","postgresql_database":"admin","postgresql_password":"changeme","postgresql_user":"admin","postgresql_version":"9.6"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000320000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yaml --extra-vars '{"_apb_plan_id":"dev","_apb_service_class_id":"d5915e05b253df421efe6e41fb6a66ba","_apb_service_instance_id":"1359848d-9736-4e5d-868b-5edd2e382549","cluster":"openshift","namespace":"pg-test","postgresql_database":"admin","postgresql_password":"changeme","postgresql_user":"admin","postgresql_version":"9.6"}'

PLAY [Deploy rhscl-postgresql-apb to "openshift"] ******************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-postgresql-apb : Find pod we need to update] ***********************
changed: [localhost]

TASK [rhscl-postgresql-apb : Find dc we will clean up] *************************
changed: [localhost]

TASK [rhscl-postgresql-apb : Find deployment we will clean up] *****************
skipping: [localhost]

TASK [rhscl-postgresql-apb : Backup source database] ***************************
changed: [localhost]

TASK [rhscl-postgresql-apb : Copy over db backup] ******************************
changed: [localhost]

TASK [rhscl-postgresql-apb : set service state to present] *********************
changed: [localhost]

TASK [rhscl-postgresql-apb : include_tasks] ************************************
included: /opt/ansible/roles/rhscl-postgresql-apb/tasks/dev.yml for localhost

TASK [rhscl-postgresql-apb : set development deployment config state to present] ***
skipping: [localhost]

TASK [rhscl-postgresql-apb : set development deployment config state to present] ***
changed: [localhost]

TASK [rhscl-postgresql-apb : include_tasks] ************************************
skipping: [localhost]

TASK [rhscl-postgresql-apb : Wait for postgres to come up] *********************
ok: [localhost]

TASK [rhscl-postgresql-apb : Find pod we need to restore] **********************
changed: [localhost]

TASK [rhscl-postgresql-apb : Copy over db backup] ******************************
changed: [localhost]

TASK [rhscl-postgresql-apb : Restore database] *********************************
changed: [localhost]

TASK [rhscl-postgresql-apb : Remove deployment config] *************************
changed: [localhost]

TASK [rhscl-postgresql-apb : Remove deployment] ********************************
skipping: [localhost]

TASK [rhscl-postgresql-apb : ensure production volume is absent] ***************
ok: [localhost] => (item=9.4)
ok: [localhost] => (item=9.5)
ok: [localhost] => (item=9.6)

TASK [rhscl-postgresql-apb : encode bind credentials] **************************
changed: [localhost]

PLAY RECAP *********************************************************************
localhost                  : ok=14   changed=11   unreachable=0    failed=0   

+ EXIT_CODE=0
+ set +ex
+ '[' -f /var/tmp/test-result ']'
+ exit 0

Comment 4 Jason Montleon 2018-02-26 19:07:18 UTC
Running the command db backup command outside of an APB there seems to be an issue with the environment that has nothing to do with APB's.

This works:
# oc exec -n pg-test postgresql-9.6-dev-1-d785k  -- /bin/bash -c "pg_dumpall -f /tmp/db.dump"

This does not:
# oc exec -n post-96-dev postgresql-9.6-dev-1-n8br8 -- /bin/bash -c "pg_dumpall -f /tmp/db.dump"

Those are the same command against different pods.

This works though:
# oc exec -it -n post-96-dev postgresql-9.6-dev-1-n8br8 -- /bin/bash -c "pg_dumpall -f /tmp/db.dump"

That's with -it added. I tried -i and -t indivudually and they did not appear to make a difference. 

I can probably work around this by adding -it to the commands, but I would not think that I should have to. There seems to be a bug with the client or server here. Unfortunately setting a loglevel on the client isn't getting me any output.

Comment 5 Jason Montleon 2018-02-26 19:16:49 UTC
oc cp commands hang as well, so we will need another workaround for that or someone with more knowledge of oc to look into this and fix it, otherwise there's not much point in working around the exec issue; it will just bump the failure down one task.

Comment 6 Zihan Tang 2018-02-27 06:52:09 UTC
(In reply to Jason Montleon from comment #2)
> + ansible-playbook /opt/apb/actions/update.yml --extra-vars
> '{"_apb_plan_id":"prod","_apb_service_class_id":
> "2c259ddd8059b9bc65081e07bf20058f","_apb_service_instance_id":"bec8e8df-e96f-
> 4cf9-bd5e-281b35bd6dd9","cluster":"openshift","mariadb_database":"admin",
> "mariadb_password":"dddd","mariadb_root_password":"dddd","mariadb_user":
> "admin","mariadb_version":"10.2","namespace":"maria-1"}'
> 
> looks like this should be the host in the maria-1 namespace
> 
> [root@host-172-16-120-96 audit]# oc exec -it maria-1                        
> rhscl-mariadb-10.0-dev-1-746v8 /bin/bash
> 
> No DB dump exists:
> $ ls /tmp
> ks-script-8eoXwX  yum.log
> 
> Can you tell me was selinux disabled prior to creating any APB's? Are you
> using the UI or some other means to create the APB's? What user are you
> operating as?
> 
> In you environment I had success as admin/admin:
When I test this issue, the test scenario : 
1. about selinux: 
    a. selinux enabled, provision postgresql-apb , update plan or version -> succeed
    b. selinux enabled, provision wiki and postgresql, create a count in meidawiki webUI, downgrade postgresql to 9.5 -> succeed.
    c. selinux enabled, provision postgresql or mariadb, create data (use system:admin), then update version  -> failed.
    d. selinux disabled. provision postgresql or mariadb, create data(use system:admin), then update version  ->failed.
2. I perform provision or update in UI, the result is the same with normal user(test) or cluster-admin user(zitang).
    I create data in database with 'system:admin' user.

Comment 7 Zihan Tang 2018-02-27 07:19:52 UTC
Json,
If I use 'test' (normal user with out cluser-admin) to create data, then update still failed.

[root@host-172-16-120-58 ~]# oc get pod 
NAME                             READY     STATUS    RESTARTS   AGE
rhscl-mariadb-10.2-dev-1-vtg8g   1/1       Running   0          14m
[root@host-172-16-120-58 ~]# oc rsh rhscl-mariadb-10.2-dev-1-vtg8g
sh-4.2$ mysql -uadmin -pddd -h127.0.0.1  
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 10
Server version: 10.2.8-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use admin
Database changed
MariaDB [admin]> CREATE TABLE COMPANY(
    ->    ID INT PRIMARY KEY     NOT NULL,
    ->    NAME           TEXT    NOT NULL,
    ->    AGE            INT     NOT NULL,
    ->    ADDRESS        CHAR(50),
    ->    SALARY         REAL,   JOIN_DATE   DATE
    -> );
Query OK, 0 rows affected (0.01 sec)

MariaDB [admin]> 
MariaDB [admin]> INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY,JOIN_DATE) VALUES (1, 'Paul', 32, 'California', 20000.00 ,'2001-07-13');
Query OK, 1 row affected (0.03 sec)


What the role of user 'admin:admin' you mentioned in comment 2;

Comment 9 Zihan Tang 2018-02-27 07:26:19 UTC
Jason, 
Since we have the following scenarios, is there any difference process among them when update the sql apb.
1. update plan after provision
2. update version after provision
3. binding to meidawiki , create mediawiki user in UI, then update version or plan
4. create data(table or db), then update version or plan.

especially scenario 3 and scenario 4. 

thanks.

Comment 10 Jason Montleon 2018-02-27 13:59:48 UTC
It looks like the same issue is present:


rh-mariadb-apb-upda-4zpxk           apb-7c7c0a3a-e3de-4990-9dd7-6ee54677fcf3   1/1       Running     0          5h
rh-postgresql-apb-upda-5nn2x        apb-9acd5bed-2b4c-454c-a685-f9c9f3a8bcf1   1/1       Running     0          6h
rh-postgresql-apb-upda-hv5pf        apb-c4ac544f-23bf-45ae-b021-752575630ab0   1/1       Running     0          6h
rh-postgresql-apb-upda-lwqvb        apb-ada71d6b-722b-4f7e-b042-1d6bea18662e   1/1       Running     0          6h



# oc exec -n post-selinux postgresql-9.6-dev-1-ttffd -- /bin/bash -c "ls /tmp"
^C

# oc exec -it -n post-selinux postgresql-9.6-dev-1-ttffd -- /bin/bash -c "ls /tmp"
ks-script-8eoXwX  yum.log

Each is stuck on an oc cp after the oc exec failed to produce the db dump file.

Comment 11 Jason Montleon 2018-02-27 17:17:27 UTC
These may alleviate some of the issues. If things break down cp will still fail, but execs should still run. I'll keep looking for a reasonable workaround for copy.

https://github.com/ansibleplaybookbundle/mariadb-apb/pull/24
https://github.com/ansibleplaybookbundle/mysql-apb/pull/24
https://github.com/ansibleplaybookbundle/postgresql-apb/pull/35

Comment 12 Jason Montleon 2018-02-27 19:50:20 UTC
These might work around the oc cp issues until we get a fix and can get back to using it.

https://github.com/ansibleplaybookbundle/mariadb-apb/pull/25
https://github.com/ansibleplaybookbundle/mysql-apb/pull/25
https://github.com/ansibleplaybookbundle/postgresql-apb/pull/36

Comment 14 Zihan Tang 2018-02-28 06:03:15 UTC
I use mariadb-apb-v3.9.0-0.53.0.1 , postgresql-apb-v3.9.0-0.53.0.1 with selinux disabled. 
Still failed , pending on 'Backup source database'

step:
1. provision mairadb/postgresql 
2.create data with root or other sql user;
3. update version.

[root@host-172-16-120-23 tmp]# oc exec -it -n post-down  postgresql-9.6-dev-1-x9h5s -- /bin/bash -c "ls /tmp"
ks-script-8eoXwX  yum.log

[root@host-172-16-120-23 tmp]# oc exec -it -n maria-up rhscl-mariadb-10.0-dev-1-dcjf4 -- /bin/bash -c "ls /tmp"
ks-script-8eoXwX  yum.log

The DB file is not dumped.

+ [[ update --extra-vars {"_apb_plan_id":"prod","_apb_service_class_id":"2c259ddd8059b9bc65081e07bf20058f","_apb_service_instance_id":"77feba75-4f65-4ef5-9854-98c612dd2561","cluster":"openshift","mariadb_database":"admin","mariadb_password":"dddd","mariadb_root_password":"dddd","mariadb_user":"admin","mariadb_version":"10.2","namespace":"maria-up"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000180000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ [[ -e /opt/apb/actions/update.yml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"prod","_apb_service_class_id":"2c259ddd8059b9bc65081e07bf20058f","_apb_service_instance_id":"77feba75-4f65-4ef5-9854-98c612dd2561","cluster":"openshift","mariadb_database":"admin","mariadb_password":"dddd","mariadb_root_password":"dddd","mariadb_user":"admin","mariadb_version":"10.2","namespace":"maria-up"}'

PLAY [Deploy rhscl-mariadb-apb to openshift] ***********************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-mariadb-apb-openshift : Find pod we need to update] ****************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Find dc we will clean up] ******************
changed: [localhost]

TASK [rhscl-mariadb-apb-openshift : Backup source database] ********************

Comment 17 Zihan Tang 2018-03-02 09:05:39 UTC
Jason,
I tried in crio env, 
the update is pending on ' Create db backup directory' .
check the old pod , the /tmp/db dir is not really created.
But I updated apb 5 times , only  succeed 1 time.
but the 'mkdir' command seems run ok outside.
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "ls /tmp"
ks-script-8eoXwX  yum.log
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "mkdir /tmp/db"
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "ls /tmp"
db  ks-script-8eoXwX  yum.log
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "mkdir /tmp/db1"
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "mkdir /tmp/db2"
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "mkdir /tmp/db3"
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "mkdir /tmp/db4"
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "mkdir /tmp/db5"
[root@host-172-16-120-75 ~]# oc exec -it rhscl-mariadb-10.1-prod-1-9zkhh -- /bin/bash  -c "ls /tmp"
db  db1  db2  db3  db4	db5  ks-script-8eoXwX  yum.log

disable selinux is the same result.

Comment 19 Jason Montleon 2018-03-05 14:37:35 UTC
Could the oc cp issues be caused by this overlay/tar problem? It's stated in there that this is most likely affects overlay and overlay2 and the QE hosts are configured with 'DOCKER_STORAGE_OPTIONS="--storage-driver overlay2 "'

https://github.com/moby/moby/issues/19647

I'm wondering if after a cp fails that's when the terminal issues start.

Comment 20 Zihan Tang 2018-03-06 07:24:31 UTC
Jason, 
I use the apb of 'v3.9.0-0.53.0.1' to test 
In cri-o, the update hang and not begin to do 'cp', the new pod has not started.
[root@host-172-16-120-49 ~]# oc get pod -n maira-1
NAME                              READY     STATUS    RESTARTS   AGE
rhscl-mariadb-10.2-prod-1-z8hlp   1/1       Running   0          12m

[root@host-172-16-120-49 ~]# oc exec -it rhscl-mariadb-10.2-prod-1-z8hlp -- /bin/bash -c "ls /tmp "
db.dump  ks-script-8eoXwX  yum.log

The sandbox hang on 
TASK [rhscl-mariadb-apb-openshift : Backup source database] 
********************
If I don't create any database(it's very small) , the update will succeed . 

If use apb 'v3.9.1-1.2'  with 'rsync' command
The task hang on : 
TASK [rhscl-mariadb-apb-openshift : Create db backup directory] ****************
And the directory is not created.
[root@host-172-16-120-49 ~]# oc get pod 
NAME                              READY     STATUS    RESTARTS   AGE
rhscl-mariadb-10.2-prod-1-c92dj   1/1       Running   0          48m

[root@host-172-16-120-49 ~]# oc exec -it rhscl-mariadb-10.2-prod-1-c92dj  -- /bin/bash -c "ls /tmp"
ks-script-8eoXwX  yum.log

Comment 21 Jason Montleon 2018-03-09 13:41:13 UTC
Please try again with the latest puddle (3.9.4 images I believe) for Openshift. There is a fix that went in affecting stdin closing.

There's also a new build of (crio cri-o-1.9.8-1.git7d9d2aa) with the execsync issue that was suggested as a possible issue.

Comment 22 Zihan Tang 2018-03-12 07:45:44 UTC
Verify failed
openshift : v3.9.7
cri-o : crio version 1.9.9

the update sandbox still hang on : 
TASK [rhscl-postgresql-apb : Create db backup directory] ***********************

Comment 24 Jason Montleon 2018-03-13 13:25:03 UTC
Thanks for trying.

In the bug we're dependent on a PR was posted stating it fixes this and will be available in cri-o-1.9.10.

https://github.com/kubernetes-incubator/cri-o/pull/1443

Once we have 1.9.10 we can try again.

Comment 27 Zhang Cheng 2018-03-16 12:34:56 UTC
Jason,

Could you help to make sure which is the suitable "Target Release"? 

Zihan verified and pass with:
Asb: 1.1.16
service-catalog : v0.1.9
crio version 1.9.10

I think 3.9.0 is more suitable. Is it correct?

Comment 28 Jason Montleon 2018-03-16 12:36:52 UTC
I switched it to 3.9.0. I think you're correct. Thanks!

Comment 31 errata-xmlrpc 2018-12-13 19:26:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3748


Note You need to log in before you can comment on or make changes to this bug.