1535931 – New created database won't be preserved if update the plan or parameters of database APB

Bug 1535931 - New created database won't be preserved if update the plan or parameters of database APB

Summary: New created database won't be preserved if update the plan or parameters of d...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Service Broker
Sub Component:
Version:	3.9.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	3.9.0
Assignee:	Jason Montleon
QA Contact:	Zihan Tang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-01-18 09:40 UTC by Qixuan Wang
Modified:	2018-03-28 14:20 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2018-03-28 14:20:40 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:0489	0	None	None	None	2018-03-28 14:20:59 UTC

Description Qixuan Wang 2018-01-18 09:40:02 UTC

Description of problem:
I provisioned a PostgreSQL APB with dev plan, wrote create database, table and write data to it. 
1) After updating plan from dev to prod, the data lost. 
2) After updating database version, the data lost.


Version-Release number of selected component (if applicable):
openshift v3.9.0-0.20.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8
brew.....ose-ansible-service-broker v3.9
brew.....ose-service-catalog v3.9


How reproducible:
Alway


Steps to Reproduce:
1. Provision a PostgreSQL APB with development plan

2. Check ServiceInstance and pod
# oc edit serviceinstance rh-postgresql-apb-ck24x
# oc get pod

3. Write data into the database
CREATE DATABASE testdb;

\c testdb;

CREATE TABLE COMPANY(
   ID INT PRIMARY KEY     NOT NULL,
   NAME           TEXT    NOT NULL,
   AGE            INT     NOT NULL,
   ADDRESS        CHAR(50),
   SALARY         REAL,   JOIN_DATE   DATE
);

INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY,JOIN_DATE) VALUES (1, 'Paul', 32, 'California', 20000.00 ,'2001-07-13');


4. Update the plan from dev to prod
# oc edit serviceinstance rh-postgresql-apb-ck24x

5. Check ServiceInstance and pod again

6. Check data in the database


Actual results:
2. [root@host-xxx ~]# oc get pod
NAME                            READY     STATUS    RESTARTS   AGE
po/postgresql-9.4-dev-1-fz2jc   1/1       Running   0          14m

3. postgres=# \l
                                 List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   
-----------+----------+----------+------------+------------+-----------------------
 admin     | admin    | UTF8     | en_US.utf8 | en_US.utf8 | 
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 testdb    | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 
(5 rows)

testdb=# \dt
          List of relations
 Schema |  Name   | Type  |  Owner   
--------+---------+-------+----------
 public | company | table | postgres
(1 row)

testdb=# SELECT * FROM COMPANY;
 id | name | age |                      address                       | salary | join_date  
----+------+-----+----------------------------------------------------+--------+------------
  1 | Paul |  32 | California                                         |  20000 | 2001-07-13
(1 row)

5. [root@host-xxx ~]# oc get pod
NAME                          READY     STATUS    RESTARTS   AGE
postgresql-9.4-prod-1-qjpk7   1/1       Running   0          2m

6. [root@host-xxx ~]# oc rsh postgresql-9.4-prod-1-qjpk7
sh-4.2$ psql
psql (9.4.14)
Type "help" for help.

postgres=# \l
                                 List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   
-----------+----------+----------+------------+------------+-----------------------
 admin     | admin    | UTF8     | en_US.utf8 | en_US.utf8 | 
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
(4 rows)


Expected results:
6. Should be the same with step 3


Additional info:

Comment 1 Qixuan Wang 2018-01-18 10:46:58 UTC

Given I set this during provision APB
{"postgresql_database":"admin","postgresql_user":"admin","postgresql_version":"9.5","postgresql_password":"admin"}
I wrote data to the database/table: admin/admin, update plan and parameters, data was retained. 
New database wasn't retained if the user is superuser. I'm not sure it went out of the feature scope. 


After updating:
sh-4.2$ psql -h 127.0.0.1 admin admin
psql (9.5.9)
Type "help" for help.

admin=> \l
                                 List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   
-----------+----------+----------+------------+------------+-----------------------
 admin     | admin    | UTF8     | en_US.utf8 | en_US.utf8 | 
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
(4 rows)

admin=> \dt
        List of relations
 Schema |  Name   | Type  | Owner 
--------+---------+-------+-------
 public | company | table | admin
(1 row)

admin=> SELECT * FROM COMPANY;
 id | name | age |                      address                       | salary | join_date  
----+------+-----+----------------------------------------------------+--------+------------
  1 | Paul |  32 | California                                         |  20000 | 2001-07-13
(1 row)

admin=> INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY,JOIN_DATE) VALUES (2, 'Tom', 42, 'California', 10000.00 ,'2001-09-11');
INSERT 0 1
admin=> SELECT * FROM COMPANY;
 id | name | age |                      address                       | salary | join_date  
----+------+-----+----------------------------------------------------+--------+------------
  1 | Paul |  32 | California                                         |  20000 | 2001-07-13
  2 | Tom  |  42 | California                                         |  10000 | 2001-09-11
(2 rows)

Comment 2 Jason Montleon 2018-01-19 13:34:29 UTC

Why not use the database created at the time you ran the APB? We might be able to dump all databases instead of a specific one. I'd have to investigate.

Comment 3 Qixuan Wang 2018-01-22 02:27:42 UTC

Comment 1 is the database created when I ran the APB. It worked as expected.

Comment 4 Jason Montleon 2018-01-26 14:57:41 UTC

This should save everything for postgresql. Still investigating for MariaDB and MySQL.
https://github.com/ansibleplaybookbundle/postgresql-apb/pull/30

Comment 5 Jason Montleon 2018-01-26 15:28:11 UTC

https://github.com/ansibleplaybookbundle/mysql-apb/pull/21
https://github.com/ansibleplaybookbundle/mariadb-apb/pull/21

Comment 11 Jason Montleon 2018-02-02 13:53:19 UTC

Is the pod actually running? Are you able to determine if it is stuck trying to run some process when it stops at, "TASK [rhscl-postgresql-apb : Find deployment we will clean up] *****************".

As I mentioned your logs on the APB are cut off early and it appears without an errors, which doesn't make much sense to me. I used openshift v3.9.0-0.34.0 yesterday and was unable to reproduce this.

If you have an environment where this is happening that I can log into and look I'd be happy to try and figure out what's going on.

Comment 12 Ryan Hallisey 2018-02-02 14:14:06 UTC

The task: "TASK [rhscl-postgresql-apb : Find deployment we will clean up] *****************" will only run when the task: TASK [rhscl-postgresql-apb : Find dc we will clean up] ************************* is skipped. Are you able to consistently reproduce the playbook hanging?

Comment 13 Jason Montleon 2018-02-02 14:56:17 UTC

The next task would exec to the postgres pod and dump the database. Is the postgres pod running?

Comment 14 Jason Montleon 2018-02-02 18:41:24 UTC

I set up a 3.9 multinode environment with one master and four nodes and multi tenant sdn networking to see if a more complex environment would tease the issue out, but I am still unable to reproduce it. after running several updates in multiple projects switching plans and versions in different directions.

Comment 15 Jason Montleon 2018-02-04 19:28:36 UTC

Can you please provide oc describe output for the broker pod, postgres pod, the APB that is getting stuck, provide full logs for the stuck APB and broker, as well as provide the inventory file used to set up the cluster.

And if I can access the environment I can try to diagnose the issue directly.

Comment 18 Jason Montleon 2018-02-05 16:20:24 UTC

I wonder if what you are seeing is an selinux issue with cri-o.

After deploying postgres I see constant errors in audit.log:
type=AVC msg=audit(1517846970.413:1596): avc:  denied  { write } for  pid=25075 comm="pg_ctl" name=".s.PGSQL.5432" dev="dm-0" ino=103097197 scontext=system_u:system_r:svirt_lxc_net_t:s0:c5,c11 tcontext=system_u:object_r:container_share_t:s0 tclass=sock_file

My update pod did not hang, but I did get an error at the same point:
...
TASK [rhscl-postgresql-apb : Find deployment we will clean up] *****************
skipping: [localhost]

TASK [rhscl-postgresql-apb : Backup source database] ***************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "kubectl exec -n test postgresql-9.4-dev-1-dkv89 -- /bin/bash -c \"pg_dumpall -f /tmp/db.dump\"", "delta": "0:00:00.547038", "end": "2018-02-05 16:10:21.036604", "msg": "non-zero return code", "rc": 1, "start": "2018-02-05 16:10:20.489566", "stderr": "pg_dumpall: could not connect to database \"template1\": could not connect to server: Permission denied\n\tIs the server running locally and accepting\n\tconnections on Unix domain socket \"/var/run/postgresql/.s.PGSQL.5432\"?\n\ncommand terminated with exit code 1", "stderr_lines": ["pg_dumpall: could not connect to database \"template1\": could not connect to server: Permission denied", "\tIs the server running locally and accepting", "\tconnections on Unix domain socket \"/var/run/postgresql/.s.PGSQL.5432\"?", "", "command terminated with exit code 1"], "stdout": "", "stdout_lines": []}
	to retry, use: --limit @/opt/apb/actions/update.retry
...

Can you try disabling selinux and see if it works. If it does I'd be inclined to think this is a cri-o selinux bug as it's preventing the rhscl postgresql pod from performing necessary tasks.

Comment 19 Daniel Walsh 2018-02-05 19:04:12 UTC

This looks like a socket stored on COW file system?

Is this a socket listening on /run/ directory?

Comment 20 Jason Montleon 2018-02-06 13:37:14 UTC

Correct, in this case there was no persistent storage of any kind attached.

With selinux disabled I have:
$ ls /var/run/postgresql/.s.PGSQL.5432 /run/postgresql/.s.PGSQL.5432 -l
srwxrwxrwx. 1 1000120000 root 0 Feb  6 13:28 /run/postgresql/.s.PGSQL.5432
srwxrwxrwx. 1 1000120000 root 0 Feb  6 13:28 /var/run/postgresql/.s.PGSQL.5432

Comment 22 Jason Montleon 2018-02-08 13:45:34 UTC

Can you confirm this is working without crio and crio if selinux is disabled and verify it? If it's a crio / selinux issue preventing the application container from writing the .sock file, dump files, etc. I'd recommend opening a new bug against the correct component as there's not much I'll be able to do about that.

Comment 23 Zihan Tang 2018-02-09 07:28:23 UTC

Json,
I try to disable selinux by "setenforce 0", then selinux status:
[root@host-172-16-120-8 ~]# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      31

Using the latest downstream image
In cri-o env, updating plan from dev to prod with DB created still failed with the same log. 
The sandbox pod blocked in the running status.
Now using project "rh-postgresql-apb-upda-v74l4" on server "https://172.16.120.8:8443".
[root@host-172-16-120-8 ~]# oc get pod
 NAME                                       READY     STATUS    RESTARTS   AGE
apb-8e1c6679-6db3-4f83-ab7e-de6d420c55fd   1/1       Running   0          15m
[root@host-172-16-120-8 ~]#  oc logs -f apb-8e1c6679-6db3-4f83-ab7e-de6d420c55fd 
+ [[ update --extra-vars {"_apb_plan_id":"prod","_apb_service_class_id":"d5915e05b253df421efe6e41fb6a66ba","_apb_service_instance_id":"df138cda-d3b6-4465-8c93-d3b96bbc9996","cluster":"openshift","namespace":"post-5","postgresql_database":"admin","postgresql_password":"dddd","postgresql_user":"admin","postgresql_version":"9.6"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000260000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yaml --extra-vars '{"_apb_plan_id":"prod","_apb_service_class_id":"d5915e05b253df421efe6e41fb6a66ba","_apb_service_instance_id":"df138cda-d3b6-4465-8c93-d3b96bbc9996","cluster":"openshift","namespace":"post-5","postgresql_database":"admin","postgresql_password":"dddd","postgresql_user":"admin","postgresql_version":"9.6"}'
 [WARNING]: While constructing a mapping from /opt/apb/actions/update.yaml,
line 1, column 3, found a duplicate dict key (vars). Using last defined value
only.

PLAY [Deploy rhscl-postgresql-apb to "openshift"] ******************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-postgresql-apb : Find pod we need to update] ***********************
changed: [localhost]

TASK [rhscl-postgresql-apb : Find dc we will clean up] *************************
changed: [localhost]

TASK [rhscl-postgresql-apb : Find deployment we will clean up] *****************
skipping: [localhost]

TASK [rhscl-postgresql-apb : Backup source database] ***************************

The asb log:
[2018-02-09T06:59:36.381Z] [DEBUG] - ServiceInstance Parameters: [map[_apb_service_class_id:d5915e05b253df421efe6e41fb6a66ba _apb_service_instance_id:df138cda-d3b6-4465-8c93-d3b96bbc9996 postgresql_database:admin postgresql_password:dddd postgresql_user:admin postgresql_version:9.6 _apb_plan_id:prod]]
[2018-02-09T06:59:36.381Z] [INFO] - ASYNC update in progress
[2018-02-09T06:59:36.381Z] [NOTICE] - ============================================================
[2018-02-09T06:59:36.381Z] [NOTICE] -                        UPDATING                             
[2018-02-09T06:59:36.381Z] [NOTICE] - ============================================================
[2018-02-09T06:59:36.381Z] [NOTICE] - Spec.ID: d5915e05b253df421efe6e41fb6a66ba
[2018-02-09T06:59:36.381Z] [NOTICE] - Spec.Name: rh-postgresql-apb
[2018-02-09T06:59:36.381Z] [NOTICE] - Spec.Image: registry.access.stage.redhat.com/openshift3/postgresql-apb:v3.9
[2018-02-09T06:59:36.381Z] [NOTICE] - Spec.Description: SCL PostgreSQL apb implementation
[2018-02-09T06:59:36.381Z] [NOTICE] - ============================================================
[2018-02-09T06:59:36.381Z] [INFO] - Checking if namespace post-5 exists.
[2018-02-09T06:59:36.383Z] [DEBUG] - ExecutingApb:
[2018-02-09T06:59:36.383Z] [DEBUG] - name:[ rh-postgresql-apb ]
[2018-02-09T06:59:36.383Z] [DEBUG] - image:[ registry.access.stage.redhat.com/openshift3/postgresql-apb:v3.9 ]
[2018-02-09T06:59:36.383Z] [DEBUG] - action:[ update ]
[2018-02-09T06:59:36.383Z] [DEBUG] - pullPolicy:[ IfNotPresent ]
[2018-02-09T06:59:36.383Z] [DEBUG] - role:[ edit ]
[2018-02-09T06:59:36.383Z] [DEBUG] - No proxy env vars found to be configured.
10.129.0.4 - - [09/Feb/2018:06:59:36 +0000] "PATCH /ansible-service-broker/v2/service_instances/df138cda-d3b6-4465-8c93-d3b96bbc9996?accepts_incomplete=true HTTP/1.1" 202 58
[2018-02-09T06:59:36.466Z] [DEBUG] - Trying to create apb sandbox: [ apb-8e1c6679-6db3-4f83-ab7e-de6d420c55fd ], with edit permissions in namespace rh-postgresql-apb-upda-v74l4
[2018-02-09T06:59:36.466Z] [NOTICE] - Creating RoleBinding apb-8e1c6679-6db3-4f83-ab7e-de6d420c55fd
[2018-02-09T06:59:36.585Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-02-09T06:59:36.585Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-02-09T06:59:36.585Z] [DEBUG] - operation:  f8953e65-6969-4eca-84ea-54d775be4813
[2018-02-09T06:59:36.586Z] [DEBUG] - state: in progress
10.129.0.4 - - [09/Feb/2018:06:59:36 +0000] "GET /ansible-service-broker/v2/service_instances/df138cda-d3b6-4465-8c93-d3b96bbc9996/last_operation?operation=f8953e65-6969-4eca-84ea-54d775be4813&plan_id=4acaf1511a92890cd8910b1d8473be97&service_id=d5915e05b253df421efe6e41fb6a66ba HTTP/1.1" 200 29
[2018-02-09T06:59:36.631Z] [NOTICE] - Creating RoleBinding apb-8e1c6679-6db3-4f83-ab7e-de6d420c55fd
[2018-02-09T06:59:36.651Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-02-09T06:59:36.651Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-02-09T06:59:36.651Z] [DEBUG] - operation:  f8953e65-6969-4eca-84ea-54d775be4813
[2018-02-09T06:59:36.652Z] [DEBUG] - state: in progress

The update in other env is also failed using the latest image in downstream , but the status are different. 
step 
1. Provision Postgresql in dev plan 
2. create a DB
3.update plan to prod.
the status  is the same with bug 1542410 #comment7.

Comment 25 Jason Montleon 2018-02-09 16:54:44 UTC

I have pushed a new image to upstream postgresql-apb:latest and pushed openshift-enterprise-postgresql-apb-v3.9.0-0.41.0.2 to QE on the errata.

Can you re-test with these and see if you can reproduce the issue, please.

Comment 26 Zihan Tang 2018-02-11 09:36:16 UTC

Jason, 
I use asb v1.1.10 , downstream image v3.9 and  postgresql-apb-v3.9.0-0.41.0.2 to re-test
In cri-o env, after create DB, and perform update . postgresql and mariaDB pending at 'TASK [rhscl-postgresql-apb : Backup source database]'

In other env, I test mysql and postgresql, the update also failed if I create DB or Table.
Scenario:
1. mysql 5.7 prod- > create table ->update to dev   failed
    update sandbox failed with error:

2. postgresql -> create DB -> update  failed,  sandbox is deleted.
 [root@host-172-16-120-76 ~]# oc get pod 
NAME                          READY     STATUS    RESTARTS   AGE
postgresql-9.6-dev-1-4t2gq    1/1       Running   0          1h
postgresql-9.6-prod-1-qwhfz   1/1       Running   0          2h

 [root@host-172-16-120-76 ~]# oc get pod 
NAME                                       READY     STATUS    RESTARTS   AGE
apb-2b61ad77-a129-4018-8720-beb0790c1574   0/1       Error     0          1h
[root@host-172-16-120-76 ~]# oc logs -f apb-2b61ad77-a129-4018-8720-beb0790c1574 
+ [[ update --extra-vars {"_apb_plan_id":"dev","_apb_service_class_id":"73ead67495322cc462794387fa9884f5","_apb_service_instance_id":"aac1936a-7f4b-4a81-a880-c07ea016a382","cluster":"openshift","mysql_database":"devel","mysql_password":"dddd","mysql_user":"devel","mysql_version":"5.7","namespace":"mysql-t","service_name":"mysql"} == *\s\2\i\/\a\s\s\e\m\b\l\e* ]]
+ ACTION=update
+ shift
+ playbooks=/opt/apb/actions
+ CREDS=/var/tmp/bind-creds
+ TEST_RESULT=/var/tmp/test-result
+ whoami
+ '[' -w /etc/passwd ']'
++ id -u
+ echo 'apb:x:1000390000:0:apb user:/opt/apb:/sbin/nologin'
+ set +x
+ [[ -e /opt/apb/actions/update.yaml ]]
+ [[ -e /opt/apb/actions/update.yml ]]
+ ANSIBLE_ROLES_PATH=/etc/ansible/roles:/opt/ansible/roles
+ ansible-playbook /opt/apb/actions/update.yml --extra-vars '{"_apb_plan_id":"dev","_apb_service_class_id":"73ead67495322cc462794387fa9884f5","_apb_service_instance_id":"aac1936a-7f4b-4a81-a880-c07ea016a382","cluster":"openshift","mysql_database":"devel","mysql_password":"dddd","mysql_user":"devel","mysql_version":"5.7","namespace":"mysql-t","service_name":"mysql"}'

PLAY [mysql-apb playbook to provision the application] *************************

TASK [ansible.kubernetes-modules : Install latest openshift client] ************
skipping: [localhost]

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [rhscl-mysql-apb-openshift : Find pod we need to update] ******************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Find dc we will clean up] ********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Backup source database] **********************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Copy over db backup] *************************
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : Set mysql service state to present] **********
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
included: /opt/ansible/roles/rhscl-mysql-apb-openshift/tasks/dev.yml for localhost

TASK [rhscl-mysql-apb-openshift : set MySQL deployment with ephemeral storage to present] ***
changed: [localhost]

TASK [rhscl-mysql-apb-openshift : include_tasks] *******************************
skipping: [localhost]

TASK [rhscl-mysql-apb-openshift : Wait for mysql to come up] *******************
fatal: [localhost]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for 172.30.15.21:3306"}

PLAY RECAP *********************************************************************
localhost                  : ok=7    changed=6    unreachable=0    failed=1   

 [WARNING]: Could not create retry file '/opt/apb/actions/update.retry'.
[Errno 13] Permission denied: u'/opt/apb/actions/update.retry'
+ EXIT_CODE=2
+ set +ex
+ '[' -f /var/tmp/test-result ']'
+ exit 2

Comment 28 Jason Montleon 2018-02-12 20:47:05 UTC

Has host-8-245-30.host.centralci.eng.rdu2.redhat.com been reprovisioned? I went to log in to look and the libra.pem key does not get me in and the web interface isn't running on 8443.

Comment 30 Jason Montleon 2018-02-13 15:09:01 UTC

I created a test2 project in your environment.

There I was able to successfully update a mariadb apb from 10.1 dev to 10.2 dev and then again to 10.2 prod in your new environment.

I was also able to successfully go from postgresql 9.4 dev to 9.6 prod.

I also did mysql 5.6 prod to 5.7 dev successfully.

I think this is VERIFIED if you want to try for yourself and look at the ansible-service-broker log to confirm you see I ran the updates.

[2018-02-13T08:30:00.747Z] [INFO] - Listening for update messages
[2018-02-13T14:42:25.773Z] [INFO] - ASYNC update in progress
[2018-02-13T14:42:26.121Z] [INFO] - Update requested for instance 2967f467-7f78-4d17-81cc-031577a81467, but job is already in progress
[2018-02-13T14:48:10.174Z] [INFO] - ASYNC update in progress
[2018-02-13T14:52:42.823Z] [INFO] - ASYNC update in progress
[2018-02-13T15:00:56.809Z] [INFO] - ASYNC update in progress

And finally I spotted a separate issue, which I think we've regressed on that a separate bug should be filed for. Your storage is RWO. If you try to do a rolling update of mediawiki it can't start the new pod because the old pod continues to use the storage. We should switch to a Recreate strategy.

Comment 31 Zihan Tang 2018-02-14 05:32:08 UTC

Jason,
I have verified with crio and non-crio env.
Asb version : 1.1.10
in non-crio env, I think the bug is fixed.
test secnario:
1. postgresql apb:
    9.6 dev -> table created with 'admin' user ->9.5 prod : data preserved;
    9.5 dev-> db created with 'root' user         ->9.6 prod : data preserved;
2. mariaDB apb:
   10.2 prod -> table created with 'admin' user -> 10.1 dev : data preserved;
   10.0 dev -> db created with 'root' user  -> 10.2 prod : data preserved;
3. mysql apb
   5.6 prod -> db created with 'root' user ->  5.7, dev : data preserved;

But In crio env ,disable selinux , if I create db in postsql , the update still waiting at " TASK [rhscl-postgresql-apb : Backup source database] ***************************
" 
But it's better to open another bug to trace. 
So please change the status to ON_QA, I'll mark as verified.

Comment 32 Zihan Tang 2018-02-22 02:10:18 UTC

Verified based on comment 31

Comment 35 errata-xmlrpc 2018-03-28 14:20:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.