Bug 1436324 - [preview][prod]deploying Postgresql Database
Summary: [preview][prod]deploying Postgresql Database
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Storage
Version: 3.x
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Hemant Kumar
QA Contact: Wenqi He
URL:
Whiteboard:
: 1435424 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-27 16:31 UTC by dbb2000
Modified: 2017-04-20 19:52 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-20 19:52:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
print screen showing too long deployment duration (42.77 KB, image/png)
2017-03-27 16:31 UTC, dbb2000
no flags Details
trying to create a pod for more than a hour (43.70 KB, image/png)
2017-03-29 15:29 UTC, dbb2000
no flags Details
no pod log available information (61.54 KB, image/png)
2017-03-29 15:30 UTC, dbb2000
no flags Details
kibana screen (46.38 KB, image/png)
2017-03-30 15:38 UTC, dbb2000
no flags Details

Description dbb2000 2017-03-27 16:31:58 UTC
Created attachment 1266713 [details]
print screen showing too long deployment duration

It´s being quite dificult to deploy a Postgresql database instance in my project. OpenShift is trying to deploy it for more than 40 minutes with no success. I can´t even retrieve any log from this current process because is not shown to me. I just sending a print screen as evidence. Could anybody help me, please? Tks

Comment 1 dbb2000 2017-03-27 21:47:44 UTC
Here it follows additional log info from pod:

--> Scaling postgresql-2 to 1
--> Waiting up to 10m0s for pods in deployment postgresql-2 to become ready
W0327 19:58:46.367143       1 reflector.go:330] github.com/openshift/origin/pkg/deploy/strategy/support/lifecycle.go:468: watch of *api.Pod ended with: too old resource version: 1047728707 (1047752717)
error: update acceptor rejected postgresql-2: pods for deployment "postgresql-2" took longer than 600 seconds to become ready

Comment 2 Michal Fojtik 2017-03-29 14:31:29 UTC
Can you check if the postgresql-2 deployment created a pod and can you check the logs from that pod? To make sure postgresql is not stuck initializing.

Comment 3 dbb2000 2017-03-29 15:29:29 UTC
Created attachment 1267320 [details]
trying to create a pod for more than a hour

Comment 4 dbb2000 2017-03-29 15:30:35 UTC
Created attachment 1267322 [details]
no pod log available information

Comment 5 dbb2000 2017-03-29 15:33:34 UTC
I´m currently on 6th postgresql deployment with no success. It´s being difficult to scale up a pod and due to this there is no log available. I´m sending some print screen as evidences coz I haven´t got many info on Console.

Comment 6 Michal Fojtik 2017-03-29 15:36:20 UTC
Thanks! I'm going to ask the Online team to investigate if there are any capacity issues. Also, do you have any other deployments that are failing in similar way?

I can see the pod is in "pending" state, can you please check the "events" for that pod and see why it is pending?

Comment 10 dbb2000 2017-03-29 17:21:27 UTC
here it goes:


1:56:29 PM	Warning	Failed mount 	Failed to attach volume "pvc-0b930f1d-1303-11e7-b5d1-0ebeb1070c7f" on node "ip-172-31-2-63.ec2.internal" with: Timeout waiting for volume state: actual=detached, desired=attached



1:24:24 PM	Warning	Failed mount 	Failed to attach volume "pvc-0b930f1d-1303-11e7-b5d1-0ebeb1070c7f" on node "ip-172-31-2-63.ec2.internal" with: Error attaching EBS volume: VolumeInUse: vol-0399f530d8c354c39 is already attached to an instance status code: 400, request id:
63 times in the last 2 hours

Comment 23 dbb2000 2017-03-29 23:59:59 UTC
I removed all postgresql configuration and installed once again, but the problem persists. Here it follows the last log deployment :


--> Scaling postgresql-1 to 1
--> Waiting up to 10m0s for pods in deployment postgresql-1 to become ready
W0329 23:56:48.276495       1 reflector.go:330] github.com/openshift/origin/pkg/deploy/strategy/support/lifecycle.go:468: watch of *api.Pod ended with: too old resource version: 1056886168 (1056911807)
error: update acceptor rejected postgresql-1: pods for deployment "postgresql-1" took longer than 600 seconds to become ready

Comment 24 Hemant Kumar 2017-03-30 01:10:52 UTC
Thank you - I am working with our team to get relevant logs and see what happened.

Comment 27 dbb2000 2017-03-30 15:37:55 UTC
I´m sending another print screen from Kibana. I don´t know exactly what is, but the elasticsearch plugin is unavailable. There is a link on Deployment log screen pointed to Kibana status screen. I believe worth a look.

Comment 28 dbb2000 2017-03-30 15:38:29 UTC
Created attachment 1267611 [details]
kibana screen

Comment 29 dbb2000 2017-03-31 11:27:20 UTC
I found another log info:

--> Scaling postgresql-5 to 1
--> Waiting up to 10m0s for pods in deployment postgresql-5 to become ready
W0331 11:07:34.314016       1 reflector.go:330] github.com/openshift/origin/pkg/deploy/strategy/support/lifecycle.go:468: watch of *api.Pod ended with: too old resource version: 1063095846 (1063112715)
error: update acceptor rejected postgresql-5: pods for deployment "postgresql-5" took longer than 600 seconds to become ready

Comment 30 Hemant Kumar 2017-03-31 18:50:47 UTC
We have identified the problem and we are working on deploying a fix. 

The problem stems from the fact that, some volumes in AWS were for force detached. This was done as a last resort to fix some volumes which were stuck. But the problem is - a node must be rebooted to update kernel's view of devices after force detaching a volume.

Comment 31 dbb2000 2017-03-31 19:43:26 UTC
Thanks for your effort. Do you know how long will take to fix it, roughly?

Comment 32 Hemant Kumar 2017-04-03 17:57:16 UTC
*** Bug 1435424 has been marked as a duplicate of this bug. ***

Comment 34 Bradley Childs 2017-04-20 14:55:07 UTC
We have identified some problems similar to the one reported.  Do you still have an account db2000?   We went looking for logs but couldn't find your username.

Comment 35 dbb2000 2017-04-20 19:44:32 UTC
After my last post, the problem was solved and deployment service was working perfectly. But now my registration has been cancelled and I´m in a waiting list for a new access.

Cheers,

Davi


Note You need to log in before you can comment on or make changes to this bug.