Bug 1381378

Summary: false error when creating a router.
Product: OKD Reporter: Peter Ruan <pruan>
Component: RoutingAssignee: Phil Cameron <pcameron>
Status: CLOSED DUPLICATE QA Contact: zhaozhanqi <zzhao>
Severity: low Docs Contact:
Priority: low    
Version: 3.xCC: akostadi, aloughla, aos-bugs, bbennett
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-20 15:06:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Peter Ruan 2016-10-03 22:42:29 UTC
Description of problem:
oadm reports error when creating an new route even though it's created successfully.



Version-Release number of selected component (if applicable):


How reproducible:
always.

Steps to Reproduce:
[root@openshift-153 ~]# oadm router blahblah --images=registry.access.redhat.com/openshift3/ose-haproxy-router:v3.3.0.34 --replicas=2
info: password for stats user admin has been set to zSIsifa6YT
--> Creating router blahblah ...
    error: serviceaccounts "router" already exists
    clusterrolebinding "router-blahblah-role" created
    deploymentconfig "blahblah" created
    service "blahblah" created
--> Failed
[root@openshift-153 ~]# echo $?
1

Actual results:

return code should be 0 and no failure message.

Expected results:


Additional info:

Comment 1 Aleksandar Kostadinov 2016-10-04 07:32:32 UTC
Tested with

> oc v3.3.0.34
> kubernetes v1.3.0+52492b4

and

> oc v3.3.0.32
> kubernetes v1.3.0+52492b4

Comment 2 Aleksandar Kostadinov 2016-10-04 07:50:18 UTC
Another glitch I see with both environments is that creating a router:

> oadm router tc-518936 --images=openshift3/ose-haproxy-router:v3.3.0.34 --replicas=2

often ends up with 

> NAME              REVISION   DESIRED   CURRENT   TRIGGERED BY
> tc-518936         1          0         0         config

It most likely is related to the fact the `oadm router` command was exited with failure. But thought to share anyway.

Comment 3 Aleksandar Kostadinov 2016-10-04 10:22:45 UTC
I noticed that if this happens - to see 0 replicas configured after create, I can manually increase replicas and things start working. Then if I remove the new router DC, SVC and clusterrolebinding, then susequent creations of the route with the reported command, does start more replicas. Is it possible that there is some caching of replica num kept somewhere so that it affects future deployments with same name?

Comment 4 Ben Bennett 2016-10-04 13:53:07 UTC
Please open a second bug for the replicas problem.  The oadm router issue is known and only because one of the steps failed.  We need to make it smarter there, and I'd like to keep this bug focused on that.

Comment 5 Aleksandar Kostadinov 2016-10-05 21:10:30 UTC
Thank you, I filed bug 1382142.

Wrt the failure, another concern should be when the cluster role binding already exists. See:

> error: rolebinding "router-tc-518936-role" already exists

I think that existing service account and existin cluster role binding should be handled gracefully.

Comment 6 Ben Bennett 2016-11-17 21:29:10 UTC
See @smarterclayton's comment on https://github.com/openshift/origin/pull/11756#issuecomment-261337473:

---
So here's my take.

All the infra create commands are generators (they create a config set to apply). If any of the things already exist, the whole command should exit (instead of creating) and require you to decide what to do (pipe to apply, delete, etc).

If we fix this so that the BulkCreater can preflight check existence, we can simply fail on that check and get the right behavior.
---

Comment 7 Phil Cameron 2016-12-19 20:33:38 UTC
Tested against current checked in origin (3.4 based).

# oadm router rrr
info: password for stats user admin has been set to 8oXIs1DgDw
--> Creating router rrr ...
    warning: serviceaccounts "router" already exists
    warning: clusterrolebinding "router-rrr-role" already exists
    deploymentconfig "rrr" created
    service "rrr" created
--> Success
root@wsfd-netdev22: ~/git/src/github.com/openshift/origin # oc get dc
NAME              REVISION   DESIRED   CURRENT   TRIGGERED BY
docker-registry   12         1         1         config
ipf-ha-router     9          2         2         config
phil11            5          0         0         config
phil12            3          0         0         config
rrr               1          1         1         config



This appears to be fixed. The "already exists" i snow reported as a warning. The commands ends with Success.

Comment 8 Ben Bennett 2016-12-20 15:06:21 UTC

*** This bug has been marked as a duplicate of bug 1332432 ***