Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1595934

Summary: Healthcheck URI for controller/nb/v2/neutron no longer works and requires auth
Product: Red Hat OpenStack Reporter: Tim Rozet <trozet>
Component: opendaylightAssignee: lpeer <lpeer>
Status: CLOSED WONTFIX QA Contact: Itzik Brown <itbrown>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15.0 (Stein)CC: mkolesni, nyechiel, trozet, vorburger
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-28 14:38:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Karaf log none

Description Tim Rozet 2018-06-27 20:32:55 UTC
Description of problem:
The URI used to check ODL health in tripleo-common (docker healthcheck) and haproxy backend check is no longer working. First it claims it requires Auth:

[root@overcloud-controller-0 ~]#  curl --fail 192.0.2.5:8081/controller/nb/v2/neutron
curl: (22) The requested URL returned error: 401 Unauthorized

Then after providing auth, it gives 404 not found:
[root@overcloud-controller-0 ~]#  curl --fail -u admin:admin http://192.0.2.5:8081/controller/nb/v2/neutron
curl: (22) The requested URL returned error: 404 Not Found

However, restconf still seems to work:
[root@overcloud-controller-0 ~]#  curl --fail -u admin:admin http://192.0.2.5:8081/restconf/operational/network-topology:network-topology/topology/netvirt:1
{"topology":[{"topology-id":"netvirt:1"}]}

Version-Release number of selected component (if applicable):
Fluorine

Comment 1 Tim Rozet 2018-06-27 20:34:02 UTC
Impact of this bug is that deployment is unusable.

Comment 2 Tim Rozet 2018-06-27 20:51:04 UTC
Created attachment 1455129 [details]
Karaf log

Comment 3 Michael Vorburger 2018-06-28 10:11:52 UTC
After local testing, I can confirm that on an upstream ODL from today's master (future Fluorine release) this is indeed how it currently behaves - I can "reproduce" this.

But, but... it also seems to me that this is "normal" - neutron REST API is protected (and, surely, we want it to be?), so the 401 actually seems right?  Also /controller/nb/v2/neutron is just the root of a REST API, but there is "nothing there"... I just checked the neutron northbound code related to this - there is nothing registered at the root.  What response were you hoping to get on that URL?  

It does however work if you query any of that API's real endpoints; for example like below, so perhaps the simplest short term fix is if you just do a get on whatever is your favourite Neutron pet object and GET that:

curl --fail -u admin:admin localhost:8181/controller/nb/v2/neutron/qos/policies/
{
   "policies" : [ ]
}%

So while again the current behaviour actually does seem correct to me, we (me) did make changes related to the Web API stuff recently, so perhaps this used to behave differently earlier, and this is a subtle regression... I'll try out the same with Oxygen later and then update this re. the situation on that branch.

BTW: As previously discussed, the /neutron/ URLs don't really prove all that much in terms of ODL's readyness, anyway; a lot of other things can be completely badly broken and that will still be OK.  Could I re-recommend that the URI used to check ODL health in tripleo-common (docker healthcheck) and haproxy backend check be changed to the new and much more interesting and reliably /diagstatus instead of neutron - see our Bug 1577853 ?

Comment 4 Michael Vorburger 2018-06-28 10:36:08 UTC
> try out the same with Oxygen later

$ http --auth admin:admin GET localhost:8181/controller/nb/v2/neutron
HTTP/1.1 302 Found
Content-Length: 0
Location: http://localhost:8181/controller/nb/v2/neutron/


So....

1. we did break this on master (only).  Sorry!  ;-)

2. the fact that it behaves differently on upstream master than on stable/oxygen technically means that this is invalid as a downstream ODL RH BZ? :) It does work fine in our current product... ;-)  Do you want to close this bug here and open an upstream JIRA issue (against the ODL Neutron project), instead?

3. it's really more of a "side effect" (coincidence) that this ever worked before.  There is nothing registered at that root URL in the code.  It looks like some recent upgrade or whatever made this more strict, and now we return a 404 instead of a 302 as it used to.  But it's "more correct" to have a 404 than a 302 on that URL, IMHO.

4. the simplest short term fix is if you just do a HTTP GET on whatever is your favourite Neutron pet object (e.g. /controller/nb/v2/neutron/qos/policies/, or any other), instead of the root URL (/controller/nb/v2/neutron).  But again, a "success" on those URLs proves relatively little in terms of how "ready" ODL really is - Bug 1577853 with /diagstatus really is the right way forward.

5. just in case you would have to insist that this is a major regression with huge impacts, then it may be possible to do some development to add a "fake" new API endpoint on the root URL to explicitly return 302.  I would only do this reluctantly.

Comment 5 Tim Rozet 2018-06-28 14:38:45 UTC
Thanks Michael. Yes I believe since Rocky will use Oxygen, and this bug is only present in Fluorine then this would have to be targeted at version 15. Doesn't seem to warrant a downstream BZ and the real fix should be to move to diagstatus. Just please make sure that no patches get into Oxygen that remove that healthcheck, until we have migrated the healthcheck in the relevant OOO projects. I'll go work on 1577853 now :)