Bug 875088

Summary: OVIRT35 - [RFE] ovirt-node-registration - a generic node registration
Product: [Retired] oVirt Reporter: Alon Bar-Lev <alonbl>
Component: ovirt-nodeAssignee: Fabian Deutsch <fdeutsch>
Status: CLOSED DUPLICATE QA Contact: yeylon <yeylon>
Severity: low Docs Contact:
Priority: medium    
Version: 3.5CC: acathrow, alonbl, asegurap, bazulay, bugs, dougsland, fdeutsch, gklein, hadong, iheim, jboggs, leiwang, lpeer, mgoldboi, mkalinin, ovirt-bugs, ovirt-maint, sbonazzo, srevivo, ycui, yeylon
Target Milestone: ---Keywords: FutureFeature, Improvement
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
URL: http://www.ovirt.org/Features/Node/GenericNodeRegistration
Whiteboard: node
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-09-01 07:51:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 515681, 629392, 753315, 1097685, 1113622    

Description Alon Bar-Lev 2012-11-09 13:53:24 UTC
vdsm-reg is doing two separate tasks:

1. node related tasks - registration.

2. ovirt related tasks - bridge and pki management.

functionality of (2) is removed in favour of bootstrap script.
functionality of (1) is generic and actually belongs to the ovirt-node project.

vdsm-reg is to be retired soon in favour of better bootstrap process and generic registration solution.

As there is some objection from ovirt-node maintainers to implement a generic registration component, ovirt-node-registration will be implemented as a standalone component to be provided in ovirt-node images instead of legacy vdsm-reg, in a hope that this will be usable by other projects.

PARAMETERS

url
fingerprint of SSL certificate's root ca

PROTOCOL

HTTP based protocol, simple GET.

url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>
  register within master.

url?cmd=get-sshkey&type=manager
  establish ssh trust.

url?cmd=get-ca-roots
  establish pki trust.

url?cmd=get-bootscript
  acquire initial command (NAT bypass).

TUI

a TUI to be used by ovirt-node TUI management.
branding customization will be available.

FUTURE

Allow different methods of registration (DNS, multicast).

Comment 1 Fabian Deutsch 2013-04-12 14:04:56 UTC
Hey,

I think that you are right that Node should offer a generic way to register with a management system.
But I don't agree on all the assumptions you make.
E.g. not all nodes (gluster or openstack variants) might require a pki or ssh access.

Could you also clarify the url param? I don't understand this part:
url
fingerprint of SSL certificate's root ca

Comment 2 Fabian Deutsch 2013-04-12 14:08:37 UTC
(In reply to comment #1)
> url
> fingerprint of SSL certificate's root ca

okay - fingerprint is a separate param - clear now ;)

Comment 3 Alon Bar-Lev 2013-04-13 19:00:22 UTC
we must establish some kind of trust, best is to provide multiple options. PKI and SSH trusts are the most commonly used so I suggest supporting these. Everything else can be done securely once one (or more) of these trusts is established either by using ssh or TLS.

Comment 4 Fabian Deutsch 2013-08-28 15:13:18 UTC
Alon,

you specified a protocol in the description, now let me clear a couple of points:

A service speaking that protocol is offered by some management component (Engine, Openstack thingy, ...).

Node will have a registration-client which understands this protocol.
The registration-client
- triggers the registration
- establish the pki and ssh trust and 
- finally acquires an initial boot-script
IIUIC the boot-script is in Engine's case ovirt-host-deploy (or otopi?) and is specific to that management component.

The client-registration can be triggered manually or - maybe - the management component could be advertising itself via mdns or something similar and node could use this information to trigger the client-registration automatically.

Does this fit what you think of?

Comment 5 Alon Bar-Lev 2013-09-02 11:37:02 UTC
(In reply to Fabian Deutsch from comment #4)
> Alon,
> 
> you specified a protocol in the description, now let me clear a couple of
> points:
> 
> A service speaking that protocol is offered by some management component
> (Engine, Openstack thingy, ...).

The protocol is offered by the ovirt-node project, it is implemented by any project who wish to consume ovirt-node.
 
> Node will have a registration-client which understands this protocol.

Well, I am thinking of a set of tools...

1. what you call 'client' which is initiated by the TUI or boot process.

2. multicast listener, then execute 'client'.

3. multicast sonar, and socket listener, then execute 'client'.

4. mdns/dyndns

But let's start with simple client that initiating the registration within a specific management server.

> The registration-client
> - triggers the registration
> - establish the pki and ssh trust and 
> - finally acquires an initial boot-script
> IIUIC the boot-script is in Engine's case ovirt-host-deploy (or otopi?) and
> is specific to that management component.

No...

Once ssh trust is established, the engine uses ssh to execute the ovirt-host-deploy.

The boot script is needed if ssh cannot be used... for example:

    manager <<NAT node

This mode is not currently supported, but in future it can be used in order to initiate something similar to ovirt-host-deploy without using ssh.

> The client-registration can be triggered manually or - maybe - the
> management component could be advertising itself via mdns or something
> similar and node could use this information to trigger the
> client-registration automatically.
> 
> Does this fit what you think of?

Yes, see above :)

Thanks!

Comment 6 Fabian Deutsch 2013-09-02 12:49:46 UTC
Nice.

I think that ruled out more unclear points and is a nice starting point, Thanks for your time Alon.

Comment 7 Fabian Deutsch 2014-04-09 13:08:22 UTC
Alon,

is there a plan when vdsm-reg is retired?

Comment 8 Alon Bar-Lev 2014-04-09 13:50:18 UTC
(In reply to Fabian Deutsch from comment #7)
> Alon,
> 
> is there a plan when vdsm-reg is retired?

it is always the next task in my queue, but for some reason it always pushed back.

Comment 9 Fabian Deutsch 2014-04-09 15:42:56 UTC
(In reply to Alon Bar-Lev from comment #8)
> (In reply to Fabian Deutsch from comment #7)
> > Alon,
> > 
> > is there a plan when vdsm-reg is retired?
> 
> it is always the next task in my queue, but for some reason it always pushed
> back.

I see.

Alon or Sandro, we targeted this as a feature for 3.5 and to make this happen someone who has been previously working with vdsm_reg should drive this imo.

What do you think?

Comment 10 Alon Bar-Lev 2014-04-09 15:53:58 UTC
This is a different bug.

It to provide generic protocol for node registration that can be used by any manager (ovirt, openstack).

If this will be available we will use it instead of vdsm-reg.

Comment 11 Ryan Barry 2014-04-22 18:38:09 UTC
(In reply to Alon Bar-Lev from comment #5)
> The protocol is offered by the ovirt-node project, it is implemented by any
> project who wish to consume ovirt-node.
>  
> > Node will have a registration-client which understands this protocol.
> 

Alon:

Just to clarify, is then intention for the node to provide the HTTP API to be consumed by the engine (or other potential managers)? If so, please read below.

Or is it a ovirt-node-registration script to consume this API, which should be provided by any potential managers which want to use node? If this is the case, I'd basically end up rewriting a modularized vdsm-reg, which is ok, but I want to know if that's the intention.

I have a few questions.

> PARAMETERS
> 
> url
> fingerprint of SSL certificate's root ca
I don't actually see the fingerpring used in the hypothetical protocl

> PROTOCOL
> 
> HTTP based protocol, simple GET.
> 
> url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>
>   register within master.
> 
> url?cmd=get-sshkey&type=manager
>  establish ssh trust.
> 
> url?cmd=get-ca-roots
>   establish pki trust.
> 
> url?cmd=get-bootscript
>   acquire initial command (NAT bypass).

How do we register? What's the expected workflow? It's not clear to me from this skeleton.

Management server hits the registration URL.

Then...? We prompt the user through the TUI to approve it? Or should this tell the node to go out and register itself with an external service somewhere (like oVirt engine)? Whose fingerprint, node or management? Whose IP? Whose name?

I'm reading this like the manager should send its name, its IP, and its fingerprint for approval, but I want to clarify. It also seems like we should have a way to bypass any potential approval, but I can't see a secure way to do hands-off node registration, though I admittedly haven't spent a lot of time thinking about it.

get-sshkey and get-ca-roots both have the same question. Are we getting CA roots from the node, or the manager? Are we putting the manager's SSH key into the node's .authorized_keys, or sending an authorized key back to the manager to do something with?

It reads like we should send a pubkey back, presumably after we've approved the manager, that the manager can use to log in. 

The same for SSL. But is this then a separate cert? Should we generate a CSR once approved and create a client cert signed by the node, then send that back?

The rest makes sense, basically:

Then, if it's SSH, the manager can arbitrarily manager the node.
If it's SSL, the manager can tell the node to grab a script and execute it over HTTPS.

Comment 12 Alon Bar-Lev 2014-04-22 18:51:22 UTC
(In reply to Ryan Barry from comment #11)
> (In reply to Alon Bar-Lev from comment #5)
> > The protocol is offered by the ovirt-node project, it is implemented by any
> > project who wish to consume ovirt-node.
> >  
> > > Node will have a registration-client which understands this protocol.
> > 
> 
> Alon:
> 
> Just to clarify, is then intention for the node to provide the HTTP API to
> be consumed by the engine (or other potential managers)? If so, please read
> below.

Correct.

> Or is it a ovirt-node-registration script to consume this API, which should
> be provided by any potential managers which want to use node? If this is the
> case, I'd basically end up rewriting a modularized vdsm-reg, which is ok,
> but I want to know if that's the intention.

I do not suggest to use any of the legacy code, it is not in a quality that can be used.

I strongly suggest clean implementation.

> 
> I have a few questions.
> 
> > PARAMETERS
> > 
> > url
> > fingerprint of SSL certificate's root ca
> I don't actually see the fingerpring used in the hypothetical protocl
> 

the fingerprint is local validation, it cannot be part of the protocol.

> > PROTOCOL
> > 
> > HTTP based protocol, simple GET.
> > 
> > url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>
> >   register within master.
> > 
> > url?cmd=get-sshkey&type=manager
> >  establish ssh trust.
> > 
> > url?cmd=get-ca-roots
> >   establish pki trust.
> > 
> > url?cmd=get-bootscript
> >   acquire initial command (NAT bypass).
> 
> How do we register? What's the expected workflow? It's not clear to me from
> this skeleton.
> 
> Management server hits the registration URL.

hmmm.... it is the opposite...

the manager is the server while the node is the client.

sequence:

1. url?cmd=get-ca-roots using http or https insecure
2. validate fingerprint via UI or parameter (boot)
3. persist the ca certificate
4. url?cmd=get-sshkey&type=manager
5. persist the key for the admin user (currently it is root)
6. url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>&user=xxx

notice I added user=
maybe you can look at [1] for example of new engine registration protocol which is better than what I originate suggested here. it basically what you need + ssh server fingerprint to send.

[1] http://gerrit.ovirt.org/#/c/20815/

> 
> Then...? We prompt the user through the TUI to approve it? Or should this
> tell the node to go out and register itself with an external service
> somewhere (like oVirt engine)? Whose fingerprint, node or management? Whose
> IP? Whose name?
> 
> I'm reading this like the manager should send its name, its IP, and its
> fingerprint for approval, but I want to clarify. It also seems like we
> should have a way to bypass any potential approval, but I can't see a secure
> way to do hands-off node registration, though I admittedly haven't spent a
> lot of time thinking about it.
> 
> get-sshkey and get-ca-roots both have the same question. Are we getting CA
> roots from the node, or the manager? Are we putting the manager's SSH key
> into the node's .authorized_keys, or sending an authorized key back to the
> manager to do something with?

I hope now it is clear.

> 
> It reads like we should send a pubkey back, presumably after we've approved
> the manager, that the manager can use to log in. 
> 
> The same for SSL. But is this then a separate cert? Should we generate a CSR
> once approved and create a client cert signed by the node, then send that
> back?
> 
> The rest makes sense, basically:
> 
> Then, if it's SSH, the manager can arbitrarily manager the node.
> If it's SSL, the manager can tell the node to grab a script and execute it
> over HTTPS.

Comment 13 Ryan Barry 2014-04-22 19:10:46 UTC
(In reply to Alon Bar-Lev from comment #12)
> (In reply to Ryan Barry from comment #11)
> > (In reply to Alon Bar-Lev from comment #5)
> > Just to clarify, is then intention for the node to provide the HTTP API to
> > be consumed by the engine (or other potential managers)? If so, please read
> > below.
> 
> Correct.

> > Management server hits the registration URL.
> 
> hmmm.... it is the opposite...
> 
> the manager is the server while the node is the client.

I'm still unclear on this, which appears to read both ways.

Do you want the node to expose HTTP? If so, what are we exposing?

> 
> sequence:
> 
> 1. url?cmd=get-ca-roots using http or https insecure
> 2. validate fingerprint via UI or parameter (boot)
> 3. persist the ca certificate
> 4. url?cmd=get-sshkey&type=manager
> 5. persist the key for the admin user (currently it is root)
> 6.
> url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>&user=xxx
> 
> notice I added user=

I think what I'm missing for context is what's issuing the GETs in this sequence.

> 1. url?cmd=get-ca-roots using http or https insecure
HTTP_GET->NODE
tells the node to retrive the ca-roots from the engine?
NODE->ENGINE
Actually get the certs

> 2. validate fingerprint via UI or parameter (boot)
Validated in the TUI or with a cmdline arg to the node?

> 3. persist the ca certificate
Happens on node.

> 4. url?cmd=get-sshkey&type=manager
HTTP_GET->NODE
tells the node to retrieve the SSH key from the engine?
NODE->ENGINE
retrieve

> 5. persist the key for the admin user (currently it is root)
Happens on node

> 6.
> url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>&user=xxx
HTTP_GET->NODE
tells the node to execute the registration process as $user, where the registration process can be defined in a module (potentially with another parameter to tell it whether it's gluster, openstack, ovirt, or whatever)
NODE->ENGINE
(registration URL is called, script is run, whatever)


----------------------------------------------


Or is it?
> 1. url?cmd=get-ca-roots using http or https insecure
> 2. validate fingerprint via UI or parameter (boot)
> 3. persist the ca certificate
> 4. url?cmd=get-sshkey&type=manager
> 5. persist the key for the admin user (currently it is root)
> 6.
> url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>&user=xxx
These steps replace vdsm-reg, but all GETs are issued from the node to the engine.

> maybe you can look at [1] for example of new engine registration protocol
> which is better than what I originate suggested here. it basically what you
> need + ssh server fingerprint to send.
> 
> [1] http://gerrit.ovirt.org/#/c/20815/
>

Comment 14 Alon Bar-Lev 2014-04-22 19:16:11 UTC
(In reply to Ryan Barry from comment #13)
> (In reply to Alon Bar-Lev from comment #12)
> > (In reply to Ryan Barry from comment #11)
> > > (In reply to Alon Bar-Lev from comment #5)
> > > Just to clarify, is then intention for the node to provide the HTTP API to
> > > be consumed by the engine (or other potential managers)? If so, please read
> > > below.
> > 
> > Correct.
> 
> > > Management server hits the registration URL.
> > 
> > hmmm.... it is the opposite...
> > 
> > the manager is the server while the node is the client.
> 
> I'm still unclear on this, which appears to read both ways.
> 
> Do you want the node to expose HTTP? If so, what are we exposing?

No. The node is communicating with the manager.
The manager is the HTTP server, just like in our current situation.

> 
> > 
> > sequence:
> > 
> > 1. url?cmd=get-ca-roots using http or https insecure
> > 2. validate fingerprint via UI or parameter (boot)
> > 3. persist the ca certificate
> > 4. url?cmd=get-sshkey&type=manager
> > 5. persist the key for the admin user (currently it is root)
> > 6.
> > url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>&user=xxx
> > 
> > notice I added user=
> 
> I think what I'm missing for context is what's issuing the GETs in this
> sequence.

The node.

> 
> > 1. url?cmd=get-ca-roots using http or https insecure
> HTTP_GET->NODE
> tells the node to retrive the ca-roots from the engine?
> NODE->ENGINE
> Actually get the certs
> 
> > 2. validate fingerprint via UI or parameter (boot)
> Validated in the TUI or with a cmdline arg to the node?
> 
> > 3. persist the ca certificate
> Happens on node.
> 
> > 4. url?cmd=get-sshkey&type=manager
> HTTP_GET->NODE
> tells the node to retrieve the SSH key from the engine?
> NODE->ENGINE
> retrieve
> 
> > 5. persist the key for the admin user (currently it is root)
> Happens on node
> 
> > 6.
> > url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>&user=xxx
> HTTP_GET->NODE
> tells the node to execute the registration process as $user, where the
> registration process can be defined in a module (potentially with another
> parameter to tell it whether it's gluster, openstack, ovirt, or whatever)
> NODE->ENGINE
> (registration URL is called, script is run, whatever)

No. The node initiates that.

> 
> ----------------------------------------------
> 
> 
> Or is it?
> > 1. url?cmd=get-ca-roots using http or https insecure
> > 2. validate fingerprint via UI or parameter (boot)
> > 3. persist the ca certificate
> > 4. url?cmd=get-sshkey&type=manager
> > 5. persist the key for the admin user (currently it is root)
> > 6.
> > url?cmd=register&name=<name>&ip=<ip>&sshkeyfingerprint=<fingerprint>&user=xxx
> These steps replace vdsm-reg, but all GETs are issued from the node to the
> engine.

Correct. In future we also be able to add multicast or other alternatives.

> 
> > maybe you can look at [1] for example of new engine registration protocol
> > which is better than what I originate suggested here. it basically what you
> > need + ssh server fingerprint to send.
> > 
> > [1] http://gerrit.ovirt.org/#/c/20815/
> >

Comment 15 Ryan Barry 2014-04-22 19:19:22 UTC
Thanks, Alon. All clear now.

Comment 16 Alon Bar-Lev 2014-04-22 20:01:50 UTC
Engine new protocol is here[1], it can be a base for this protocol.

The one issue I would like to solve is nat bypass using a script that can be downloaded (get-bootscript), but we can discuss this in future.

[1] http://www.ovirt.org/Features/HostDeployProtocol

Comment 17 Alon Bar-Lev 2014-06-15 18:33:22 UTC
I had a chance to look at this implementation:

1. Generic == establish a protocol which is independent of ovirt-engine

 -> This implementation establish any no protocol, it is just a sequence of commands as far as I can see.

2. Registration == integration with node's TUI, node kernel command-line for PXE boot and registration during boot.

 -> This implementation does not have either.

3. While establish PKI trust a sequence of fingerprint verification of SSL server side CA should be done, see bug#994451 for sequence.

 -> This implementation does not support it.

I am unsure this change that transform this bug to modified progresses us into a generic ovirt-node-registration.

I am reopen it.

Thank you,

Comment 18 Ryan Barry 2014-06-16 01:19:45 UTC
Alon -

I didn't see any reference whatsoever to bug#994451 prior to this. I'll look at it tomorrow and see how it can be implemented.

You're correct that the registration is *not* integrated into PXE, registration during boot, or the TUI. Intentionally. The new workflow is not a re-implementation of vdsm-reg, and is contingent on features from newer engines to work. While we could theoretically check for the presence of the right methods on the engine and fall back to vdsm-reg if not, this add complexity which is not necessarily desirable.

Moreover, no, there is no protocol. As we discussed in the comments above, the node does not currently provide anything which might be used to host such a protocol (with jsonrpc or xmlrpc being the obvious candidates). It also requires that management engines code to the node.

So instead, it's a service definition which can be mapped to various generic actions which can fulfill the node->engine flow in the above comments, as well as adding a node to a other management engines (nova/openstack, archipel, and anything which relies on REST calls, which is virtually everything these days).

What would such a hypothetical protocol look like to you, and how would it be implemented on the node?

Comment 19 Alon Bar-Lev 2014-06-16 01:28:29 UTC
This feature request was to be able to use the node in various of setups and allow registration within manager, regardless what manager is.

It must implement a protocol, it must be integrated with boot, it must be integrated with TUI. The entire registration process should be part of the basic node features.

All the registration process should notify the master "I'm alive", and enable manager to perform SSH into the node. From this point the manager, whatever software it may be, takes over.

If the manager is to implement any code at node side to perform registration, implement the integration with boot process and the TUI dialog, I am unsure what the benefit of the change that was posted.

Comment 20 Ryan Barry 2014-06-18 23:52:29 UTC
Ok -

The json for the registration script now matches what's available in the service.

What I'm missing is this:

What do you mean by a protocol? If you mean a structured series of steps to follow, we discussed one in the earlier comments to this bug. The registration script does not make any decisions about SSL trusts or anything else, because it's designed to be a generic registration script with the actual logic (in a JSON service definition) completed as part of a different task, probably by someone who's more familiar with vdsm than anyone on the node team.


That said, I appreciate that the current generic registration script is, well, just a script, which is not integrated into the TUI or boot process, because it's a base for the replacement for vdsm-reg to be build around, and I'm not sure what the fallback logic is or when we get to break compatibility with older versions of the engine before 4.0.

bug#994451 looks like it should be filed against ovirt-node-plugin-vdsm, but I'm able to implement the logic (and the UI callbacks) in order to finish this RFE.

What I need to know is how compatible the deployment service is, and what our decision tree for when to use it vs legacy vdsm-reg is. When was it first implemented? v0 is "obsoleted and should not be used", but is v0 present in 3.2 and 3.3? When should vdsm-reg be used over the registration service? Should we try http://engine/services/host-register?version=1&command=get-version and fall back to vdsm-reg if it's a 404 or anything other than "1"?

Comment 21 Alon Bar-Lev 2014-06-19 10:56:57 UTC
this bug was opened when the ovirt-node had a roadmap to serve not just ovirt environment. if that's changed and ovirt-node is ovirt specific, then we do not need anything generic.

however, if this project roadmap remains, then there should be complete detach from the way node as a slave reports its existence within network and how the node is set up.

this bug addresses the first, it should include all steps required to let manager eventually ssh to the node. this includes:

1. protocol.
2. ability to securely download manager ssh public key.
3. installation of manager ssh public key into user.
4. report manager of existence (address, name, id) and what user to use.
5. integration within boot process.
6. integration within TUI.

this regardless of ovirt.

the example that was provided is the registration protocol of ovirt-engine, as a prototype to what expected in generic sequence based on https protocol. you can choose whatever protocol you like, the example of ovirt-enigne registration protocol is only one option, which proven to be sufficient.

also noted in previous comments that in future there may be other methods, such as multicast and dyndns, so that there may be other alternatives, in which the node even does not know what managers are out there, but still reach them.

Comment 22 Ryan Barry 2014-06-19 14:10:37 UTC
Yes, this bug was opened when ovirt-node was intended to be more generic.

It currently already supports 2, 3, and 4.

What do you mean by protocol? We discussed the node *not* being an HTTP server (or server of any other type) above. Is the Node presenting an API which can be consumed by others something that's desirable? I thought we had agreed to implement the steps necessary to act as a client to the ovirt-engine registration protocol. Now I'm not sure what's being asked for.

Secondly, In order to replace vdsm-reg against all engine versions, I need this answered:

>What I need to know is how compatible the deployment service is, and what our decision tree for when to use it vs legacy vdsm-reg is. When was it first implemented? v0 is "obsoleted and should not be used", but is v0 present in 3.2 and 3.3? When should vdsm-reg be used over the registration service? Should we try http://engine/services/host-register?version=1&command=get-version and fall back to vdsm-reg if it's a 404 or anything other than "1"?

Comment 23 Alon Bar-Lev 2014-06-19 14:17:42 UTC
(In reply to Ryan Barry from comment #22)
> Yes, this bug was opened when ovirt-node was intended to be more generic.
> 
> It currently already supports 2, 3, and 4.
> 
> What do you mean by protocol? We discussed the node *not* being an HTTP
> server (or server of any other type) above. Is the Node presenting an API
> which can be consumed by others something that's desirable? I thought we had
> agreed to implement the steps necessary to act as a client to the
> ovirt-engine registration protocol. Now I'm not sure what's being asked for.

Once again... what is expected is to have fully functioning registration process of ovirt-node, this cannot be done without a protocol. I do not understand what is missing in my responses.

> Secondly, In order to replace vdsm-reg against all engine versions, I need
> this answered:
> 
> >What I need to know is how compatible the deployment service is, and what our decision tree for when to use it vs legacy vdsm-reg is. When was it first implemented? v0 is "obsoleted and should not be used", but is v0 present in 3.2 and 3.3? When should vdsm-reg be used over the registration service? Should we try http://engine/services/host-register?version=1&command=get-version and fall back to vdsm-reg if it's a 404 or anything other than "1"?

forget vdsm-reg. this bug once will be solved should provide replacement by supporting the new protocol at engine side. vdsm-reg will be kept to support older versions of engine, or we can provide some alternative. ovirt-engine considerations are not related to this bug.

Comment 24 Ryan Barry 2014-06-19 14:48:15 UTC
>Once again... what is expected is to have fully functioning registration process of ovirt-node, this cannot be done without a protocol. I do not understand what is missing in my responses.

I know of two interpretations of protocol:

One as HTTP/SSH/FTP -- network protocols. Since creation of a new protocol is obviously far beyond the scope of this, I would interpret this as using an extant protocol to communicate, by hosting a server on the node or otherwise.

The second is a structured series of steps, which is exactly what the script is doing (the JSON file is an in-order sequence which is run in-order).

Since neither of these are what we're looking for judging by the above comments, where a "series of commands" is unsufficient and we discussed not needing a server of any kind on the node, I am at a loss for what protocol means in this context. Can you elaborate what you mean by protocol?

>forget vdsm-reg. this bug once will be solved should provide replacement by supporting the new protocol at engine side. vdsm-reg will be kept to support older versions of engine, or we can provide some alternative. ovirt-engine considerations are not related to this bug.

The existing patch plus:

http://gerrit.ovirt.org/#/c/28920/2/registration/example.json

Completely implements the protocol at engine side.

However, integration of it into the TUI and autoinstall is difficult, if not outright impossible, without knowing when we should fall back to vdsm-reg, protocol v0, etc. The current registration logic depends on vdsm-reg. The registration logic in the submitted patches will not work with engines before 3.4 (at a guess on the date the new engine registration was merged), yet we have not broken compatibility with earlier versions. 

Since the new patches cannot work with earlier versions and we are not breaking compatibility with those versions, I need some guidance on an appropriate decision tree in order to modify the registration logic into some kind of combined "new registration if the engine supports it, vdsm-reg if it doesn't" amalgam in order to add support into the TUI and autoinstall.

Comment 25 Alon Bar-Lev 2014-06-19 15:06:34 UTC
Can we have someone more senior on this bug?

Comment 26 Ryan Barry 2014-06-19 16:23:50 UTC
Why don't we start over.

It would help if you explained what you want.

What should ovirt-node-registration do and how should it do it? It's clear that you want more than "a series of commands" (even if this is exactly what a consumer of the engine API will do).

There are two questions, with two possible answers to each:

Should there be a server presenting an API from the node side?
Or should it just implement the engine API as a client?

Should it totally break compatibility?
Or should it exist alongside vdsm-reg to maintain compatibility?

Anything beyond this goes into a design discussion.

If we're presenting an API, how should it be consumed?

If we're maintaining compatibility, when do we use the new registration script? Presenting options to users is meaningless. They don't care how the node registers to the engine, only that it happens, so this question is not appropriately answered by a "is the engine new enough (>3.2)?" checkbox.

If the answer to these questions is "use a protocol", you'll have to explain what you mean, perhaps in different terms. 

The engine host-registration is a HTTP/REST API (HTTP rather than REST because it functions purely with GETs and doesn't appear to do anything differently with POST or other HTTP verbs).

If you had to use phrasing other than "protocol" to describe what you're looking for out of ovirt-node-registration, what would it be?

Comment 27 Alon Bar-Lev 2014-06-19 16:44:43 UTC
(In reply to Ryan Barry from comment #26)
> Why don't we start over.
> 
> It would help if you explained what you want.
> 
> What should ovirt-node-registration do and how should it do it? It's clear
> that you want more than "a series of commands" (even if this is exactly what
> a consumer of the engine API will do).

Please read my previous responses.

1. notify manager about existence node existence.
2. establish ssh trust.
3. support doing (1), (2) via kernel command-line for pxe sequence.
4. support doing (1), (2) via TUI for manual sequence.

> There are two questions, with two possible answers to each:
> 
> Should there be a server presenting an API from the node side?

the node specify a protocol, server should implement the protocol.

> Or should it just implement the engine API as a client?

has nothing to do with API.

> Should it totally break compatibility?

this is new, how can it be compatible?

> Or should it exist alongside vdsm-reg to maintain compatibility?

parallel, unrelated. vdsm-reg is part of vdsm/ovirt, totally different project.

> Anything beyond this goes into a design discussion.
> 
> If we're presenting an API, how should it be consumed?

you do not, you exposing a protocol.

> If we're maintaining compatibility, when do we use the new registration
> script? Presenting options to users is meaningless. They don't care how the
> node registers to the engine, only that it happens, so this question is not
> appropriately answered by a "is the engine new enough (>3.2)?" checkbox.

you do not.
 
> If the answer to these questions is "use a protocol", you'll have to explain
> what you mean, perhaps in different terms. 

a protocol is a series of data that is transfered betwen components, in this case it is between the node and the manager.
 
> The engine host-registration is a HTTP/REST API (HTTP rather than REST
> because it functions purely with GETs and doesn't appear to do anything
> differently with POST or other HTTP verbs).

again, please detach from current vdsm-reg/ovirt-engine implementation, these are irrelevant to the discussion.

> If you had to use phrasing other than "protocol" to describe what you're
> looking for out of ovirt-node-registration, what would it be?

...

Comment 28 Ryan Barry 2014-06-19 17:04:12 UTC
>Please read my previous responses.

In comments 12-14, we went back and forth while I tried to figure out what you meant, and came to the conclusion that the node should merely be consuming the API presented by the engine, with the node issuing nothing but GETs. 

>has nothing to do with API.

Ok, maybe I misunderstood the direction of the conversation until this point, and misconstrued all the discussion about vdsm-reg and host-deploy in the first few comments to mean that this was intended to serve as a replacement using the host-deploy API.

Is the purpose of this RFE to *expose* an API which can be consumed by others?

So:

http://node/?command=register&manager={engine}...

Which then establishes PKI trust, and presents something like:

https://node/?command=add-ssh-key&user={user}&key=deadbeef...

To add auth.

>3. support doing (1), (2) via kernel command-line for pxe sequence.
>4. support doing (1), (2) via TUI for manual sequence.

As options to trigger the same from the node without a client consuming the API and invoking command=register?

Comment 29 Alon Bar-Lev 2014-06-19 17:40:17 UTC
again, I am unsure why you keep mentioning API while the interface is a protocol. we require a protocol to be initiated by the node when its manager is known, over https, to let manager knows node is alive and allow node to establish ssh trust.

the first implementation of the protocol is initiated by node using direct communication to manager. as the entire point is that manager needs to be aware of the node. so statements like https://node/ is incorrect by definition... as node is unknown to manager at this point.

future implementations may be based on multicast or other means in which node can be detected without it knows what its manager is.

Comment 30 Ryan Barry 2014-06-19 18:00:07 UTC
Generally, I'd call "a series of steps carried out between components" a transaction. I keep calling it an API because:

The engine host-deploy presents an API. It's various methods which can be invoked/consumed independently by a client, which don't carry any state which needs to be passed onto the next.

Invoking these individual methods in a specific order as a series of steps could be a protocol. 

As could stateful communication of services over JSON-RPC or XML-RPC.

The terms we use don't matter, and it's not an attempt at correction, I'm just trying to clarify the language so I understand what you mean.

Are we or are we not talking about consuming the engine host-deploy API from the node?

If the "first implementation of the protocol is initiated by the node", this seems as if we're discussing calling the series of steps in the registration script:

http://engine/services/host-register?version=1&cmd=get-pki-trust
http://engine/services/host-register?version=1&cmd=get-ssh-trust
http://engine/services/host-register?version=1&cmd=register...

So now the manager is aware of the node, yes?

But from your comments earlier today, this is not the expected resolution of this RFE. 

Does the administrator approve the node, and otopi runs from the engine on the node through the SSH trust to finish it?

Does the engine then consume an API presented by the node?

What is the intended workflow?

Comment 31 Alon Bar-Lev 2014-06-19 18:04:09 UTC
We are going in circles, and you try to teach me different language, which is something you should avoid.

You keep return to ovirt specific sequences, this is not the intention of this discussion.

I would like someone else to be assigned to this bug.

Comment 32 Ryan Barry 2014-06-19 18:14:13 UTC
Alon -

I am trying to clarify. I am not attempting to teach you different language, merely clarifying what the language you are using means to me and why I am confused. The language used does not matter as long as I understand what you are trying to say, and I currently do not.

We are going in circles because a number of comments (#5, #10, a few between you and I) make reference to vdsm-reg and replacing it.

But when I mention vdsm-reg, you tell me to forget it.

When I talk about consuming the engine's API, you tell me we are not consuming it, then refer to notifying the manager of the node's existence (which is done through the host-deploy API).

When we talk about whether or not the node should provide an API (in comments #11-14), it is determined that we are not. But then when I write a script to consume the engine's API and follow its protocol, I am told that was not the intention and the submitted code is useless.

Please explain, step by step, what you expect the node to provide and what a successful completion of this RFE looks like.

Comment 33 Ryan Barry 2014-06-19 19:18:59 UTC
Giving this one more shot:

I'm not sure if you're talking about a server on the node which the manager can talk to.

Or adding another page to the TUI and additional kernel arguments to support the submitted generic registration while also keeping plugin-vdsm around.

Comment 34 Alon Bar-Lev 2014-09-01 07:51:14 UTC
Refresh of this bug: bug#1135921

*** This bug has been marked as a duplicate of bug 1135921 ***