Tested on CFME-18.104.22.168 & RHV-4.0.5.
Indeed RHV provider is now discovered,
but after the discovery of the provider, and adding it by CFME,
the provider refresh is not done.
"This is because access with IP addresses doesn't
work in 4.0, it is a side effect of changes in the SSO service.
This needs careful investigation, and we may need to do reverse lookups of
the addresses in order to find the host name. "
How would you like to handle this please?
After discovery the provider is added via ip?
Seems weird to me to have a provider defined via operating rather than via fqdn.
For the discovery, ip addresses range is provided.
After RHV is discovered, it is added with name
"RHEV-M(<The ip address of the RHV>)",
and the hostname is <The ip address of the RHV>
Regardless to the discovery, it is possible on CFME to add a RHV provider, by using for hostname the ip address, instead of FQDN, for RHV-3.6, or bellow.
Juan - any thoughts on the complexity to get the FQDN?
Any issues there?
I guess there may be different DNS settings that might have issues with it.
Getting the FQDN isn't complex, if we assume that the user has a working DNS setup. We know that this tends to be false. My suggestion is to try the reverse DNS lookup, but use the IP address anyhow if that fails. That is what the proposed patch does:
Resolve oVirt IP addresses
As there may be cases where the user really wants to use the IP address, the proposed patch also adds a configuration parameter to disable this reverse resolving:
The pull request is this one:
Resolve oVirt IP addresses
It is merged upstream and marked to be backported with the 'euwe/yes' label.
Ilanit, I believe that the message that you see in the log:
[ActiveRecord::RecordInvalid]: Validation failed: Host Name has already been taken Method:[rescue in block in refresh]
Is caused because ManageIQ validates that the 'name' attribute of the 'Host' entity is unique:
If I understand that correctly then this may happen if you have the same oVirt environment added as provider twice, maybe once with the IP address of the engine and another time with the fully qualified host name of the engine. Do you have that?
It may also happen if you have different oVirt environments added as proiders, and they happen to have different hosts that have the same name. For example, I can have a host named 'myhost' in one oVirt engine, and onother host also named 'myhost' in a different oVirt engine. If you add those two oVirt environments as providers to ManageIQ, then you will see this problem when trying to save the inventory. Do you have such configuration? If this is the rout cause of the problem, then I'd say it is a different bug, one which will happen with or without discovery.
Marcel, can you confirm/reject the above hypothesis?
Thanks Juan for the explanation.
Adding CFME, that is connected to a RHV env, again the the same RHV env, using ip address, fail refresh the same,
and thus indeed the problem mentioned in comment #13 is unrelated to the Provider Discovery.
Thus moving bug to Verified.
Opened this bug, for having no error in the case described in comment #13:
Juan and Ilanit, that is right. A host.name has to be uniq across the cfme db.
I'm investigating if this is still a valid assesment
Juan, actually this exception with a hostname already taken should not be raised.
I think the reason it has to be unique cross ems is that we use it to "steal" archived hosts from old EMSs
Could you re-visited the backtrace under that light?
Maybe its still related to connecting to the same env twice
Marcel, according to the backtrace the exception happens here, when calling ems.save!:
And that happens after calling the 'save_hosts_inventory' method, which is the place where the exception is caught and handled.
So, if I understand correctly, this happens when ActiveRecord automatically persist the EMS to host relationship:
The exception isn't handled in this case.
Yes, but it should not happen, because https://github.com/ManageIQ/manageiq/blob/master/app/models/ems_refresh/save_inventory_infra.rb#L149 assigns it to the previous ems. All in all the code tries very hard to find the host with the name in question.
So there might be a hidden bug - but if its not easy to reproduce (e.g. with a db dump that exhibits this) then I would not investigate further, too.
I just reproduced it with the latest master. Just added the same oVirt system twice, first with a host name and then with an IP address. This is what I see in 'evm.log':
[----] E, [2017-03-28T12:00:39.875827 #12552:2acb8866311c] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Refresh::Strategies::Api3#refresh) EMS: [192.168.122.18], id:  Refresh failed
[----] E, [2017-03-28T12:00:39.875994 #12552:2acb8866311c] ERROR -- : [ActiveRecord::RecordInvalid]: Validation failed: Host Name has to be unique per provider type Method:[rescue in block in refresh]
[----] E, [2017-03-28T12:00:39.876077 #12552:2acb8866311c] ERROR -- : /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/validations.rb:78:in `raise_validation_error'
/files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/transactions.rb:324:in `block in save!'
/files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/transactions.rb:395:in `block in with_transaction_returning_status'
/files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/connection_adapters/abstract/database_statements.rb:232:in `block in transaction'
/files/projects/ManageIQ/manageiq/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:91:in `block in refresh_targets_for_ems'
The messages is slightly different, it was changed in this commit:
From that I understand that the message we saw before was not really about a host name already taken, but about the *endpoint* host name already taken. So this is happening just because there are two providers with the same host name. I guess there is a point where this is validated, before actually adding the provider. What is most likely happening is that the validation is performed *before* we do the reverse lookup to convert the IP address to a name, so it passes, because it compares the host name used by the previously existing provider with the IP addrss of the new provier. Later, when doing the refresh it fails, because we try to update the database with the resolved name.
Marcel, if that is the case, we can either do the name resolving before that initial validation, or else avoid updating the database after resolving. What do you suggest?
Good find - and luckily we changed the error message.
Where in the refresh are we changing the hostname of an endpoint? I would not expect this...
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.