Bug 1417757
Summary: | CF fails to provider discover RHV4.0 | ||
---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Satoe Imaishi <simaishi> |
Component: | Providers | Assignee: | Juan Hernández <juan.hernandez> |
Status: | CLOSED ERRATA | QA Contact: | Ilanit Stein <istein> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 5.6.0 | CC: | cbudzilo, cpelland, istein, jfrey, jhardy, juan.hernandez, masayag, mbetak, mhild, obarenbo, oourfali, simaishi |
Target Milestone: | GA | Keywords: | ZStream |
Target Release: | 5.7.2 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | rhev:discovery | ||
Fixed In Version: | 5.7.2.0 | Doc Type: | Release Note |
Doc Text: |
This release corrects an issue with RHV server refusing to authenticate requests that use the IP address instead of the fully qualified host name.
The RHV provider has been modified so that when it receives an IP address instead of a fully qualified host name, it will try to find the corresponding fully qualified host name, doing a reverse DNS lookup if required.
If a user does not want to use DNS, the RHV server can be explicitly configured to accept IP addresses.
|
Story Points: | --- |
Clone Of: | 1382732 | Environment: | |
Last Closed: | 2017-04-12 14:36:34 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | RHEVM | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1382732 | ||
Bug Blocks: |
Comment 2
Ilanit Stein
2017-02-02 10:04:09 UTC
After discovery the provider is added via ip? Seems weird to me to have a provider defined via operating rather than via fqdn. For the discovery, ip addresses range is provided. After RHV is discovered, it is added with name "RHEV-M(<The ip address of the RHV>)", and the hostname is <The ip address of the RHV> Regardless to the discovery, it is possible on CFME to add a RHV provider, by using for hostname the ip address, instead of FQDN, for RHV-3.6, or bellow. Juan - any thoughts on the complexity to get the FQDN? Any issues there? I guess there may be different DNS settings that might have issues with it. Getting the FQDN isn't complex, if we assume that the user has a working DNS setup. We know that this tends to be false. My suggestion is to try the reverse DNS lookup, but use the IP address anyhow if that fails. That is what the proposed patch does: Resolve oVirt IP addresses https://github.com/ManageIQ/manageiq/pull/13767 As there may be cases where the user really wants to use the IP address, the proposed patch also adds a configuration parameter to disable this reverse resolving: :ems: :ems_redhat: :resolve_ip_addresses: true Makes sense. The pull request is this one: Resolve oVirt IP addresses https://github.com/ManageIQ/manageiq/pull/13767 It is merged upstream and marked to be backported with the 'euwe/yes' label. Ilanit, I believe that the message that you see in the log: [ActiveRecord::RecordInvalid]: Validation failed: Host Name has already been taken Method:[rescue in block in refresh] Is caused because ManageIQ validates that the 'name' attribute of the 'Host' entity is unique: https://github.com/ManageIQ/manageiq/blob/euwe-2/app/models/host.rb#L42 If I understand that correctly then this may happen if you have the same oVirt environment added as provider twice, maybe once with the IP address of the engine and another time with the fully qualified host name of the engine. Do you have that? It may also happen if you have different oVirt environments added as proiders, and they happen to have different hosts that have the same name. For example, I can have a host named 'myhost' in one oVirt engine, and onother host also named 'myhost' in a different oVirt engine. If you add those two oVirt environments as providers to ManageIQ, then you will see this problem when trying to save the inventory. Do you have such configuration? If this is the rout cause of the problem, then I'd say it is a different bug, one which will happen with or without discovery. Marcel, can you confirm/reject the above hypothesis? Thanks Juan for the explanation. Adding CFME, that is connected to a RHV env, again the the same RHV env, using ip address, fail refresh the same, and thus indeed the problem mentioned in comment #13 is unrelated to the Provider Discovery. Thus moving bug to Verified. Opened this bug, for having no error in the case described in comment #13: bug 1436199 Juan and Ilanit, that is right. A host.name has to be uniq across the cfme db. I'm investigating if this is still a valid assesment Juan, actually this exception with a hostname already taken should not be raised. https://github.com/ManageIQ/manageiq/blob/master/app/models/ems_refresh/save_inventory_infra.rb#L179-L184 I think the reason it has to be unique cross ems is that we use it to "steal" archived hosts from old EMSs https://github.com/ManageIQ/manageiq/blob/master/app/models/ems_refresh/save_inventory_infra.rb#L149 and https://github.com/ManageIQ/manageiq/blob/master/app/models/ems_refresh/save_inventory_infra.rb#L376-L384 Could you re-visited the backtrace under that light? Maybe its still related to connecting to the same env twice Marcel, according to the backtrace the exception happens here, when calling ems.save!: https://github.com/ManageIQ/manageiq/blob/euwe-2/app/models/ems_refresh/save_inventory_infra.rb#L74 And that happens after calling the 'save_hosts_inventory' method, which is the place where the exception is caught and handled. So, if I understand correctly, this happens when ActiveRecord automatically persist the EMS to host relationship: https://github.com/ManageIQ/manageiq/blob/euwe-2/app/models/ext_management_system.rb#L34 The exception isn't handled in this case. Yes, but it should not happen, because https://github.com/ManageIQ/manageiq/blob/master/app/models/ems_refresh/save_inventory_infra.rb#L149 assigns it to the previous ems. All in all the code tries very hard to find the host with the name in question. So there might be a hidden bug - but if its not easy to reproduce (e.g. with a db dump that exhibits this) then I would not investigate further, too. I just reproduced it with the latest master. Just added the same oVirt system twice, first with a host name and then with an IP address. This is what I see in 'evm.log': ---8<--- [----] E, [2017-03-28T12:00:39.875827 #12552:2acb8866311c] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Refresh::Strategies::Api3#refresh) EMS: [192.168.122.18], id: [2] Refresh failed [----] E, [2017-03-28T12:00:39.875994 #12552:2acb8866311c] ERROR -- : [ActiveRecord::RecordInvalid]: Validation failed: Host Name has to be unique per provider type Method:[rescue in block in refresh] [----] E, [2017-03-28T12:00:39.876077 #12552:2acb8866311c] ERROR -- : /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/validations.rb:78:in `raise_validation_error' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/validations.rb:50:in `save!' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/attribute_methods/dirty.rb:30:in `save!' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/transactions.rb:324:in `block in save!' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/transactions.rb:395:in `block in with_transaction_returning_status' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/connection_adapters/abstract/database_statements.rb:232:in `block in transaction' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/connection_adapters/abstract/transaction.rb:189:in `within_new_transaction' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/connection_adapters/abstract/database_statements.rb:232:in `transaction' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/transactions.rb:211:in `transaction' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/transactions.rb:392:in `with_transaction_returning_status' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/transactions.rb:324:in `save!' /files/rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/activerecord-5.0.2/lib/active_record/suppressor.rb:45:in `save!' /files/projects/ManageIQ/manageiq/app/models/ems_refresh/save_inventory_infra.rb:74:in `save_ems_infra_inventory' /files/projects/ManageIQ/manageiq/app/models/ems_refresh/save_inventory.rb:14:in `save_ems_inventory' /files/projects/ManageIQ/manageiq/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:156:in `save_inventory' /files/projects/ManageIQ/manageiq/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:91:in `block in refresh_targets_for_ems' --->8--- The messages is slightly different, it was changed in this commit: https://github.com/ManageIQ/manageiq/pull/12912 From that I understand that the message we saw before was not really about a host name already taken, but about the *endpoint* host name already taken. So this is happening just because there are two providers with the same host name. I guess there is a point where this is validated, before actually adding the provider. What is most likely happening is that the validation is performed *before* we do the reverse lookup to convert the IP address to a name, so it passes, because it compares the host name used by the previously existing provider with the IP addrss of the new provier. Later, when doing the refresh it fails, because we try to update the database with the resolved name. Marcel, if that is the case, we can either do the name resolving before that initial validation, or else avoid updating the database after resolving. What do you suggest? Good find - and luckily we changed the error message. Where in the refresh are we changing the hostname of an endpoint? I would not expect this... Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:0898 |