Bug 1255779 - RFE: Allow reusing installed hosts in other deployments
Summary: RFE: Allow reusing installed hosts in other deployments
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Quickstart Cloud Installer
Classification: Red Hat
Component: Installation - RHCI
Version: 1.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 1.0
Assignee: John Matthews
QA Contact: Dave Johnson
Dan Macpherson
URL:
Whiteboard:
: 1254712 (view as bug list)
Depends On:
Blocks: 1331555
TreeView+ depends on / blocked
 
Reported: 2015-08-21 14:30 UTC by Antonin Pagac
Modified: 2016-04-29 16:17 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
: 1331555 (view as bug list)
Environment:
Last Closed: 2016-04-28 19:21:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1255889 0 unspecified VERIFIED second deployment lists hosts as available which were successfully used in a first deployment 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1255893 0 unspecified NEW [RFE]: allow for amending deployments; RHEV is already installed, now deploy CF 2021-02-22 00:41:40 UTC

Internal Links: 1255889 1255893

Description Antonin Pagac 2015-08-21 14:30:01 UTC
Description of problem: Currently, if trying to reuse installed hosts in other deployments, an error appears and installation can not continue. If the user decides to redeploy, he should be allowed to do so successfully. A simple example usecase:

1. User does a deploy of RHEV called Dev1.
2. Deployment finishes successfully.
3. User then decides he suddenly wants CFME also - how to do this?
4. So he makes a second deployment, of RHEV and CFME, called Dev2.
5. Previously used hosts appear in the installation options.
6. Installation starts and completes successfully.
7. User now has deployment Dev2 with RHEV and CFME.

Comment 1 John Matthews 2015-08-21 14:54:06 UTC
*** Bug 1254712 has been marked as a duplicate of this bug. ***

Comment 2 Dave Johnson 2015-08-21 19:43:18 UTC
So the proper procedure would include deleting the hosts from Satellite first, allowing them to re-PXE into foreman discovery, and then try again. 

We also opened bug 1255893 and bug 1255889 which are related

Comment 3 Dave Johnson 2015-08-21 19:44:15 UTC
Grr, I meant to mention, QE is going to test this out and see how it goes, this probably will need to move to documentation but I want to see it work before doing that.

Comment 4 Antonin Pagac 2015-08-24 11:39:45 UTC
After delete and re-discovery of the hosts and start of a second deployment, the Actions::Fusor::Deployment::Rhev::Deploy tasks hangs on timeout:

Foreman::Exception: ERF42-7017 [Foreman::Exception]: You've reached the timeout set for this action. If the action is still ongoing, you can click on the "Resume Deployment" button to continue.

This is because one of the machines is trying to download a kickstart file, which fails. foreman-proxy/proxy.log says:

ERROR -- : Failed to retrieve provision template for e4d36a7f-44b1-4f01-a314-0df432e9954f: Error retrieving provision for e4d36a7f-44b1-4f01-a314-0df432e9954f from sat.rhci.com: Net::HTTPNotFound
    "GET /unattended/provision?token=e4d36a7f-44b1-4f01-a314-0df432e9954f HTTP/1.1" 500 184 0.0567

with corresponding production.log message:

   Parameters: {"token"=>"e4d36a7f-44b1-4f01-a314-0df432e9954f", "unattended"=>{}}
unattended: unable to find a host that matches the request from 192.168.77.1
Filter chain halted as :get_host_details rendered or redirected
Completed 404 Not Found in 3ms (ActiveRecord: 0.9ms)

I tried to download the kickstart multiple times, also restarting the host, nothing helped.

Comment 5 John Matthews 2015-08-24 12:05:50 UTC
(In reply to Antonin Pagac from comment #4)
> After delete and re-discovery of the hosts and start of a second deployment,
> the Actions::Fusor::Deployment::Rhev::Deploy tasks hangs on timeout:
> 
> Foreman::Exception: ERF42-7017 [Foreman::Exception]: You've reached the
> timeout set for this action. If the action is still ongoing, you can click
> on the "Resume Deployment" button to continue.
> 
> This is because one of the machines is trying to download a kickstart file,
> which fails. foreman-proxy/proxy.log says:
> 
> ERROR -- : Failed to retrieve provision template for
> e4d36a7f-44b1-4f01-a314-0df432e9954f: Error retrieving provision for
> e4d36a7f-44b1-4f01-a314-0df432e9954f from sat.rhci.com: Net::HTTPNotFound
>     "GET /unattended/provision?token=e4d36a7f-44b1-4f01-a314-0df432e9954f
> HTTP/1.1" 500 184 0.0567
> 
> with corresponding production.log message:
> 
>    Parameters: {"token"=>"e4d36a7f-44b1-4f01-a314-0df432e9954f",
> "unattended"=>{}}
> unattended: unable to find a host that matches the request from 192.168.77.1
> Filter chain halted as :get_host_details rendered or redirected
> Completed 404 Not Found in 3ms (ActiveRecord: 0.9ms)
> 
> I tried to download the kickstart multiple times, also restarting the host,
> nothing helped.


This sounds similar to when a provisioning token times out, I wonder if the deletion was not complete and an older token was used for fetching the kickstart.

I'd recommend chatting with Sat6 QE to learn how they test discovery, how do they delete a host and rediscover.

Comment 6 Antonin Pagac 2015-08-25 18:14:11 UTC
After a second try, I think we made a mistake when deleting hosts from Satellite. All content hosts should be deleted, as well as all hosts that are under Hosts -> All hosts menu in Satellite. Then the host machines can be rebooted, rediscovered and successfully used in second deployment.

However, this stands only for hosts provision and deployment of RHEV. When deployment continues and CFME installation begins, we hit an error:

Net::SSH::HostKeyMismatch: fingerprint 31:68:48:1e:e4:af:65:34:fb:1c:4c:3c:8c:4c:4a:4b does not match for "192.168.77.102"

while running Actions::Fusor::Deployment::Rhev::TransferImage task.
This makes me think that something still persists somewhere, and second deployment can not be successfully completed.

Comment 7 Todd Sanders 2016-04-28 19:21:54 UTC
Cloned to 1331555 for leaving SSH Fingerprint in known hosts.  Closing this RFE, as host was not properly deleted.


Note You need to log in before you can comment on or make changes to this bug.