Bug 1421003 - [Docs][RFE][SHE] Cleanup script for complete cleaning the hosted engine VM installation after failed installation
Summary: [Docs][RFE][SHE] Cleanup script for complete cleaning the hosted engine VM in...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: Documentation
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ovirt-4.1.1-1
: ---
Assignee: Emma Heftman
QA Contact: Megan Lewis
URL:
Whiteboard:
Depends On: 1001181
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-10 05:36 UTC by Tahlia Richardson
Modified: 2019-05-07 13:13 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-13 10:39:15 UTC
oVirt Team: Docs
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Tahlia Richardson 2017-02-10 05:36:04 UTC
RHV now provides a script to clean up a failed SHE deployment. This likely supersedes the current info in Cleaning Up a Failed Self-hosted Engine Deployment in the SHE Guide. 

A new procedure is required (cleaning the shared storage appears not to be part of the script).

Comment 1 Lucy Bopf 2017-03-06 05:08:46 UTC
Assigning to Emma for review.

Comment 2 Emma Heftman 2017-03-20 15:21:16 UTC
Hi Simone,
With regards to the script to clean up a failed HE installation, could you please explain step 4. The command seems to show only how to clean up the storage. But not how to choose a different one. 
Also 
Did you mean choose a path other than /mnt/nsednev_he_4/*
but use the same command rm -rf

Relevant script:

Interrupt hosted-engine --deploy:
^C[ ERROR ] Failed to execute stage 'Closing up': SIG2
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20170206145745.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue,fix and redeploy
          Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20170206144452-sj4lh8.log

3)Run /usr/sbin/ovirt-hosted-engine-cleanup to remove any leftover from the host:
# /usr/sbin/ovirt-hosted-engine-cleanup
 This will de-configure the host to run ovirt-hosted-engine-setup from scratch. 
Caution, this operation should be used with care.

Are you sure you want to proceed? [y/n]
y
  -=== Destroy hosted-engine VM ===- 
  -=== Stop HA services ===- 
  -=== Shutdown sanlock ===- 
shutdown force 1 wait 0
shutdown done 0
  -=== Disconnecting the hosted-engine storage domain ===- 
  -=== De-configure VDSM networks ===- 
  -=== Stop other services ===- 
  -=== De-configure external daemons ===- 
  -=== Removing configuration files ===- 
? /etc/init/libvirtd.conf already missing
- removing /etc/libvirt/nwfilter/vdsm-no-mac-spoofing.xml
? /etc/ovirt-hosted-engine/answers.conf already missing
- removing /etc/ovirt-hosted-engine/hosted-engine.conf
- removing /etc/vdsm/vdsm.conf
- removing /etc/pki/vdsm/certs/cacert.pem
- removing /etc/pki/vdsm/certs/vdsmcert.pem
- removing /etc/pki/vdsm/keys/vdsmkey.pem
- removing /etc/pki/vdsm/libvirt-spice/ca-cert.pem
- removing /etc/pki/vdsm/libvirt-spice/ca-key.pem
- removing /etc/pki/vdsm/libvirt-spice/server-cert.pem
- removing /etc/pki/vdsm/libvirt-spice/server-key.pem
? /etc/pki/CA/cacert.pem already missing
? /etc/pki/libvirt/*.pem already missing
? /etc/pki/libvirt/private/*.pem already missing
? /etc/pki/ovirt-vmconsole/*.pem already missing
- removing /var/cache/libvirt/qemu
- removing /var/run/ovirt-hosted-engine-ha/vm.conf
- removing /var/run/ovirt-hosted-engine-ha/vm.conf.20170206145727

4)Clean the shared storage or choose a different one:
rm -rf /mnt/nsednev_he_4/*

Comment 3 Simone Tiraboschi 2017-03-27 12:39:40 UTC
(In reply to Emma Heftman from comment #2)
> Hi Simone,
> With regards to the script to clean up a failed HE installation, could you
> please explain step 4. The command seems to show only how to clean up the
> storage. But not how to choose a different one. 

/usr/sbin/ovirt-hosted-engine-cleanup simply cleans up the host where it's run to ensure we don't have any leftover.
Cleaning up the shared storage device is instead a manual action up to the user: how to clean it up depends from the specific storage technology (NFS vs iSCSI vs FC vs GlusterFS) but it's in general something it's better to do on the storage server side.

The user has to necessary cleanup the shared storage if he wants to try again on it or he could also deploy on a different device (a different iSCSI or FC LUN, a different NFS share, a different gluster volume). In both the case is recommended cleaning up the host with /usr/sbin/ovirt-hosted-engine-cleanup

Comment 5 Emma Heftman 2017-03-28 13:05:15 UTC
Hi Simone
Please review and comment on the updated cleanup section in the SHE Installation guide.
Note that I left the original error message that appeared in the documentation, but it appears to be different from the messages in your bug. Please confirm what should appear.

http://file.tlv.redhat.com/~eheftman/bz1421003/html-single/#Cleaning_Up_a_Failed_Self-hosted_Engine_Deployment

Comment 6 Simone Tiraboschi 2017-03-28 15:23:54 UTC
(In reply to Emma Heftman from comment #5)
> Please confirm what should appear.


rm -rf /mnt/nsednev_he_4/*
was specific to Nikolai Sednev's host; how to clean it up depends on how and where you tried to set it up.

The error message on partial deployment could vary depending on where it got interrupted.

Comment 7 Emma Heftman 2017-03-29 13:14:20 UTC
(In reply to Simone Tiraboschi from comment #6)
> (In reply to Emma Heftman from comment #5)
> > Please confirm what should appear.
> 
> 
> rm -rf /mnt/nsednev_he_4/*
> was specific to Nikolai Sednev's host; how to clean it up depends on how and
> where you tried to set T up.
> 
> The error message on partial deployment could vary depending on where it got
> interrupted.

Thanks Simone. For the storage cleanup should I just write 
rm -rf <storage directory>/*

Comment 8 Emma Heftman 2017-03-29 14:55:07 UTC
Hi Nikolai
Could you please review the new cleanup documentation and especially step 3 which discusses cleaning up storage.

http://file.tlv.redhat.com/~eheftman/bz1421003/html-single/#Cleaning_Up_a_Failed_Self-hosted_Engine_Deployment

Comment 10 Nikolai Sednev 2017-04-03 09:49:43 UTC
(In reply to Emma Heftman from comment #8)
> Hi Nikolai
> Could you please review the new cleanup documentation and especially step 3
> which discusses cleaning up storage.
> 
> http://file.tlv.redhat.com/~eheftman/bz1421003/html-single/
> #Cleaning_Up_a_Failed_Self-hosted_Engine_Deployment

Looks fine, I have no objections.

Comment 11 Emma Heftman 2017-04-03 10:04:32 UTC
Hi Megan

Please review this merge request:
https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/merge_requests/385

Link to updated documentation:
 http://file.tlv.redhat.com/~eheftman/bz1421003/html-single/
#Cleaning_Up_a_Failed_Self-hosted_Engine_Deployment


Note You need to log in before you can comment on or make changes to this bug.