Hide Forgot
Description of problem: Deploying HE over block storage that already contains some data (partitions to be specific) fails due to BZ 1215427 (NOTABUG). Once BZ 1343043 is released, we will have notes regarding it. However, during Hosted Engine deployment, it specifically asks to use "Force", so the user assumes that will not be a problem (but it will be). [ INFO ] Connecting to the storage server The following luns have been found on the requested target: [1] 36001405227484ee340640e38e286ca39 70GiB LIO-ORG hostedengine2 status: used, paths: 1 active Please select the destination LUN (1) [1]: The selected device is already used. To create a vg on this device, you must use Force. WARNING: This will destroy existing data on the device. (Force, Abort)[Abort]? Force Then, after answering all those questions and confirming everything, it fails with: [ ERROR ] Error creating Volume Group: Failed to initialize physical device: ("[u'/dev/mapper/36001405227484ee340640e38e286ca39']",) [ ERROR ] Failed to execute stage 'Misc configuration': Failed to initialize physical device: ("[u'/dev/mapper/36001405227484ee340640e38e286ca39']",) There are two problems: 1) The error is not clear and also incorrect. Is says error creating the VG. But it's the PV that failed to create. This can lead to the user to wrong investigation paths. 2) There is no indication that the "Force" above did not work (it did not). This leaves the user wondering what went wrong. So I assume the user starts investigating what could be wrong, retrying the setup each time. And every time if fails, all those questions may need to be answered again (unless using answers file). It can be quite frustrating. Version-Release number of selected component (if applicable): rhvh-4.0-0.20160919 ovirt-hosted-engine-setup-2.0.2.2-2.el7ev.noarch vdsm-4.18.13-1.el7ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create new LUN 2. Create partitions in the new LUN 3. Try to deploy HE in this LUN. Actual results: Fails with no clear indication on what is wrong Expected results: More user friendly: fails with clear indications, possibly pointing to documentation. AND/OR Offers to dd zero over the LUN* *just in this special case for HE setup, not general storage domain creation after RHV is deployed.
Can't we detect that there are partitions when the LUN is selected (and detected to not be clean)? And then fail right there (and/or) suggest the 'dd' or Docs link, to avoid having to answer all the questions and than fail in a much later stage? It would be more user friendly.
On my opinion, the point is that also force option in the createVG VDSM command is not forcefully enough. In hosted-engine-setup we are not directly managing PVs since we are using a single vdsm command (createVG) to do it all so we cannot really trace where it internally fails. If, as for bug 1343043, we don't want to fix that behavior but just handle it as a documentation issue (suggesting to manually remove the partition table), we could just add an initial warning reporting that also using the force option could be not enough and manually cleaning the selected lun could be required.
KBase should do the trick. Closing.