Bug 875527

Summary: PRD32 - bootstrap: do not get unique id at canDoAction
Product: Red Hat Enterprise Virtualization Manager Reporter: Alon Bar-Lev <alonbl>
Component: ovirt-engineAssignee: Alon Bar-Lev <alonbl>
Status: CLOSED ERRATA QA Contact: Pavel Stehlik <pstehlik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: bazulay, dougsland, dyasny, iheim, lpeer, Rhev-m-bugs, sgrinber, yeylon, ykaul, yzaslavs
Target Milestone: ---Keywords: Improvement
Target Release: 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-06-10 21:19:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 866889, 875920, 915537    

Description Alon Bar-Lev 2012-11-11 22:27:50 UTC
commit bd3054dc68a2bc5817e47fc894321f0a4a714c7e
Author: Alon Bar-Lev <alonbl>
Date:   Tue Nov 6 00:48:58 2012 +0200

    bootstrap: do not get unique id at canDoAction
    
    CURRENT IMPLEMENTATION
    
    engine has duplicate complex vdsm logic to generate the vdsm id.
    
        @Reloadable
        @TypeConverterAttribute(String.class)
        @DefaultValueAttribute(
            "IDFILE=/etc/vdsm/vdsm.id; " +
            "if [ -r \"${IDFILE}\" ]; then " +
                "cat \"${IDFILE}\"; " +
            "else " +
                "UUID=\"$(" +
                    "dmidecode -s system-uuid 2> /dev/null | " +
                    "sed -e 's/.*Not.*//' " +
                ")\"; " +
                "if [ -z \"${UUID}\" ]; then " +
                    "UUID=\"$(uuidgen 2> /dev/null)\" && " +
                    "mkdir -p \"$(dirname \"${IDFILE}\")\" && " +
                    "echo \"${UUID}\" > \"${IDFILE}\" && " +
                    "chmod 0644 \"${IDFILE}\"; " +
                "fi; " +
                "[ -n \"${UUID}\" ] && echo \"${UUID}\"; " +
            "fi"
        )
        BootstrapNodeIDCommand(372),
    
    The command is executed synchronously (UI wise) when host is added.
    
    ASSUMPTIONS OF CURRENT IMPLEMENTATION
    
     1. dmidecode exists out of the box in any distribution.
    
     ---> WRONG: fedora 17 has no, and other distributions may also lack.
    
     2. host id is only dmidecode output.
    
     ---> WRONG: over time we saw that we need extra logic to keep the id
          sane, especially when the hardware id does not exist or
          malformed.
    
     3. dmidecode utility is used for host id
    
     ---> WRONG: there are plans to make it more secure/robust using TPM,
          which requires software at host to generate.
    
    PROBLEMS IN CURRENT IMPLEMENTATION
    
     1. if dmidecode utility is missing we cannot acquire host id before
        performing bootstrap. The whole idea of bootstrap process is to take
        vanilla distribution and install vdsm. dmidecode is missing from
        vanilla, hence cannot be executed before bootstrap.
    
     2. the logic of generating host id exists both in engine and vdsm, both
        implementations need to synced, and kept synced between versions,
        which in practice cannot be achieved. As there is too much static
        noise (distribute on different channels, not be able to update all
        IT components, etc...).
    
     3. If host id generation method is changed, the engine implementation
        should be changed as well, while engine should not really care how
        vdsm maintain its identity.
    
    NEW IMPLEMENTATION
    
    Acquire vdsm id during bootstrap process, at the earliest. When all
    dependencies are available.
    
    If vdsm id is duplicate:
    
     1. post an error to host event log.
    
     2. set the state of the host to "install fail".
    
    USER VISIBLE CHANGES
    
    The following existing scenario:
    
     1. user has host xxx.com with ip 1.1.1.1, he added this host using
        xxx.com as host name.
     2. user add new host, at the host field he *BY MISTAKE* writes 1.1.1.1.
     3. the user is blocked from proceeding because of duplicate uuid between
        xxx.com(existing) and 1.1.1.1(new).
     4. the confused user fixes the  host field to 1.1.1.2 and proceed.
    
    Behaves as:
    
     1. user has host xxx.com with ip 1.1.1.1, he added this host using
        xxx.com as host name.
     2. user add new host, at the host field he *BY MISTAKE* writes 1.1.1.1.
     3. installation starts, and ends up with installation failed status on
        the new host.
     4. user sees the error message, and remove the host added by mistake.
    
    REASONING
    
    The above sequence of adding the same host with different name/address is not
    common or frequent, and does not justify the need to duplicate vdsm logic into
    engine, nor find a solution for distributions that locks the dmidecode utility
    at their base system layout, nor to lock our-self to dmidecode utility.
    
    Change-Id: I0263dbae34aaa02c126c5ed1dc52a84f4f5e77f8
    Signed-off-by: Alon Bar-Lev <alonbl>

Comment 1 Alon Bar-Lev 2012-11-12 00:16:03 UTC
http://gerrit.ovirt.org/#/c/9159/

Comment 2 Alon Bar-Lev 2012-11-12 16:49:32 UTC
*** Bug 875526 has been marked as a duplicate of this bug. ***

Comment 3 Alon Bar-Lev 2012-11-12 16:49:34 UTC
*** Bug 875524 has been marked as a duplicate of this bug. ***

Comment 4 Alon Bar-Lev 2012-11-12 16:49:38 UTC
*** Bug 875522 has been marked as a duplicate of this bug. ***

Comment 7 Cheryn Tan 2013-04-03 06:51:03 UTC
This bug is currently attached to errata RHEA-2013:14491. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.

* Consequence: What happens when the bug presents.

* Fix: What was done to fix the bug.

* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes

Thanks in advance.

Comment 8 Alon Bar-Lev 2013-04-03 07:18:05 UTC
No doc required.

Comment 9 errata-xmlrpc 2013-06-10 21:19:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0888.html