Bug 1551603
Summary: | virt-customize on rhel-7.5 re-introduces a static /etc/machine-id in overcloud-full.qcow2 | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Omri Hochman <ohochman> | |
Component: | rhosp-director-images | Assignee: | Alex Schultz <aschultz> | |
Status: | CLOSED ERRATA | QA Contact: | Artem Hrechanychenko <ahrechan> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 13.0 (Queens) | CC: | agurenko, ahrechan, aschultz, bcafarel, dhill, dsariel, ekuris, evelu, hbrock, jjoyce, jschluet, jslagle, lmarsh, mburns, pablo.iranzo, radoslaw.smigielski, ragiman, rhel-osp-director-maint, sasha, wznoinsk | |
Target Milestone: | beta | Keywords: | Regression, Triaged | |
Target Release: | 13.0 (Queens) | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | rhosp-director-images-13.0-20180315.1.el7ost | Doc Type: | Bug Fix | |
Doc Text: |
Recent versions of libguestfs generate a machine-id when virt-customize or the customize action from virt-sysprep run. When this happens, a static /etc/machine-id is included in the image, which can cause issues with services that rely on this information to be unique across hosts.
To fix the issue, the build process cleans the overcloud image to provide a blank /etc/machine-id to ensure the image generates correctly when systems are booted for the first time. However, if you use virt-customize to update the overcloud image prior to deployment, run "virt-sysprep --operation machine-id -a <image>" again prior to uploading the image.
|
Story Points: | --- | |
Clone Of: | 1476612 | |||
: | 1555474 1557046 (view as bug list) | Environment: | ||
Last Closed: | 2018-06-27 13:24:35 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1554546, 1270860, 1476612, 1481443 | |||
Bug Blocks: | 1555474, 1557046 |
Description
Omri Hochman
2018-03-05 13:32:08 UTC
Jon can you verify the image build process has the remove-machine-id element in the configuration? https://github.com/openstack/tripleo-common/blob/master/image-yaml/overcloud-images.yaml#L22 This is caused by a newer version of virt-customize that we use in the image building process. https://github.com/libguestfs/libguestfs/commit/d5ce659e2c136fbcf0a0b9058711765cfae6c210 A quick way to verify this has been done is to run: guestfish -a overcloud-full.qcow2 run : mount /dev/sda / : cat /etc/machine-id It should return a blank line. The file should exist but be empty. 10:10 AM tmp ➜ guestfish -a overcloud-full.qcow2 run : mount /dev/sda / : cat /etc/machine-id 10:10 AM tmp ➜ *** Bug 1545085 has been marked as a duplicate of this bug. *** VERIFIED (undercloud) [stack@undercloud-0 images]$ tar -xvf /usr/share/rhosp-director-images/overcloud-full-latest-13.0.tar overcloud-full.qcow2 overcloud-full.initrd overcloud-full.vmlinuz overcloud-full-rpm.manifest overcloud-full-signature.manifest (undercloud) [stack@undercloud-0 images]$ guestfish -a overcloud-full.qcow2 run : mount /dev/sda / : cat /etc/machine-id (undercloud) [stack@undercloud-0 images]$ (undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.24.14 Last login: Fri Mar 30 13:51:42 2018 from 192.168.24.1 [heat-admin@compute-1 ~]$ cat /etc/machine-id e420ff129b2243b89c2f9536a6e66d03 [heat-admin@compute-1 ~]$ exit logout Connection to 192.168.24.14 closed. (undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.24.11 The authenticity of host '192.168.24.11 (<no hostip for proxy command>)' can't be established. ECDSA key fingerprint is SHA256:fmnNktU4xBACzFTYS0daYlaYlTQTXaOo/6F9yUc+m2s. ECDSA key fingerprint is MD5:ed:4d:53:25:d8:97:94:f0:79:e3:2d:12:07:42:0c:2c. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.24.11' (ECDSA) to the list of known hosts. Last login: Fri Mar 30 13:38:25 2018 from 192.168.24.254 [heat-admin@controller-1 ~]$ cat /etc/machine-id 3a34f59f127b435cadbc727c32412b05 (undercloud) [stack@undercloud-0 ~]$ sudo rpm -qa "rhosp-director-image*" rhosp-director-images-13.0-20180328.1.el7ost.noarch Hey. Re-opening due to issues popping out recently. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OSPD-Customized-Deployment-virt/3514/ Take a look at this Customized Job, failing in the overcloud step due to same reasons that should be fixed here. Console Output: 13:42:58 TASK [get ironic info for the node] ******************************************** 13:42:58 task path: /home/rhos-ci/jenkins/workspace/OSPD-Customized-Deployment-virt@2/infrared/plugins/tripleo-overcloud/tasks/add_overcloud_host.yml:2 13:42:58 fatal: [undercloud-0]: FAILED! => { 13:42:58 "changed": true, 13:42:58 "cmd": "source ~/stackrc\n openstack baremetal node show -c name -f value", 13:42:58 "delta": "0:00:02.120858", 13:42:58 "end": "2018-04-09 09:43:11.794658", 13:42:58 "failed": true, 13:42:58 "rc": 2, 13:42:58 "start": "2018-04-09 09:43:09.673800" 13:42:58 } 13:42:58 13:42:58 STDERR: 13:42:58 13:42:58 usage: openstack baremetal node show [-h] [-f {json,shell,table,value,yaml}] 13:42:58 [-c COLUMN] [--max-width <integer>] 13:42:58 [--fit-width] [--print-empty] 13:42:58 [--noindent] [--prefix PREFIX] 13:42:58 [--instance] 13:42:58 [--fields <field> [<field> ...]] 13:42:58 <node> 13:42:58 openstack baremetal node show: error: too few arguments 13:42:58 13:42:58 13:42:58 MSG: 13:42:58 13:42:58 non-zero return code Please contact me if any further information is needed. The failed job did not fail for the same reason. It was a timeout of the deployment. Also this bug wasn't a deployment failure related problem 2018-04-09 13:41:27Z [overcloud.Compute]: CREATE_FAILED CREATE aborted (Task create from ResourceGroup "Compute" Stack "overcloud" [fbd5b33b-6892-40b6-8bde-1e0f48d39c35] Timed out) 2018-04-09 13:41:27Z [overcloud.Compute]: UPDATE_FAILED Stack UPDATE cancelled 2018-04-09 13:41:27Z [overcloud]: CREATE_FAILED Timed out 2018-04-09 13:41:28Z [overcloud.Compute.0]: CREATE_FAILED Stack CREATE cancelled 2018-04-09 13:41:28Z [overcloud.Compute.0]: CREATE_FAILED resources[0]: Stack CREATE cancelled 2018-04-09 13:41:28Z [overcloud.Compute]: UPDATE_FAILED Resource CREATE failed: resources[0]: Stack CREATE cancelled (In reply to Roee Agiman from comment #21) > Hey. > Re-opening due to issues popping out recently. > > Please contact me if any further information is needed. Hi Roee, The original bug body mention issue with a specific value that was in the overcloud images nodes, under: /etc/machine-id . It was always the same value, while this value should be unique. I'm re-verifying this bug, if you encounter this specific issue in the future please re-open. *** Bug 1545085 has been marked as a duplicate of this bug. *** *** Bug 1545085 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2083 |