Bug 1270860

Summary:	All nodes are sharing the same machine-id
Product:	Red Hat OpenStack	Reporter:	Erwan Velu <evelu>
Component:	openstack-tripleo-puppet-elements	Assignee:	Alex Schultz <aschultz>
Status:	CLOSED ERRATA	QA Contact:	Gurenko Alex <agurenko>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	7.0 (Kilo)	CC:	ahrechan, alan_bishop, arkady_kanevsky, aschultz, bnemec, cdevine, christopher_dearborn, dcain, derekh, evelu, gmeno, hbrock, icolle, jcoufal, jfenal, jjoyce, John_walsh, jschluet, jslagle, kurt_hey, lkocman, mariel, mburns, morazi, nbarcet, nthomas, randy_perryman, rghatvis, rhel-osp-director-maint, sclewis, smerrow, sreichar, tvignaud
Target Milestone:	beta	Keywords:	Triaged
Target Release:	12.0 (Pike)
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	openstack-tripleo-puppet-elements-7.0.0-0.20170819032135.23884d3.el7ost openstack-tripleo-common-7.4.1-0.20170818153039.7d74e83.el7ost	Doc Type:	Bug Fix
Doc Text:	Using hardcoded machine IDs in templates creates multiple nodes with identical machine IDs. This prevents the Red Hat Storage Console from identifying multiple nodes. Workaround: Generate unique machine IDs on each node and then update the /etc/machine-id file. This will ensure that the Red Hat Storage Console can identify the nodes as unique.	Story Points:	---
Clone Of:
Clones:	1481443 (view as bug list)		Environment:
Last Closed:	2017-12-13 20:33:46 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1401639, 1476612, 1481443, 1551603, 1555474, 1557046

Description Erwan Velu 2015-10-12 14:29:01 UTC

Description of problem:
All nodes deployed using directord are sharing the same /etc/machine-id which is supposed to be unique.


Version-Release number of selected component (if applicable):


How reproducible:
Install a cloud and watch /etc/machine-id on all nodes.
That's exactly the same everywhere

Steps to Reproduce:
1.
2.
3.

Actual results:
All nodes are sharing the same /etc/machine-id

Expected results:
Shall be different on all nodes

Additional info:
Sounds like the machine-id is inside the golden image making all node sharing the same one.

Comment 2 chris alfonso 2015-10-12 16:07:04 UTC

What is the net effect of having the systems using the same machine-id?

Comment 3 chris alfonso 2015-10-12 16:07:04 UTC

What is the net effect of having the systems using the same machine-id?

Comment 9 Hugh Brock 2016-02-05 11:20:27 UTC

Derek, is this a dup of  https://bugzilla.redhat.com/show_bug.cgi?id=1244328 or related to the fix for that bug?

Comment 10 Derek Higgins 2016-02-05 13:05:19 UTC

Both bugs have similar causes, a unique identifier is being generated during the image build and then being used on all machines its deployed to. But they need to be tracked separately as the fixes for them wont be the same.

Comment 11 James Slagle 2016-02-17 17:09:29 UTC

clearing needinfo

Comment 15 Dmitry Tantsur 2016-02-19 12:21:34 UTC

Please confirm that just removing this file from the image would actually work. Or should we better regenerate it?

Comment 16 Erwan Velu 2016-02-19 13:57:15 UTC

For the new systems yes, deleting the file from the image is enough as systemd will generate it at boot time.

The question is about the systems already setup with this one ... shall we keep it ? Shall we delete it and regenerate ? I don't know the impacts and the products using that number like satelite.

Comment 17 Erwan Velu 2016-03-29 09:25:28 UTC

Any update for it ? Does this issue will be fixed in release 8 ?

Comment 18 Erwan Velu 2016-04-05 07:58:38 UTC

Still valid on OSP8...

Comment 19 Dmitry Tantsur 2016-04-05 10:29:41 UTC

Thanks for reminder, I think I have some time to look into it.

I'm making this bug public, as I don't see anything private about it, and it's good to be able to reference our bugs upstream.

Comment 20 Dmitry Tantsur 2016-04-05 10:46:26 UTC

First patch posted

Comment 21 Dmitry Tantsur 2016-04-05 10:55:21 UTC

Second patch posted. Waiting for reviews now (it can also take substantial time).

Comment 22 Erwan Velu 2016-04-05 13:16:56 UTC

Can you link your patch here ?

Comment 23 Dmitry Tantsur 2016-04-05 13:31:49 UTC

Please see External Trackers section on this bug (before the comments).

Comment 24 Erwan Velu 2016-04-05 14:17:23 UTC

(In reply to Dmitry Tantsur from comment #23)
> Please see External Trackers section on this bug (before the comments).

Thx I didn't noticed it before.
Would have been nice being mentioned in the commit message. Anyway, good to see it's on the way.

Comment 25 Dmitry Tantsur 2016-04-05 14:27:28 UTC

It seems like systemd does not even bother starting IPA when machine-id is missing. I'm not sure why, but for now I'll get back to only removing machine-id for overcloud-full.

Comment 26 Erwan Velu 2016-04-05 15:33:55 UTC

(In reply to Dmitry Tantsur from comment #25)
> It seems like systemd does not even bother starting IPA when machine-id is
> missing. I'm not sure why, but for now I'll get back to only removing
> machine-id for overcloud-full.

You may need to call systemd-machine-id-setup at some time.

Comment 27 Dmitry Tantsur 2016-04-05 15:42:05 UTC

So it doesn't get called automatically, does it? Then I misunderstood this issue a bit..

Comment 28 Erwan Velu 2016-04-05 16:11:40 UTC

It seems that the installer should do some tasks around it so maybe the osp installer too.

Note that I'm not a deep systemd expert.

Comment 29 Mike Burns 2016-04-07 20:54:03 UTC

This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 33 Dmitry Tantsur 2016-08-09 16:44:05 UTC

Sorry, I have to unassign myself. The patch upstream has caused a long discussion about support for various distros and how to recreate this file on boot properly. I don't have any time to continue working on it. Anyone is free to overtake the abandoned patches.

Comment 34 Erwan Velu 2016-10-18 18:55:43 UTC

Oh I'm sorry little bug ... I forgot you ...

I missed your birthday .. Happy Birthday bug !

More seriously, anyone to take care of it ?!?

Comment 35 Sean Merrow 2017-01-19 18:12:20 UTC

Updating BZ to reflect partner desire for this fix.

Comment 36 Alan Bishop 2017-01-19 21:19:42 UTC

What are the downsides (if there are any) to generating new machine IDs on the overcloud nodes? Apparently the Storage Console requires they be unique, and so I need to know if anything else will break or complain if the IDs change. So far I haven't noticed any breakage or SELinux issues, but I'd like a definitive answer.

Comment 37 Erwan Velu 2017-01-20 09:01:19 UTC

(In reply to Alan Bishop from comment #36)
> What are the downsides (if there are any) to generating new machine IDs on
> the overcloud nodes? Apparently the Storage Console requires they be unique,
> and so I need to know if anything else will break or complain if the IDs
> change. So far I haven't noticed any breakage or SELinux issues, but I'd
> like a definitive answer.

I don't see any downside .... Systemd will re-generate it...

Comment 40 Ian Colle 2017-01-30 16:29:08 UTC

Ben - What's the next step for this? How do we get some movement towards resolving it?

Comment 43 Nishanth Thomas 2017-01-31 10:17:32 UTC

Ack, LGTM

Comment 47 Red Hat Bugzilla Rules Engine 2017-02-13 14:20:38 UTC

This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 55 Alex Schultz 2017-07-03 19:48:04 UTC

Those changes were superseded by the subsequent changes https://review.openstack.org/445173 and https://review.openstack.org/445174 which were targeted to tripleo only.  The previous ones were pushed against DIB and the upstream seemed to not want to do that.  I didn't want to drop them from the bug as to not lose some of the history.

Comment 58 Randy Perryman 2017-09-18 16:15:08 UTC

Will this be back ported to OSP 10?

Comment 59 Alex Schultz 2017-09-19 19:51:34 UTC

Bug 1476612 is for OSP10

Comment 61 Artem Hrechanychenko 2017-11-07 12:40:43 UTC

Verified:


openstack-tripleo-puppet-elements-7.0.1-0.20171020122223.82d7e6c.el7ost.noarch
openstack-tripleo-common-7.6.3-0.20171028055750.el7ost.noarch


(undercloud) [stack@undercloud-0 ~]$ for i in `openstack server list -f value -c Networks |awk -F'=' '{print $2}'`; do echo "############################################"; ssh -o StrictHostKeyChecking=no heat-admin@$i "hostname; sudo cat /etc/machine-id"; done
############################################
controller-3
e3f151e9c436443b844d1346c213e959
############################################
controller-2
94d12be698ab4441b900677170aa1aa3
############################################
controller-0
aeeedae034684f61af5857a16a77b185
############################################
controller-1
f74fe0c617b9479fa8fbad427a16c97c
############################################
compute-0
e8a89b2eb3c043cdba08b36d7d51629a

Comment 65 errata-xmlrpc 2017-12-13 20:33:46 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462