Bug 1259550 - RHEL hypervisor host registered twice to the satellite when virt-who used
RHEL hypervisor host registered twice to the satellite when virt-who used
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Candlepin (Show other bugs)
x86_64 Linux
high Severity high (vote)
: GA
: 6.1
Assigned To: Barnaby Court
Katello QA List
: Triaged
Depends On:
Blocks: GSS_Sat6Beta_Tracker/GSS_Sat6_Tracker
  Show dependency treegraph
Reported: 2015-09-02 22:10 EDT by Anand Vaddarapu
Modified: 2017-04-25 12:35 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2017-03-27 12:56:28 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Anand Vaddarapu 2015-09-02 22:10:36 EDT
Description of problem:

1. I want to deploy RHEV's thick hypervisor "RHEL-H" using the Satellite6 to control contents on top of any other reasons.
So as you can tell, at the end of the the provisioning, it will register to Satellite 6.1.
(It registers as "HOST")

2. The host will be registered to RHEV-M 

3. virt-who will interrogate RHEV-M and find out the new host is in the RHEV-M.

4. virt-who registers the new HOST into the Satellite6.1 as new "HYPERVISOR"

In my understanding, that if you want to get true host-VM relationship, you need to give Smart Virtualization subscription to the one which says "HYPERVISOR", because Hypervisor-VM relationship report is based on UUID.
This means double the subscription, one for the host that was registered in 1 and the same host being registered again by 4.
I have found out that the UUIDs for both hosts are different.

To me somehow, Satellite6 should have a smart to tell both boxes are the same, and shouldn't show up twice.

Also, I have tested to register the host using --type=hypervisor using RHEL6.7's subscription-manager, which didn't work to register the host after the provisioning. 
It was registered as a "host", even though I used "hypervisor" as a type.

Actual results:
Duplicate entries twice for same host on satellite.

Expected results:
I think that Satellite should somehow work-around this problem and merge those
two systems together.
Comment 2 Sean Mullen 2015-10-16 11:21:06 EDT
I have similar issues with registration of VMWare hosts using virt-who.  My findings / assumptions are as follows based upon a decent amount of digging:

1.  You run virt-who against your virtual environment - it pulls a list of VMs ... good so far.
2.  virt-who reports this big list to SAM on the satellite 6.1 master - It's a big list so the SSL connection from virt-who to Satellite times out.  That's OK because a bunch of data is in Katello's processing queue for creation of new VM hosts and updates to the mappings of guest to host relationships ... it'll just keep processing - not ideal but still OK.
3.  Here's the problem ... even with oneshot mode, virt-who in it's infinite wisdom reconnects to Satellite and RESENDS the list of changes
4.  it seems as though Katello receives this as "OK, here's more to process" and starts a PARALLEL thread processing the "new" data
5. Here's the stupid bit ... in the foreman database, there's a katello-systems table.  This has an ID, the UUID and the hostname info in it.  Two major design issues here ... First, there is no unique constraint in the table enforcing that the hostname / UUID combos have to be unique (so you can have duplicates per the database) and apparently Katello doesn't have great error checkin in this case because it goes ahead and creates duplicate records.

From here, in the GUI, you can see 2 entries for some of the content hosts created after virt-who.  You select to delete them in a clean up effort and Katello deletes all references to them EXCEPT THE SECOND RECORD IN THE KATELLO-SYSTEMS table.  This orphan record then wreaks havoc in that it breaks your ability to get content-host lists, among other things because you keep throwing "409 Gone" errors. 

I found a work-around .... but this is NOT something I got from Red Hat so it's probably not a supported action (though appears to work on my 6.1.1 system with no ill effects):
1. identify duplicate records in the katello_systems table
2. delete second record by ID, NOT BY UUID OR HOSTNAME since these are now not unique
3. after cleaning out the table's duplicates, run "foreman-rake katello:reindex"
Comment 8 Jonathan Gibert 2016-06-23 09:15:30 EDT
For convenience here are the queries you'll need to delete duplicate records (I had 446 duplicates...)

delete collections FK if needed :
delete from katello_system_host_collections where system_id in (select id from (select id,row_number() over (partition by uuid,name order by id) as rnum from katello_systems) t where t.rnum > 1);

delete hypervisor from systems (it keeps the one with the lowest ID) :
delete from katello_systems where id in (select id from (select id,row_number() over (partition by uuid,name order by id) as rnum from katello_systems) t where t.rnum > 1);

As stated by Sean, this is not supported by Red Hat.
Comment 9 Barnaby Court 2016-12-20 14:29:28 EST
We have often seen issues where different UUIDs for a single host are returned depending on how they are queried. For this reason we recommend using one and only one virt-who backend-type & connection type for querying a given hypervisor. When using libvirt for example, depending on the version of libvirt being used & whether you query using the local interface or remote interface a different UUID will be returned. 

Have you tried this interaction with RHEV using only libvirt-remote mode for all the virt-who connections? Do you still end up with duplicate records if remote mode is used for all virt-who connections to libvirt? 

If you are using the same backend for querying from virt-who and virt-who is not reporting different UUIDs then would you consider this issue a duplicate of BZ 1365248?

Note You need to log in before you can comment on or make changes to this bug.