Bug 826677

Summary: IPA cannot remove disconnected replica data to reconnect
Product: Red Hat Enterprise Linux 6 Reporter: Scott Poore <spoore>
Component: ipaAssignee: Rob Crittenden <rcritten>
Status: CLOSED ERRATA QA Contact: Namita Soman <nsoman>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 6.3CC: dpal, fedora, jgalipea, mkosek, ssorce
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ipa-3.0.0-1.el6 Doc Type: Bug Fix
Doc Text:
Cause: When Identity Management replica is being deleted via ipa-replica-manage, the script does not check if the deletion would orphan other Identity Management replica. Consequence: Administrator unwary of the Identity Management replication graph structure may accidentally delete a replica and orphan other replicas would force him to re-install the orphaned replicas. Fix: When deleting a master, try to prevent orphaning other servers. Result: ipa-replica-manage will not allow an Administrator to delete a remote replica if that action would orphan a replica with an replication agreement with the deleted replica.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 09:14:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 782183    

Description Scott Poore 2012-05-30 18:24:40 UTC
Description of problem:

Cannot remove disconnected host data in order to uninstall/reinstall a replica for re-connecting it to a domain.

Initial topology:  (simple triangle)
2 - 3
 \ /
  1

# on host1:
ipa-replica-manage disconnect host1 host2
ipa-replica-manage del host3  # not sure if this one is relevant here

# on host2:
ipa-server-install --uninstall -U

# on host1:
ipa-replica-prepare -p $ADMINPW --ip-address=$HOST2_IP $HOST2

# on host2:
sftp root@$HOST1:/var/lib/ipa/replica-info-$HOST2.gpg
ipa-replica-install -U --setup-dns --forwarder=$DNSFORWARD -w $ADMINPW -p $ADMINPW /dev/shm/replica-info-$HOST2.gpg
...
The host qe-blade-04.testrelm.com already exists on the master server. Depending on your configuration, you may perform the following:

Remove the replication agreement, if any:
    % ipa-replica-manage del qe-blade-04.testrelm.com
Remove the host entry:
    % ipa host-del qe-blade-04.testrelm.com

# on host1:
ipa-replica-manage del $HOST2
'$HOST1' has no replication agreement for '$HOST2'

ipa host-del $HOST2
ipa: ERROR: invalid 'hostname': An IPA master host cannot be deleted or disabled

Version-Release number of selected component (if applicable):
ipa-server-2.2.0-16.el6.x86_64

How reproducible:
very if not always

Steps to Reproduce:
1.  <setup rhel6.3 IPA master and 2 replicas>
# on host1:
2.  ipa-replica-manage disconnect $HOST1 $HOST2
3.  ipa-replica-manage del $HOST3
# on host2:
4.  ipa-server-install --uninstall -U
# on host1:
5.  ipa-replica-prepare -p $ADMINPW --ip-address=$HOST2_IP $HOST2
# on host2:
6.  cd /dev/shm; sftp root@$HOST1:/var/lib/ipa/replica-info-$HOST2.gpg
7.  ipa-replica-install -U --setup-dns --forwarder=$DNSFORWARD -w $ADMINPW -p $ADMINPW /dev/shm/replica-info-$HOST2.gpg
  
Actual results:

# on host2 ipa-replica-install fails:
...
The host qe-blade-04.testrelm.com already exists on the master server. Depending on your configuration, you may perform the following:

Remove the replication agreement, if any:
    % ipa-replica-manage del qe-blade-04.testrelm.com
Remove the host entry:
    % ipa host-del qe-blade-04.testrelm.com

# on host1:
ipa-replica-manage del $HOST2
'$HOST1' has no replication agreement for '$HOST2'

ipa host-del $HOST2
ipa: ERROR: invalid 'hostname': An IPA master host cannot be deleted or disabled


Expected results:

ipa-replica-manage or ipa host-del should be able to forcibly remove data to allow a reconnect or ipa-replica-install should provide a way to reconnect if possible.

Additional info:

Comment 3 Dmitri Pal 2012-05-31 16:47:05 UTC
Upstream ticket:
https://fedorahosted.org/freeipa/ticket/2797

Comment 4 RHEL Program Management 2012-07-10 06:21:59 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 5 RHEL Program Management 2012-07-10 23:28:03 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 6 Rob Crittenden 2012-07-19 22:18:51 UTC
I think this is a procedural problem.

You did an ipa-replica-manage del for host3, but never for host2, yet you uninstalled on host2. As far as the remaining IPA server (host1) knows, host2 is still there just perhaps unreachable. This is why trying to re-install fails.

If you do a: ipa-replica-manage del host2 --force on host1 then you should be able to re-install it.

With the connection between 1 and 2 gone and host3 deleted there would be no way to communicate anything happening on host2 back to host1, so I don't think there is much we can do.

Comment 7 Scott Poore 2012-07-20 01:35:57 UTC
deleting with --force didn't seem to help.  I got the same error message:

### From MASTER:

[root@vm1 quickinstall]# ipa-replica-manage -p $ADMINPW list
vm2.testrelm.com: master
vm1.testrelm.com: master
vm3.testrelm.com: master

[root@vm1 quickinstall]# ipa-replica-manage -p $ADMINPW list vm1.testrelm.com
vm2.testrelm.com: replica
vm3.testrelm.com: replica

[root@vm1 quickinstall]# ipa-replica-manage -p $ADMINPW list vm2.testrelm.com
vm1.testrelm.com: replica
vm3.testrelm.com: replica

[root@vm1 quickinstall]# ipa-replica-manage -p $ADMINPW list vm3.testrelm.com
vm1.testrelm.com: replica
vm2.testrelm.com: replica

[root@vm1 quickinstall]# ipa-replica-manage -p $ADMINPW disconnect vm1.testrelm.com vm2.testrelm.com
Deleted replication agreement from 'vm1.testrelm.com' to 'vm2.testrelm.com'

[root@vm1 quickinstall]# ipa-replica-manage -p $ADMINPW del vm3.testrelm.com
Deleting a master is irreversible.
To reconnect to the remote master you will need to prepare a new replica file
and re-install.
Continue to delete? [no]: yes
Deleted replication agreement from 'vm1.testrelm.com' to 'vm3.testrelm.com'
Deleted replication agreement from 'vm2.testrelm.com' to 'vm3.testrelm.com'

### Then From Slave 1 (host2 from example):

[root@vm2 quickinstall]# ipa-server-install --uninstall -U
Shutting down all IPA services
Removing IPA client configuration
Unconfiguring ntpd
Unconfiguring named
Unconfiguring web server
Unconfiguring krb5kdc
Unconfiguring kadmin
Unconfiguring directory server
Unconfiguring ipa_memcached

### From Master:

[root@vm1 quickinstall]# ipa-replica-prepare -p $ADMINPW --ip-address=192.168.122.102 vm2.testrelm.com
Preparing replica for vm2.testrelm.com from vm1.testrelm.com
Creating SSL certificate for the Directory Server
Creating SSL certificate for the dogtag Directory Server
Creating SSL certificate for the Web Server
Exporting RA certificate
Copying additional files
Finalizing configuration
Packaging replica information into /var/lib/ipa/replica-info-vm2.testrelm.com.gpg
Adding DNS records for vm2.testrelm.com
Using reverse zone 122.168.192.in-addr.arpa.

### From Slave 1:

[root@vm2 quickinstall]# sftp root.com:/var/lib/ipa/replica-info-vm2.testrelm.com.gpg
Connecting to vm1.testrelm.com...
Fetching /var/lib/ipa/replica-info-vm2.testrelm.com.gpg to replica-info-vm2.testrelm.com.gpg
/var/lib/ipa/replica-info-vm2.testrelm.com.gpg                                   100%   28KB  28.3KB/s   00:00    

[root@vm2 quickinstall]# ipa-replica-install -U --setup-dns --forwarder=$DNSFORWARD -w $ADMINPW -p $ADMINPW /dev/shm/replica-info-vm2.testrelm.com.gpg 
Run connection check to master
Check connection from replica to remote master 'vm1.testrelm.com':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

The following list of ports use UDP protocol and would need to be
checked manually:
   Kerberos KDC: UDP (88): SKIPPED
   Kerberos Kpasswd: UDP (464): SKIPPED

Connection from replica to master is OK.
Start listening on required ports for remote master check
Get credentials to log in to remote master
Execute check on remote master
Check connection from master to remote replica 'vm2.testrelm.com':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos KDC: UDP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   Kerberos Kpasswd: UDP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

Connection from master to replica is OK.

Connection check OK
The host vm2.testrelm.com already exists on the master server. Depending on your configuration, you may perform the following:

Remove the replication agreement, if any:
    % ipa-replica-manage del vm2.testrelm.com
Remove the host entry:
    % ipa host-del vm2.testrelm.com

### On Master:
[root@vm1 quickinstall]# ipa-replica-manage -p $ADMINPW del vm2.testrelm.com
'vm1.testrelm.com' has no replication agreement for 'vm2.testrelm.com'

[root@vm1 quickinstall]# ipa-replica-manage -p $ADMINPW del vm2.testrelm.com --force
'vm1.testrelm.com' has no replication agreement for 'vm2.testrelm.com'

Comment 8 Rob Crittenden 2012-07-20 13:35:50 UTC
You've found a weakness in the topology management, and I'm not entirely sure how to address it.

We require that the last server not be disconnected, but deleted.

So what you've done is:

Connect 1 -> 2 -> 3 ->1, a nice little circle

If you disconnect 1 and 2 you have:

1 -> 3 and 2 -> 3

Delete 3. 2 still exists but is disconnected and orphaned from 1. This is why you can't re-create it.

You may have to create a connection to it from 1 -> 2, then you can delete it.

Comment 9 Scott Poore 2012-07-21 15:19:25 UTC
I can't seem to reconnect even if I try before the delete.

[root@vm1 ipa-test-install]# ipa-replica-manage -p $ADMINPW disconnect vm1.testrelm.com vm2.testrelm.com
Deleted replication agreement from 'vm1.testrelm.com' to 'vm2.testrelm.com'

[root@vm1 ipa-test-install]# ipa-replica-manage -p $ADMINPW connect vm1.testrelm.com vm2.testrelm.com
You cannot connect to a previously deleted master

[root@vm1 ipa-test-install]# ipa-replica-manage -p $ADMINPW list vm2.testrelm.com
vm3.testrelm.com: replica

I also tried to delete it before the final delete to try a re-install:

[root@vm1 ipa-test-install]# ipa-replica-manage -p $ADMINPW del vm2.testrelm.com
'vm1.testrelm.com' has no replication agreement for 'vm2.testrelm.com'

So, how could I reconnect there?

Comment 10 Scott Poore 2012-07-24 20:00:11 UTC
ok, so I recloned the env to the point where all the connections are in place...

[root@vm1 ~]# ipa-replica-manage list vm1.testrelm.com
vm2.testrelm.com: replica
vm3.testrelm.com: replica
[root@vm1 ~]# ipa-replica-manage list vm2.testrelm.com
vm1.testrelm.com: replica
vm3.testrelm.com: replica
[root@vm1 ~]# ipa-replica-manage list vm3.testrelm.com
vm1.testrelm.com: replica
vm2.testrelm.com: replica

##### On Master (vm1):
[root@vm1 shared]# ipa-replica-manage -p $ADMINPW disconnect vm1.testrelm.com vm2.testrelm.com
Deleted replication agreement from 'vm1.testrelm.com' to 'vm2.testrelm.com'

[root@vm1 shared]# ipa-replica-manage -p $ADMINPW del vm3.testrelm.com
Deleting a master is irreversible.
To reconnect to the remote master you will need to prepare a new replica file
and re-install.
Continue to delete? [no]: yes
Deleted replication agreement from 'vm1.testrelm.com' to 'vm3.testrelm.com'
Deleted replication agreement from 'vm2.testrelm.com' to 'vm3.testrelm.com'

...copied ipa-replica-prepare from a fedora machine to /tmp...

[root@vm1 tmp]# rm /var/lib/ipa/replica-info-vm2.testrelm.com.gpg 
rm: remove regular file `/var/lib/ipa/replica-info-vm2.testrelm.com.gpg'? y

[root@vm1 tmp]# /tmp/ipa-replica-prepare -p $ADMINPW --ip-address=192.168.122.102 vm2.testrelm.com
Preparing replica for vm2.testrelm.com from vm1.testrelm.com
Creating SSL certificate for the Directory Server
Creating SSL certificate for the dogtag Directory Server
Creating SSL certificate for the Web Server
Exporting RA certificate
Copying additional files
Finalizing configuration
Packaging replica information into /var/lib/ipa/replica-info-vm2.testrelm.com.gpg
Adding DNS records for vm2.testrelm.com
Using reverse zone 122.168.192.in-addr.arpa.


##### Replica 1 (vm2):

[root@vm2 shared]# rm /dev/shm/replica-info-vm2.testrelm.com.gpg 
rm: remove regular file `/dev/shm/replica-info-vm2.testrelm.com.gpg'? y

[root@vm2 shared]# sftp root.com:/var/lib/ipa/replica-info-vm2.testrelm.com.gpg /dev/shm
Connecting to vm1.testrelm.com...
Fetching /var/lib/ipa/replica-info-vm2.testrelm.com.gpg to /dev/shm/replica-info-vm2.testrelm.com.gpg
/var/lib/ipa/replica-info-vm2.testrelm.com.gpg                                   100%   28KB  28.3KB/s   00:00    

[root@vm2 shared]# /usr/sbin/ipa-server-install --uninstall  -U
Shutting down all IPA services
Removing IPA client configuration

[root@vm2 shared]# ipa-replica-install -U --setup-dns --forwarder=$DNSFORWARD -w $ADMINPW -p $ADMINPW /dev/shm/replica-info-vm2.testrelm.com.gpg
Run connection check to master
Check connection from replica to remote master 'vm1.testrelm.com':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

The following list of ports use UDP protocol and would need to be
checked manually:
   Kerberos KDC: UDP (88): SKIPPED
   Kerberos Kpasswd: UDP (464): SKIPPED

Connection from replica to master is OK.
Start listening on required ports for remote master check
Get credentials to log in to remote master
Execute check on remote master
Check connection from master to remote replica 'vm2.testrelm.com':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos KDC: UDP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   Kerberos Kpasswd: UDP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

Connection from master to replica is OK.

Connection check OK
The host vm2.testrelm.com already exists on the master server. Depending on your configuration, you may perform the following:

Remove the replication agreement, if any:
    % ipa-replica-manage del vm2.testrelm.com
Remove the host entry:
    % ipa host-del vm2.testrelm.com

##### On Master:

[root@vm1 tmp]# ipa-replica-manage del vm2.testrelm.com
'vm1.testrelm.com' has no replication agreement for 'vm2.testrelm.com'

[root@vm1 tmp]# ipa host-del vm2.testrelm.com
ipa: ERROR: invalid 'hostname': An IPA master host cannot be deleted or disabled

[root@vm1 sbin]# ipa host-show vm2.testrelm.com
  Host name: vm2.testrelm.com
  Principal name: host/vm2.testrelm.com
  SSH public key fingerprint: 59:93:30:60:01:C5:74:7D:98:B6:99:F3:48:98:CF:C6 (ssh-dss),
                              4A:77:D0:3E:A8:2B:33:93:5C:3A:B7:4B:77:48:D1:0B (ssh-rsa)
  Password: False
  Keytab: True
  Managed by: vm2.testrelm.com


###################

So, I still can't install using the gpg file from the newer devel ipa-replica-prepare file from F17.  I can't delete because the host still exists in IPA and I can't delete it because it's flagged as an IPA master.

I still need to test with a full F17 env to see if that works.  Will post back with results when that's done.

Comment 11 Scott Poore 2012-07-25 19:45:20 UTC
And, I see pretty much the same thing on F17.

##### On Master (vm4):

[root@vm4 log]# ipa-replica-manage -p $ADMINPW list vm4.testrelm.com
vm5.testrelm.com: replica
vm6.testrelm.com: replica

[root@vm4 log]# ipa-replica-manage -p $ADMINPW list vm5.testrelm.com
vm4.testrelm.com: replica
vm6.testrelm.com: replica

[root@vm4 log]# ipa-replica-manage -p $ADMINPW list vm6.testrelm.com
vm4.testrelm.com: replica
vm5.testrelm.com: replica

[root@vm4 log]# ipa-replica-manage -p $ADMINPW disconnect vm4.testrelm.com vm5.testrelm.com
Deleted replication agreement from 'vm4.testrelm.com' to 'vm5.testrelm.com'

[root@vm4 log]# ipa-replica-manage -p $ADMINPW del vm6.testrelm.com
Deleting a master is irreversible.
To reconnect to the remote master you will need to prepare a new replica file
and re-install.
Continue to delete? [no]: yes
Deleted replication agreement from 'vm4.testrelm.com' to 'vm6.testrelm.com'
Deleted replication agreement from 'vm5.testrelm.com' to 'vm6.testrelm.com'


##### On Replica 1 (vm5):

[root@vm5 ipa-test-install]# ipa-server-install --uninstall -U
Shutting down all IPA services
Removing IPA client configuration
Unconfiguring ntpd
Unconfiguring CA directory server
Unconfiguring CA
Unconfiguring named
Unconfiguring web server
Unconfiguring krb5kdc
Unconfiguring kadmin
Unconfiguring directory server
Unconfiguring ipa_memcached

##### On Master (vm4):

[root@vm4 log]# ipa-replica-prepare -p $ADMINPW --ip-address=192.168.122.105 vm5.testrelm.com
Preparing replica for vm5.testrelm.com from vm4.testrelm.com
Creating SSL certificate for the Directory Server
Creating SSL certificate for the dogtag Directory Server
Creating SSL certificate for the Web Server
Exporting RA certificate
Copying additional files
Finalizing configuration
Packaging replica information into /var/lib/ipa/replica-info-vm5.testrelm.com.gpg
Adding DNS records for vm5.testrelm.com
Using reverse zone 122.168.192.in-addr.arpa.

##### On Replica 1 (vm5):

[root@vm5 ipa-test-install]# sftp root.com:/var/lib/ipa/replica-info-vm5.testrelm.com.gpg
The authenticity of host 'vm4.testrelm.com (192.168.122.104)' can't be established.
RSA key fingerprint is c2:93:f2:ba:42:c9:7c:11:7b:6b:35:2b:93:01:bd:3e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'vm4.testrelm.com' (RSA) to the list of known hosts.
Connected to vm4.testrelm.com.
Fetching /var/lib/ipa/replica-info-vm5.testrelm.com.gpg to replica-info-vm5.testrelm.com.gpg
/var/lib/ipa/replica-info-vm5.testrelm.com.gpg                                     100%   28KB  28.4KB/s   00:00    

[root@vm5 ipa-test-install]# mv replica-info-vm5.testrelm.com.gpg /dev/shm/
mv: overwrite `/dev/shm/replica-info-vm5.testrelm.com.gpg'? y

[root@vm5 ipa-test-install]# ipa-replica-install -U --setup-dns --forwarder=$DNSFORWARD -w $ADMINPW -p $ADMINPW /dev/shm/replica-info-vm5.testrelm.com.gpg 
Run connection check to master
Check connection from replica to remote master 'vm4.testrelm.com':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

The following list of ports use UDP protocol and would need to be
checked manually:
   Kerberos KDC: UDP (88): SKIPPED
   Kerberos Kpasswd: UDP (464): SKIPPED

Connection from replica to master is OK.
Start listening on required ports for remote master check
Get credentials to log in to remote master
Execute check on remote master
Check connection from master to remote replica 'vm5.testrelm.com':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos KDC: UDP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   Kerberos Kpasswd: UDP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

Connection from master to replica is OK.

Connection check OK
The host vm5.testrelm.com already exists on the master server.
You should remove it before proceeding:
    % ipa host-del vm5.testrelm.com

##### On Master (vm4):
[root@vm4 log]# ipa host-del vm5.testrelm.com
ipa: ERROR: invalid 'hostname': An IPA master host cannot be deleted or disabled

######################

So pretty much the same thing there.

Note versions:
[root@vm4 log]# rpm -q freeipa-server
freeipa-server-2.99.0-0.20120725T0906Zgit1235739.fc17.x86_64

[root@vm5 ipa-test-install]# rpm -q freeipa-server
freeipa-server-2.99.0-0.20120724T1825Zgitdedb180.fc17.x86_64

[root@vm6 ipa-test-install]# rpm -q freeipa-server
freeipa-server-2.99.0-0.20120724T1825Zgitdedb180.fc17.x86_64

Comment 12 Rob Crittenden 2012-08-15 17:55:43 UTC
I can't duplicate this with current F-17 master (it may be that I'm running the commands wrong, this is somewhat complex).

Would it be possible for you to te-test with a current daily build and if you can reproduce, post only the steps used to reproduce (on vm1 run: ...). Normally I'm all for verbosity but I'll admit I had a problem sifting through parts of this :-)

Comment 13 Scott Poore 2012-08-16 03:46:57 UTC
Ok, I retested and I'm still seeing the same thing.

I tested with these versions on F17:

freeipa-server-2.99.0-0.20120815T0635Zgite1d3463.fc17.x86_64
389-ds-base-1.2.11.5-1.fc17.x86_64
sssd-1.8.97-0.20120815T1507Zgitbdbf4f1.fc17.x86_64

Sorry about all the clutter (especially with different hostnames).  

Steps I'm using to setup/reproduce here:

# Setup VM1 Master
on vm1:  ipa-server-install --setup-dns --forwarder=192.168.122.1 --hostname=vm1.testrelm.com -r TESTRELM.COM -n testrelm.com -p $ADMINPW -P $ADMINPW -a $ADMINPW -U

# Setup VM2 Replica1
on vm1:  ipa-replica-prepare -p $ADMINPW --ip-address=192.168.122.102 vm2.testrelm.com
on vm2:  sftp vm1:/var/lib/ipa/replica-info-vm2.testrelm.com.gpg /dev/shm
on vm2:  ipa-replica-install -U --setup-ca --setup-dns --forwarder=192.168.122.1 -w $ADMINPW -p $ADMINPW /dev/shm/replica-info-vm2.testrelm.com.gpg

# Setup VM3 Replica2
on vm1:  ipa-replica-prepare -p $ADMINPW --ip-address=192.168.122.103 vm3.testrelm.com
on vm3:  sftp vm1:/var/lib/ipa/replica-info-vm3.testrelm.com.gpg /dev/shm
on vm3:  ipa-replica-install -U --setup-ca --setup-dns --forwarder=192.168.122.1 -w $ADMINPW -p $ADMINPW /dev/shm/replica-info-vm3.testrelm.com.gpg

# Connect Replica1 and Replica2
on vm1:  ipa-replica-manage connect vm2.testrelm.com vm3.testrelm.com

# Disconnect Master and Replica1
on vm1:  ipa-replica-manage disconnect vm1.testrelm.com vm2.testrelm.com

# Delete Replica2
on vm1:  ipa-replica-manage del vm3.testrelm.com

# attempt Re-install on Replica1
on vm2:  ipa-server-install --uninstall -U
on vm1:  rm /var/lib/ipa/replica-info-vm2.testrelm.com.gpg
on vm1:  ipa-replica-prepare -p $ADMINPW --ip-address=192.168.122.102 vm2.testrelm.com
on vm2:  sftp vm1:/var/lib/ipa/replica-info-vm2.testrelm.com.gpg /dev/shm
on vm2:  ipa-replica-install -U --setup-ca --setup-dns --forwarder=192.168.122.1 -w $ADMINPW -p $ADMINPW /dev/shm/replica-info-vm2.testrelm.com.gpg

# at this point, I'm normally seeing this error:
Connection check OK
The host vm5.testrelm.com already exists on the master server.
You should remove it before proceeding:
    % ipa host-del vm5.testrelm.com

# attempt to delete Replica1 host entry
on vm1: ipa host-del vm2.testrelm.com

# at this point, I've been seeing this error:
ipa: ERROR: invalid 'hostname': An IPA master host cannot be deleted or disabled

Comment 14 Rob Crittenden 2012-08-31 13:53:44 UTC
Ok, I see what is happening.

What this does is delete all the replication agreements for vm2, then vm2 is uninstalled.

This doesn't remove the cn=masters entries from vm1 and vm3 so reinstalling will fail.

I think we want to avoid orphaning any nodes in general, so I'll start by trying to prevent this in the disconnect code.

We may also want to provide docs for how to get out of this situation using ldapdelete.

Comment 15 Chris Smart 2012-09-06 01:59:51 UTC
I have run into this same problem, where we have a ghost replica (no RUV records, no replication agreements, but replica cannot be re-added).

I can confirm that removing the entry for the relpica from the database solved my issue.

Backup the entry first:
  ldapsearch -x -LLL -D "cn=Directory Manager" -W -b 'cn=replica-to-remove.domain.com,cn=masters,cn=ipa,cn=etc,dc=domain,dc=com' '(objectclass=*)'

Here is the delete command I used (removes sub records):
  ldapdelete -r -x -D "cn=Directory Manager" -W 'cn=replica-to-remove.domain.com,cn=masters,cn=ipa,cn=etc,dc=domain,dc=com'

Tested this on a test network and it was successful (assumes all replication agreements gone, and no RUV records, i.e the definition of ghost):
  * delete ghost with ldapdelete command above
  * remove host record, using ipa host-del
  * prepare replica, using ipa-replica-prepare
  * install replica, using ipa-replica-install
  * add any remaining replica agreements, using ipa-replica-manage connect
  * test replication by adding, removing records
  * check log, no replication errors

Comment 16 Chris Smart 2012-09-07 04:33:31 UTC
We are in the process of cleaning up all ghosts, but find them still listed in the config of the one remaining server. Should we be removing these too?

(Note, we have no more replicas, no more RUV records, no more entries in cn=masters.)

dn: cn=replica,cn=dc\3Ddomain\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: replica
nsDS5Flags: 1
objectClass: top
objectClass: nsds5replica
objectClass: extensibleobject
nsDS5ReplicaType: 3
nsDS5ReplicaRoot: dc=domain,dc=com
nsds5ReplicaLegacyConsumer: off
nsDS5ReplicaId: 3
nsDS5ReplicaBindDN: cn=replication manager,cn=config
nsDS5ReplicaBindDN: krbprincipalname=ldap/site1-sv03.domain.com,cn=se
 rvices,cn=accounts,dc=domain,dc=com
nsDS5ReplicaBindDN: krbprincipalname=ldap/site3-sv01.domain.com,cn=se
 rvices,cn=accounts,dc=domain,dc=com
nsDS5ReplicaBindDN: krbprincipalname=ldap/site2-sv01.domain.com,cn=se
 rvices,cn=accounts,dc=domain,dc=com
nsDS5ReplicaBindDN: krbprincipalname=ldap/site1-sv02.domain.com,cn=se
 rvices,cn=accounts,dc=domain,dc=com
nsDS5ReplicaBindDN: krbprincipalname=ldap/site2-sv02.domain.com,cn=se
 rvices,cn=accounts,dc=domain,dc=com
nsState:: AwAAAAAAAABld0lQAAAAAAAAAAAAAAAAAwAAAAAAAAABAAAAAAAAAA==
nsDS5ReplicaName: 9a6cd27f-8c2f11e1-ad89fcb6-2dba7b2e
nsds5ReplicaChangeCount: 33334
nsds5replicareapactive: 0

Comment 17 Chris Smart 2012-09-07 05:44:32 UTC
We also have a ghost (site3-sv3.domain.com) in following in the domain db:

dn: cn=ipa-http-delegation,cn=s4u2proxy,cn=etc,dc=domain,dc=com
objectClass: ipaKrb5DelegationACL
objectClass: groupOfPrincipals
objectClass: top
ipaAllowedTarget: cn=ipa-ldap-delegation-targets,cn=s4u2proxy,cn=etc,dc=domain,dc=com
memberPrincipal: HTTP/site1-sv1.domain.com
memberPrincipal: HTTP/site3-sv3.domain.com
cn: ipa-http-delegation

dn: cn=ipa-ldap-delegation-targets,cn=s4u2proxy,cn=etc,dc=domain,dc=com
objectClass: groupOfPrincipals
objectClass: top
memberPrincipal: ldap/site1-sv1.domain.com
memberPrincipal: ldap/site3-sv3.domain.com
cn: ipa-ldap-delegation-targets

Comment 18 Chris Smart 2012-09-07 05:54:54 UTC
And this one, where we have a list of servers for Solaris:

dn: cn=default,ou=profile,dc=domain,dc=com
defaultServerList: site1-server1.domain.com site3-server1.domain.com site3-server2.domain.com

Comment 19 Rob Crittenden 2012-09-07 12:22:44 UTC
The cn=mapping tree,cn=config lists those users allowed to bind for replication. You can safely remove this, or leave it if you intend to re-create the agreement.

The s4u2proxy members list those principals allowed to use S4U2Proxy to obtain a ticket on behalf of a user. You should definitely remove any dangling principals.

The defaultServerList should be cleaned up if you have any systems enrolling using our DUA Profile (such as using ldapclient in Solaris). This is the list of available servers for cases of failover. It would be good practice to clean this up, and tidier, but if you aren't using DUA profiles it isn't very important.

Comment 21 Chris Smart 2012-09-17 21:36:44 UTC
Great news! Any chance that this will be backported for 2.x?

Comment 22 Dmitri Pal 2012-09-18 15:28:42 UTC
(In reply to comment #21)
> Great news! Any chance that this will be backported for 2.x?

So far we do not plan to but 3.0 will see the light of day quite soon.
Hope this is sufficient.

Comment 23 Chris Smart 2012-09-18 22:00:10 UTC
(In reply to comment #22)
> So far we do not plan to but 3.0 will see the light of day quite soon.
> Hope this is sufficient.

Yes, no problem. Comment 15 shows how to clean up the master entry, so that should be sufficient until 3.0 is out. Thanks to everyone.

Comment 25 Scott Poore 2012-10-03 18:13:04 UTC
Verified.

Version ::

ipa-server-3.0.0-2.el6.x86_64

Manual Test Results ::

[root@vm4 ipa]# ipa-replica-manage -p $ADMINPW list vm4.testrelm.com
vm5.testrelm.com: replica
vm6.testrelm.com: replica

[root@vm4 ipa]# ipa-replica-manage -p $ADMINPW list vm5.testrelm.com
vm4.testrelm.com: replica

[root@vm4 ipa]# ipa-replica-manage -p $ADMINPW list vm6.testrelm.com
vm4.testrelm.com: replica

[root@vm4 ipa]# ipa-replica-manage -p $ADMINPW connect vm5.testrelm.com vm6.testrelm.com
ipa: INFO: Getting ldap service principals for conversion: (krbprincipalname=ldap/vm5.testrelm.com) and (krbprincipalname=ldap/vm6.testrelm.com)
Connected 'vm5.testrelm.com' to 'vm6.testrelm.com'

[root@vm4 ipa]# ipa-replica-manage -p $ADMINPW disconnect vm4.testrelm.com vm5.testrelm.com
ipa: INFO: Setting agreement cn=meTovm4.testrelm.com,cn=replica,cn=dc\=testrelm\,dc\=com,cn=mapping tree,cn=config schedule to 2358-2359 0 to force synch
ipa: INFO: Deleting schedule 2358-2359 0 from agreement cn=meTovm4.testrelm.com,cn=replica,cn=dc\=testrelm\,dc\=com,cn=mapping tree,cn=config
ipa: INFO: Replication Update in progress: FALSE: status: 0 Replica acquired successfully: Incremental update succeeded: start: 0: end: 0
Deleted replication agreement from 'vm4.testrelm.com' to 'vm5.testrelm.com'

[root@vm4 ipa]# ipa-replica-manage -p $ADMINPW del vm6.testrelm.com 
Deleting a master is irreversible.
To reconnect to the remote master you will need to prepare a new replica file
and re-install.
Continue to delete? [no]: yes
Deleting this server will orphan 'vm4.testrelm.com, vm5.testrelm.com'. 
You will need to reconfigure your replication topology to delete this server.

Comment 27 errata-xmlrpc 2013-02-21 09:14:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0528.html