Created attachment 985934 [details]
steps with console output
Description of problem:
Replication agreement with replica not disabled when ipa-restore done without IPA installed and consequently replica is able to push changes to restored master without doing re-init from master which causes data corruption.
Version-Release number of selected component (if applicable):
[root@master ~]# rpm -q ipa-server
Steps to Reproduce:
1. Please find the attached file which is having console output along with steps.
Replication agreement with existing replica not disabled when doing IPA restore without IPA installed.
Replication agreement with replica should be disabled when doing IPA restore without IPA installed.
Looking at the logs was a bad idea. Backup seems to change them and I think we can not trust them.
I am trying to reproduce. Having difficulties after step 11 complete.
The ipa-restore fails with a problem with hostname:
Directory Manager (existing master) password:
Preparing restore from /var/lib/ipa/backup/ipa-full-2015-01-30-09-09-23 on localhost.localdomain
Performing FULL restore from FULL backup
Host name localhost.localdomain does not match backup name vm-028.idm.lab.bos.redhat.com
Note that I installed master/replica without reverse-zone (--forwarder=10.65.201.89) or specifying --ip-address=10.65.207.58. But at the end replication was working fine.
Also looking that the userRoot.ldif file in the backup, I noticed it contains the RUV. This is not problematic if the RUV is cleared at restore time
Using Kaleem environment I was able to reproduce the problem systematically.
I modified the backup:
- remove the RUV from the userRoot.ldif
- remove the RUV from ./etc/dirsrv/slapd-TESTRELM-TEST/dse.ldif (in files.tar)
These were the only 2 ldif files containing the replicageneration.
So either there is an other ldif file in the backup, that contains the RUV and I missed that file. Or the ldif files were not used at the end of the restore.
A possibility is that backup database files ('*.db') were applied after the ldif import.
I wanted to check that but errors logs are also modified by ipa-restore .
Waiting in finished investigation from Thierry, to find the root cause.
During the restore, the database files are restored and then the backends are reimported from ldif files (ipaca.ldif and userroot.ldif).
The ldif files are imported at the condition the ldif files exists.
For a FULL restore, the ldif files are temporary files (e.g. /tmp/tmpEoVEO5ipa/ipa/EXAMPLE-COM-userRoot.ldif). The test (os.path.exist) on those temporary files fails. This is the reason why they are not imported (ldif2db).
I will test a fix to allow the import. I assume that if the import (without RUV) is successful then the others replica will not be able to replicate to the restored instance.
Thierry, should we document this as Known Issue in 7.1? Do we know the exact root cause?
Still fighting with fix.
A first problem is that using a temporary directory to extract the backup, The test os.path.exists on extracted ldif files return FALSE .
I had to use tarinfo to check the existence of the file and add then to the files set to import.
It works but then we want to remove the RUV from the files. Then it fails to open some of the files. It is not systematically the same file that fails so I believe syncing the file system should made it, but I do not know how to do that.
Currently FULL restore is not working as expect, we can document that until we get a true fix.
That's weird. Did you check the audit log for SELinux AVCs?
I don't think syncing the FS will help, as it merely flushes FS buffers. If the file isn't there in the first place, there is nothing to flush (?)
You can try adding "print repr(filename)" above the os.path.exists call to see if there are any unusual characters in the filename.
RUV removal works and should address the problem of restored server receiving old updates.
The problem is understood but due to lack of knowledge of python I am not able to find a fix.
The restore is done using a tarball. The tarball contains backends ldif.
Files (including backends ldif) are extracted from the tarball, but are not accessible. os.path.exists or open fails on those files.
I do not know the reason why those call fail (no such file) although I can detect the files in the tarball.
I think removing the RUV from the ldif is the best approach, this is why I prefer to make it work.
I am attaching the trace of the debug.
Jan, can you please assist Thierry with the Python parts?
Created attachment 993684 [details]
modified ipa-restore.py+log files (traces added prefixed by 'XXX')
[root@dhcp207-229 ~]# rpm -q ipa-server
Please find the attached file for console output of verification steps.
Created attachment 1080969 [details]
console output with verification steps
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.