Bug 570107

Summary: The import of LDIFs with base-64 encoded DNs fails, modrdn with non-ASCII new rdn incorrect
Product: [Retired] 389 Reporter: Andrey Ivanov <andrey.ivanov>
Component: Database - Import/ExportAssignee: Noriko Hosoi <nhosoi>
Status: CLOSED CURRENTRELEASE QA Contact: Viktor Ashirov <vashirov>
Severity: high Docs Contact:
Priority: high    
Version: 1.2.6CC: amsharma, rmeggins
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-07 16:46:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 543590, 639035    
Attachments:
Description Flags
git patch file
nhosoi: review?, rmeggins: review+
LDIF file with non-ASCII DNs none

Description Andrey Ivanov 2010-03-03 10:24:33 UTC
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6

The import of LDIF previously exported by 389 v1.2.5 fails for certain entries. Simple modrdns with non-ASCII DNs seem to be incorrect.

Reproducible: Always

Steps to Reproduce:
1.I have compiled 389 from the following source packages :
389-admin-1.1.11.a2.tar.bz2
389-adminutil-1.1.10.tar.bz2
389-ds-base-1.2.6.a2.tar.bz2

2. Try to import (with ldif2db -n userRoot -i /tmp/prod_base_current.ldif) the following LDIF :
dn:: Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PW9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5p
 cXVlLGRjPWVkdQ==
objectClass: top
objectClass: inetOrgPerson
cn:: QVRFTElFUiBERSBNw4lDQU5JUVVF
uid: toto
sn: toto


3. The errors are shown during the import and the entry is not imported.

4. Imagine we have an entry with another cn. Let us now rename this entry to the cn with non-ascii dn. Example of test_mod.ldif file :
---------------------------------
dn: cn=ATELIER DE MECANIQUE,ou=objets,dc=id,dc=polytechnique,dc=edu
changetype: modrdn
newrdn: cn=ATELIER DE M\C3\89CANIQUE
deleteoldrdn: 1
----------------------------------
[root@ldap-model Admin]# ldapmodify -Y GSSAPI -f /tmp/test_mod.ldif 
SASL/GSSAPI authentication started
SASL username: pj.EDU
SASL SSF: 56
SASL installing layers
modifying rdn of entry "cn=ATELIER DE MECANIQUE,ou=objets,dc=id,dc=polytechnique,dc=edu"
rename completed



Actual Results:  
Concerning the import problem :

[03/Mar/2010:10:47:12 +0100] - import userRoot: WARNING: skipping bad LDIF entry (not starting with "dn: ") ending line 239762 of file "/tmp/prod_base_current.ldif"


As for the rename the renamed entry is :
[root@ldap-model Admin]# ldapsearch -Y GSSAPI -b "dc=id,dc=polytechnique,dc=edu" cn=atelier*de*m*canique   
SASL/GSSAPI authentication started
SASL username: pj.EDU
SASL SSF: 56
SASL installing layers
# extended LDIF
#
# LDAPv3
# base <dc=id,dc=polytechnique,dc=edu> with scope subtree
# filter: cn=atelier*de*m*canique
# requesting: ALL
#

# ATELIER DE M\C3\89CANIQUE, objets, id.polytechnique.edu
dn: cn=ATELIER DE M\C3\89CANIQUE,ou=objets,dc=id,dc=polytechnique,dc=edu
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
uid: toto
sn: toto
cn: ATELIER DE MC389CANIQUE

# search result
search: 4
result: 0 Success

# numResponses: 2
# numEntries: 1



Expected Results:  
The import should succeed and the modrdn should be consistent.

ldapadd with the file test.ldif containing the same data works without problem :

test.ldif 
-------------------
dn:: Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PW9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5p
 cXVlLGRjPWVkdQ==
objectClass: top
objectClass: inetOrgPerson
cn:: QVRFTElFUiBERSBNw4lDQU5JUVVF
uid: toto
sn: toto
-------------------



[root@ldap-model Admin]# ldapadd -Y GSSAPI -f /tmp/test.ldif 
SASL/GSSAPI authentication started
SASL username: pj.EDU
SASL SSF: 56
SASL installing layers
adding new entry "cn=ATELIER DE MÉCANIQUE,ou=objets,dc=id,dc=polytechnique,dc=edu"

[root@ldap-model Admin]# ldapsearch -Y GSSAPI -b "dc=id,dc=polytechnique,dc=edu" cn=atelier*de*m*canique
SASL/GSSAPI authentication started
SASL username: pj.EDU
SASL SSF: 56
SASL installing layers
# extended LDIF
#
# LDAPv3
# base <dc=id,dc=polytechnique,dc=edu> with scope subtree
# filter: cn=atelier*de*m*canique
# requesting: ALL
#

# ATELIER DE M\C3\89CANIQUE, objets, id.polytechnique.edu
dn:: Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PW9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5pc
 XVlLGRjPWVkdQ==
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
cn:: QVRFTElFUiBERSBNw4lDQU5JUVVF
uid: toto
sn: toto

# search result
search: 4
result: 0 Success

# numResponses: 2
# numEntries: 1

Comment 1 Noriko Hosoi 2010-03-11 00:36:50 UTC
Created attachment 399210 [details]
git patch file

File: ldap/servers/slapd/back-ldbm/import-threads.c

Description: When getting the DN value from the raw ldif file,
it was strictly checking "dn: ", which was incomplete.  We
should have checked "dn:: " for the Base64 encoded DN, as well.  
This patch is adding the case.

Comment 2 Noriko Hosoi 2010-03-11 00:55:02 UTC
Hi Andrey,

Could it be possible to attach a corresponding ldif file which hasn't been base64 encoded?  I tried to decode the dn and cn on a base64 decoder, but it complains they are not a UTF-8 string...

test.ldif 
-------------------
dn:: Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PW9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5p
 cXVlLGRjPWVkdQ==
objectClass: top
objectClass: inetOrgPerson
cn:: QVRFTElFUiBERSBNw4lDQU5JUVVF
uid: toto
sn: toto
-------------------

Thanks,
--noriko

Comment 3 Rich Megginson 2010-03-11 02:31:58 UTC
Comment on attachment 399210 [details]
git patch file

patch looks ok, but would like to verify that it works with the data in question

Comment 5 Andrey Ivanov 2010-03-11 08:02:03 UTC
Created attachment 399276 [details]
LDIF file with non-ASCII DNs

The test case

Comment 6 Andrey Ivanov 2010-03-11 08:09:12 UTC
Hi Noriko,

here is the complete test.ldif (including the necessary "ou" hierarchy). It does work correctly with your patch.

As for the second part of the bug (incorrect renames with non-ascii RDNs), this patch has of course no influence, the code bugs...

Comment 7 Andrey Ivanov 2010-03-11 08:14:26 UTC
In order to decode i use 

perl -e 'use MIME::Base64; while(<STDIN>){print decode_base64($_);}'


For this file it gives me thje following DN and CN :

[root@ldap-model Admin]# perl -e 'use MIME::Base64; while(<STDIN>){print decode_base64($_);}'
Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PW9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5pcXVlLGRjPWVkdQ==
cn=ATELIER DE MÉCANIQUE,ou=objets,dc=id,dc=polytechnique,dc=edu

QVRFTElFUiBERSBNw4lDQU5JUVVF
ATELIER DE MÉCANIQUE

Comment 8 Noriko Hosoi 2010-03-12 00:33:13 UTC
Thank you, Andrey, for providing me the test data and Perl script.  It's really handy!

I created a suffix: dc=id,dc=polytechnique,dc=edu. And followed your steps:
1) import the sample test data: test.ldif
./ldif2db -n testroot -i /export/tests/570107/test.ldif 
No error was reported; 3 entries were successfully imported:
[..]  - import testroot: Import complete.  Processed 3 entries in 2 seconds. (1.50 entries/sec)

2) search the server
ldapsearch -e -b "dc=id,dc=polytechnique,dc=edu" "(objectclass=*)" dn
dn: dc=id,dc=polytechnique,dc=edu
dn: ou=Objets,dc=id,dc=polytechnique,dc=edu
dn: cn=ATELIER DE MÉCANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu

3) delete the 3-rd entry:
ldapdelete -D 'cn=directory manager' -w <pw>
cn=ATELIER DE MÉCANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu

4) add an entry without accent:
ldapmodify 'cn=directory manager' -w <pw> -a << EOF
dn: cn=ATELIER DE MECANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
cn: ATELIER DE MECANIQUE
uid: toto
sn: toto
EOF

5) modrdn the DN:
ldapmodify -D 'cn=directory manager' -w <pw>
dn: cn=ATELIER DE MECANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu
changetype: modrdn
newrdn: cn=ATELIER DE MÉCANIQUE
deleteoldrdn: 1

6) check if it was updated:
ldapsearch -e -b "dc=id,dc=polytechnique,dc=edu" "(cn=*)"
dn: cn=ATELIER DE MÉCANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
uid: toto
sn: toto
cn: ATELIER DE MÉCANIQUE

Comment 9 Noriko Hosoi 2010-03-12 00:40:01 UTC
Reviewed by Rich (Thank you!!)

Pushed to master.

$ git merge work
Updating f11afee..dc2f7d0
Fast-forward
 ldap/servers/slapd/back-ldbm/import-threads.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

$ git push
Counting objects: 13, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 908 bytes, done.
Total 7 (delta 5), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   f11afee..dc2f7d0  master -> master

Comment 10 Andrey Ivanov 2010-03-12 16:08:44 UTC
I've compiled the sources from git checkout that i've just made.
Up to the test 4 everything is just like you describe(the import in base64 is corrected). Starting from 5 you use ldapmodify of mozldap. I use ldapmodify of OpenLDAP. And the representation of the dn is not the same. One can escape UTF-8 symbols (i think it's part of rfc4514), so for example ldapvi(based on ldapsearch) generates the following for the DN:

newrdn: cn=ATELIER M\C3\89CANIQUE

In other words, if i use ldapmodify of openldap starting from 5) 
[root@ldap-model Admin]# /usr/bin/ldapmodify -x -D 'cn=Directory Manager' -w '<mdp>'
dn: cn=ATELIER MECANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu
changetype: modrdn
newrdn: cn=ATELIER M\C3\89CANIQUE
deleteoldrdn: 1
newsuperior: ou=Objets,dc=id,dc=polytechnique,dc=edu

modifying rdn of entry "cn=ATELIER MECANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu"
rename completed


The resulting entry is exactly as i stated above. Actually there are two problems :
* incorrect resulting entry :
/usr/bin/ldapsearch -x -D 'cn=Directory manager' -w '<mdp>' -b "dc=id,dc=polytechnique,dc=edu" cn=* 

# ATELIER M\C3\89CANIQUE, Objets, id.polytechnique.edu
dn: cn=ATELIER M\C3\89CANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
uid: toto
sn: toto
cn: ATELIER MC389CANIQUE

* absence of the MODRDN line in accesslog for this modification (and the presence of RESULT line for the operation, cf op=1, tag-109 is the result of moddn operation):
==> /Logs/Ldap/access <==
[12/Mar/2010:16:55:29 +0100] conn=4 fd=128 slot=128 connection from 127.0.0.1 to 127.0.0.1
[12/Mar/2010:16:55:29 +0100] conn=4 op=0 BIND dn="cn=Directory Manager" method=128 version=3
[12/Mar/2010:16:55:29 +0100] conn=4 op=0 RESULT err=0 tag=97 nentries=0 etime=0.010000 dn="cn=Directory Manager"
[12/Mar/2010:16:55:34 +0100] conn=4 op=1 RESULT err=0 tag=109 nentries=0 etime=0.001000
[12/Mar/2010:16:55:36 +0100] conn=4 op=2 UNBIND
[12/Mar/2010:16:55:36 +0100] conn=4 op=2 fd=128 closed - U1
==> /Logs/Ldap/audit <==
time: 20100312165534
dn: cn=atelier de mecanique,ou=objets,dc=id,dc=polytechnique,dc=edu
changetype: modrdn
newrdn: cn=ATELIER DE M\C3\89CANIQUE
deleteoldrdn: 1

Comment 11 Noriko Hosoi 2010-03-12 18:31:53 UTC
Thank you, Andrey!
I've opened another bug for the problem:
Bug 573060 - DN normalizer: ESC HEX HEX is not normalized

Comment 12 Andrey Ivanov 2010-03-12 18:57:26 UTC
Yes, i think it's a good idea to separate these two bugs. I've reported them together initially because i thought they had the same origin.

Comment 13 Amita Sharma 2011-05-19 13:52:17 UTC
Hi Noriko,

I am following the steps given in comment#8..
But after executing ..
[root@testvm slapd-testvm]# ldapmodify -h localhost -p 389 -D "cn=Directory Manager" -w Secret123 -a << EOF
> dn: cn=ATELIER DE MECANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu
> changetype: modrdn
> newrdn: cn=ATELIER DE MÉCANIQUE
> deleteoldrdn: 1
> EOF
modifying rdn of entry "cn=ATELIER DE MECANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu"

I am getting ..
ldapsearch -h localhost -p 389 -D "cn=Directory Manager" -w Secret123  -b "dc=id,dc=polytechnique,dc=edu" "(cn=*)"
# extended LDIF
#
# LDAPv3
# base <dc=id,dc=polytechnique,dc=edu> with scope subtree
# filter: (cn=*)
# requesting: ALL
#

# ATELIER DE M\C3\89CANIQUE, Objets, id.polytechnique.edu
dn:: Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PU9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5pc
 XVlLGRjPWVkdQ==
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
uid: toto
sn: toto
cn:: QVRFTElFUiBERSBNw4lDQU5JUVVF

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

How should I get the simple ASCII values with openldap search?

Comment 14 Rich Megginson 2011-05-19 14:08:22 UTC
openldap always converts non printable character values to base64.  I use python:

python
>>> import base64
>>> val = "Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PU9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5pcXVlLGRjPWVkdQ=="
>>> base64.b64decode(val)
'cn=ATELIER DE M\xc3\x89CANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu'
>>> val2 = "QVRFTElFUiBERSBNw4lDQU5JUVVF"
>>> base64.b64decode(val2)
'ATELIER DE M\xc3\x89CANIQUE'

Comment 15 Noriko Hosoi 2011-05-19 16:19:06 UTC
Another way is using this perl script which was given by Andrey the reporter of this bug. ;)
perl -e 'use MIME::Base64; while(<STDIN>){print decode_base64($_);}'
Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PU9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5pcXVlLGRjPWVkdQ==
cn=ATELIER DE MÉCANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu
QVRFTElFUiBERSBNw4lDQU5JUVVF
ATELIER DE MÉCANIQUE

Comment 16 Amita Sharma 2011-05-20 08:51:40 UTC
ok, thanks both of you :)
Marking the bug as VERIFIED.

Comment 17 Amita Sharma 2011-05-20 08:53:22 UTC
[root@testvm ~]# perl -e 'use MIME::Base64; while(<STDIN>){print decode_base64($_);}'
Y249QVRFTElFUiBERSBNw4lDQU5JUVVFLG91PU9iamV0cyxkYz1pZCxkYz1wb2x5dGVjaG5pcXVlLGRjPWVkdQ==
cn=ATELIER DE MÉCANIQUE,ou=Objets,dc=id,dc=polytechnique,dc=edu
QVRFTElFUiBERSBNw4lDQU5JUVVF
ATELIER DE MÉCANIQUE