Bug 628300 - DN is not normalized in dn/entry cache when an entry is added, entrydn is not present in search results
Summary: DN is not normalized in dn/entry cache when an entry is added, entrydn is not...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: 389
Classification: Retired
Component: Database - Indexes/Searches
Version: 1.2.6
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Noriko Hosoi
QA Contact: Viktor Ashirov
URL:
Whiteboard:
Depends On:
Blocks: 389_1.2.6 639035
TreeView+ depends on / blocked
 
Reported: 2010-08-29 10:11 UTC by Andrey Ivanov
Modified: 2015-12-07 16:31 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-07 16:31:31 UTC


Attachments (Terms of Use)
git patch file (master) (1.35 KB, patch)
2010-08-30 17:05 UTC, Noriko Hosoi
rmeggins: review+
Details | Diff
git patch file (master) (4.23 KB, patch)
2010-08-31 00:41 UTC, Noriko Hosoi
no flags Details | Diff
git patch file (master) (4.30 KB, patch)
2010-08-31 17:12 UTC, Noriko Hosoi
nhosoi: review?
nhosoi: review?
rmeggins: review+
Details | Diff

Description Andrey Ivanov 2010-08-29 10:11:48 UTC
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8 (.NET CLR 3.5.30729)

When adding an entry with a non-normalized DN the non-normalized form of this DN is added to the DN/entry cache. All the searches return this non-normalized value until server restart. The entrydn operational attribute does not appear either until server restart.

Reproducible: Always

Steps to Reproduce:
1. dbgen.pl -n 1 -v -o /tmp/example.ldif
2. ldif2db -n userRoot -i /tmp/example.ldif
3. Add the following entry to the directory (note the non-normalised components in ou and dc parts):
------------
dn: uid=bug-normalizer,ou=paYROll,dc=eXAmple,dc=COM
changetype: add
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
cn: Bug Normalizer Test
sn: Bug Normalizer
uid: bug-normalizer


Actual Results:  
I'm using openldap ldapsearch.

ldapsearch -x  -b "dc=example,dc=com" entrydn

...
# Payroll, example.com
dn: ou=Payroll,dc=example,dc=com
entrydn: ou=payroll,dc=example,dc=com

# TVradmin0, Accounting, example.com
dn: uid=TVradmin0,ou=Accounting,dc=example,dc=com
entrydn: uid=tvradmin0,ou=accounting,dc=example,dc=com

# bug-normalizer, paYROll, eXAmple.COM
dn: uid=bug-normalizer,ou=paYROll,dc=eXAmple,dc=COM

...

So, the dn value is not normalized and the entrydn attribute does not appear for this entry. 

Expected Results:  
According to the design document http://directory.fedoraproject.org/wiki/Upgrade_to_New_DN_Format#Rules_to_handle_DN the value should be normalized when the server receives it and stores in the backend. The dn and entry caches should correspond to the backend contents.

After i restart the server it shows the correct results :

service dirsrv restart
ldapsearch -x  -b "dc=example,dc=com" entrydn

...
# Payroll, example.com
dn: ou=Payroll,dc=example,dc=com
entrydn: ou=payroll,dc=example,dc=com

# TVradmin0, Accounting, example.com
dn: uid=TVradmin0,ou=Accounting,dc=example,dc=com
entrydn: uid=tvradmin0,ou=accounting,dc=example,dc=com

# bug-normalizer, Payroll, example.com
dn: uid=bug-normalizer,ou=Payroll,dc=example,dc=com
entrydn: uid=bug-normalizer,ou=payroll,dc=example,dc=com
...

The problem is clearly in the way the dn or entry cache are managed. The non-normalized value is directly written to these caches and so this value is returned upon the search just after. When the server restarts it regenerates the caches from entryrdn.db4 and the returned values are correctly normalized.

So the fix would be to add the entrydn attribute and to put the correctly normalized dn entries to dn and entry caches when a new entry is added.

Comment 1 Noriko Hosoi 2010-08-30 17:04:36 UTC
Thanks for the bug report, Andrey!

Actually, it's a bug in the entrydn support. (Bug 578296 - Attribute type entrydn needs to be added when subtree rename switch is on. )

This normalized entry also did not show the entrydn.  *shame shame* :(

dn: uid=bug2-normalizer,ou=payroll,dc=eXAmple,dc=COM
changetype: add
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
cn: Bug2 Normalizer Test
sn: Bug2 Normalizer
uid: bug2-normalizer

The fix is very simple.  I'm attaching it next.

Comment 2 Noriko Hosoi 2010-08-30 17:05:54 UTC
Created attachment 441989 [details]
git patch file (master)

Description: Code for supporting entrydn (added for Bug 578296)
contained a bug.  If an entry was found in the entry cache,
id2entry_ext returned it without adding the entrydn attribute
value.  This patch fixes the problem.

Comment 3 Andrey Ivanov 2010-08-30 17:48:10 UTC
Hi, Noriko,

thanks for your patch! Yes, i've seen the Bug 578296, but i've made a new one since it's only a part of the problem. Your patch does what is necessary to make appear the entrydn attribute, however the second part of the bug is still here - the returned DN in ldapsearch is still non-normalized and becomes normalized only after server restart :

ldapsearch -x  -b "dc=example,dc=com" entrydn
dn: uid=bug-normalizer,ou=paYROll,dc=eXAmple,dc=COM
entrydn: uid=bug-normalizer,ou=payroll,dc=example,dc=com


And after the server restart the returned dn is normalised (service dirsrv restart):

dn: uid=bug-normalizer,ou=Payroll,dc=example,dc=com
entrydn: uid=bug-normalizer,ou=payroll,dc=example,dc=com

Comment 4 Andrey Ivanov 2010-08-30 17:52:25 UTC
What i mean is that the server response should not depend on whether it is restarted or not, it should be invariant.

Comment 5 Noriko Hosoi 2010-08-30 18:47:56 UTC
Actually, it's a bit tricky issue...  DN normalizer (slapi_dn_normalize_ext) does not change the case.

For instance, if you add this test entry,
  dn: uid=bug3-normalizer, ou=p\61yroll, dc=eXAmple, dc=COM
  [..]
  cn: Bug3 Normalizer Test
  sn: Bug3 Normalizer
  uid: bug3-normalizer
You get this search result (before restarting the server).
  # bug3-normalizer, payroll, eXAmple.COM
  dn: uid=bug3-normalizer,ou=payroll,dc=eXAmple,dc=COM
  entrydn: uid=bug3-normalizer,ou=payroll,dc=example,dc=com
Once the server is restarted, the result becomes like this:
  # bug3-normalizer, Payroll, example.com
  dn: uid=bug3-normalizer,ou=Payroll,dc=example,dc=com
  entrydn: uid=bug3-normalizer,ou=payroll,dc=example,dc=com

The lower case payroll and mixed case eXAmple.COM are from the test entry's p\61yroll and dc=eXAmple, dc=COM, respectively.  Both are "normalized".  '\61' is converted to a and a space between eXAmple, and dc= is removed.

Once the server is restarted, the parent entries' RDNs are respected.  The dn of payroll happens to be "ou=Payroll" and dc=example's "dc=example,dc=com"...

Probably, we should return all lowered (and normalized) DN as the search result?

Comment 6 Andrey Ivanov 2010-08-30 19:18:06 UTC
I don't think everything should lowered - the values of the naming attributes should be conserved as is. And the server gives the DN absolutely correctly after restart. So we need just to re-calculate the returned DN (when the entry is added to cache) in the same way the server does it during restart.
This sort of normalization is done, for example, by Active Directory(i've just tested). I think openldap does the same thing.


Another interesting notice is that moving the entry to another subtree works correctly, even if you put the name of the new subtree in a way like "ou=paYROll,dc=eXAmple,dc=COM" - the final returned DN will be correct, that is the new DN is recalculated during the entry subtree move.

Comment 7 Andrey Ivanov 2010-08-30 19:21:05 UTC
the DN calculation, of course, should be done only once, when the added entry goes into the dn or entry cache..

Comment 8 Noriko Hosoi 2010-08-30 20:16:11 UTC
(In reply to comment #6)
> I don't think everything should lowered - the values of the naming attributes
> should be conserved as is. And the server gives the DN absolutely correctly
> after restart. So we need just to re-calculate the returned DN (when the entry
> is added to cache) in the same way the server does it during restart.

Well, that's going to be a big change... :)  Currently, I'm trying to reduce the entryrdn index access as much as possible, which involves disk IO with traversing DIT.  When an entry is added, the server receives the DN info (with possible random cases like dc=eXAmple,dc=COM).  That is, we don't have to assemble the DN string from the entryrdn index.  And we don't.  We use the given DN after normalized.  That's the reason why you see the user passed cases.  Please note that the DN is temporarily stored in the entry cache in memory.  So, once the entry is evicted and retrieved from the db, you see the updated result (no more eXAmple.COM, e.g.).

Once you restart the server, the entry cache is reset.  Following search has to go to the entryrdn to assemble the DN string, which shows the original RDN in the index.

Compared with the entryrdn index access, lowering case is much less expensive.  I can alter the behaviour that way with minimum effort.

BTW, I tested your original entry with my openldap 2.4.23.  It behaves in the same way as 389 does.
1) add an entry 
2) search the entry
# bug-normalizer, paYROll, eXAmple.COM
dn: uid=bug-normalizer,ou=paYROll,dc=eXAmple,dc=COM
entryDN: uid=bug-normalizer,ou=paYROll,dc=eXAmple,dc=COM
3) restart the server, then search the entry
dn: uid=bug-normalizer,ou=Payroll,dc=example,dc=com
entryDN: uid=bug-normalizer,ou=Payroll,dc=example,dc=com

> This sort of normalization is done, for example, by Active Directory(i've just
> tested). I think openldap does the same thing.

> Another interesting notice is that moving the entry to another subtree works
> correctly, even if you put the name of the new subtree in a way like
> "ou=paYROll,dc=eXAmple,dc=COM" - the final returned DN will be correct, that is
> the new DN is recalculated during the entry subtree move.

Internally, the server also contains case lowered normalized RDN and use it for its process.  So, despite of the representation, it works just fine.

Comment 9 Andrey Ivanov 2010-08-30 21:02:05 UTC
Yes, the mechanism why it happens this way is quite clear - to reduce the expensive (in disk IO or entryrdn access) re-constitution of dn.

If openldap behaves in the same way i guess we should leave the things as they are now.

In any case, the lowering is not a good idea - the client just after adding the entry will expect to see the retrieved entry's DN either exactly the same way it was in the added entry (the way it works now) or a "reconstructed" DN (that's what happens after the server restarts).

Comment 10 Noriko Hosoi 2010-08-31 00:41:53 UTC
Created attachment 442069 [details]
git patch file (master)

Description: Code for supporting entrydn (added for Bug 578296)
contained a bug.  If an entry was found in the entry cache,
id2entry_ext returned it without adding the entrydn attribute
value.  This patch fixes the problem.
In addition, if the parent DN in the to-be-added entry is not
identical to the real parent DN (e.g., dc=eXAmple vs. dc=example),
replace the string with the real parent DN.  This check & replace
is done only when the parent entry is in the entry cache not to
sacrifice the performance.

Comment 11 Andrey Ivanov 2010-08-31 08:40:10 UTC
Thank you, Noriko! I think it's a very bright idea and a good tradeoff - to make the dn "path normalization" only if the parent is already in the cache. The client will probably make one or several searches at the parent level before adding a new entry, so the parent will most certainly go to cache.

Moreover, i've just tested your patch. It works in all the cases, even if the entry add is the first operation after the server restart (that is, no caches are still present)! Not sure why it happens. Maybe during an entry addition the server code puts the entry's parent (or its DN) into cache automatically?

There seem to be also a strange side effect of this patch: when using db2ldif (offline) i have an additional informational (or error?) line :

entrycache_clear_int: there are still 17 entries in the entry cache. :/

This message does not appear without your patch.


Here is the complete log :

[31/Aug/2010:10:18:45 +0200] - WARNING: Import is running with nsslapd-db-private-import-mem on; No other process is allowed to access the database
[31/Aug/2010:10:18:45 +0200] - check_and_set_import_cache: pagesize: 4096, pages: 240234, procpages: 47617
[31/Aug/2010:10:18:45 +0200] - WARNING: After allocating import cache 384372KB, the available memory is 576564KB, which is less than the soft limit 1048576KB. You may want to decrease the import cache size and rerun import.
[31/Aug/2010:10:18:45 +0200] - Import allocates 384372KB import cache.
[31/Aug/2010:10:18:45 +0200] - import userRoot: Beginning import job...
[31/Aug/2010:10:18:45 +0200] - import userRoot: Index buffering enabled with bucket size 100
[31/Aug/2010:10:18:45 +0200] - import userRoot: Processing file "/tmp/prod_base_current.ldif"
[31/Aug/2010:10:18:55 +0200] - import userRoot: Finished scanning file "/tmp/prod_base_current.ldif" (9527 entries)
[31/Aug/2010:10:18:56 +0200] - import userRoot: Workers finished; cleaning up...
[31/Aug/2010:10:18:56 +0200] - import userRoot: Workers cleaned up.
[31/Aug/2010:10:18:56 +0200] - import userRoot: Cleaning up producer thread...
[31/Aug/2010:10:18:56 +0200] - import userRoot: Indexing complete.  Post-processing...
[31/Aug/2010:10:18:57 +0200] - import userRoot: Flushing caches...
[31/Aug/2010:10:18:57 +0200] - import userRoot: Closing files...
[31/Aug/2010:10:18:57 +0200] - entrycache_clear_int: there are still 17 entries in the entry cache. :/
[31/Aug/2010:10:18:57 +0200] - All database threads now stopped
[31/Aug/2010:10:18:57 +0200] - import userRoot: Import complete.  Processed 9527 entries in 12 seconds. (793.92 entries/sec)

Comment 12 Noriko Hosoi 2010-08-31 17:12:55 UTC
Created attachment 442238 [details]
git patch file (master)

Good catch!  Thanks a lot, Andrey.  I forgot to return the found entries to the cache. :(

I'm attaching the revised code.

Comment 13 Noriko Hosoi 2010-08-31 17:41:05 UTC
Reviewed by Rich (Thanks!!!)

Pushed to master.
$ git push
Counting objects: 67, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (41/41), done.
Writing objects: 100% (41/41), 14.98 KiB, done.
Total 41 (delta 33), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   a2bcd81..cc36301  master -> master

Comment 14 Amita Sharma 2011-07-20 12:21:38 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=578296 is VERIFIED (related bug)
and some more testing :
=========================

[root@testvm slapd-testvm1]# ldapadd -x -h localhost -p 1389 -D "cn=Directory Manager" -w Secret123  << EOF
> dn: uid=bug-normalizer,ou=paYROll,dc=eXAmple,dc=COM
> changetype: add
> objectClass: top
> objectClass: person
> objectClass: organizationalPerson
> objectClass: inetOrgPerson
> cn: Bug Normalizer Test
> sn: Bug Normalizer
> uid: bug-normalizer
> EOF
adding new entry "uid=bug-normalizer,ou=paYROll,dc=eXAmple,dc=COM"


[root@testvm slapd-testvm1]# ldapsearch -x -h localhost -p 1389 -D "cn=Directory Manager" -w Secret123 -b "ou=Payroll,dc=example,dc=com" entrydn | grep uid=bug-normalizer
dn: uid=bug-normalizer,ou=Payroll,dc=example,dc=com
entrydn: uid=bug-normalizer,ou=payroll,dc=example,dc=com


Note You need to log in before you can comment on or make changes to this bug.