Description of problem: If there is a index key that has more than 255 IDs in a secondary index b-tree, db_verify reports the b-tree is corrupted on a little endian machine. Sample output from db_verify (called from verify-db.pl) on RHEL4U4: Verify db/ebiRoot/cn.db4 ... DB ERROR: db_verify: Page 10: out-of-order key at entry 249 DB ERROR: db_verify: DB->verify: db/ebiRoot/cn.db4: DB_VERIFY_BAD: Database verification failed Secondary index file cn.db4 in db/ebiRoot is corrupted. The location where db_verify reports "out-of-order". (The digits in [] is the internal entry id.) At [249], data 00(0x)010000 looks smaller than the previous ff000000, and db_verify complains it. page 10: duplicate: LSN [0][1]: level 1 prev: 0 next: 0 entries: 255 offset: 6152 [000] 8184 len: 4 data: 0x07000000 [001] 8176 len: 4 data: 0x08000000 [002] 8168 len: 4 data: 0x09000000 [...] [247] 6208 len: 4 data: 0xfe000000 [248] 6200 len: 4 data: 0xff000000 [249] 6192 len: 4 data: 000x010000 [250] 6184 len: 4 data: 0x010x010000 [251] 6176 len: 4 data: 0x020x010000 Following is the same key-value pair from cn.db4 dumped by dbscan. DS and dbscan internally prepares the byte-order converter and the comparison function, thus they handle the IDs correctly. The utility db_verify does not have the knowledge and uses the default comparing method which fails to verify the healthy index files... + 255 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 For comparison, this is the same page dumped on a big endian machine. page 10: duplicate level: 1 (lsn.file: 0 lsn.offset: 1) prev: 0 next: 0 entries: 255 offset: 6152 [000] 8184 len: 4 data: 0x000x000x000x07 [001] 8176 len: 4 data: 0x000x000x000x08 [002] 8168 len: 4 data: 0x000x000x000x09 [...] [246] 6216 len: 4 data: 0x000x000x000xfd [247] 6208 len: 4 data: 0x000x000x000xfe [248] 6200 len: 4 data: 0x000x000x000xff [249] 6192 len: 4 data: 0x000x000x010x00 [250] 6184 len: 4 data: 0x000x000x010x01 [251] 6176 len: 4 data: 0x000x000x010x02 The data are ascendant and db_verify does not complain on this index. ToDo: Instead of using the system db_verify, we have to implement a verify utility with the ID format knowledge.
Created attachment 189281 [details] New File: ldapserver/ldap/servers/slapd/back-ldbm/dbverify.c Implemented dbverify function which calls Berkeley DB's DB->verify on each database file (.db# file). It works as a command line mode via dbverify script (being attached next). I've also implemented and tested the task mode, but DB->verify function does not like the way the DS opens the DB environment: [05/Sep/2007:18:41:16 -0700] - libdb: DB->verify may not be used with transactions, logging, or locking If we cannot share the DB environment, there is no much advantage to implement using the task, I think. Thus, I removed the task mode for now.
Created attachment 189291 [details] New File: ldapserver/ldap/admin/src/scripts/template-dbverify.in This is a template file to be instantiated when a server instance is created. Usage: dbverify [-n backend_instance] [-V] Note : if "-n backend_instance" is not passed, verify all DBs. -V : verbose Sample usages: [nhosoi@laputa slapd-laputa2]$ dbverify DB verify: Passed $ dbverify -n userRoot DB verify: Passed $ dbverify -n bogus DB verify: Failed $ dbverify -V [06/Sep/2007:15:06:52 -0700] DB verify - /export/servers/ds72/var/lib/dirsrv/slapd-laputa2/db/userRoot/sn.db4: ok [06/Sep/2007:15:06:52 -0700] DB verify - /export/servers/ds72/var/lib/dirsrv/slapd-laputa2/db/userRoot/parentid.db4: ok [...] [06/Sep/2007:15:06:52 -0700] DB verify - /export/servers/ds72/var/lib/dirsrv/slapd-laputa2/db/userRoot/id2entry.db4: ok DB verify: Passed $ dbverify -V [06/Sep/2007:15:08:05 -0700] DB verify - /export/servers/ds72/var/lib/dirsrv/slapd-laputa3/db/userRoot/parentid.db4: ok [...] [06/Sep/2007:15:08:05 -0700] - libdb: Page 0: nonsensical bt_minkey value 1 on metadata page [06/Sep/2007:15:08:05 -0700] - libdb: Page 0: nonsensical root page 0 on metadata page [06/Sep/2007:15:08:05 -0700] DB verify - verify failed(-30976): /export/servers/ds72/var/lib/dirsrv/slapd-laputa3/db/userRoot/cn.db4 [...] [06/Sep/2007:15:08:05 -0700] DB verify - /export/servers/ds72/var/lib/dirsrv/slapd-laputa3/db/userRoot/id2entry.db4: ok DB verify: Failed
Created attachment 189301 [details] cvs diffs Modified Files: Makefile.am ldap/admin/src/scripts/template-verify-db.pl.in ldap/servers/slapd/main.c ldap/servers/slapd/pblock.c ldap/servers/slapd/slap.h ldap/servers/slapd/slapi-private.h ldap/servers/slapd/task.c ldap/servers/slapd/back-ldbm/dblayer.c ldap/servers/slapd/back-ldbm/init.c ldap/servers/slapd/back-ldbm/proto-back-ldbm.h Descriptions: 1) adding dbverify (template-dbverify.in) 2) adding dbverify.c 3) updating verify-db.pl calling dbverify instead of db_verify from Berkeley DB 4) updating main.c/pblock.c/slap.h/slapi-private.c to make dbverify mode available 5) fixing minor memory leak (task.c) and mode confusion (dblayer.c) Sample usages: $ verify-db.pl ***************************************************************** verify-db: This tool should only be run if recovery start fails and the server is down. If you run this tool while the server is running, you may get false reports of corrupted files or other false errors. ***************************************************************** Verify log files in /export/servers/ds72/var/lib/dirsrv/slapd-laputa2/db ... Good $ verify-db.pl ***************************************************************** verify-db: This tool should only be run if recovery start fails and the server is down. If you run this tool while the server is running, you may get false reports of corrupted files or other false errors. ***************************************************************** Verify log files in /export/servers/ds72/var/lib/dirsrv/slapd-laputa3/db ... Good [06/Sep/2007:15:11:18 -0700] - libdb: Page 0: nonsensical bt_minkey value 1 on metadata page [06/Sep/2007:15:11:18 -0700] - libdb: Page 0: nonsensical root page 0 on metadata page [06/Sep/2007:15:11:18 -0700] DB verify - verify failed(-30976): /export/servers/ds72/var/lib/dirsrv/slapd-laputa3/db/userRoot/cn.db4 Found the index file(s) was corrupted Please run db2index on the corrupted index
+ /* check for slapi v2 support */ + if (! SLAPI_PLUGIN_IS_V2(backend_plugin)) { + LDAPDebug(LDAP_DEBUG_ANY, "ERROR: %s is too old to do imports.\n", + backend_plugin->plg_name, 0, 0); + exit(1); + } Should this be "to do dbverify"? + IFP plg_un_db_verify; /* verify db files */ If you can, put this new struct member at the end of the structure to minimize potential binary compatability problems.
Created attachment 190451 [details] cvs commit message (comment #1-#4) (In reply to comment #4) > + /* check for slapi v2 support */ > + if (! SLAPI_PLUGIN_IS_V2(backend_plugin)) { > + LDAPDebug(LDAP_DEBUG_ANY, "ERROR: %s is too old to do imports.\n", > + backend_plugin->plg_name, 0, 0); > + exit(1); > + } > > Should this be "to do dbverify"? > > + IFP plg_un_db_verify; /* verify db files */ > > If you can, put this new struct member at the end of the structure to minimize > potential binary compatability problems. Thank you, Rich, for your reviews and comments. I applied your changes and checked in into HEAD.
It turned out there are 2 issues in this problem: 1. These 2 packages need to be installed on Solaris: HATdb4x-4.2.52-7.1.sparcv9.pkg RHATdb4x-utils-4.2.52-7.1.sparcv9.pkg Proudfoot did not have utils: bash-2.05# pkginfo | egrep RHATdb4x application RHATdb4x The Berkeley DB database library (version 4) for C. [64-bit] I installed utils package: $ pkgadd -d ./RHATdb4x-utils-4.2.52-7.1.sparcv9.pkg ## Installing part 1 of 1. /usr/bin/sparcv9/berkeley_db_svc /usr/bin/sparcv9/db_archive /usr/bin/sparcv9/db_checkpoint /usr/bin/sparcv9/db_deadlock /usr/bin/sparcv9/db_dump /usr/bin/sparcv9/db_dump185 /usr/bin/sparcv9/db_load /usr/bin/sparcv9/db_printlog /usr/bin/sparcv9/db_recover /usr/bin/sparcv9/db_stat /usr/bin/sparcv9/db_upgrade /usr/bin/sparcv9/db_verify [ verifying class <none> ] 2. verify-db.pl is setting PATH and LDB_LIBRARY_PATH as follows to pick up db_printlog and dbverify. my $prefix = ""; $ENV{'PATH'} = "$prefix/usr/bin/sparcv9:$prefix/usr/bin:/usr/bin/sparcv9:/usr/bin"; $ENV{'LD_LIBRARY_PATH'} = "/usr/lib/sparcv9:/usr/lib/sparcv9"; $ENV{'SHLIB_PATH'} = "/usr/lib/sparcv9:/usr/lib/sparcv9"; dbverify is in the same directory and it failed to be picked up since '.' was not in PATH. We need to change the PATH as follows: diff verify-db.pl.orig verify-db.pl 172c172 < $ENV{'PATH'} = "$prefix/usr/bin/sparcv9:$prefix/usr/bin:/usr/bin/sparcv9:/usr/bin"; --- > $ENV{'PATH'} = ".:$prefix/usr/bin/sparcv9:$prefix/usr/bin:/usr/bin/sparcv9:/usr/bin"; With the above 2 changes, it works fine on Solaris: ./verify-db.pl ***************************************************************** verify-db: This tool should only be run if recovery start fails and the server is down. If you run this tool while the server is running, you may get false reports of corrupted files or other false errors. ***************************************************************** Verify log files in /export/ds80/db ... Good Verify db files ... Good
verify-db.pl is fixed. The bug is 367671 "verify-db.pl : can't find dbverify". Regarding the package dependency on RHATdb4x-utils, could you open another bug, if necessary? Modifying the status to "MODIFIED" for the bug verification.