Bug 431086

Summary: Fix database inconsistencies as reported by sanitycheck.cgi before final release
Product: [Community] Bugzilla Reporter: David Lawrence <dkl>
Component: DatabaseAssignee: Simon Green <sgreen>
Status: CLOSED WONTFIX QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 3.6CC: chandankumar2014, ebaak, kbaker, sgreen
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.18 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-19 04:06:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 406071, 427052    
Attachments:
Description Flags
possible solutions of fixing database inconsistencies, reported by sanitcheck
none
possible solutions of fixing database inconsistencies, reported by sanitcheck, commented by dkl
none
clean up inconsistent data, reported by sanitycheck report
none
inconsistent data cleanup
none
delete invalid logins from profiles table
dkl: review+
clean up inconsistent data, reported by general_bug_check.orig file
dkl: review+
clean up inconsistent data, reported by general_bug_check.orig file
none
File containing recent sanity check errors (2008-11-25) none

Description David Lawrence 2008-01-31 17:45:41 UTC
Sanitycheck.cgi shows numerous inconsistencies in the Bugzilla database after
migration from 2.18 to 3.0 schema. We should make best effort to clear as much 
of these as possible before the final release.

http://bugdev.devel.redhat.com/bugzilla/sanitycheck.cgi?rebuildkeywordcache=1&remove_invalid_flags=1

We could break this bug up into smaller pieces as needed and use this as a top
level tracker.

Comment 1 Kevin Baker 2008-04-15 02:46:30 UTC
Tony,

look into the sanity check report and see what can be fixed. 

Comment 2 Tony Fu 2008-04-18 05:52:19 UTC
Dave,

I have examined our current sanitycheck report, regarding to foreign key
checking.  I put my solutions of fixing the inconsistent data in attached file.
 Please take a look at it (looking for "XXX tfu" for my comment).  If you are
happy with them, I will write a script to update bugzilla DB.


Thanks,
Tony

Comment 3 Tony Fu 2008-04-18 05:53:39 UTC
Created attachment 302838 [details]
possible solutions of fixing database inconsistencies, reported by sanitcheck

Comment 4 David Lawrence 2008-04-22 16:15:37 UTC
Created attachment 303334 [details]
possible solutions of fixing database inconsistencies, reported by sanitcheck, commented by dkl

Tony, I am attaching another copy of your report with my comments inside. One
thing of note is that alot of the changes that you mention I have already
included in the redhat/db_fixup.sql script that I use when converting the
database from 2.18 to 3.2. Please take the bits that are useful from that
script and create another script that will do the cleanups on the live
database. If you do use the SQL from db_fixup.sql in your cleanup script,
please remove the section from db_fixup.sql and commit the changes back to CVS.


Thanks
Dave

Comment 5 Tony Fu 2008-04-23 05:57:59 UTC
Dave,

Thanks for your comments.  I will write a scrip to clean up the data, according
to your comments.


Thanks,
Tony

Comment 6 Tony Fu 2008-04-24 07:15:18 UTC
Dave,

Please review attached data cleanup script.  I have tested it against my test
database that only includes a subset of live data.  If it is possible, I would
like to test it again bugs on bz-db1-test.devel.

Also I commented out the sql queries that have been used in my script from
db-fixup.sql file.

Thanks,
Tony

Comment 7 Tony Fu 2008-04-24 07:17:26 UTC
Created attachment 303599 [details]
clean up inconsistent data, reported by sanitycheck report

Comment 8 David Lawrence 2008-04-24 19:16:54 UTC
Comment on attachment 303599 [details]
clean up inconsistent data, reported by sanitycheck report


>#########################################################
>#   data clean up activity #14:
>#       change resolution value '' to NULL if status is
>#       not "CLOSED";
>#       change it to 'CURRENTLEASE' if the status is 'CLOSED'
>##########################################################
>$dbh->do(
>    "UPDATE bugs SET resolution=NULL
>           WHERE resolution=''
>           AND bug_status!='CLOSED'"
>);
>$dbh->do(
>    "UPDATE bugs SET resolution='CURRENTRELEASE'
>           WHERE resolution=''
>           AND bug_status='CLOSED'"
>);
>print "records whith invalid bugs.resolution value('') have been deleted\n";

I think we may need to add a bugs_activity entry for the second one. When we
are changing nothing to something or something to something then we should
probably add an activity entry. So for '' --> 'CURRENTRELEASE', we add an
activity entry, but for '' --> NULL (essentially same thing) we don't need an
activity entry. 

Also do the same for the bug.assigned_to fix in data clean up activity #1. For
that just do 'UNKNOWN' --> 'bugzilla'. Similar to how you did in
similar previous scripts such as redhat/cleanup_reporter.pl.

>#########################################################
>#   data clean up activity #19:
>#       change bugs.rep_platform to "i386" if its current
>#       value is "PC", which is not a valid value any more
>##########################################################
>$dbh->do(" UPDATE bugs SET rep_platform='i386' WHERE rep_platform='PC' ");
>print "changed bugs.rep_platform value PC to i386\n";

I think we should also have a bugs_activity entry for this one as well for any
bugs fixed.

Everything looks good Tony. Nice work. Add the above and then test on
bz-db1-test.devel.redhat.com. Then re-run sanitycheck.cgi after to make sure
the errors disappear as expected. If all goes well, we will run this on the
live database next week.

Dave

Comment 9 Tony Fu 2008-05-01 06:14:56 UTC
Dave,

I have run data cleanup script against bugs2 db on bz-db1-test and corrected
some minor errors of the script.  The new sanitycheck report looks good.

I attached the new data cleanup script here and committed it to
rh_bugzilla_3/redhat as well.


Tony

Comment 10 Tony Fu 2008-05-01 06:17:11 UTC
Created attachment 304295 [details]
inconsistent data cleanup

Comment 11 David Lawrence 2008-05-01 15:25:52 UTC
Tony looks good. I am having this execute on live server today so we can close
this now.

Dave

Comment 12 David Lawrence 2008-05-06 04:00:09 UTC
Reopening this bug. 

Lets go ahead and use this bug to track the final cleanup tasks left from
sanitycheck.cgi. The following errors still appear in the report from
bz-db1-test.devel.redhat.com

Checking profile logins.
Bad profile email address, id=126466, ...

There are many lines reporting bad profiles.login_name values that do not match
proper email format. We need to take each one of these and delete it from
profiles if it is not linked to any bugs, components, cc, bugs_activity, etc.

Checking resolution/duplicates
Bugs found on duplicates table that are not marked duplicate: ...

These should be simply removed from the duplicates table.

Bugs found marked resolved duplicate and not on duplicates table: ...

These should be added to the duplicates table.

Bugs with open status and a resolution: ...

For these we should just clear the resolution.

Checking for bugs with groups violating their product's group controls
Have groups not permitted for their products: 123596, 123772, 123775, 206136,
432077, 437874 (as buglist).

For these we will just click the link "Permit the missing groups for the
affected products (set member control to <code>SHOWN</code>)." and have Bugzilla
fix it so the groups are added to the products from 6 bugs.

Lastly, there are numerous bugs that have email that has not been sent yet.
Leave these alone for now.

Thanks
Dave

Comment 13 Tony Fu 2008-05-14 06:13:54 UTC
Dave,

Please see attached script, which checks invalid login name (using emailregexp
param in Bugzilla data/params file), then if the invalid login is not used by
any other tables (using reference tables used by sanitycheck) delete it from
profiles.

After running it against a set of data copied from bz-db1-test, there were 29
invalid logins left.  Here is the list:

User 0 (userid:22991) is used in bugs.assigned_to.  Can't delete 0 from profiles
table
User stgyuri (userid:33082) is used in bugs.reporter.  Can't delete stgyuri from
profiles table
User fossy (userid:79220) is used in namedqueries.userid.  Can't delete fossy
from profiles table
User ctolley (userid:27211) is used in bugs.reporter.  Can't delete ctolley from
profiles table
User monnier+lists/redhat/bugs.yale.edu (userid:2740) is used in
bugs.reporter.  Can't delete monnier+lists/redhat/bugs.yale.edu from
profiles table
User timothy.covell. (userid:107768) is used in bugs.reporter. 
Can't delete timothy.covell. from profiles table
User iac2 (userid:2144) is used in bugs.reporter.  Can't delete iac2 from
profiles table
User gigi%c-12-232.fx.ro (userid:152198) is used in bugs.reporter.  Can't
delete gigi%c-12-232.fx.ro from profiles table
User lkillo (userid:75874) is used in cc.who.  Can't delete lkillo from profiles
table
User mhaag (userid:130068) is used in bugs.reporter.  Can't delete mhaag from
profiles table
User mb/redhat.ac.uk (userid:46092) is used in bugs.reporter.  Can't
delete mb/redhat.ac.uk from profiles table
User brentrbrian (userid:3158) is used in bugs.reporter.  Can't delete
brentrbrian from profiles table
User matt#bugzill.redhat (userid:141726) is used in
bugs_activity.who.  Can't delete matt#bugzill.redhat from
profiles table
User aegean (userid:134827) is used in longdescs.who.  Can't delete aegean from
profiles table
User callihn (userid:125548) is used in attachments.submitter_id.  Can't delete
callihn from profiles table
User blackrat (userid:43784) is used in namedqueries.userid.  Can't delete
blackrat from profiles table
User dbolcioni (userid:27150) is used in bugs.reporter.  Can't delete dbolcioni
from profiles table
User http://beta.redhat.com/beta@redhat.com (userid:96506) is used in
user_group_map.user_id.  Can't delete http://beta.redhat.com/beta@redhat.com
from profiles table
User ccntmnky (userid:27008) is used in namedqueries.userid.  Can't delete
ccntmnky from profiles table
User disabled (userid:2735) is used in bugs.reporter.  Can't delete disabled
from profiles table
User egalanos.edu.au. (userid:9004) is used in bugs.reporter.  Can't
delete egalanos.edu.au. from profiles table
User bagepants (userid:32314) is used in bugs.reporter.  Can't delete bagepants
from profiles table
User jmss (userid:39331) is used in cc.who.  Can't delete jmss from profiles table
User matt#bugzilla.redhat (userid:11484) is used in
bugs.reporter.  Can't delete matt#bugzilla.redhat from profiles
table
User jrlflah (userid:29490) is used in bugs.reporter.  Can't delete jrlflah from
profiles table
User seppanen. (userid:26830) is used in bugs.reporter.  Can't
delete seppanen. from profiles table
User anonymous (userid:4424) is used in longdescs.who.  Can't delete anonymous
from profiles table
User albumen (userid:3869) is used in longdescs.who.  Can't delete albumen from
profiles table
User dewet (userid:21607) is used in bugs.reporter.  Can't delete dewet from
profiles table

Thanks,
Tony

Comment 14 Tony Fu 2008-05-14 06:15:24 UTC
Created attachment 305333 [details]
delete invalid logins from profiles table

Comment 15 David Lawrence 2008-05-15 04:01:18 UTC
Comment on attachment 305333 [details]
delete invalid logins from profiles table

Script looks good. 

Couple of suggestons.
1. entries that are in the namedqueries table only can go ahead and delete from
namedqueries and profiles.
2. entries only in cc table can go ahead and delete as well. Should we add
activity entry?
3. Entries in longdescs.who can be changed bugzilla.
4. Entries in bugs.reporter can also be changed to bugzilla. Maybe
we also should do a activity entry for this as well?
5. Just delete the one attachment with the invalid submitter id.

Please list the bug id in the output lines as well. Invalid login names that
are bugs.assigned_to can be reassigned to bugzilla or other using
the UI.

Thanks
Dave

Comment 16 Tony Fu 2008-05-16 06:40:20 UTC
Dave,

I attached a script to clean up inconsistent data reported in
bug_general_check.orig.  Basically, it does three things

1. Some bugs have open status, but also have a resolution.  change these bugs'
resolution to ''(or we can change them to NULL)
2. Some bugs have closed status, but don't have resolution value.  Change these
bugs' resolution to 'CURRENTRELEASE' (I am not sure if the 'CURRENTRELEASE' is
the right resolution value for these bugs)
3. Some bugs are not marked as duplicate in bugs table, but there are in
duplicates table.  Remove these bugs from duplicates table.

Also bug_general_check.orig includes another issue: some bugs are marked as
duplicate, but no entries in duplicates table.  It is a little bit hard to fix,
because we need to know the value of duplicates.dupe_of if we want to insert
entries into duplicates table.  We can manually look into all this kind of bugs
(around 450 bugs), and find dupe_of information from their comments, but it
seems pretty hard to automate this process.  

Tony


Tony

Comment 17 Tony Fu 2008-05-16 06:42:52 UTC
Created attachment 305640 [details]
clean up inconsistent data, reported by general_bug_check.orig file

Comment 18 Tony Fu 2008-05-16 07:06:34 UTC
To sum up the database inconsistent data cleaning up actions:

Sanitycheck generates 8 files to report inconsistent data
1.  foreign_keys.orig: all inconsistent data reported in this file have been
fixed, except some bugs.resolution values are '' (empty string).  It is caused
by the bugzilla code to insert '' as default resolution when reopening a closed
bug.  Because there is the known issue about possible errors when using NULL as
default value for text field, we may need to change resolution value from NULL
to '' in the future.  Now, we may just ignore this error reported in
foreign_keys.orig file.

2. profile_loginnames.orig: some login_names are not invalid email addresses.  I
 have developed a script to fix them, but there are a few issues Dave pointed
out in earlier comments

3. general_bug_check.orig: four kinds of inconsistent data are reported there.
a. Some bugs have open status, but also have a resolution.
b. Some bugs have closed status, but don't have resolution value
c. Some bugs are not marked as duplicate in bugs table, but there are in
duplicates table.
d. some bugs are marked as duplicate, but no entries in duplicates table
I wrote a script to fix the first three problems, but the last one needs further
decision.
4. control_values.orig: can be fixed using web UI.
5. unsent_emails.orig: haven't reached a solution for this issue yet

6. bugs_components.orig
7. date_check.orig
8. keyword_cache.orig
no inconsistent data reported by file 6, 7 and 8


Tony

Comment 19 David Lawrence 2008-05-19 19:02:46 UTC
Comment on attachment 305640 [details]
clean up inconsistent data, reported by general_bug_check.orig file

>my $open_status
>    = "'NEW', 'REOPENED', 'ASSIGNED', 'NEEDINFO', 'NEEDINFO_REPORTER',
>            'MODIFIED', 'ON_DEV', 'VERIFIED',
>            'ON_QA', 'FAILS_QA', 'RELEASE_PENDING',
>            'POST'";
>my $check_sql_query = "SELECT bug_id, resolution FROM bugs"
>    . " WHERE bug_status IN ($open_status)"
>    . " AND (resolution !='' AND resolution IS NOT NULL)";
>

Nitpick:
May be easier to just look for bugs.bug_status != 'CLOSED' since
our current bugzilla considers anything not closed to be open state.
But for other Bugzillas this could be different.


>#########################################################
>#   TODO: data clean up activity 4:
>#    some bugs found on duplicates table that are not
>#    marked duplicate in bugs table
>#    delete these bugs from duplicates table
>#########################################################

Nitpick:
Description same as activity 3.

Will discuss implementation of activity 4 in the bug comments and create a new
script specifically for that task.

Otherwise looks good Tony. Please run this against bz-db1-test/bugs and if all
goes well please send email to schick/meethune asking for the script to be ran
on the test database on partner-bugzilla.redhat.com as well.

Thanks
Dave

Comment 20 David Lawrence 2008-07-31 16:46:05 UTC
Added additional line to redhat/db_fixup.sql script that fixes the resolutions
table foreign key errors reported by sanitycheck.cgi.

--- Add the empty string resolution to the resolutions table to allow the
foreign keys sanitycheck to work properly.
 INSERT INTO resolution (value, isactive, sortkey) VALUES ('', 1, 0);


Comment 21 Tony Fu 2008-08-01 03:27:18 UTC
Dave,

I ran sanitycheck.pl on bz-web2-test.devel.redhat.com and found it still
reported a lot of inconsistent data between bugs.resolution and bugs.status,
like bug has closed status but no resolution, as well as invalid login names. 
It seems that we haven't run attached cleanup_invalid_login.pl and
cleanup_general_bug_check_data.pl against production db.

Do you think we should run them before the upgrade, or we can run them after the
upgrade?


Tony

Comment 22 David Lawrence 2008-08-01 04:25:52 UTC
(In reply to comment #21)
> Dave,
> 
> I ran sanitycheck.pl on bz-web2-test.devel.redhat.com and found it still
> reported a lot of inconsistent data between bugs.resolution and bugs.status,
> like bug has closed status but no resolution, as well as invalid login names. 
> It seems that we haven't run attached cleanup_invalid_login.pl and
> cleanup_general_bug_check_data.pl against production db.
> 
> Do you think we should run them before the upgrade, or we can run them after the
> upgrade?
> 
> 
> Tony

Cool Tony. I say we should run these during the upgrade on Saturday while we are
down.

Please add the scripts to the DatabaseMigration Wiki page so that we have a
reminder that they need to be ran before we bring the system back up.

https://engineering.redhat.com/trac/bugzilla-3.0-rh/wiki/DbMigrationProcess

Thanks 
Dave

Comment 23 Tony Fu 2008-08-01 07:25:03 UTC
Created attachment 313167 [details]
clean up inconsistent data, reported by general_bug_check.orig file

Comment 24 David Lawrence 2008-08-01 16:55:16 UTC
Tony, I went ahead and had eng-sysadmin run these today (friday) before the
migration so one less thing for us to worry about. I used the one you attached
previously since it had 'fieldid' for fielddefs instead.

Dave

Comment 25 David Lawrence 2008-08-01 16:56:12 UTC
We will leave this open until after the migration so we can rerun the
sanitycheck.cgi to see what is left.

Dave

Comment 26 David Lawrence 2008-11-25 22:31:06 UTC
Created attachment 324678 [details]
File containing recent sanity check errors (2008-11-25)

We need to fix the attached errors when possible to finish cleaning up the sanitycheck output.

Comment 27 David Lawrence 2010-01-15 16:55:36 UTC
Red Hat Bugzilla is now using version 3.4 of the Bugzilla codebase and
therefore this bug will need to be re-verified against the new release. With
the updated code this bug may no longer be relevant or may have been fixed in
the new code. Updating bug version to 3.4.

Comment 28 David Lawrence 2010-08-25 21:43:46 UTC
Red Hat has now upgraded to Bugzilla 3.6 and this bug will now be reassigned to that version. It would be helpful to the Bugzilla Development Team if this bug is verified to still be an issue with the latest version. If it is no longer an issue, then feel free to close, otherwise please comment that it is still a problem and we will try to address the issue as soon as we can.

Thanks
Bugzilla Development Team