Bug 923448 - pulp-manage-db fails for 0005_rpm_changelog_files
Summary: pulp-manage-db fails for 0005_rpm_changelog_files
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Pulp
Classification: Retired
Component: rpm-support
Version: 2.1 Beta
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 2.1.0
Assignee: Michael Hrivnak
QA Contact: Preethi Thomas
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-19 21:10 UTC by Steven Roberts
Modified: 2013-04-08 16:03 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-08 16:03:06 UTC
Embargoed:


Attachments (Terms of Use)
pulp-manage-db output (7.92 KB, text/plain)
2013-03-19 21:23 UTC, Steven Roberts
no flags Details
script that reproduces bug (6.24 KB, text/x-python)
2013-03-19 21:32 UTC, Michael Hrivnak
no flags Details

Description Steven Roberts 2013-03-19 21:10:58 UTC
Description of problem:
did an rpm update to 2.1.0 0.23.beta.  ran pulp-manage-db.  0005_rpm_changelog_files  fails 

the output of the command:
===============================================

    Applying pulp_rpm.migrations version 1
    Migration to pulp_rpm.migrations version 1 complete.
    Applying pulp_rpm.migrations version 2
    Migration to pulp_rpm.migrations version 2 complete.
    Applying pulp_rpm.migrations version 3
    Migration to pulp_rpm.migrations version 3 complete.
    Applying pulp_rpm.migrations version 4
    Migration to pulp_rpm.migrations version 4 complete.
    Applying pulp_rpm.migrations version 5
    Applying migration pulp_rpm.migrations.0005_rpm_changelog_files failed. See log for details.
    2013-03-19 11:26:51,340 db:CRITICAL: Applying migration pulp_rpm.migrations.0005_rpm_changelog_files failed.
    2013-03-19 11:26:51,340 db:CRITICAL: strings in documents must be valid UTF-8
    2013-03-19 11:26:51,460 db:CRITICAL: Traceback (most recent call last):
    File "/usr/lib/python2.6/site-packages/pulp/server/db/manage.py", line 79, in migrate_database
    update_current_version=not options.test)
    File "/usr/lib/python2.6/site-packages/pulp/server/db/migrate/models.py", line 161, in apply_migration
    migration.migrate()
    File "/usr/lib/python2.6/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py", line 58, in migrate
    _migrate_rpm_unit_changelog_files()
    File "/usr/lib/python2.6/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py", line 48, in _migrate_rpm_unit_changelog_files
    collection.save(rpm_unit, safe=True)
    File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry
    return method(*args, **kwargs)
    File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 237, in save
    manipulate, safe, _check_keys=True, **kwargs)
    File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry
    return method(*args, **kwargs)
    File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 411, in update
    _check_keys, self.__uuid_subtype), safe)
    InvalidStringData: strings in documents must be valid UTF-8
     
    Database migrations complete.
    Loading content types.
===============================================

Comment 1 Steven Roberts 2013-03-19 21:14:06 UTC
From some discussions I did a "strace -e trace=stat" to see the rpm, and looks to be man-pages-da-0.1.1-12.1.1.noarch.rpm.


and added a exception snippet from IRC and here is the object dump:
Beginning database migrations.
Migration package pulp.server.db.migrations is up to date at version 4
Migration package pulp_puppet.plugins.migrations is up to date at version 0
Applying pulp_rpm.migrations version 5
SON([(u'requires', []), (u'_storage_path', u'/var/lib/pulp/content/rpm/.//man-pages-da/0.1.1/12.1.1/noarch/63f1445df3e2680413e410b0409268574df47157/man-pages-da-0.1.1-12.1.1.noarch.rpm'), (u'repodata', SON([(u'filelists', u'\n<package pkgid="63f1445df3e2680413e410b0409268574df47157" name="man-pages-da" arch="noarch">\n    <version epoch="0" ver="0.1.1" rel="12.1.1"/>\n\n    <file>/usr/share/doc/man-pages-da-0.1.1/AUTHORS</file>\n    <file>/usr/share/doc/man-pages-da-0.1.1/ChangeLog</file>\n    <file>/usr/share/doc/man-pages-da-0.1.1/l\xe6smig</file>\n    <file>/usr/share/man/da/man1/chgrp.1.gz</file>\n    <file>/usr/share/man/da/man1/chmod.1.gz</file>\n    <file>/usr/share/man/da/man1/chown.1.gz</file>\n    <file>/usr/share/man/da/man1/dd.1.gz</file>\n    <file>/usr/share/man/da/man1/df.1.gz</file>\n    <file>/usr/share/man/da/man1/gnome-wm.1.gz</file>\n    <file>/usr/share/man/da/man1/make.1.gz</file>\n    <file type="dir">/usr/share/doc/man-pages-da-0.1.1</file>\n    <file type="dir">/usr/share/man/da</file>\n    <file type="dir">/usr/share/man/da/man1</file>\n</package>\n'), (u'other', u'\n<package pkgid="63f1445df3e2680413e410b0409268574df47157" name="man-pages-da" arch="noarch">\n    <version epoch="0" ver="0.1.1" rel="12.1.1"/>\n\n<changelog author="Tim Powers &lt;timp&gt;" date="1010613600">- automated rebuild</changelog>\n<changelog author="Tim Powers &lt;timp&gt;" date="1022191200">- automated rebuild</changelog>\n<changelog author="Tim Powers &lt;timp&gt;" date="1024696800">- automated rebuild</changelog>\n<changelog author="Tim Powers &lt;timp&gt; 0.1.1-8" date="1039644000">- rebuild</changelog>\n<changelog author="Tim Powers &lt;timp&gt;" date="1043272800">- rebuilt</changelog>\n<changelog author="Phil Knirsch &lt;pknirsch&gt; 0.1.1-10" date="1045000800">- Convert all manpages to utf-8.</changelog>\n<changelog author="Elliot Lee &lt;sopwith&gt;" date="1076709600">- rebuilt</changelog>\n<changelog author="Elliot Lee &lt;sopwith&gt; 0.1.1-12" date="1096495200">- Rebuilt</changelog>\n<changelog author="Jesse Keating &lt;jkeating&gt;" date="1134165600">- rebuilt</changelog>\n<changelog author="Jesse Keating &lt;jkeating&gt; - 0.1.1-12.1.1" date="1152741600">- rebuild</changelog>\n\n</package>\n'), (u'primary', u'\n<package type="rpm">\n  <name>man-pages-da</name>\n  <arch>noarch</arch>\n  <version epoch="0" ver="0.1.1" rel="12.1.1"/>\n  <checksum type="sha" pkgid="YES">63f1445df3e2680413e410b0409268574df47157</checksum>\n  <summary>Danish man pages from the Linux Documentation Project.</summary>\n  <description>Manual pages from the Linux Documentation Project, translated into\nDanish.</description>\n  <packager>Red Hat, Inc. &lt;http://bugzilla.redhat.com/bugzilla&gt;</packager>\n  <url>http://www.sslug.dk/locale/man-sider/</url>\n  <time file="1169142850" build="1152768694"/>\n  <size package="15259" installed="11231" archive="13280"/>\n<location href="man-pages-da-0.1.1-12.1.1.noarch.rpm"/>\n  <format>\n    <rpm:license>Distributable</rpm:license>\n    <rpm:vendor>Red Hat, Inc.</rpm:vendor>\n    <rpm:group>Documentation</rpm:group>\n    <rpm:buildhost>ia64-1.build.redhat.com</rpm:buildhost>\n    <rpm:sourcerpm>man-pages-da-0.1.1-12.1.1.src.rpm</rpm:sourcerpm>\n    <rpm:header-range start="440" end="4216"/>\n    <rpm:provides>\n      <rpm:entry name="man-pages-da" flags="EQ" epoch="0" ver="0.1.1" rel="12.1.1"/>\n    </rpm:provides>\n  </format>\n</package>')])), (u'checksumtype', u'sha'), (u'license', u'Distributable'), (u'_ns', u'units_rpm'), (u'checksum', u'63f1445df3e2680413e410b0409268574df47157'), (u'filename', u'man-pages-da-0.1.1-12.1.1.noarch.rpm'), (u'buildhost', u'ia64-1.build.redhat.com'), (u'epoch', u'0'), (u'version', u'0.1.1'), (u'relativepath', u'Packages/man-pages-da--12.1.1.noarch.rpm'), (u'provides', [[u'man-pages-da', u'EQ', [u'0', u'0.1.1', u'12.1.1']]]), (u'_content_type_id', u'rpm'), (u'release', u'12.1.1'), (u'vendor', u'Red Hat, Inc.'), (u'_id', u'082e75f4-501c-44a7-a2ba-4d
Applying migration pulp_rpm.migrations.0005_rpm_changelog_files failed.  See log for details.
2013-03-19 14:12:55,968 db:CRITICAL: Applying migration pulp_rpm.migrations.0005_rpm_changelog_files failed.
2013-03-19 14:12:55,969 db:CRITICAL: strings in documents must be valid UTF-8
2013-03-19 14:12:55,970 db:CRITICAL: Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/pulp/server/db/manage.py", line 79, in migrate_database
    update_current_version=not options.test)
  File "/usr/lib/python2.6/site-packages/pulp/server/db/migrate/models.py", line 161, in apply_migration
    migration.migrate()
  File "/usr/lib/python2.6/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py", line 62, in migrate
    _migrate_rpm_unit_changelog_files()
  File "/usr/lib/python2.6/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py", line 49, in _migrate_rpm_unit_changelog_files
    collection.save(rpm_unit, safe=True)
  File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry
    return method(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 237, in save
    manipulate, safe, _check_keys=True, **kwargs)
  File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry
    return method(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 411, in update
    _check_keys, self.__uuid_subtype), safe)
InvalidStringData: strings in documents must be valid UTF-8

Database migrations complete.
Loading content types.
Content types loaded.

Comment 2 Steven Roberts 2013-03-19 21:23:20 UTC
Created attachment 712975 [details]
pulp-manage-db output

the SON is a REALLY long line so was getting clipped by my terminal.  here is the output redirected to a file

Comment 3 Michael Hrivnak 2013-03-19 21:32:36 UTC
Created attachment 712977 [details]
script that reproduces bug

Comment 4 Michael Hrivnak 2013-03-20 22:43:41 UTC
https://github.com/pulp/pulp_rpm/pull/157

Comment 5 Jeff Ortel 2013-03-21 14:47:36 UTC
build: 2.1.0-0.25.beta

Comment 6 Steven Roberts 2013-03-21 17:36:57 UTC
I applied the patch yesterday evening straight from github.  first ran failed due to out of disk space.  increased my /var/lib/mongodb volume from 15G to 25G and re-ran.

the pulp-manage-db did eventually finish after 4 hr 45 min.  and the mongodb volume is using 22GB after being complete.

pulp does look to be running okay now.  in the process of doing my daily (scheduled) syncs.

So mongodb space is definitely bigger with the latest update.  RAM usage for syncs is maybe a little bigger too.  I'm on a 8GB VM dedicated to pulp and pulp is using about 4.5GB and mongo 3.5GB at the moment.

Comment 7 John Matthews 2013-03-21 20:55:34 UTC
Hi Steven,

As to the RAM usage during repo syncs a recent patch was merged yesterday which will significantly decrease memory usage during syncs:

The patch is here:
  https://github.com/pulp/pulp_rpm/pull/152

This is incorporated in the pulp 2.1.0-0.26-beta build RPMs

Would you apply the patch and perform a resync and let us know what you see?  I suspect memory usage will decrease noticeably.  

Below is what I see when syncing a typical RHEL 6.2 repo with ~7k RPMs
VSZ: ~2.5GB
RSS: ~1.2GB

prior to the patch I was closer to what you saw with VSZ around 4.8GB, RSS: 3.5GB

Comment 8 Steven Roberts 2013-03-22 21:02:10 UTC
good to hear on the performance work.  I'm on bug 873313, so I'll put results in there.  I have updated to .26 beta and will do some testing.

Comment 9 Michael Hrivnak 2013-03-28 17:58:10 UTC
To verify, grab this RPM (which includes non-UTF8 text in its metadata) and upload it to a pulp repository:

http://pkgs.org/centos-5-rhel-5/centos-rhel-i386/man-pages-da-0.1.1-12.1.1.noarch.rpm.html

Now we need to use mongo directly to remove some data from the database that the migration will re-generate.

$ mongo
> use pulp_database
> db.units_rpm.find({name:"man-pages-da"}, {filelist:1})
> // verify that the output is a list of files, as this is what the migration will re-generate
> db.units_rpm.update({name:"man-pages-da"}, {"$unset": {filelist:1}})
> // verify that the output does not show a file list

Keep the mongo shell open, and in another terminal, run the migration manually with:

$ python /path/to/pulp_rpm/migrations/0005_rpm_changelog_files.py

Probably it's in something like /usr/lib/python2.7/site-packages/pulp_rpm/migrations/

Back in your mongo shell, run the "find" command again to verify that the filelist is back. If the migration didn't cause any errors, the bug is verified.

Comment 10 Preethi Thomas 2013-03-28 18:22:09 UTC
verified


switched to db pulp_database
> db.units_rpm.find({name:"man-pages-da"}, {filelist:1})
{ "_id" : "a0ac1c96-e2b8-403a-a808-4c17e2e8f8e4", "filelist" : [ 	"/usr/share/man/da/man1/chgrp.1.gz", 	"/usr/share/man/da/man1/chmod.1.gz", 	"/usr/share/man/da/man1/chown.1.gz", 	"/usr/share/man/da/man1/dd.1.gz", 	"/usr/share/man/da/man1/df.1.gz", 	"/usr/share/man/da/man1/gnome-wm.1.gz", 	"/usr/share/man/da/man1/make.1.gz", 	"/usr/share/doc/man-pages-da-0.1.1/AUTHORS", 	"/usr/share/doc/man-pages-da-0.1.1/ChangeLog", 	"/usr/share/doc/man-pages-da-0.1.1/læsmig" ] }
{ "_id" : "a4132ed3-d1c4-4d24-8bfc-ff2d08ca5a0e" }
> 
> 
> db.units_rpm.update({name:"man-pages-da"}, {"$unset": {filelist:1}})
> 
> 
> 
> db.units_rpm.find({name:"man-pages-da"}, {filelist:1})
{ "_id" : "a0ac1c96-e2b8-403a-a808-4c17e2e8f8e4" }
{ "_id" : "a4132ed3-d1c4-4d24-8bfc-ff2d08ca5a0e" }
> 

root@mgmt8 ~]# python /usr/lib/python2.7/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py


[root@mgmt8 ~]# 
> 
> db.units_rpm.find({name:"man-pages-da"}, {filelist:1})
{ "_id" : "a0ac1c96-e2b8-403a-a808-4c17e2e8f8e4", "filelist" : [ 	"/usr/share/doc/man-pages-da-0.1.1/AUTHORS", 	"/usr/share/doc/man-pages-da-0.1.1/ChangeLog", 	"/usr/share/doc/man-pages-da-0.1.1/læsmig", 	"/usr/share/man/da/man1/chgrp.1.gz", 	"/usr/share/man/da/man1/chmod.1.gz", 	"/usr/share/man/da/man1/chown.1.gz", 	"/usr/share/man/da/man1/dd.1.gz", 	"/usr/share/man/da/man1/df.1.gz", 	"/usr/share/man/da/man1/gnome-wm.1.gz", 	"/usr/share/man/da/man1/make.1.gz" ] }
{ "_id" : "a4132ed3-d1c4-4d24-8bfc-ff2d08ca5a0e" }
>

Comment 11 Preethi Thomas 2013-04-08 16:03:06 UTC
Pulp 2.1 released 


http://www.pulpproject.org/2013/04/05/pulp-2-1-0-released/


Note You need to log in before you can comment on or make changes to this bug.