Description of problem: did an rpm update to 2.1.0 0.23.beta. ran pulp-manage-db. 0005_rpm_changelog_files fails the output of the command: =============================================== Applying pulp_rpm.migrations version 1 Migration to pulp_rpm.migrations version 1 complete. Applying pulp_rpm.migrations version 2 Migration to pulp_rpm.migrations version 2 complete. Applying pulp_rpm.migrations version 3 Migration to pulp_rpm.migrations version 3 complete. Applying pulp_rpm.migrations version 4 Migration to pulp_rpm.migrations version 4 complete. Applying pulp_rpm.migrations version 5 Applying migration pulp_rpm.migrations.0005_rpm_changelog_files failed. See log for details. 2013-03-19 11:26:51,340 db:CRITICAL: Applying migration pulp_rpm.migrations.0005_rpm_changelog_files failed. 2013-03-19 11:26:51,340 db:CRITICAL: strings in documents must be valid UTF-8 2013-03-19 11:26:51,460 db:CRITICAL: Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/pulp/server/db/manage.py", line 79, in migrate_database update_current_version=not options.test) File "/usr/lib/python2.6/site-packages/pulp/server/db/migrate/models.py", line 161, in apply_migration migration.migrate() File "/usr/lib/python2.6/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py", line 58, in migrate _migrate_rpm_unit_changelog_files() File "/usr/lib/python2.6/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py", line 48, in _migrate_rpm_unit_changelog_files collection.save(rpm_unit, safe=True) File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry return method(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 237, in save manipulate, safe, _check_keys=True, **kwargs) File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry return method(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 411, in update _check_keys, self.__uuid_subtype), safe) InvalidStringData: strings in documents must be valid UTF-8 Database migrations complete. Loading content types. ===============================================
From some discussions I did a "strace -e trace=stat" to see the rpm, and looks to be man-pages-da-0.1.1-12.1.1.noarch.rpm. and added a exception snippet from IRC and here is the object dump: Beginning database migrations. Migration package pulp.server.db.migrations is up to date at version 4 Migration package pulp_puppet.plugins.migrations is up to date at version 0 Applying pulp_rpm.migrations version 5 SON([(u'requires', []), (u'_storage_path', u'/var/lib/pulp/content/rpm/.//man-pages-da/0.1.1/12.1.1/noarch/63f1445df3e2680413e410b0409268574df47157/man-pages-da-0.1.1-12.1.1.noarch.rpm'), (u'repodata', SON([(u'filelists', u'\n<package pkgid="63f1445df3e2680413e410b0409268574df47157" name="man-pages-da" arch="noarch">\n <version epoch="0" ver="0.1.1" rel="12.1.1"/>\n\n <file>/usr/share/doc/man-pages-da-0.1.1/AUTHORS</file>\n <file>/usr/share/doc/man-pages-da-0.1.1/ChangeLog</file>\n <file>/usr/share/doc/man-pages-da-0.1.1/l\xe6smig</file>\n <file>/usr/share/man/da/man1/chgrp.1.gz</file>\n <file>/usr/share/man/da/man1/chmod.1.gz</file>\n <file>/usr/share/man/da/man1/chown.1.gz</file>\n <file>/usr/share/man/da/man1/dd.1.gz</file>\n <file>/usr/share/man/da/man1/df.1.gz</file>\n <file>/usr/share/man/da/man1/gnome-wm.1.gz</file>\n <file>/usr/share/man/da/man1/make.1.gz</file>\n <file type="dir">/usr/share/doc/man-pages-da-0.1.1</file>\n <file type="dir">/usr/share/man/da</file>\n <file type="dir">/usr/share/man/da/man1</file>\n</package>\n'), (u'other', u'\n<package pkgid="63f1445df3e2680413e410b0409268574df47157" name="man-pages-da" arch="noarch">\n <version epoch="0" ver="0.1.1" rel="12.1.1"/>\n\n<changelog author="Tim Powers <timp>" date="1010613600">- automated rebuild</changelog>\n<changelog author="Tim Powers <timp>" date="1022191200">- automated rebuild</changelog>\n<changelog author="Tim Powers <timp>" date="1024696800">- automated rebuild</changelog>\n<changelog author="Tim Powers <timp> 0.1.1-8" date="1039644000">- rebuild</changelog>\n<changelog author="Tim Powers <timp>" date="1043272800">- rebuilt</changelog>\n<changelog author="Phil Knirsch <pknirsch> 0.1.1-10" date="1045000800">- Convert all manpages to utf-8.</changelog>\n<changelog author="Elliot Lee <sopwith>" date="1076709600">- rebuilt</changelog>\n<changelog author="Elliot Lee <sopwith> 0.1.1-12" date="1096495200">- Rebuilt</changelog>\n<changelog author="Jesse Keating <jkeating>" date="1134165600">- rebuilt</changelog>\n<changelog author="Jesse Keating <jkeating> - 0.1.1-12.1.1" date="1152741600">- rebuild</changelog>\n\n</package>\n'), (u'primary', u'\n<package type="rpm">\n <name>man-pages-da</name>\n <arch>noarch</arch>\n <version epoch="0" ver="0.1.1" rel="12.1.1"/>\n <checksum type="sha" pkgid="YES">63f1445df3e2680413e410b0409268574df47157</checksum>\n <summary>Danish man pages from the Linux Documentation Project.</summary>\n <description>Manual pages from the Linux Documentation Project, translated into\nDanish.</description>\n <packager>Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla></packager>\n <url>http://www.sslug.dk/locale/man-sider/</url>\n <time file="1169142850" build="1152768694"/>\n <size package="15259" installed="11231" archive="13280"/>\n<location href="man-pages-da-0.1.1-12.1.1.noarch.rpm"/>\n <format>\n <rpm:license>Distributable</rpm:license>\n <rpm:vendor>Red Hat, Inc.</rpm:vendor>\n <rpm:group>Documentation</rpm:group>\n <rpm:buildhost>ia64-1.build.redhat.com</rpm:buildhost>\n <rpm:sourcerpm>man-pages-da-0.1.1-12.1.1.src.rpm</rpm:sourcerpm>\n <rpm:header-range start="440" end="4216"/>\n <rpm:provides>\n <rpm:entry name="man-pages-da" flags="EQ" epoch="0" ver="0.1.1" rel="12.1.1"/>\n </rpm:provides>\n </format>\n</package>')])), (u'checksumtype', u'sha'), (u'license', u'Distributable'), (u'_ns', u'units_rpm'), (u'checksum', u'63f1445df3e2680413e410b0409268574df47157'), (u'filename', u'man-pages-da-0.1.1-12.1.1.noarch.rpm'), (u'buildhost', u'ia64-1.build.redhat.com'), (u'epoch', u'0'), (u'version', u'0.1.1'), (u'relativepath', u'Packages/man-pages-da--12.1.1.noarch.rpm'), (u'provides', [[u'man-pages-da', u'EQ', [u'0', u'0.1.1', u'12.1.1']]]), (u'_content_type_id', u'rpm'), (u'release', u'12.1.1'), (u'vendor', u'Red Hat, Inc.'), (u'_id', u'082e75f4-501c-44a7-a2ba-4d Applying migration pulp_rpm.migrations.0005_rpm_changelog_files failed. See log for details. 2013-03-19 14:12:55,968 db:CRITICAL: Applying migration pulp_rpm.migrations.0005_rpm_changelog_files failed. 2013-03-19 14:12:55,969 db:CRITICAL: strings in documents must be valid UTF-8 2013-03-19 14:12:55,970 db:CRITICAL: Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/pulp/server/db/manage.py", line 79, in migrate_database update_current_version=not options.test) File "/usr/lib/python2.6/site-packages/pulp/server/db/migrate/models.py", line 161, in apply_migration migration.migrate() File "/usr/lib/python2.6/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py", line 62, in migrate _migrate_rpm_unit_changelog_files() File "/usr/lib/python2.6/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py", line 49, in _migrate_rpm_unit_changelog_files collection.save(rpm_unit, safe=True) File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry return method(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 237, in save manipulate, safe, _check_keys=True, **kwargs) File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry return method(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 411, in update _check_keys, self.__uuid_subtype), safe) InvalidStringData: strings in documents must be valid UTF-8 Database migrations complete. Loading content types. Content types loaded.
Created attachment 712975 [details] pulp-manage-db output the SON is a REALLY long line so was getting clipped by my terminal. here is the output redirected to a file
Created attachment 712977 [details] script that reproduces bug
https://github.com/pulp/pulp_rpm/pull/157
build: 2.1.0-0.25.beta
I applied the patch yesterday evening straight from github. first ran failed due to out of disk space. increased my /var/lib/mongodb volume from 15G to 25G and re-ran. the pulp-manage-db did eventually finish after 4 hr 45 min. and the mongodb volume is using 22GB after being complete. pulp does look to be running okay now. in the process of doing my daily (scheduled) syncs. So mongodb space is definitely bigger with the latest update. RAM usage for syncs is maybe a little bigger too. I'm on a 8GB VM dedicated to pulp and pulp is using about 4.5GB and mongo 3.5GB at the moment.
Hi Steven, As to the RAM usage during repo syncs a recent patch was merged yesterday which will significantly decrease memory usage during syncs: The patch is here: https://github.com/pulp/pulp_rpm/pull/152 This is incorporated in the pulp 2.1.0-0.26-beta build RPMs Would you apply the patch and perform a resync and let us know what you see? I suspect memory usage will decrease noticeably. Below is what I see when syncing a typical RHEL 6.2 repo with ~7k RPMs VSZ: ~2.5GB RSS: ~1.2GB prior to the patch I was closer to what you saw with VSZ around 4.8GB, RSS: 3.5GB
good to hear on the performance work. I'm on bug 873313, so I'll put results in there. I have updated to .26 beta and will do some testing.
To verify, grab this RPM (which includes non-UTF8 text in its metadata) and upload it to a pulp repository: http://pkgs.org/centos-5-rhel-5/centos-rhel-i386/man-pages-da-0.1.1-12.1.1.noarch.rpm.html Now we need to use mongo directly to remove some data from the database that the migration will re-generate. $ mongo > use pulp_database > db.units_rpm.find({name:"man-pages-da"}, {filelist:1}) > // verify that the output is a list of files, as this is what the migration will re-generate > db.units_rpm.update({name:"man-pages-da"}, {"$unset": {filelist:1}}) > // verify that the output does not show a file list Keep the mongo shell open, and in another terminal, run the migration manually with: $ python /path/to/pulp_rpm/migrations/0005_rpm_changelog_files.py Probably it's in something like /usr/lib/python2.7/site-packages/pulp_rpm/migrations/ Back in your mongo shell, run the "find" command again to verify that the filelist is back. If the migration didn't cause any errors, the bug is verified.
verified switched to db pulp_database > db.units_rpm.find({name:"man-pages-da"}, {filelist:1}) { "_id" : "a0ac1c96-e2b8-403a-a808-4c17e2e8f8e4", "filelist" : [ "/usr/share/man/da/man1/chgrp.1.gz", "/usr/share/man/da/man1/chmod.1.gz", "/usr/share/man/da/man1/chown.1.gz", "/usr/share/man/da/man1/dd.1.gz", "/usr/share/man/da/man1/df.1.gz", "/usr/share/man/da/man1/gnome-wm.1.gz", "/usr/share/man/da/man1/make.1.gz", "/usr/share/doc/man-pages-da-0.1.1/AUTHORS", "/usr/share/doc/man-pages-da-0.1.1/ChangeLog", "/usr/share/doc/man-pages-da-0.1.1/læsmig" ] } { "_id" : "a4132ed3-d1c4-4d24-8bfc-ff2d08ca5a0e" } > > > db.units_rpm.update({name:"man-pages-da"}, {"$unset": {filelist:1}}) > > > > db.units_rpm.find({name:"man-pages-da"}, {filelist:1}) { "_id" : "a0ac1c96-e2b8-403a-a808-4c17e2e8f8e4" } { "_id" : "a4132ed3-d1c4-4d24-8bfc-ff2d08ca5a0e" } > root@mgmt8 ~]# python /usr/lib/python2.7/site-packages/pulp_rpm/migrations/0005_rpm_changelog_files.py [root@mgmt8 ~]# > > db.units_rpm.find({name:"man-pages-da"}, {filelist:1}) { "_id" : "a0ac1c96-e2b8-403a-a808-4c17e2e8f8e4", "filelist" : [ "/usr/share/doc/man-pages-da-0.1.1/AUTHORS", "/usr/share/doc/man-pages-da-0.1.1/ChangeLog", "/usr/share/doc/man-pages-da-0.1.1/læsmig", "/usr/share/man/da/man1/chgrp.1.gz", "/usr/share/man/da/man1/chmod.1.gz", "/usr/share/man/da/man1/chown.1.gz", "/usr/share/man/da/man1/dd.1.gz", "/usr/share/man/da/man1/df.1.gz", "/usr/share/man/da/man1/gnome-wm.1.gz", "/usr/share/man/da/man1/make.1.gz" ] } { "_id" : "a4132ed3-d1c4-4d24-8bfc-ff2d08ca5a0e" } >
Pulp 2.1 released http://www.pulpproject.org/2013/04/05/pulp-2-1-0-released/