Description of problem: This bz represents a problem we have with a repo sync reporting the BSON object was too large. My guess is this is triggered because we are capturing every error that occurs and for this sync, 2768 errors occurred. lzap saw an error from a sync. He based the sync off of local content from an ISO. Looking at his logs we see: 2011-10-25 12:15:31,412 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:616 YumSynchronizer reported 5 successes, 0 downloads, 2768 errors lzap's logs: 2011-10-25 12:13:43,399 15460:-1278223504: pulp.server.webservices.controllers.repositories:INFO: repositories:615 sync timeout passed : None 2011-10-25 12:13:44,116 15460:-1373635728: pulp.server.api.repo_sync:INFO: repo_sync:244 Sync invoked for repo <ACME_Corporation-isos-isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso> 2011-10-25 12:13:44,746 15460:-1373635728: pulp.server.api.repo_sync:INFO: repo_sync:258 Sync of ACME_Corporation-isos-isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso starting, skip_dict = {} 2011-10-25 12:13:45,544 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:540 cacert = <None>, cert = <None> 2011-10-25 12:13:45,546 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:588 Fetching repo to </var/lib/pulp//repos/ACME_Corporation/Locker/isos/isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso> with verify_options <{'checksum': True, 'size': True}> 2011-10-25 12:15:26,957 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:608 Running createrepo, this may take a few minutes to complete. 2011-10-25 12:15:26,957 15460:-1373635728: pulp.server.util:INFO: util:462 metadata found; taking backup. 2011-10-25 12:15:26,988 15460:-1373635728: pulp.server.util:INFO: util:445 started repo metadata update: ['createrepo', '--database', '--checksum', 'sha256', '-g', '/var/lib/pulp//repos/ACME_Corporation/Locker/isos/isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso/repodata/1d46a2073097bdb6b7b6f6fba60e4525ca0bf2d1165026a3e1c87b0b2abaf07f-Fedora-14-comps.xml', '--update', '/var/lib/pulp//repos/ACME_Corporation/Locker/isos/isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso'] 2011-10-25 12:15:31,297 15460:-1373635728: pulp.server.util:INFO: util:495 createrepo on /var/lib/pulp//repos/ACME_Corporation/Locker/isos/isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso finished 2011-10-25 12:15:31,411 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:614 Createrepo finished in 4.45665597916 seconds 2011-10-25 12:15:31,412 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:616 YumSynchronizer reported 5 successes, 0 downloads, 2768 errors 2011-10-25 12:15:32,822 15460:-1373635728: pulp.server.api.distribution:INFO: distribution:39 Distribution with id ks-ACME_Corporation-isos-isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso-noarch already exists 2011-10-25 12:15:33,809 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:275 No metadata for 'File Sync' present; no files to import to repo.. 2011-10-25 12:15:34,792 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:246 Loading comps group info from: /var/lib/pulp//repos/ACME_Corporation/Locker/isos/isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso/repodata/1d46a2073097bdb6b7b6f6fba60e4525ca0bf2d1165026a3e1c87b0b2abaf07f-1d46a2073097bdb6b7b6f6fba60e4525ca0bf2d1165026a3e1c87b0b2abaf07f-Fedora-14-comps.xml 2011-10-25 12:15:35,015 15460:-1373635728: pulp.server.api.synchronizers:INFO: synchronizers:267 Skipping errata imports from sync process 2011-10-25 12:15:35,505 15460:-1373635728: pulp.server.api.repo_sync:INFO: repo_sync:265 Sync returned 0 packages, 0 errata in 110.758918047 seconds 2011-10-25 12:15:35,875 15460:-1373635728: pulp.server.api.repo_sync:INFO: repo_sync:276 0 old packages to process, 0 new packages to process 2011-10-25 12:15:37,533 15460:-1373635728: pulp.server.api.repo_sync:INFO: repo_sync:291 Examining 0 errata from repo ACME_Corporation-isos-isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso 2011-10-25 12:15:37,853 15460:-1373635728: pulp.server.api.repo_sync:INFO: repo_sync:295 Removing 0 old errata from repo ACME_Corporation-isos-isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso 2011-10-25 12:15:40,433 15460:-1373635728: pulp.server.api.repo_sync:INFO: repo_sync:304 Adding 0 new errata to repo ACME_Corporation-isos-isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso 2011-10-25 12:15:40,891 15460:-1373635728: pulp.server.tasking.task:INFO: task:376 Task succeeded: Task 04f7d6ae-fef2-11e0-83c0-f0def1032187: _sync(ACME_Corporation-isos-isos_pub_isos_fuse_Fedora-14-x86_64-DVD_iso, synchronizer=<pulp.server.api.synchronizers.YumSynchronizer object at 0xaf636f2c>, skip_dict={}, max_speed=None, threads=None, progress_callback=<bound method RepoSyncTask.progress_callback of <pulp.server.api.repo_sync_task.RepoSyncTask object at 0xa330c4ac>>) 2011-10-25 12:15:41,782 15460:-1373635728: pulp.server.tasking.task:ERROR: task:405 document too large - BSON documents are limited to 4 MB Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/pulp/server/tasking/task.py", line 403, in _complete self.complete_callback(self) File "/usr/lib/python2.7/site-packages/pulp/server/tasking/taskqueue/queue.py", line 301, in complete self.__storage.store_complete(task) File "/usr/lib/python2.7/site-packages/pulp/server/tasking/taskqueue/storage.py", line 244, in store_complete self.history_collection.insert(history) File "/usr/lib/python2.7/site-packages/pulp/server/db/connection.py", line 80, in retry return method(*args, **kwargs) File "/usr/lib/python2.7/site-packages/pymongo/collection.py", line 270, in insert check_keys, safe, kwargs), safe) File "/usr/lib/python2.7/site-packages/pymongo/message.py", line 67, in insert bson_data = "".join([bson.BSON.encode(doc, check_keys) for doc in docs]) File "/usr/lib/python2.7/site-packages/bson/__init__.py", line 462, in encode return cls(_dict_to_bson(document, check_keys)) InvalidDocument: document too large - BSON documents are limited to 4 MB
I created a custom repo with 6k packages. chmod 700 *.rpm To force errors for all rpm fetches. Repo Feed can be seen below. $ sudo pulp-admin repo create --id large_repo_with_errors --feed http://jmatthews.fedorapeople.org/large_repo_no_read_perms $ sudo pulp-admin repo sync --id large_repo_with_errors -F Sync for repository large_repo_with_errors started Step: Running Createrepo Waiting / Task not found: 4650403a-ff15-11e0-9041-0022157359f6 Traceback (most recent call last): File "/usr/bin/pulp-admin", line 7, in <module> execfile(__file__) File "/shared/repo/pulp/bin/pulp-admin", line 23, in <module> admin.main() File "/shared/repo/pulp/src/pulp/client/lib/cli.py", line 152, in main command.main(self.args[1:]) File "/shared/repo/pulp/src/pulp/client/pluginlib/command.py", line 129, in main action.main(args[1:]) File "/shared/repo/pulp/src/pulp/client/pluginlib/command.py", line 211, in main self.run() File "/shared/repo/pulp/src/pulp/client/admin/plugins/repo.py", line 887, in run self.sync_foreground(task) File "/shared/repo/pulp/src/pulp/client/admin/plugins/repo.py", line 839, in sync_foreground while not task_end(task): File "/shared/repo/pulp/src/pulp/client/api/task.py", line 19, in task_end return task['state'] in ('finished', 'error', 'canceled', 'timed_out') TypeError: 'NoneType' object is not subscriptable Looking at /var/log/pulp/pulp.log 2011-10-25 10:38:56,866 21874:140718479759104: pulp.server.api.synchronizers:INFO: synchronizers:640 YumSynchronizer reported 0 successes, 0 downloads, 6000 errors 2011-10-25 10:38:56,880 21874:140718479759104: pulp.server.api.synchronizers:INFO: synchronizers:298 No image files to import to repo.. 2011-10-25 10:38:56,883 21874:140718479759104: pulp.server.api.synchronizers:INFO: synchronizers:278 No metadata for 'File Sync' present; no files to import to repo.. 2011-10-25 10:38:56,886 21874:140718479759104: pulp.server.api.synchronizers:INFO: synchronizers:270 Skipping errata imports from sync process 2011-10-25 10:38:56,888 21874:140718479759104: pulp.server.api.repo_sync:INFO: repo_sync:269 Sync returned 0 packages, 0 errata in 770.836686134 seconds 2011-10-25 10:38:56,889 21874:140718479759104: pulp.server.api.repo_sync:INFO: repo_sync:280 0 old packages to process, 0 new packages to process 2011-10-25 10:38:56,893 21874:140718479759104: pulp.server.api.repo_sync:INFO: repo_sync:295 Examining 0 errata from repo large_repo_with_errors 2011-10-25 10:38:56,894 21874:140718479759104: pulp.server.api.repo_sync:INFO: repo_sync:299 Removing 0 old errata from repo large_repo_with_errors 2011-10-25 10:38:56,899 21874:140718479759104: pulp.server.api.repo_sync:INFO: repo_sync:308 Adding 0 new errata to repo large_repo_with_errors 2011-10-25 10:38:56,917 21874:140718479759104: pulp.server.tasking.task:INFO: task:440 Task succeeded: Task 4650403a-ff15-11e0-9041-0022157359f6: _sync(large_repo_with_errors, synchronizer=<pulp.server.api.synchronizers.YumSynchronizer object at 0x7ffbcbf2dc90>, skip_dict={}, max_speed=None, threads=None, progress_callback=<bound method RepoSyncTask.progress_callback of <pulp.server.api.repo_sync_task.RepoSyncTask object at 0x7ffbcbf2d290>>) 2011-10-25 10:38:58,042 21874:140718479759104: pulp.server.tasking.task:ERROR: task:469 document too large - BSON documents are limited to 4 MB Traceback (most recent call last): File "/shared/repo/pulp/src/pulp/server/tasking/task.py", line 467, in _complete self.complete_callback(self) File "/shared/repo/pulp/src/pulp/server/tasking/taskqueue/queue.py", line 333, in complete self.__storage.store_complete(task) File "/shared/repo/pulp/src/pulp/server/tasking/taskqueue/storage.py", line 244, in store_complete self.history_collection.insert(history) File "/shared/repo/pulp/src/pulp/server/db/connection.py", line 80, in retry return method(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 270, in insert check_keys, safe, kwargs), safe) File "/usr/lib64/python2.7/site-packages/pymongo/message.py", line 67, in insert bson_data = "".join([bson.BSON.encode(doc, check_keys) for doc in docs]) File "/usr/lib64/python2.7/site-packages/bson/__init__.py", line 462, in encode return cls(_dict_to_bson(document, check_keys)) InvalidDocument: document too large - BSON documents are limited to 4 MB
We are storing every error in the progress dictionary. For 6000 packages that each are receiving a 403, we have 6000 dicts with almost identical tracebacks. We could store only one unique error type, and link each item name which saw that error. Desire is to maintain the fine grained error reporting if possible, while shrinking the size of the returned data structure.
FYI I have used Fedora 14 ISO: Fedora-14-x86_64-DVD.iso Unpacked it (or use fuseiso or mount loop), and synchronized it via localhost. Not sure if all the errors above were caused by the fact it is so fast or syncing to ISO content. Tried regular F15 repo now and I only have one error, works better.
Attaching full debug output of grinder yum --label test -U http://localhost/pub/isos/fuse/Fedora-14-x86_64-DVD.iso/ Sample: grinder.ParallelFetch: ERROR Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/grinder/activeobject.py", line 447, in process retval = method(*args, **kwargs) File "/usr/lib/python2.7/site-packages/grinder/RepoFetch.py", line 117, in fetchItem verify_options=self.verify_options) File "/usr/lib/python2.7/site-packages/grinder/BaseFetch.py", line 301, in fetch checksum, headers, retryTimes, packages_location) File "/usr/lib/python2.7/site-packages/grinder/BaseFetch.py", line 301, in fetch checksum, headers, retryTimes, packages_location) File "/usr/lib/python2.7/site-packages/grinder/BaseFetch.py", line 244, in fetch curl.perform() error: (6, 'Could not resolve host: Packages; Cannot allocate memory')
Created attachment 530117 [details] log file 1
Created attachment 530118 [details] log file 2
Here's the grinder bug that relates to the problem seen here from the ISO https://bugzilla.redhat.com/show_bug.cgi?id=749211
I don't believe this issue exists anymore, since we rewrote the Yum Importer, and no longer use Grinder.