When uploading a package, sconn.search_packages() is called to get checksum of packages already uploaded to server. The call queries only one package a time. We need to query multiple packages at once, because we sometimes deal with more than 10k files at the same time. Example: get_package_checksums([package1, package2, ...]) -> {package1: [sha256sums], package2: [], package3: ...}
Should Pulp implement a generic multicall method?
A single call would be better in this case. It would mean less db load (db can be queried just once). But generally, a multicall is definitely a good idea.
fixed! The new calls are in services api. ServiceAPI.get_package_checksums(filenames=[]) ServiceAPI.get_file_checksums(filenames=[]) the return format: {"file_name1": [<checksum1>,<checksum2>..],...}
Build: 0.144
Created attachment 482064 [details] performance patch Can you check this patch? It should boost performance by 50%. I removed file_checksum() methods, looks like they are no longer needed, but I may be wrong.
fixed! I left the package_checksum and file_checksum calls in there as they are used by other modules. I just updated the batch query with your path. SHould make it into next build. Thanks.
Build: 0.145
verified [root@preethi ~]# rpm -q pulp pulp-0.0.159-1.fc14.noarch [root@preethi ~]# ./bz677695.py {u'fedora-bookmarks-14-1.noarch.rpm': [u'82a93aa6787c3223e8880dfa57d056627e73b8ea55788ba23d5c9dc094e48523'], u'silkscreen-fonts-common-1.0-4.fc12.noarch.rpm': [u'a67ac991a7a4288df11079f5b529c356c8104bc38f366833ff169ef23ec42758']} {u'test2.txt': [u'8ff2165791201edf1f3ed70518d8b30115a1272eabe231a50fe50445a9ff19ae']}
Closing with Community Release 15 pulp-0.0.223-4.