Bug 677695

Summary: get_package_checksums() call for package uploads
Product: [Retired] Pulp Reporter: Daniel Mach <dmach>
Component: z_otherAssignee: Pradeep Kilambi <pkilambi>
Status: CLOSED CURRENTRELEASE QA Contact: Preethi Thomas <pthomas>
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedCC: dgregor, pkilambi, skarmark
Target Milestone: ---Keywords: Triaged
Target Release: Sprint 21   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-16 12:10:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 563609, 647488    
Attachments:
Description Flags
performance patch none

Description Daniel Mach 2011-02-15 15:54:13 UTC
When uploading a package, sconn.search_packages() is called to get checksum of packages already uploaded to server. The call queries only one package a time.

We need to query multiple packages at once, because we sometimes deal with more than 10k files at the same time.

Example:

get_package_checksums([package1, package2, ...])
->
{package1: [sha256sums], package2: [], package3: ...}

Comment 1 Dennis Gregorovic 2011-02-15 16:17:31 UTC
Should Pulp implement a generic multicall method?

Comment 2 Daniel Mach 2011-02-15 16:25:30 UTC
A single call would be better in this case.
It would mean less db load (db can be queried just once).

But generally, a multicall is definitely a good idea.

Comment 3 Pradeep Kilambi 2011-02-28 20:35:48 UTC
fixed!

The new calls are in services api.

ServiceAPI.get_package_checksums(filenames=[])

ServiceAPI.get_file_checksums(filenames=[])

the return format: {"file_name1": [<checksum1>,<checksum2>..],...}

Comment 4 Jeff Ortel 2011-03-02 21:34:36 UTC
Build: 0.144

Comment 5 Daniel Mach 2011-03-03 12:43:11 UTC
Created attachment 482064 [details]
performance patch

Can you check this patch?
It should boost performance by 50%.

I removed file_checksum() methods, looks like they are no longer needed, but I may be wrong.

Comment 6 Pradeep Kilambi 2011-03-03 15:38:47 UTC
fixed!

I left the package_checksum and file_checksum calls in there as they are used by other modules. I just updated the batch query with your path.

SHould make it into next build. Thanks.

Comment 7 Jeff Ortel 2011-03-07 20:43:16 UTC
Build: 0.145

Comment 8 Preethi Thomas 2011-03-31 21:38:03 UTC
verified
[root@preethi ~]# rpm -q pulp
pulp-0.0.159-1.fc14.noarch


[root@preethi ~]# ./bz677695.py 
{u'fedora-bookmarks-14-1.noarch.rpm': [u'82a93aa6787c3223e8880dfa57d056627e73b8ea55788ba23d5c9dc094e48523'], u'silkscreen-fonts-common-1.0-4.fc12.noarch.rpm': [u'a67ac991a7a4288df11079f5b529c356c8104bc38f366833ff169ef23ec42758']}
{u'test2.txt': [u'8ff2165791201edf1f3ed70518d8b30115a1272eabe231a50fe50445a9ff19ae']}

Comment 9 Preethi Thomas 2011-08-16 12:10:42 UTC
Closing with Community Release 15

pulp-0.0.223-4.

Comment 10 Preethi Thomas 2011-08-16 12:22:33 UTC
Closing with Community Release 15

pulp-0.0.223-4.