Hide Forgot
Description of problem: A Satellite6 setup in disconnected mode when again connected to cdn , end up resyncing whole content again. Syncing content in disconnected mode: Space occupied: 31 GB Repos enabled: RHEL7_optional, RHEL7_x86_64, RHEL7_Sat6_Tools. Disconnected URL : http://abc.redhat.com/pub/sat-import After Syncing Content with cdn URL: Space Occupied: 56 GB Repos enabled: RHEL7_optional_x86_64, RHEL7_OS_x86_64, RHEL7_Sat6_Tools. Version-Release number of selected component (if applicable): Sat6.2-SNAP3-compose1 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Disconnected to Connected setup resync all the rpm/content. Expected results: Should be syncing only the non-existent rpm/content. Additional info:
Created attachment 1136212 [details] disconnected mode sync status The screenshot of sync page with repos: RHEL7_OS_X86_64 , RHEL7_OPTIONAL_X86_64 and RHEL7_SATTOOLS
Created attachment 1136226 [details] connected mode sync status The screenshot of sync page with repos: RHEL7_OS_X86_64 , RHEL7_OPTIONAL_X86_64 and RHEL7_SATTOOLS
In comment1 screenshot please note, the "new package" count: 73 for sattools. In comment2 screenshot please note, the "new package" count: 10 for sattools. But this is not the case with other repos which were synced with it, RHEL7_{optional and OS}, they seem to be syncing all the content once again.
Were all repos set as "immediate"?
The RHEL7 repo on the CDN appears to have metadata generated with sha1, vs the iso whose repo metadata is sha256. Unfortunately, pulp sees that the rpms from the CDN have different checksums than those on the ISO and is not able to determine that they are the same, so pulp re-downloads each rpm. This has previously resulted in double download and storage of rpms in the case where one rhel7 repo used sha1, but another similar repo (with mostly the same RPMs) used sha256. The only quick solution is to ensure that the metadata in each repo on the CDN is using the same checksum type as the corresponding repo on the ISO. Long-term we'll make pulp more flexible regarding checksum algorithms, but that is a way off and will require substantial change to pulp's data model.
Yes, all repos were set as "immediate", before syncing content in disconnected mode. This was done by changing the download_policy to "immediate" via settings --> katello --> "default_download_policy" : "immediate"
Pulp's export distributor now has a checksum_type option. It's documented here all the way at the bottom, called "checksum_type". Let us know if there are any questions about its use. http://docs.pulpproject.org/plugins/pulp_rpm/tech-reference/export-distributor.html#configuration-parameters It was introduced upstream in Pulp 2.9.0. "Checksum type to use for metadata generation. For any units where the checksum of this type is not already known, it will be computed on-the-fly and saved for future use."
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the forseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.