Bug 2173757

Summary: Can't rerun a failed content-import task if it was exported using chunks
Product: Red Hat Satellite Reporter: Joniel Pasqualetto <jpasqual>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Vladimír Sedmík <vsedmik>
Severity: medium Docs Contact:
Priority: high    
Version: 6.12.1CC: ahumbe, dalley, dkliban, ggainey, osousa, paji, rchan, sshewale, vsedmik
Target Milestone: 6.14.0Keywords: PrioBumpGSS, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pulpcore-3.22.6, pulpcore-3.21.9, pulpcore-3.18.19, pulpcore-3.16.19 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2218659 2227903 (view as bug list) Environment:
Last Closed: 2023-11-08 14:18:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joniel Pasqualetto 2023-02-27 21:43:56 UTC
Description of problem:
When importing content that was exported in chunks, the importer process concatenate the chunks into a single file in order to import.

If that import task fails for some reason, after the chunks were already combined into a single file, the user can't simply re-run the same command to retry the import. Satellite will complain that the chunks are missing.

Checking the directory, we can see that the chunks are gone and only the master file is present (together with metadata.json and TOC file).

Version-Release number of selected component (if applicable):

6.10, 6.11, 6.12

How reproducible:


Steps to Reproduce:
1. Export something using chunks
2. Start the import on another satellite. Monitor the data directory until all the chunk files are gone and only the main file is present. At this moment, kill the pulpcore-worker which is processing the import. Wait until the task returns with error.
3. Repeat the same import command. Pulp will error out with an error like this:

~~~
Feb 27 16:38:56 reproducer-import pulpcore-worker-3[156191]: pulp [6c91f855-9959-43b1-864f-925393ae025a]: pulpcore.tasking.pulpcore_worker:INFO: Task ddf45eb1-adea-480e-9114-748baa7bd7bf failed ([ErrorDetail(string="Missing import-chunks named in table-of-contents: ['export-fc4273a4-6320-4fd5-98c9-6adfe9461781-20230227_2132.tar.gz.0000', 'export-fc4273a4-6320-4fd5-98c9-6adfe9461781-20230227_2132.tar.gz.0001', 'export-fc4273a4-6320-4fd5-98c9-6adfe9461781-20230227_2132.tar.gz.0002'].", code='invalid')])
~~~

Actual results:
Re-running same import fails, complaining about chunks missing. User needs to modify the TOC file manually OR split the master file in chunks again OR copy the files again in order to run the import.

Expected results:
Pulp could be smart enough to identify either all the chunks are present OR the global file file. If the checksum matches, move forward with the import.

Additional info:
I've been seeing this for a while, since previous versions. Not a new thing.

Comment 2 Brad Buckingham 2023-03-02 15:24:54 UTC
Partha,

Based upon the description, should this be on the Pulp component vs ISS ?  Thanks!

Comment 3 Partha Aji 2023-04-06 15:01:59 UTC
Yes this is pulp.

Comment 5 Robin Chan 2023-06-05 08:05:12 UTC
The Pulp upstream bug status is at closed. Updating the external tracker on this bug.

Comment 6 Vladimír Sedmík 2023-07-13 20:23:31 UTC
Verified in 6.14.0 snap 7 (python39-pulpcore-3.22.7-1.el8pc.noarch)

Steps to verify:

On export SAT:
1) Set immediate download policy, uploaded manifest.
2) Synced RH repo big enough to be chunked (RHEL8 baseos).
3) Exported complete Library using chunks:
# hammer content-export complete library --organization-id 1 --chunk-size-gb 1

On import SAT:
1) Set immediate download policy, uploaded manifest, set "Export Sync" in CDN configuration.
2) Created import dir, copied the content from the export SAT, updated owner. 
# scp root@$EXP_SAT:/var/lib/pulp/exports/Default_Organization/Export-Library/1.0/$TIME/* /var/lib/pulp/imports/$TIME/
# chown -R pulp:pulp /var/lib/pulp/imports/$TIME/
3) Started the import and once the chunks were gone (merged into single tar.gz), restarted the services:
# hammer content-import library --organization="Default Organization" --path="/var/lib/pulp/imports/$TIME/"
# foreman-maintain service restart
4) Started the import again and let it finish -> the task succeeded.
5) Checked that the imported content matched the export.

Comment 7 Grant Gainey 2023-08-31 11:38:39 UTC
*** Bug 2092008 has been marked as a duplicate of this bug. ***

Comment 10 errata-xmlrpc 2023-11-08 14:18:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.14 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6818