Bug 1462956

Summary: RFE: Add support for Amazon S3 to download vmcores as an alternative to ftp
Product: [Fedora] Fedora EPEL Reporter: Dave Wysochanski <dwysocha>
Component: retrace-serverAssignee: abrt <abrt-devel-list>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: high    
Version: epel7CC: brhatiga, daduval, jakub, jwest, michal.toman, mimoore, mmarusak, msuchy, npatil, stalexan, tbutt, yshao
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-08 14:06:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dave Wysochanski 2017-06-19 18:24:55 UTC
Description of problem:
We need support for Amazon S3 storage to download core files into retrace server.  I am not sure about it but I think here is the API docs:
https://aws.amazon.com/sdk-for-python/

points at
https://boto.readthedocs.io/en/latest/

and then for S3
https://boto.readthedocs.io/en/latest/ref/s3.html


Version-Release number of selected component (if applicable):
retrace-server-1.17.0-3.el6.noarch


Additional info:

Comment 1 Miroslav Suchý 2017-07-10 13:44:33 UTC
El6 will not get this feature for sure. Switching to EL7.

Comment 2 Dave Wysochanski 2018-03-29 20:50:13 UTC
There is a RHEL7 bug for "Improved S3 filesystem interface":
https://bugzilla.redhat.com/show_bug.cgi?id=1477798

If anything further is needed in RHEL for retrace-server's S3 support, probably we need to show the use case in that bug.

Adding Shao to the CC list.

Comment 3 Dave Wysochanski 2018-03-29 21:05:08 UTC
If it turns out that S3 support should be looked at as a third method of submitting to retrace-server (first is a local file, second is ftp), then we may need to redo the 'manager' page.  To do that we may want to do it on this bug or split it off into another bug.  Options seem to be:
1. Split the front table into 3 seconds rather than 2
ftp files | S3 files | finished tasks

2. Merge ftp and S3 tasks and handle transparently based on whether ftp or S3 is used and say "remote files" or something
remote files | finished tasks

If we split them (option #1) it would be easier to visually see which tasks were S3 and which were FTP.  Why would this matter?  It would let us see how the vmcores are coming in (assuming both input methods will remain active for some time).  Probably this is not a huge reason to go that route though since we may be able to do this with a file in the 'task' directory'

Comment 4 Miroslav Suchý 2018-04-03 14:27:47 UTC
In manager page you can pass any URL which wget can download. You based on
  https://stackoverflow.com/questions/18239567/how-can-i-download-a-file-from-an-s3-bucket-with-wget
it seems you can download S3 files using wget as well. Does this work for you?

Comment 5 Dave Wysochanski 2018-04-03 14:39:34 UTC
(In reply to Miroslav Suchý from comment #4)
> In manager page you can pass any URL which wget can download. You based on
>  
> https://stackoverflow.com/questions/18239567/how-can-i-download-a-file-from-
> an-s3-bucket-with-wget
> it seems you can download S3 files using wget as well. Does this work for
> you?

it will probably depend on security - maybe spenser or shao knows how this will work or is intended to work

Comment 6 Dave Wysochanski 2018-04-10 14:24:42 UTC
It would also be good if there was a way to reject a vmcore that has already been submitted.  If there is an md5sum or checksum in S3 we could possibly add that to the 'stats.db' file and then check it before downloading from S3.  That would eliminate our duplicate vmcore problem.

Today as a start we have a patch that deduplicates based on md5sum in retrace-server-cleanup see (https://bugzilla.redhat.com/show_bug.cgi?id=1558903).  We could also reject based on this but that is not the best since we have to download it and calculate the md5sum.

Comment 10 Dave Wysochanski 2023-08-08 14:06:02 UTC
Closing this WONTFIX due to lack of bandwidth and unclear upstream project status going forward.

Comment 11 Red Hat Bugzilla 2023-12-07 04:25:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days