Bug 2223292

Summary: build arrow with optional components
Product: [Fedora] Fedora Reporter: steffen.stell
Component: libarrowAssignee: Kaleb KEITHLEY <kkeithle>
Status: CLOSED WONTFIX QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 38CC: benson_muite, code, kkeithle
Target Milestone: ---Keywords: Improvement
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
URL: https://github.com/Enchufa2/cran2copr/issues/37#issuecomment-1633197821
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-17 14:02:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description steffen.stell 2023-07-17 09:28:46 UTC
Currently, libarrow is built without several optional components. In particular, I am looking for S3 support. Is it possible to build it with more of the optional features enabled? 

``` r
arrow_info()
#> Arrow package version: 11.0.0.3
#> 
#> Capabilities:
#>                
#> dataset    TRUE
#> substrait FALSE
#> parquet    TRUE
#> json       TRUE
#> s3        FALSE
#> gcs       FALSE
#> utf8proc   TRUE
#> re2        TRUE
#> snappy     TRUE
#> gzip       TRUE
#> brotli     TRUE
#> zstd       TRUE
#> lz4        TRUE
#> lz4_frame  TRUE
#> lzo       FALSE
#> bz2        TRUE
#> jemalloc  FALSE
#> mimalloc  FALSE
#> 
#> To reinstall with more optional capabilities enabled, see
#>    https://arrow.apache.org/docs/r/articles/install.html
#> 
#> Memory:
#>                  
#> Allocator  system
#> Current   0 bytes
#> Max       0 bytes
#> 
#> Runtime:
#>                         
#> SIMD Level          avx2
#> Detected SIMD Level avx2
#> 
#> Build:
#>                            
#> C++ Library Version  11.0.0
#> C++ Compiler            GNU
#> C++ Compiler Version 13.1.1
```

Reproducible: Always

Comment 1 Kaleb KEITHLEY 2023-07-17 14:02:38 UTC
enabling s3 requires Amazon aws-lc, and maybe more build-time dependencies.  aws-lc doesn't exist in Fedora, at least not that I can find.

(Amazon likes to download sources at build time, which doesn't work in koji/mock builds which disable network access.)

When someone adds aws-lc to fedora (or when I find time to add it) then S3 can be enabled.

setting CLOSED/WONTFIX.  When aws-lc gets added you can reopen this bz or open a new bz.

Comment 2 Benson Muite 2023-07-18 13:05:31 UTC
It does build https://copr.fedorainfracloud.org/coprs/fed500/aws-lc/build/6182074 though will conflict
with OpenSSL.  Some licensing issues also need to be addressed.

Comment 3 Kaleb KEITHLEY 2023-07-19 12:31:02 UTC
T(In reply to Benson Muite from comment #2)
> It does build
> https://copr.fedorainfracloud.org/coprs/fed500/aws-lc/build/6182074 though
> will conflict
> with OpenSSL.  Some licensing issues also need to be addressed.

That's good to know. But it doesn't help until the aws-lc is actually added to fedora.

Do you want to elaborate on the licensing issues? aws-lc appears to be licensed the same as OpenSSL and BoringSSL (which is not surprising, since it's derived from them.)

All things considered, it would be better, and easier, if Arrow's S3 just used OpenSSL.

Comment 4 steffen.stell 2023-07-20 12:39:53 UTC
I am somewhat confused by this requirement. The arrow documentation (https://arrow.apache.org/docs/r/articles/install.html#libraries) states that the only additional dependencies for S3 and GCS would be CURL and OpenSSL.

Comment 5 Kaleb KEITHLEY 2023-07-20 13:01:02 UTC
Simply flipping use_s3 or -DARROW_S3=ON results in

...
-- stderr output is:
CMake Error at aws_lc_ep-stamp/download-aws_lc_ep.cmake:170 (message):
  Each download failed!
    error: downloading 'https://github.com/awslabs/aws-lc/archive/v1.3.0.tar.gz' failed
...

(full log at https://kojipkgs.fedoraproject.org//work/tasks/924/103480924/build.log)

Setting aside the fact that Fedora koji builds literally can not download third party sources. And then there's the work to patch up the cmake to use the system's SSL lib, which looks, to me, like it's really aws-lc and not, as the documentation claims, OpenSSL. (There are lies, damned lies, and documentation. :-))

If you know something I don't know then send a PR to https://src.fedoraproject.org/rpms/libarrow

Comment 6 Kaleb KEITHLEY 2023-07-24 17:10:52 UTC
s/use the system's SSL lib/use a theoretical unbundled aws-lc SSL lib/