Bug 2284394

Summary: [RGW QAT]: Decompression fails on QAT when downloading regular uploaded object
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tejas <tchandra>
Component: RGWAssignee: Mark Kogan <mkogan>
Status: CLOSED ERRATA QA Contact: Tejas <tchandra>
Severity: urgent Docs Contact: Akash Raj <akraj>
Priority: unspecified    
Version: 7.1CC: ceph-eng-bugs, cephqe-warriors, jcaratza, mbenjamin, mkasturi, mkogan, rpollack, sostapov, tserlin
Target Milestone: ---Keywords: Reopened
Target Release: 7.1z2   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-18.2.1-235.el9cp Doc Type: Known Issue
Doc Text:
.Intel QAT Acceleration for Object Compression & Encryption Intel QuickAssist Technology (QAT) is implemented to help reduce node CPU usage and improve the performance of Ceph Object Gateway when enabling compression and encryption. It's a known issue that QAT can only be configured on new setups (Greenfield only). QAT Ceph Object Gateway daemons cannot be configured in the same cluster as non-QAT (regular) Ceph Object Gateway daemons.
Story Points: ---
Clone Of:
: 2307218 (view as bug list) Environment:
Last Closed: 2024-11-07 14:38:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2267614, 2298578, 2298579, 2307218    

Description Tejas 2024-06-03 08:09:10 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Mark Kogan 2024-06-04 13:10:56 UTC
interim update, the issue does NOT repro in a SW QAT env:

in either singe-part, nor multi-part.


MP example - port 8000 is qat_compressor_enabled=false, port 8001 is qat_compressor_enabled=true :


❯ s3cmd put --host=0:8000 ./Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2 s3://bkt/objMP02
                     ^^^^ PUT to non QAT (multi-part)
upload: './Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2' -> 's3://bkt/objMP02'  [part 1 of 26, 15MB] [1 of 1]                                                                              
 15728640 of 15728640   100% in    0s    27.77 MB/s  done                                                                                                                                    
upload: './Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2' -> 's3://bkt/objMP02'  [part 2 of 26, 15MB] [1 of 1]                                                                              
 15728640 of 15728640   100% in    0s    26.27 MB/s  done                                                                                                                                    
upload: './Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2' -> 's3://bkt/objMP02'  [part 3 of 26, 15MB] [1 of 1]                                                                              
 15728640 of 15728640   100% in    0s    29.11 MB/s  done                                                                                                                                    
upload: './Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2' -> 's3://bkt/objMP02'  [part 4 of 26, 15MB] [1 of 1]
...
upload: './Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2' -> 's3://bkt/objMP02'  [part 25 of 26, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    29.17 MB/s  done
upload: './Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2' -> 's3://bkt/objMP02'  [part 26 of 26, 4MB] [1 of 1]
 4259840 of 4259840   100% in    0s    28.24 MB/s  done


❯ s3cmd get --host=0:8001 s3://bkt/objMP02 --force
                     ^^^^ QAT
download: 's3://bkt/objMP02' -> './objMP02'  [1 of 1]
 397475840 of 397475840   100% in    0s   407.11 MB/s  done

tail -F ./out/radosgw.8001.log
tail -F ./out/radosgw.8001.log | tspin
2024-06-04T09:58:25.884+0000 7fffc6d84640  1 beast: 0x7fffc0576510: 127.0.0.1 - cosbench [04/Jun/2024:09:58:16.749 +0000] " PUT  /bkt/obj01 HTTP/1.1" 200 397475840 - "aws-cli/1.23.10 Python/3.9.18 Linux/5.14.0-427.18.1.el9_4.x86_64 botocore/1.25.10" - latency=9.135948181s
2024-06-04T09:59:22.851+0000 7fffc357d640  1 ====== starting new request req=0x7fffc0576510 =====
2024-06-04T09:59:22.852+0000 7fffc357d640  1 ====== req done req=0x7fffc0576510 op status=0 http_status=200 latency=0.000999995s ======
2024-06-04T09:59:22.852+0000 7fffc357d640  1 beast: 0x7fffc0576510: 127.0.0.1 - cosbench [04/Jun/2024:09:59:22.851 +0000] " GET  /bkt/?location HTTP/1.1" 200 134 - - - latency=0.000999995s
2024-06-04T09:59:22.853+0000 7fffc5d82640  1 ====== starting new request req=0x7fffc0576510 =====
2024-06-04T09:59:22.854+0000 7fffc3d7e640  1 ====== req done req=0x7fffc0576510 op status=0 http_status=200 latency=0.000999994s ======
2024-06-04T09:59:22.854+0000 7fffc3d7e640  1 beast: 0x7fffc0576510: 127.0.0.1 - cosbench [04/Jun/2024:09:59:22.853 +0000] " HEAD  /bkt/objMP02 HTTP/1.1" 200 0 - - - latency=0.000999994s
2024-06-04T09:59:22.855+0000 7fffc5581640  1 ====== starting new request req=0x7fffc0576510 =====
2024-06-04T09:59:23.779+0000 7fffc5581640  1 ====== req done req=0x7fffc0576510 op status=0 http_status=200 latency=0.923994780s ======
                                                                                                        ^^^
2024-06-04T09:59:23.779+0000 7fffc5581640  1 beast: 0x7fffc0576510: 127.0.0.1 - cosbench [04/Jun/2024:09:59:22.855 +0000] " GET  /bkt/objMP02 HTTP/1.1" 200 397475840 - - - latency=0.923994780s


waiting for a HW QAT env to become available

Comment 22 errata-xmlrpc 2024-11-07 14:38:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.1 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:9010

Comment 23 Red Hat Bugzilla 2025-03-08 04:25:05 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days