Bug 2303403 - [MCG-4.17] Guaranteed bucket logging fails with due to "unexpected error in consuming log file"
Summary: [MCG-4.17] Guaranteed bucket logging fails with due to "unexpected error in c...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.17
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ODF 4.17.0
Assignee: Jacky Albo
QA Contact: Sagi Hirshfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-08-07 09:37 UTC by Sagi Hirshfeld
Modified: 2024-10-30 14:30 UTC (History)
5 users (show)

Fixed In Version: 4.17.0-103
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-10-30 14:30:15 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github noobaa noobaa-core pull 8256 0 None Merged Adding node support for bucket logging 2024-09-16 12:59:52 UTC
Red Hat Issue Tracker OCSBZM-9249 0 None None None 2024-09-18 15:34:37 UTC
Red Hat Issue Tracker OCSBZM-9260 0 None None None 2024-09-16 12:55:25 UTC
Red Hat Product Errata RHSA-2024:8676 0 None None None 2024-10-30 14:30:18 UTC

Description Sagi Hirshfeld 2024-08-07 09:37:47 UTC
Description of problem (please be detailed as possible and provide log
snippests):
----------------------------------------------------------------------
While working on OCS-CI tests to cover the feature, I've noticed that occasionally the operations are never logged.

Inspection of the noobaa-core pod logs has shown the following error whenever that happened:


```
Aug-7 8:46:46.032 [BGWorkers/32] [ERROR] core.util.persistent_logger:: unexpected error in consuming log file: /var/logs/bucket-logs/bucket_logging.1723018876137.log
Aug-7 8:46:46.035 [BGWorkers/32] [ERROR] core.util.bucket_logs_utils:: unexpected error in upload to bucket: SyntaxError: Unexpected token '}', "}" is not valid JSON
    at JSON.parse (<anonymous>)
    at /root/node_modules/noobaa-core/src/util/bucket_logs_utils.js:59:36
    at /root/node_modules/noobaa-core/src/util/persistent_logger.js:293:23
    at NewlineReader.forEach (/root/node_modules/noobaa-core/src/util/file_reader.js:86:24)
    at async LogFile.collect_and_process (/root/node_modules/noobaa-core/src/util/persistent_logger.js:292:13)
    at async _upload_to_targets (/root/node_modules/noobaa-core/src/util/bucket_logs_utils.js:58:9)
    at async PersistentLogger._process (/root/node_modules/noobaa-core/src/util/persistent_logger.js:158:38)
    at async PersistentLogger.process (/root/node_modules/noobaa-core/src/util/persistent_logger.js:187:26)
    at async BucketLogUploader.scan (/root/node_modules/noobaa-core/src/server/bg_services/bucket_logs_upload.js:43:29)
    at async BucketLogUploader.run_batch (/root/node_modules/noobaa-core/src/server/bg_services/bucket_logs_upload.js:61:13) for: /var/logs/bucket-logs/bucket_logging.1723018876137.log
```

So I tried to inspect the intermediate logs under /var/logs/bucket-logs/ via jq, only to confirm there's a parsing issue:
```
$ oc rsh noobaa-core-0 cat /var/logs/bucket-logs/bucket_logging.log | jq 
...
...
{
  "noobaa_bucket_logging": "true",
  "op": "DELETE",
  ...
  ...
  "http_status": 102,
  "request_id": "lzjkukii-33qjp7-ueq"
}
parse error: Unmatched '}' at line 15, column 1
```

And indeed, I found a row that contained only an opening curly brace ('{' at row 14 below):
```
  1   {"noobaa_bucket_logging":"true","op":"GET","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0?list-type=2&pref      ix=&encoding-type=url","log_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0?list-type=2&prefix=&encoding-t      ype=url","http_status":102,"request_id":"lzjkuk4b-an30c7-hfo"}
    1 {"noobaa_bucket_logging":"true","op":"GET","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0?list-type=2&pref      ix=&encoding-type=url","log_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0?list-type=2&prefix=&encoding-t      ype=url","http_status":200,"request_id":"lzjkuk4b-an30c7-hfo"}
    2 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-1","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-1","http_status":102,"request_id":"lzjkukho-224e9      f-7tt"}
    3 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-3","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-3","http_status":102,"request_id":"lzjkuki0-297v1      3-yus"}
    4 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-1","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-1","http_status":204,"request_id":"lzjkukho-224e9      f-7tt"}
    5 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-4","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-4","http_status":102,"request_id":"lzjkukii-2jtih      m-2fs"}
    6 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/dog.jpeg","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/dog.jpeg","http_status":102,"request_id":"lzjkukil-2lmgb      o-pdw"}
    7 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-3","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-3","http_status":204,"request_id":"lzjkuki0-297v1      3-yus"}
    8 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-4","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-4","http_status":204,"request_id":"lzjkukii-2jtih      m-2fs"}
    9 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/bird.jpeg","l      og_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/bird.jpeg","http_status":102,"request_id":"lzjkukj2-2vl      qqi-q41"}
   10 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/big_cat_pic.a      vif","log_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/big_cat_pic.avif","http_status":102,"request_id"      :"lzjkuki1-2tvjuo-7b0"}
   11 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-2","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-2","http_status":102,"request_id":"lzjkuki7-2xa9v      r-m9d"}
   12 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/seal.jpeg","l      og_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/seal.jpeg","http_status":102,"request_id":"lzjkukik-34u      wzf-sw7"}
   13 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-0","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-0","http_status":102,"request_id":"lzjkukii-33qjp      7-ueq"}
   14 }
   15 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/big_cat_pic.a      vif","log_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/big_cat_pic.avif","http_status":204,"request_id"      :"lzjkuki1-2tvjuo-7b0"}
   16 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-2","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-2","http_status":204,"request_id":"lzjkuki7-2xa9v      r-m9d"}
   17 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/seal.jpeg","l      og_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/seal.jpeg","http_status":204,"request_id":"lzjkukik-34u      wzf-sw7"}
   18 {"noobaa_bucket_logging":"true","op":"DELETE","bucket_owner":"admin","source_bucket":"s3-bucket-8ff829763dc1463192e568aaa8ddd0","object_key":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-0","lo      g_bucket":"s3-bucket-5b748ef69daa4094b3ec1702e4205b","log_prefix":"","remote_ip":"<CENSORED>","request_uri":"/s3-bucket-8ff829763dc1463192e568aaa8ddd0/ObjKey-0","http_status":204,"request_id":"lzjkukii-33qjp      7-ueq"}
```

While this is not a 100% reproducible, the consistency of the issue seems to rise with the volume of and frequency of operations.


Version of all relevant components (if applicable):
----------------------------------------------------------------------
OCP: 4.17.0-0.nightly-2024-08-01-213905
ODF: 4.17.0-63
ceph: 19.1.0-4.el9cp (b2c7ded5f7885ce1d488a241a30cba80f58d28bc) squid (rc)
rook: v4.17.0-0.b5d18bf0dd4ee4aacfe881ca3030ce10a156008f
noobaa core: 5.17.0-a7069fc
noobaa operator: 5.17.0-d3d80339937ee79af64a302e7870ae3c3c08127c
oc cli: 4.14.0-0.nightly-2023-09-02-132842
noobaa cli: 5.17.0


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
----------------------------------------------------------------------
It makes the feature unreliable under very mild workloads: repeatedly syncing 10 files and deleting them via the AWS CLI in the terminal for about 3 times seems to reliably cause the issue.


Do you know if t any workaround available to the best of your knowledge?
----------------------------------------------------------------------
Nope


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
----------------------------------------------------------------------
2


Can this issue reproducible?
----------------------------------------------------------------------
Yes


Can this issue reproduce from the UI?
----------------------------------------------------------------------
N/A


If this is a regression, please provide more details to justify this:
----------------------------------------------------------------------
No, since this is a new feature


Steps to Reproduce:
----------------------------------------------------------------------
1. Set guaranteed bucket logging on a source-bucket using a logs-bucket
2. Repeat the following AWS s3 CLI ops 5 times, using 10 1MB test objects:
   a. s3_alias sync test-objs-dir/ s3://source-bucket
   b. s3_alias rm s3://source-bucket --recursive
3. Search the noobaa-core logs for the "unexpected error in consuming log file" error:
```
$ oc logs noobaa-core-0 | grep "unexpected error in consuming log file" -A10
```
4. Inspect the logs that were written to the logs-bucket:
```
$ OBJECT_KEYS=$(my_s3 api list-objects --bucket source-bucket --query 'Contents[].Key')

$ for $KEY in $(echo $OBJECT_KEYS | jq -r '.[]'); do
    my_s3 cp s3://source-bucket/$KEY - | jq >> all_logs.json
done
```

Actual results:
----------------------------------------------------------------------
The "unexpected error in consuming log file" error above shows at the noobaa-core logs, and many of the expected logs are missing.


Expected results:
----------------------------------------------------------------------
The error shouldn't show at the noobaa-core logs, and for every PUT and DELETE op that was made in the second step we should see two logs (one with a successful op code, and one with the 102 code for the intent).


Additional info:
----------------------------------------------------------------------
Providing links for debugging in a following comment

Comment 8 Sunil Kumar Acharya 2024-09-18 12:06:54 UTC
Please update the RDT flag/text appropriately.

Comment 11 errata-xmlrpc 2024-10-30 14:30:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:8676


Note You need to log in before you can comment on or make changes to this bug.