Bug 2036211

Summary: [GSS] noobaa-endpoint becomes CrashLoopBackOff when uploading metrics data to bucket
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: cqu
Component: Multi-Cloud Object GatewayAssignee: Romy Ayalon <rayalon>
Status: CLOSED ERRATA QA Contact: Ben Eli <belimele>
Severity: high Docs Contact:
Priority: high    
Version: 4.9CC: assingh, belimele, etamir, jarrpa, muagarwa, nbecker, nberry, ocs-bugs, odf-bz-bot, rayalon, swilson, tdesala
Target Milestone: ---   
Target Release: ODF 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.10.0-113 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 2051829 (view as bug list) Environment:
Last Closed: 2022-04-13 18:51:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2051829    

Description cqu 2021-12-30 13:15:30 UTC
Description of problem (please be detailed as possible and provide log
snippets): I deployed ODF and created one bucket as S3 object storage, when uploading the metrics data into bucket, noobaa-endpoint becomes CrashLoopBackOff and some errors in the log


Version of all relevant components (if applicable): 4.9


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)? Yes


Is there any workaround available to the best of your knowledge? No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)? 4


Can this issue reproducible? Yes


Can this issue reproduce from the UI? 


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deployed ODF and created one bucket first.bucket
2. configure this bucket as the thanos S3 object storage
3. when uploading the metrics data into bucket, noobaa-endpoint becomes CrashLoopBackOff and some errors in the log:
```
Dec-30 12:26:45.794 [Endpoint/14] [ERROR] CONSOLE:: RPC._on_request: ERROR srv object_api.read_object_md reqid 4@fcall://fcall(3ubm4vl) connid fcall://fcall(3ubm4vl) Error: No such object: obj_id undefined bucket SENSITIVE-329af250378a10bb key 01FR3B94Q3CJQ3RFWR32FRYJKM/meta.json
    at check_object_mode (/root/node_modules/noobaa-core/src/server/object_services/object_server.js:1496:15)
    at find_object_md (/root/node_modules/noobaa-core/src/server/object_services/object_server.js:1417:5)
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
    at async Object.read_object_md (/root/node_modules/noobaa-core/src/server/object_services/object_server.js:775:17)
    at async /root/node_modules/noobaa-core/src/rpc/rpc.js:340:32
    at async RPC._on_request (/root/node_modules/noobaa-core/src/rpc/rpc.js:348:25)
Dec-30 12:26:45.794 [Endpoint/14] [ERROR] core.rpc.rpc:: RPC._request: response ERROR srv object_api.read_object_md reqid 4@fcall://fcall(3ubm4vl) connid fcall://fcall(3ubm4vl) params { bucket: SENSITIVE-329af250378a10bb, key: '01FR3B94Q3CJQ3RFWR32FRYJKM/meta.json', version_id: undefined, md_conditions: undefined, encryption: undefined } took [2.3+0.5=2.8] [RpcError: No such object: obj_id undefined bucket SENSITIVE-329af250378a10bb key 01FR3B94Q3CJQ3RFWR32FRYJKM/meta.json] { rpc_code: 'NO_SUCH_OBJECT' }
Dec-30 12:26:45.794 [Endpoint/14] [ERROR] core.endpoint.s3.s3_rest:: S3 ERROR <?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Resource>/first.bucket/01FR3B94Q3CJQ3RFWR32FRYJKM/meta.json</Resource><RequestId>kxsxyv1a-2vvt8r-3cb</RequestId></Error> HEAD /first.bucket/01FR3B94Q3CJQ3RFWR32FRYJKM/meta.json {"host":"a702e5a7764d84aee947e3b4e94fdb40-1448416135.us-east-2.elb.amazonaws.com","user-agent":"MinIO (linux; amd64) minio-go/v7.0.10 thanos-receive/0.22.0 (go1.16.6)","authorization":"AWS4-HMAC-SHA256 Credential=hStml498dG19sUcd2xvf/20211230/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=68cdd42def719043e702c3d2a4d977e956d4e82a53f8d16196f324f76ec1bf61","x-amz-content-sha256":"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855","x-amz-date":"20211230T122645Z"} [RpcError: No such object: obj_id undefined bucket SENSITIVE-329af250378a10bb key 01FR3B94Q3CJQ3RFWR32FRYJKM/meta.json] { rpc_code: 'NO_SUCH_OBJECT' }
Dec-30 12:26:45.801 [Endpoint/14]    [L0] core.sdk.object_sdk:: validate_non_nsfs_bucket:  { name: SENSITIVE-b8e720282050fed7, email: SENSITIVE-9cf0fd89409efef8, access_keys: [ { access_key: SENSITIVE-042590181bd2facf, secret_key: SENSITIVE-ae3374115b5e0b8c } ], has_login: true, has_s3_access: true, allowed_buckets: { full_permission: true }, default_resource: 'noobaa-default-backing-store', can_create_buckets: true, systems: [ { name: 'noobaa', roles: [ 'admin' ] } ], external_connections: { count: 0, connections: [] }, preferences: { ui_theme: 'LIGHT' } } undefined
Dec-30 12:26:45.802 [Endpoint/14]    [L0] core.endpoint.s3.ops.s3_put_object:: PUT OBJECT first.bucket debug/metas/01FR3B94Q3CJQ3RFWR32FRYJKM.json
Dec-30 12:26:45.805 [Endpoint/14]    [L0] core.sdk.object_io:: upload_object: start upload { bucket: 'first.bucket', key: 'debug/metas/01FR3B94Q3CJQ3RFWR32FRYJKM.json', content_type: 'application/octet-stream', size: 739, md5_b64: undefined, sha256_b64: undefined, xattr: {}, tagging: undefined, encryption: undefined, lock_settings: undefined }
Dec-30 12:26:45.819 [Endpoint/14]    [L0] core.server.node_services.node_allocator:: refresh_pool_alloc: updated pool noobaa-default-backing-store nodes [ 'noobaa-internal-agent-61cb0f294d93fa0029b46afa' ]
Dec-30 12:26:45.838 [Endpoint/14]    [L0] core.sdk.object_io:: UPLOAD: { obj_id: '61cda585026af6000ea2466f', bucket: 'first.bucket', key: 'debug/metas/01FR3B94Q3CJQ3RFWR32FRYJKM.json' } streaming to first.bucket debug/metas/01FR3B94Q3CJQ3RFWR32FRYJKM.json
Dec-30 12:26:45.840 [Endpoint/14] [ERROR] core.sdk.object_io:: _upload_stream error Error: Pipeline called on destroyed stream
    at Object.pipeline (/root/node_modules/noobaa-core/src/util/stream_utils.js:36:21)
    at ObjectIO._upload_stream_internal (/root/node_modules/noobaa-core/src/sdk/object_io.js:435:28)
    at /root/node_modules/noobaa-core/src/sdk/object_io.js:365:28
    at Semaphore.surround_count (/root/node_modules/noobaa-core/src/util/semaphore.js:90:90)
    at async ObjectIO._upload_stream (/root/node_modules/noobaa-core/src/sdk/object_io.js:363:25)
    at async ObjectIO.upload_object (/root/node_modules/noobaa-core/src/sdk/object_io.js:233:17)
    at async NamespaceNB.upload_object (/root/node_modules/noobaa-core/src/sdk/namespace_nb.js:126:23)
    at async ObjectSDK.upload_object (/root/node_modules/noobaa-core/src/sdk/object_sdk.js:504:23)
    at async Object.put_object [as handler] (/root/node_modules/noobaa-core/src/endpoint/s3/ops/s3_put_object.js:30:19)
    at async handle_request (/root/node_modules/noobaa-core/src/endpoint/s3/s3_rest.js:149:19) Error: Pipeline called on destroyed stream
    at Object.pipeline (/root/node_modules/noobaa-core/src/util/stream_utils.js:36:21)
    at ObjectIO._upload_stream_internal (/root/node_modules/noobaa-core/src/sdk/object_io.js:435:28)
    at /root/node_modules/noobaa-core/src/sdk/object_io.js:365:28
    at Semaphore.surround_count (/root/node_modules/noobaa-core/src/util/semaphore.js:90:90)
    at async ObjectIO._upload_stream (/root/node_modules/noobaa-core/src/sdk/object_io.js:363:25)
    at async ObjectIO.upload_object (/root/node_modules/noobaa-core/src/sdk/object_io.js:233:17)
    at async NamespaceNB.upload_object (/root/node_modules/noobaa-core/src/sdk/namespace_nb.js:126:23)
    at async ObjectSDK.upload_object (/root/node_modules/noobaa-core/src/sdk/object_sdk.js:504:23)
    at async Object.put_object [as handler] (/root/node_modules/noobaa-core/src/endpoint/s3/ops/s3_put_object.js:30:19)
    at async handle_request (/root/node_modules/noobaa-core/src/endpoint/s3/s3_rest.js:149:19)
Dec-30 12:26:45.840 [Endpoint/14]  [WARN] core.sdk.object_io:: upload_object: failed upload { bucket: 'first.bucket', key: 'debug/metas/01FR3B94Q3CJQ3RFWR32FRYJKM.json', md_conditions: undefined, obj_id: '61cda585026af6000ea2466f', size: 0, num_parts: 0 } Error: Pipeline called on destroyed stream
    at Object.pipeline (/root/node_modules/noobaa-core/src/util/stream_utils.js:36:21)
    at ObjectIO._upload_stream_internal (/root/node_modules/noobaa-core/src/sdk/object_io.js:435:28)
    at /root/node_modules/noobaa-core/src/sdk/object_io.js:365:28
    at Semaphore.surround_count (/root/node_modules/noobaa-core/src/util/semaphore.js:90:90)
    at async ObjectIO._upload_stream (/root/node_modules/noobaa-core/src/sdk/object_io.js:363:25)
    at async ObjectIO.upload_object (/root/node_modules/noobaa-core/src/sdk/object_io.js:233:17)
    at async NamespaceNB.upload_object (/root/node_modules/noobaa-core/src/sdk/namespace_nb.js:126:23)
    at async ObjectSDK.upload_object (/root/node_modules/noobaa-core/src/sdk/object_sdk.js:504:23)
    at async Object.put_object [as handler] (/root/node_modules/noobaa-core/src/endpoint/s3/ops/s3_put_object.js:30:19)
    at async handle_request (/root/node_modules/noobaa-core/src/endpoint/s3/s3_rest.js:149:19)
Dec-30 12:26:45.841 [Endpoint/14] [ERROR] CONSOLE:: PANIC: process uncaughtException Error: Pipeline called on destroyed stream
    at Object.pipeline (/root/node_modules/noobaa-core/src/util/stream_utils.js:36:21)
    at ObjectIO._upload_stream_internal (/root/node_modules/noobaa-core/src/sdk/object_io.js:435:28)
    at /root/node_modules/noobaa-core/src/sdk/object_io.js:365:28
    at Semaphore.surround_count (/root/node_modules/noobaa-core/src/util/semaphore.js:90:90)
    at async ObjectIO._upload_stream (/root/node_modules/noobaa-core/src/sdk/object_io.js:363:25)
    at async ObjectIO.upload_object (/root/node_modules/noobaa-core/src/sdk/object_io.js:233:17)
    at async NamespaceNB.upload_object (/root/node_modules/noobaa-core/src/sdk/namespace_nb.js:126:23)
    at async ObjectSDK.upload_object (/root/node_modules/noobaa-core/src/sdk/object_sdk.js:504:23)
    at async Object.put_object [as handler] (/root/node_modules/noobaa-core/src/endpoint/s3/ops/s3_put_object.js:30:19)
    at async handle_request (/root/node_modules/noobaa-core/src/endpoint/s3/s3_rest.js:149:19)
```


Actual results:


Expected results:


Additional info:

Comment 3 cqu 2021-12-30 13:24:36 UTC
@chuyang @lcao FYI. I encountered this issue when I took ODF as the object storage to verify Observability.

Comment 4 Nitin Goyal 2021-12-30 14:35:35 UTC
looks like a noobaa issue, moving it to the noobaa component.

Comment 6 cqu 2022-01-04 02:16:33 UTC
@rayalon I uploaded must gather here - https://drive.google.com/file/d/1f_HzClSGe92RpLQGlJ96e5TssW0QyM52/view?usp=sharing, thanks.

Comment 8 cqu 2022-01-05 00:57:50 UTC
I had tried to upload one file to first.bucket, it's working and without issue. I only found this issue when Thanos is uploading data to the bucket. Thanks.

Comment 12 cqu 2022-01-10 06:25:22 UTC
I had ever tried the OCS 4.8 without this issue, it's my first time to try ODF 4.9.

Comment 16 Nimrod Becker 2022-02-08 06:57:53 UTC
Creating a clone

Comment 24 cqu 2022-03-09 07:29:03 UTC
Hello, May I know if this issue bug has been fixed in ODF 4.9 later version, if yes may I know the detailed version info. Thanks.

Comment 28 swilson 2022-03-23 19:48:53 UTC
Saw this same issue. the endpoint was entering crashloopbackoff every 30s. Stopped the Thanos uploads and endpoint was able to run.

- Delete Thanos OBC
- Restarted noobaa endpoint
- Waited for endpoint to without crashing for 1 minute
- Recreated Thanos OBC
- Monitored noobaa endpoint
- endpoint has not crashed.

Comment 30 errata-xmlrpc 2022-04-13 18:51:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372