Bug 1905301 - volume backup to ceph backend gets stuck in creating status when debug and use_json are True
Summary: volume backup to ceph backend gets stuck in creating status when debug and us...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-oslo-serialization
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: z7
: 16.1 (Train on RHEL 8.2)
Assignee: Sofia Enriquez
QA Contact: Tzach Shefi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-08 01:49 UTC by Takashi Kajinami
Modified: 2024-10-01 17:10 UTC (History)
6 users (show)

Fixed In Version: python-oslo-serialization-2.29.2-1.20201116091656.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-26 13:49:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1908607 0 None None None 2021-01-08 15:28:39 UTC
OpenStack gerrit 769808 0 None MERGED Fix json to_primitive when using IO OBjects 2021-02-09 15:09:47 UTC
OpenStack gerrit 774210 0 None MERGED Fix json to_primitive when using IO OBjects 2021-02-10 16:42:44 UTC
Red Hat Issue Tracker OSP-1965 0 None None None 2022-08-26 15:30:07 UTC
Red Hat Product Errata RHBA-2021:2097 0 None None None 2021-05-26 13:50:16 UTC

Description Takashi Kajinami 2020-12-08 01:49:55 UTC
Description of problem:

When cinder-volume uses ceph backend and has debug log and json format log enabled, any backup gets stuck in creating status

~~~
(overcloud) [stack@undercloud-0 ~]$ cinder backup-create b9aeba71-0e10-449a-a755-b1b541002488 --name backup2
+-----------+--------------------------------------+
| Property  | Value                                |
+-----------+--------------------------------------+
| id        | 7419ef7e-f18a-4ab4-95ad-a8632d1647a6 |
| name      | backup2                              |
| volume_id | b9aeba71-0e10-449a-a755-b1b541002488 |
+-----------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ cinder backup-show 7419ef7e-f18a-4ab4-95ad-a8632d1647a6
+-----------------------+--------------------------------------+
| Property              | Value                                |
+-----------------------+--------------------------------------+
| availability_zone     | nova                                 |
| container             | None                                 |
| created_at            | 2020-12-08T01:48:15.000000           |
| data_timestamp        | 2020-12-08T01:48:15.000000           |
| description           | None                                 |
| fail_reason           | None                                 |
| has_dependent_backups | False                                |
| id                    | 7419ef7e-f18a-4ab4-95ad-a8632d1647a6 |
| is_incremental        | False                                |
| name                  | backup2                              |
| object_count          | 0                                    |
| size                  | 1                                    |
| snapshot_id           | None                                 |
| status                | creating                             |
| updated_at            | 2020-12-08T01:48:15.000000           |
| volume_id             | b9aeba71-0e10-449a-a755-b1b541002488 |
+-----------------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ 
~~~

I don't see any errors in cinder-backup.log or stdout from the cinder-backup container.
I observe the following 2 lines in the bottom of cinder-backup.log
~~~
{"message": "==> connect_volume: call \"{'self': <os_brick.initiator.connectors.rbd.RBDConnector object at 0x7fcd90061550>, 'connection_properties': {'name': 'volumes/volume-b9aeba71-0e10-449a-a755-b1b541002488', 'hosts': ['172.17.3.61', '172.17.3.70', '172.17.3.125'], 'ports': ['6789', '6789', '6789'], 'cluster_name': 'ceph', 'auth_enabled': True, 'auth_username': 'openstack', 'secret_type': 'ceph', 'secret_uuid': '***', 'volume_id': 'b9aeba71-0e10-449a-a755-b1b541002488', 'discard': True, 'keyring': None, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False}}\"", "asctime": "2020-12-08 01:48:17", "name": "os_brick.initiator.connectors.rbd", "msg": "==> %(func)s: call %(all_args)r", "args": {"func": "connect_volume", "all_args": "{'self': <os_brick.initiator.connectors.rbd.RBDConnector object at 0x7fcd90061550>, 'connection_properties': {'name': 'volumes/volume-b9aeba71-0e10-449a-a755-b1b541002488', 'hosts': ['172.17.3.61', '172.17.3.70', '172.17.3.125'], 'ports': ['6789', '6789', '6789'], 'cluster_name': 'ceph', 'auth_enabled': True, 'auth_username': 'openstack', 'secret_type': 'ceph', 'secret_uuid': '***', 'volume_id': 'b9aeba71-0e10-449a-a755-b1b541002488', 'discard': True, 'keyring': None, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False}}"}, "levelname": "DEBUG", "levelno": 10, "pathname": "/usr/lib/python3.6/site-packages/os_brick/utils.py", "filename": "utils.py", "module": "utils", "lineno": 146, "funcname": "trace_logging_wrapper", "created": 1607392097.3840482, "msecs": 384.0482234954834, "relative_created": 63425.718784332275, "thread": 140520887643328, "thread_name": "MainThread", "process_name": "MainProcess", "process": 43, "traceback": null, "hostname": "controller-1", "error_summary": "", "context": {"user_name": null, "project_name": "admin", "domain_name": null, "user_domain_name": null, "project_domain_name": null, "user": "b2cfabfa6bc74a9ba0ed91687d280cb3", "tenant": "15b7011587484d6e802741cdbc11dba2", "system_scope": null, "project": "15b7011587484d6e802741cdbc11dba2", "domain": null, "user_domain": "default", "project_domain": "default", "is_admin": true, "read_only": false, "show_deleted": false, "auth_token": "***", "request_id": "req-b525298b-20a6-46c4-863a-94889ccd6034", "global_request_id": "req-681ddf06-8068-4193-8dff-e16b65ed8cd3", "resource_uuid": null, "roles": ["reader", "admin", "member"], "user_identity": "b2cfabfa6bc74a9ba0ed91687d280cb3 15b7011587484d6e802741cdbc11dba2 - default default", "is_admin_project": true, "user_id": "b2cfabfa6bc74a9ba0ed91687d280cb3", "project_id": "15b7011587484d6e802741cdbc11dba2", "domain_id": null, "read_deleted": "no", "remote_address": "172.17.1.20", "timestamp": "2020-12-08T01:48:15.396657+00:00", "quota_class": null, "service_catalog": [{"type": "identity", "name": "keystone", "endpoints": [{"region": "regionOne", "publicURL": "https://10.0.0.101:13000", "adminURL": "http://192.168.24.30:35357", "internalURL": "http://172.17.1.41:5000"}]}, {"type": "image", "name": "glance", "endpoints": [{"region": "regionOne", "adminURL": "http://172.17.1.41:9292", "publicURL": "https://10.0.0.101:13292", "internalURL": "http://172.17.1.41:9292"}]}, {"type": "compute", "name": "nova", "endpoints": [{"region": "regionOne", "adminURL": "http://172.17.1.41:8774/v2.1", "internalURL": "http://172.17.1.41:8774/v2.1", "publicURL": "https://10.0.0.101:13774/v2.1"}]}, {"type": "object-store", "name": "swift", "endpoints": [{"region": "regionOne", "internalURL": "http://172.17.3.41:8080/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2", "publicURL": "https://10.0.0.101:13808/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2", "adminURL": "http://172.17.3.41:8080/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2"}]}]}, "extra": {"project": "15b7011587484d6e802741cdbc11dba2", "version": "unknown"}}
{"message": "opening connection to ceph cluster (timeout=-1).", "asctime": "2020-12-08 01:48:17", "name": "os_brick.initiator.linuxrbd", "msg": "opening connection to ceph cluster (timeout=%s).", "args": [-1], "levelname": "DEBUG", "levelno": 10, "pathname": "/usr/lib/python3.6/site-packages/os_brick/initiator/linuxrbd.py", "filename": "linuxrbd.py", "module": "linuxrbd", "lineno": 70, "funcname": "connect", "created": 1607392097.385284, "msecs": 385.2839469909668, "relative_created": 63426.95450782776, "thread": 140520887643328, "thread_name": "MainThread", "process_name": "MainProcess", "process": 43, "traceback": null, "hostname": "controller-1", "error_summary": "", "context": {"user_name": null, "project_name": "admin", "domain_name": null, "user_domain_name": null, "project_domain_name": null, "user": "b2cfabfa6bc74a9ba0ed91687d280cb3", "tenant": "15b7011587484d6e802741cdbc11dba2", "system_scope": null, "project": "15b7011587484d6e802741cdbc11dba2", "domain": null, "user_domain": "default", "project_domain": "default", "is_admin": true, "read_only": false, "show_deleted": false, "auth_token": "***", "request_id": "req-b525298b-20a6-46c4-863a-94889ccd6034", "global_request_id": "req-681ddf06-8068-4193-8dff-e16b65ed8cd3", "resource_uuid": null, "roles": ["reader", "admin", "member"], "user_identity": "b2cfabfa6bc74a9ba0ed91687d280cb3 15b7011587484d6e802741cdbc11dba2 - default default", "is_admin_project": true, "user_id": "b2cfabfa6bc74a9ba0ed91687d280cb3", "project_id": "15b7011587484d6e802741cdbc11dba2", "domain_id": null, "read_deleted": "no", "remote_address": "172.17.1.20", "timestamp": "2020-12-08T01:48:15.396657+00:00", "quota_class": null, "service_catalog": [{"type": "identity", "name": "keystone", "endpoints": [{"region": "regionOne", "publicURL": "https://10.0.0.101:13000", "adminURL": "http://192.168.24.30:35357", "internalURL": "http://172.17.1.41:5000"}]}, {"type": "image", "name": "glance", "endpoints": [{"region": "regionOne", "adminURL": "http://172.17.1.41:9292", "publicURL": "https://10.0.0.101:13292", "internalURL": "http://172.17.1.41:9292"}]}, {"type": "compute", "name": "nova", "endpoints": [{"region": "regionOne", "adminURL": "http://172.17.1.41:8774/v2.1", "internalURL": "http://172.17.1.41:8774/v2.1", "publicURL": "https://10.0.0.101:13774/v2.1"}]}, {"type": "object-store", "name": "swift", "endpoints": [{"region": "regionOne", "internalURL": "http://172.17.3.41:8080/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2", "publicURL": "https://10.0.0.101:13808/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2", "adminURL": "http://172.17.3.41:8080/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2"}]}]}, "extra": {"project": "15b7011587484d6e802741cdbc11dba2", "version": "unknown"}}
~~~


Version-Release number of selected component (if applicable):
I reproduced the issue with RHOSP16.1.2 .

How reproducible:
Always

Steps to Reproduce:
1. Deploy overcloud with ceph-backup enabled

2. Enable debug and json format log of cinder-backup

/var/lib/config-data/puppet-generated/cinder/etc/cinder/cinder.conf
~~~
[DEFAULT]
...
debug=True
...
use_json = true
~~~

3. Create volume backup
$ cinder backup-create <voume id> --name <backup name>

Actual results:
The created backup gets stuck in creating status

Expected results:
The backup is created and become active

Additional info:

Comment 4 Sofia Enriquez 2021-01-08 15:28:42 UTC
As Eric wrote in the upstream bug report: 

The oslo.log JSONFormatter gets stuck when passing an RBDVolumeIOWrapper from os-brick.
This happens via os-brick's utils.trace() method which passes a connector containing
{'path': RBDVolumeIOWrapper}. The oslo.log JSONFormatter format() method calls
oslo_serialization's jsonutils.to_primitive and passes in this RBDVolumeIOWrapper object.
 
Therefore the to_primitive method eventually calls RBDVolumeIOWrapper.read() which it's not intended.

Comment 6 Sofia Enriquez 2021-02-10 16:44:01 UTC
Fix in upstream stable/train

Comment 21 errata-xmlrpc 2021-05-26 13:49:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2097


Note You need to log in before you can comment on or make changes to this bug.