Description of problem: When cinder-volume uses ceph backend and has debug log and json format log enabled, any backup gets stuck in creating status ~~~ (overcloud) [stack@undercloud-0 ~]$ cinder backup-create b9aeba71-0e10-449a-a755-b1b541002488 --name backup2 +-----------+--------------------------------------+ | Property | Value | +-----------+--------------------------------------+ | id | 7419ef7e-f18a-4ab4-95ad-a8632d1647a6 | | name | backup2 | | volume_id | b9aeba71-0e10-449a-a755-b1b541002488 | +-----------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ cinder backup-show 7419ef7e-f18a-4ab4-95ad-a8632d1647a6 +-----------------------+--------------------------------------+ | Property | Value | +-----------------------+--------------------------------------+ | availability_zone | nova | | container | None | | created_at | 2020-12-08T01:48:15.000000 | | data_timestamp | 2020-12-08T01:48:15.000000 | | description | None | | fail_reason | None | | has_dependent_backups | False | | id | 7419ef7e-f18a-4ab4-95ad-a8632d1647a6 | | is_incremental | False | | name | backup2 | | object_count | 0 | | size | 1 | | snapshot_id | None | | status | creating | | updated_at | 2020-12-08T01:48:15.000000 | | volume_id | b9aeba71-0e10-449a-a755-b1b541002488 | +-----------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ ~~~ I don't see any errors in cinder-backup.log or stdout from the cinder-backup container. I observe the following 2 lines in the bottom of cinder-backup.log ~~~ {"message": "==> connect_volume: call \"{'self': <os_brick.initiator.connectors.rbd.RBDConnector object at 0x7fcd90061550>, 'connection_properties': {'name': 'volumes/volume-b9aeba71-0e10-449a-a755-b1b541002488', 'hosts': ['172.17.3.61', '172.17.3.70', '172.17.3.125'], 'ports': ['6789', '6789', '6789'], 'cluster_name': 'ceph', 'auth_enabled': True, 'auth_username': 'openstack', 'secret_type': 'ceph', 'secret_uuid': '***', 'volume_id': 'b9aeba71-0e10-449a-a755-b1b541002488', 'discard': True, 'keyring': None, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False}}\"", "asctime": "2020-12-08 01:48:17", "name": "os_brick.initiator.connectors.rbd", "msg": "==> %(func)s: call %(all_args)r", "args": {"func": "connect_volume", "all_args": "{'self': <os_brick.initiator.connectors.rbd.RBDConnector object at 0x7fcd90061550>, 'connection_properties': {'name': 'volumes/volume-b9aeba71-0e10-449a-a755-b1b541002488', 'hosts': ['172.17.3.61', '172.17.3.70', '172.17.3.125'], 'ports': ['6789', '6789', '6789'], 'cluster_name': 'ceph', 'auth_enabled': True, 'auth_username': 'openstack', 'secret_type': 'ceph', 'secret_uuid': '***', 'volume_id': 'b9aeba71-0e10-449a-a755-b1b541002488', 'discard': True, 'keyring': None, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False}}"}, "levelname": "DEBUG", "levelno": 10, "pathname": "/usr/lib/python3.6/site-packages/os_brick/utils.py", "filename": "utils.py", "module": "utils", "lineno": 146, "funcname": "trace_logging_wrapper", "created": 1607392097.3840482, "msecs": 384.0482234954834, "relative_created": 63425.718784332275, "thread": 140520887643328, "thread_name": "MainThread", "process_name": "MainProcess", "process": 43, "traceback": null, "hostname": "controller-1", "error_summary": "", "context": {"user_name": null, "project_name": "admin", "domain_name": null, "user_domain_name": null, "project_domain_name": null, "user": "b2cfabfa6bc74a9ba0ed91687d280cb3", "tenant": "15b7011587484d6e802741cdbc11dba2", "system_scope": null, "project": "15b7011587484d6e802741cdbc11dba2", "domain": null, "user_domain": "default", "project_domain": "default", "is_admin": true, "read_only": false, "show_deleted": false, "auth_token": "***", "request_id": "req-b525298b-20a6-46c4-863a-94889ccd6034", "global_request_id": "req-681ddf06-8068-4193-8dff-e16b65ed8cd3", "resource_uuid": null, "roles": ["reader", "admin", "member"], "user_identity": "b2cfabfa6bc74a9ba0ed91687d280cb3 15b7011587484d6e802741cdbc11dba2 - default default", "is_admin_project": true, "user_id": "b2cfabfa6bc74a9ba0ed91687d280cb3", "project_id": "15b7011587484d6e802741cdbc11dba2", "domain_id": null, "read_deleted": "no", "remote_address": "172.17.1.20", "timestamp": "2020-12-08T01:48:15.396657+00:00", "quota_class": null, "service_catalog": [{"type": "identity", "name": "keystone", "endpoints": [{"region": "regionOne", "publicURL": "https://10.0.0.101:13000", "adminURL": "http://192.168.24.30:35357", "internalURL": "http://172.17.1.41:5000"}]}, {"type": "image", "name": "glance", "endpoints": [{"region": "regionOne", "adminURL": "http://172.17.1.41:9292", "publicURL": "https://10.0.0.101:13292", "internalURL": "http://172.17.1.41:9292"}]}, {"type": "compute", "name": "nova", "endpoints": [{"region": "regionOne", "adminURL": "http://172.17.1.41:8774/v2.1", "internalURL": "http://172.17.1.41:8774/v2.1", "publicURL": "https://10.0.0.101:13774/v2.1"}]}, {"type": "object-store", "name": "swift", "endpoints": [{"region": "regionOne", "internalURL": "http://172.17.3.41:8080/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2", "publicURL": "https://10.0.0.101:13808/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2", "adminURL": "http://172.17.3.41:8080/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2"}]}]}, "extra": {"project": "15b7011587484d6e802741cdbc11dba2", "version": "unknown"}} {"message": "opening connection to ceph cluster (timeout=-1).", "asctime": "2020-12-08 01:48:17", "name": "os_brick.initiator.linuxrbd", "msg": "opening connection to ceph cluster (timeout=%s).", "args": [-1], "levelname": "DEBUG", "levelno": 10, "pathname": "/usr/lib/python3.6/site-packages/os_brick/initiator/linuxrbd.py", "filename": "linuxrbd.py", "module": "linuxrbd", "lineno": 70, "funcname": "connect", "created": 1607392097.385284, "msecs": 385.2839469909668, "relative_created": 63426.95450782776, "thread": 140520887643328, "thread_name": "MainThread", "process_name": "MainProcess", "process": 43, "traceback": null, "hostname": "controller-1", "error_summary": "", "context": {"user_name": null, "project_name": "admin", "domain_name": null, "user_domain_name": null, "project_domain_name": null, "user": "b2cfabfa6bc74a9ba0ed91687d280cb3", "tenant": "15b7011587484d6e802741cdbc11dba2", "system_scope": null, "project": "15b7011587484d6e802741cdbc11dba2", "domain": null, "user_domain": "default", "project_domain": "default", "is_admin": true, "read_only": false, "show_deleted": false, "auth_token": "***", "request_id": "req-b525298b-20a6-46c4-863a-94889ccd6034", "global_request_id": "req-681ddf06-8068-4193-8dff-e16b65ed8cd3", "resource_uuid": null, "roles": ["reader", "admin", "member"], "user_identity": "b2cfabfa6bc74a9ba0ed91687d280cb3 15b7011587484d6e802741cdbc11dba2 - default default", "is_admin_project": true, "user_id": "b2cfabfa6bc74a9ba0ed91687d280cb3", "project_id": "15b7011587484d6e802741cdbc11dba2", "domain_id": null, "read_deleted": "no", "remote_address": "172.17.1.20", "timestamp": "2020-12-08T01:48:15.396657+00:00", "quota_class": null, "service_catalog": [{"type": "identity", "name": "keystone", "endpoints": [{"region": "regionOne", "publicURL": "https://10.0.0.101:13000", "adminURL": "http://192.168.24.30:35357", "internalURL": "http://172.17.1.41:5000"}]}, {"type": "image", "name": "glance", "endpoints": [{"region": "regionOne", "adminURL": "http://172.17.1.41:9292", "publicURL": "https://10.0.0.101:13292", "internalURL": "http://172.17.1.41:9292"}]}, {"type": "compute", "name": "nova", "endpoints": [{"region": "regionOne", "adminURL": "http://172.17.1.41:8774/v2.1", "internalURL": "http://172.17.1.41:8774/v2.1", "publicURL": "https://10.0.0.101:13774/v2.1"}]}, {"type": "object-store", "name": "swift", "endpoints": [{"region": "regionOne", "internalURL": "http://172.17.3.41:8080/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2", "publicURL": "https://10.0.0.101:13808/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2", "adminURL": "http://172.17.3.41:8080/swift/v1/AUTH_15b7011587484d6e802741cdbc11dba2"}]}]}, "extra": {"project": "15b7011587484d6e802741cdbc11dba2", "version": "unknown"}} ~~~ Version-Release number of selected component (if applicable): I reproduced the issue with RHOSP16.1.2 . How reproducible: Always Steps to Reproduce: 1. Deploy overcloud with ceph-backup enabled 2. Enable debug and json format log of cinder-backup /var/lib/config-data/puppet-generated/cinder/etc/cinder/cinder.conf ~~~ [DEFAULT] ... debug=True ... use_json = true ~~~ 3. Create volume backup $ cinder backup-create <voume id> --name <backup name> Actual results: The created backup gets stuck in creating status Expected results: The backup is created and become active Additional info:
As Eric wrote in the upstream bug report: The oslo.log JSONFormatter gets stuck when passing an RBDVolumeIOWrapper from os-brick. This happens via os-brick's utils.trace() method which passes a connector containing {'path': RBDVolumeIOWrapper}. The oslo.log JSONFormatter format() method calls oslo_serialization's jsonutils.to_primitive and passes in this RBDVolumeIOWrapper object. Therefore the to_primitive method eventually calls RBDVolumeIOWrapper.read() which it's not intended.
Fix in upstream stable/train
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2097