Bug 1924007

Summary: Concurrent volume operation causes deadlock in cinder-volume process when json log format is used and debug is enabled
Product: Red Hat OpenStack Reporter: Takashi Kajinami <tkajinam>
Component: openstack-cinderAssignee: Cinder Bugs List <cinder-bugs>
Status: ASSIGNED --- QA Contact: Evelina Shames <eshames>
Severity: medium Docs Contact: Andy Stillman <astillma>
Priority: medium    
Version: 16.1 (Train)CC: eharney, geguileo, hberaud, knoha, ltoscano, pcaruana
Target Milestone: ---Keywords: Triaged
Target Release: ---Flags: knoha: needinfo? (geguileo)
knoha: needinfo? (geguileo)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2222869    

Description Takashi Kajinami 2021-02-02 12:19:09 UTC
Description of problem:

In the deployment with json log format and debug enabled, when a user executes concurrent volume creation/attachment operations cinder-volume got deadlock.

The process keeps running but and subsequent volume operation gets stuck in *ing status like creating, attaching and so on.
Later the service status in "openstack volume service list" becomes down.


Version-Release number of selected component (if applicable):


How reproducible:
The issue was reproduce several times

Steps to Reproduce:
1. Deploy overcoud with json log format and debug enabled
2. Execute multiple volume creations and attachments

Actual results:
some of volume operations get stuck and status of the cinder-volume service becomes down

Expected results:
volume operations complete and cinder-volume service keeps its up status


Additional info:

Comment 2 Luigi Toscano 2021-02-02 12:32:19 UTC
Doesn't it look like a duplicate of bug 1905301 ?

Comment 3 Takashi Kajinami 2021-02-02 12:37:22 UTC
(In reply to Luigi Toscano from comment #2)
> Doesn't it look like a duplicate of bug 1905301 ?
Thank you for pointing that out.

I initially suspect the same mechanism (because I reported that bug in fact... ),
but my investigation of GURU report indicates that the cause is a circular reference
about locks for log and mysql connection, which I believe is different.