Bug 1336478
Summary: | REST API Delete instances fails under load | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Yuri Obshansky <yobshans> | ||||||||||||
Component: | openstack-nova | Assignee: | Eoghan Glynn <eglynn> | ||||||||||||
Status: | CLOSED NOTABUG | QA Contact: | Prasanth Anbalagan <panbalag> | ||||||||||||
Severity: | high | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 8.0 (Liberty) | CC: | berrange, dasmith, eglynn, kchamart, mbooth, sbauza, sferdjao, sgordon, srevivo, vromanso, yobshans | ||||||||||||
Target Milestone: | --- | Keywords: | ZStream | ||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | x86_64 | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2018-01-08 13:37:59 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
Yuri Obshansky
2016-05-16 15:09:42 UTC
Created attachment 1157998 [details]
Horizon screenshot
Created attachment 1157999 [details]
nova-api log
Another error in nova-scheduler.log -> 2016-05-15 07:03:24.397 14465 ERROR oslo_db.api [req-15370471-51ab-4641-a424-d82a0e4b7bb6 - - - - -] DB error. 2016-05-15 07:03:24.397 14465 ERROR oslo_db.api Traceback (most recent call last): 2016-05-15 07:03:24.397 14465 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 136, in wrapper 2016-05-15 07:03:24.397 14465 ERROR oslo_db.api return f(*args, **kwargs) 2016-05-15 07:03:24.397 14465 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 3782, i n reservation_expire 2016-05-15 07:03:24.397 14465 ERROR oslo_db.api reservation_query.soft_delete(synchronize_session=False) 2016-05-15 07:03:24.397 14465 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/orm.py", line 32, in soft_delete Created attachment 1158000 [details]
nova-scheduler log
Look at date 2016-05-15 only in log files At first glance this looks like a problem in keystone. It may simply be that keystone isn't keeping up with this request rate, in which case the immediate recommendation will be to go slower. Could you please provide keystone logs covering the same period, and also the keystone configuration. After discussion within the team, could you please also provide some additional info: * Number of api hosts * Number of compute hosts * Number of keystone hosts * sosreport from a keystone host I'm suspecting a scaling problem, so I need to get an idea of the scale of the deployment. If you can think of any other information relevant to deployment scale, please include that too. That is my fault, I had to provide environment configuration in the bug. It is HA, baremetal deployment, 3 controller nodes and 6 compute nodes where each is 24 CPUs and 64 G RAM. Volume is 5000 instances - image: cirros (12.6 MB) - flavor: m1.nano (1VCPU, 64MB RAM, 1GB Disk) I was not able remove instances in any way to clean up the environment. Only reboot all nodes (controllers and computes) released the stacked instances. As result, I cannot provide the sosreport right now. The keystone conf and log files attached. Created attachment 1160273 [details]
Keystone conf file
Created attachment 1160274 [details]
Keystone log file
Clearing out old bugs. From looking at the keystone logs it seems that the original supposition was correct. I think a RequestTimeout is an appropriate return code under these circumstances. |