Description of problem: every node, even the simple ones like nova compute generates DescribeStackResource request in every 60 sec in a director setup. On behalf of this request heat does lot of thing including sending additional request to the n-api. Version-Release number of selected component (if applicable): openstack-heat-common-2015.1.0-4.el7ost.noarch openstack-heat-api-cfn-2015.1.0-4.el7ost.noarch python-heatclient-0.6.0-1.el7ost.noarch openstack-heat-api-2015.1.0-4.el7ost.noarch openstack-heat-api-cloudwatch-2015.1.0-4.el7ost.noarch openstack-heat-engine-2015.1.0-4.el7ost.noarch How reproducible: always Steps to Reproduce: You just need to have a minimal system for generating some load. 1. Login to one node: ssh heat-admin@$MY_CPU_NODE_1 2. strace the os-collect-config and capture one request (alternatively generate a signed url on your own ) 3. run ab with the captured request Actual results: The pattern is similar in smaller and bigger systems. Here 4 core VMs was used at 2.1Ghz. top output on the undercloud controller: Tasks: 258 total, 8 running, 247 sleeping, 2 stopped, 1 zombie %Cpu(s): 94.6 us, 4.1 sy, 0.0 ni, 0.5 id, 0.0 wa, 0.0 hi, 0.7 si, 0.0 st KiB Mem : 5946792 total, 134224 free, 5247288 used, 565280 buff/cache KiB Swap: 4194300 total, 3567980 free, 626320 used. 273656 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29385 nova 20 0 564732 149304 2556 R 46.2 2.5 9:20.93 nova-api 29388 nova 20 0 566104 150588 2556 S 46.2 2.5 9:33.06 nova-api 29386 nova 20 0 566568 150972 2556 R 45.8 2.5 9:37.20 nova-api 29387 nova 20 0 567916 152200 2572 R 45.8 2.6 9:19.96 nova-api 29982 heat 20 0 608124 211484 2724 S 30.9 3.6 12:26.29 heat-engine 29984 heat 20 0 627736 230988 2720 S 29.2 3.9 9:46.06 heat-engine 29039 keystone 20 0 471216 76284 2528 S 21.3 1.3 3:07.72 keystone-all 29981 heat 20 0 625944 229064 2724 S 15.9 3.9 6:08.98 heat-engine 26119 mysql 20 0 2626928 237476 4632 S 13.3 4.0 6:18.60 mysqld 29041 keystone 20 0 472396 76360 2708 S 12.6 1.3 4:41.03 keystone-all 29983 heat 20 0 546944 150304 2724 S 12.0 2.5 6:19.43 heat-engine 29038 keystone 20 0 470896 75204 2728 S 11.0 1.3 2:17.08 keystone-all 29043 keystone 20 0 469236 74880 2720 S 9.0 1.3 1:36.43 keystone-all 29802 heat 20 0 388816 76108 3776 S 8.3 1.3 2:05.86 heat-api-cfn 27420 rabbitmq 20 0 2301472 73956 2036 S 6.6 1.2 6:54.92 beam.smp 29480 nova 20 0 493544 88096 2504 S 6.3 1.5 2:31.38 nova-conductor 29477 nova 20 0 493504 88024 2460 S 5.3 1.5 2:36.85 nova-conductor 9336 nova 20 0 403016 83724 3456 S 5.0 1.4 2:35.10 nova-compute 29405 nova 20 0 552788 136960 2572 S 5.0 2.3 2:49.21 nova-api 29478 nova 20 0 496668 90988 2280 S 4.0 1.5 2:37.01 nova-conductor 29479 nova 20 0 493828 88264 2496 S 4.0 1.5 2:28.52 nova-conductor 29040 keystone 20 0 471152 75616 2712 S 2.7 1.3 3:37.72 keystone-all 29042 keystone 20 0 469100 74632 2720 S 1.7 1.3 1:38.19 keystone-all 29044 keystone 20 0 478352 83716 2720 S 1.3 1.4 1:35.92 keystone-all 29265 glance 20 0 362936 42948 3292 S 1.0 0.7 2:45.20 glance-api 29868 heat 20 0 459120 63012 3476 S 1.0 1.1 2:56.74 heat-engine 30907 ironic 20 0 501832 63428 3980 S 1.0 1.1 38:33.43 ironic-conducto 27384 ironic 20 0 475748 64244 3768 S 0.7 1.1 3:01.73 ironic-api 29031 keystone 20 0 353628 23144 3260 S 0.7 0.4 2:47.65 keystone-all 1 root 20 0 205124 5376 2548 S 0.3 0.1 0:28.81 systemd 13 root 20 0 0 0 0 R 0.3 0.0 0:42.16 rcu_sched The best ab response on the overcloud-compute node.: [heat-admin@overcloud-compute-0 ~]$ ab -c 6 -n 600 'http://192.0.2.1:8000/v1/?SignatureVersion=2&AWSAccessKeyId=9e9a95dc311d4e4fa92bb956003c39c8&StackName=overcloud-Compute-auw3iajgylkc-0-yifpr5vfowdn&SignatureMethod=HmacSHA256&Signature=ssZn9aCLSEJWVCUMV98bwaa72s2QEJEoS%2Buxq7nVBZ0%3D&Action=DescribeStackResource&LogicalResourceId=NovaCompute' This is ApacheBench, Version 2.3 <$Revision: 1430300 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 192.0.2.1 (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Finished 600 requests Server Software: Server Hostname: 192.0.2.1 Server Port: 8000 Document Path: /v1/?SignatureVersion=2&AWSAccessKeyId=9e9a95dc311d4e4fa92bb956003c39c8&StackName=overcloud-Compute-auw3iajgylkc-0-yifpr5vfowdn&SignatureMethod=HmacSHA256&Signature=ssZn9aCLSEJWVCUMV98bwaa72s2QEJEoS%2Buxq7nVBZ0%3D&Action=DescribeStackResource&LogicalResourceId=NovaCompute Document Length: 29128 bytes Concurrency Level: 6 Time taken for tests: 200.131 seconds Complete requests: 600 Failed requests: 0 Write errors: 0 Total transferred: 17563200 bytes HTML transferred: 17476800 bytes Requests per second: 3.00 [#/sec] (mean) Time per request: 2001.315 [ms] (mean) Time per request: 333.552 [ms] (mean, across all concurrent requests) Transfer rate: 85.70 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 2 Processing: 1211 1998 350.4 1964 3431 Waiting: 1210 1995 350.1 1963 3430 Total: 1211 1998 350.4 1964 3432 Percentage of the requests served within a certain time (ms) 50% 1964 66% 2128 75% 2238 80% 2284 90% 2460 95% 2618 98% 2806 99% 2914 100% 3432 (longest request) Expected results: 'Requests per second' * 60sec / undercloud_cores Just from the above load the minimum required undercloud cpu is 1 per 45 managed node, when you do NOTHING. A solution needed for decreasing the undercloud cpu costs even for the cases when there is no configuration change in progress. The do nothing should not cost more than 1 cpu per 500 managed node. Ideally it should have zero cost.
The long-term plan for this is to use Zaqar for sending messages to the servers rather than having them poll Nova.
os-collect-config is configured to with 2 collectors: - ec2 which hits the nova metadata API, and is causing the load reported in this bug - cfn which hits the heat cfn metadata API, which is how software deployment changes get propagated from heat to servers. The ec2 collector doesn't need to be run every 60 seconds since its contents are essentially immutable, so I propose that os-collect-config be enhanced to allow ec2 poll frequency to be reduced, maybe with a per-collector polling-interval config option. Also, I may be misremembering but I thought there was a memcached backed caching solution as an option in either nova or neutron for exactly this situation (nova metadata calls causing load on nova-api)
Regarding my last comment on memcached, are there values set for memcached_servers in nova.conf? http://www.gossamer-threads.com/lists/openstack/operators/44767
The url in the apache bench is the one from the [cfn] section of the os-collect-config, The test just used that one, you are right the os-collect-config also rereads the regular metadata in every minute (many small gets), it's load is not included in this bug.
memcached_server is not configured in the nova conf. Without memcached server, nova maintains a per process memory cache which expires in 15 sec by default. Per this behaviour the simple ab reads on the metadata api can be little bit miss leading. - It does approximately 24 metedata api request per minute. - It does all 24 on the same connection 'Connection: keep-alive' - The first metadata requests have the n-api to cache the full metadata. Against the a simple ab test, not using memcached is the faster way. 1600 req/sec reachable, with memcached just 51.84 req/sec this is too low number, so something is not good around the memcheched usage as well.
OK, so the load is coming from heat-engine. Zane mentioned zaqar for propagating deployment data, but there is a nearer-term solution which switching to swift TempURL instead of heat cfn polling. All code has landed upstream, and this can be tried in director once the following has happened: - release and package os-collect-config 0.1.36 - build overcloud images containing os-collect-config 0.1.36 - undercloud /etc/heat/heat.conf default_software_config_transport=POLL_TEMP_URL - restart openstack-heat-engine - deploy overcloud Doing the above will completely eliminate load caused by nodes polling for metadata.
To simulate high load you can also ssh into an overcloud node, edit /etc/os-collect-config.conf to add [DEFAULT] polling_interval = 1 then restart os-collect-config Using software_config_transport=POLL_TEMP_URL I get this with undercloud top: load average: 0.39, 0.31, 0.37 I'll try now with cfn polling and post the difference in load
With the current cfn polling, make one node poll every 1 second: load average: 1.07, 0.87, 0.89
I think this is actually resolved by bug 1304856 - heat-engine no longer calls nova when the server fetches its metadata. Should we close this as a duplicate, or is there more work to be delivered here in OSP 8?
Switching to swift in liberty seems to have the numbers to land, so this bug could be used to track this getting downstream https://review.openstack.org/#/c/257657/
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
Upstream stable/mitaka uses swift temp URLs by default, so I assume 9.0 will inherit that.
This is ready to test in any version of 10.0. Deployment polling is now done on a swift temp URL, also fixes to heat from zaneb stopped the unnecessary nova api calls during heat operations.
using current build of openstack-tripleo-heat-templates-5.0.0-0.20160622164917.203e925.el7ost as fixed in version
from top: load average: 0.13, 0.08, 0.07 and I am happy with that. Also nova and heat will take ~2% cpu on a 3 core VM.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html