| Summary: | Template imported to RHEV-M reported COMPLETE to Conductor but the template was not successfully imported. | ||
|---|---|---|---|
| Product: | [Retired] CloudForms Cloud Engine | Reporter: | Ronelle Landy <rlandy> |
| Component: | aeolus-conductor | Assignee: | Angus Thomas <athomas> |
| Status: | CLOSED WORKSFORME | QA Contact: | wes hayutin <whayutin> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 1.0.0 | CC: | akarol, calfonso, dajohnso, deltacloud-maint, hbrock, jlabocki, rananda, ssachdev, whayutin |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-02-28 13:52:35 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Error from training class:
* Server message
*
Deltacloud::ExceptionHandler::BackendError - Unhandled exception or status code (AuthFailure)
* Original request URI
*
/api/instances Retry
* Error details
* No details
Backtrace click to expand contents
*
/usr/share/deltacloud-core/bin/../lib/deltacloud/drivers/mock/mock_driver.rb:449:in `check_credentials'
/usr/share/deltacloud-core/bin/../lib/deltacloud/base_driver/exceptions.rb:151:in `call'
/usr/share/deltacloud-core/bin/../lib/deltacloud/base_driver/exceptions.rb:151:in `safely'
/usr/share/deltacloud-core/bin/../lib/deltacloud/drivers/mock/mock_driver.rb:447:in `check_credentials'
/usr/share/deltacloud-core/bin/../lib/deltacloud/drivers/mock/mock_driver.rb:161:in `instances'
/usr/share/deltacloud-core/bin/../lib/deltacloud/helpers/application_helper.rb:80:in `send'
/usr/share/deltacloud-core/bin/../lib/deltacloud/helpers/application_helper.rb:80:in `filter_all'
/usr/lib/ruby/1.8/benchmark.rb:293:in `measure'
/usr/share/deltacloud-core/bin/../lib/deltacloud/helpers/application_helper.rb:79:in `filter_all'
/usr/share/deltacloud-core/bin/../lib/deltacloud/server.rb:461
/usr/share/deltacloud-core/bin/../lib/sinatra/rabbit.rb:125:in `instance_eval'
/usr/share/deltacloud-core/bin/../lib/sinatra/rabbit.rb:125:in `GET /api/instances'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1151:in `call'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1151:in `compile!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:724:in `instance_eval'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:724:in `route_eval'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:708:in `route!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:758:in `process_route'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:755:in `catch'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:755:in `process_route'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:707:in `route!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:706:in `each'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:706:in `route!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:843:in `dispatch!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:644:in `call!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `instance_eval'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `invoke'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `catch'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `invoke'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:644:in `call!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:629:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_syslog.rb:48:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_date.rb:31:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_accept.rb:149:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-1.3.0/lib/rack/head.rb:9:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_driver_select.rb:45:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_matrix_params.rb:106:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_runtime.rb:36:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_etag.rb:41:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-accept-0.4.3/lib/rack/accept/context.rb:22:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-1.3.0/lib/rack/head.rb:9:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-1.3.0/lib/rack/methodoverride.rb:24:in `call'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1272:in `call'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1303:in `synchronize'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1272:in `call'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:84:in `pre_process'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:82:in `catch'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:82:in `pre_process'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `call'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `spawn_threadpool'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `initialize'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `new'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `spawn_threadpool'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1049:in `defer'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:54:in `process'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:42:in `receive_data'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/backends/base.rb:61:in `start'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/server.rb:159:in `start'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/controllers/controller.rb:86:in `start'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/runner.rb:185:in `send'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/runner.rb:185:in `run_command'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/runner.rb:151:in `run!'
/usr/share/deltacloud-core/bin/deltacloudd:235
/usr/bin/deltacloudd:5:in `load'
/usr/bin/deltacloudd:5
Hi, Ronelle, your reproducer is not actually reproducing the same bug as people saw in Conductor, because your reproducer use DC running on port 3002, which is reserved for Conductor Mock Deltacloud instance. So you did not actually started 8 API instances, but just 7. The one on port 3002 was MOCK instead of RHEV-M. This led to AuthFailure (you tried to use RHEV-M credentials for Mock driver). ============================================================== I just created my own reproducer (attached) that runs 8 threads and start 10 VM in RHEV-M. I'm using *the same approach as Conductor* (dynamic driver switching). I started 8 instances of DC (8x deltacloudd -i mock) running the *mock* driver. Then I used HTTP headers (as Conductor is doing) to change the driver to RHEV-M and set appropriate API_PROVIDER. The results are that all of my 8 threads was able to create 10 virtual machines in RHEV-M (using same image), so I think the issue here is not in DC. My suggestions: firefly ~/code/core/tests $ ruby perf/load_test_instance.rb Run options: # Running tests: i-1-0 i-1-1 i-1-2 i-1-3 i-1-4 i-1-5 i-1-6 i-1-7 i-1-8 i-1-9 .i-2-0 i-2-1 i-2-2 i-2-3 i-2-4 i-2-5 i-2-6 i-2-7 i-2-8 i-2-9 .i-3-0 i-3-1 i-3-2 i-3-3 i-3-4 i-3-5 i-3-6 i-3-7 i-3-8 i-3-9 .i-4-0 .i-5-0 i-5-1 i-5-2 i-5-3 i-5-4 i-5-5 i-5-6 i-5-7 i-5-8 i-5-9 .i-6-0 i-6-1 i-6-2 i-6-3 i-6-4 i-6-5 i-6-6 i-6-7 i-6-8 i-6-9 .i-7-0 i-7-1 i-7-2 i-7-3 i-7-4 i-7-5 i-7-6 i-7-7 i-7-8 i-7-9 .i-8-0 i-8-1 i-8-2 i-8-3 i-8-4 i-8-5 i-8-6 i-8-7 i-8-8 i-8-9 . Finished tests in 87.427783s, 0.0915 tests/s, 0.0000 assertions/s. 1. Check the Conductor configuration 2. Check Conductor code for launching instances, if there is no chance to switch accidentally to Mock under heavy load. 3. Close this bug as NOTABUG Your opinions? Created attachment 564656 [details]
Deltacloud heavy load test utility
When looking in the deltacloud log, I observed HTTP response codes of 404 for instance and template rhev api url GET requests. I added breakpoints to see what uuid's were being passed to the deltacloud driver. The uuids provided were then cross referenced with the templates and instances pusehd to rhevm by looking using a web browser. I was able to see all the templates and instances, but the specific uuids provided to deltacloud for the url GET requests to rhevm did not exist. This means for some reason conductor thought the image and template pushes were completed, but in reality they were not actually pushed to rhevm. After deleting the existing rhevm images, and trying the entire build/push/launch process again, everything worked. I would suggest to Conductor guys to perform a check before they are launching instance via DC to verify that the image they want to use is ready and correctly reported by Deltacloud: GET /api/images/:image_id Aso the image resource in DC reports 'status' which should say if the image is *ready* for the launch (<status>OK</status>). So instead of just trying to launch an instance and report to user that the instance launch failed, we should report to user that the image user wants to launch is not available due to whatever error and IF log inspection is needed to figure out reason. I spoke with James Labocki wrt the credentials error (AuthFailure). James agrees that this error could have been copied from one training class participant's computer (possible with an incorrect configuration). This error may be unrelated to the 404 error that all 6 participants saw when they were unable to launch instances into rhevm. The Delatcloud 404 error is expected when dc cannot find the specified image UUID to launch an instance. I've asked Mark Wagner to attempt to reproduce this in the perf lab, using a single conductor building and launching many images on rhev-m. (NB the actual case here, many conductors accessing the same rhev-m, is unsupported for production). If he can reproduce it then we'll have something to go on for a fix. Also, we should go ahead and implement comment 6. See BZ 796725 for that. Meanwhile Dave Johnson is trying to reproduce. putting on_qa .. we have not been repro https://bugzilla.redhat.com/show_bug.cgi?id=797955 opened ^ to get comment #6 in My attempts at reproducing this were unsuccessful using the QE rhevm environment... comment 4 is somewhat interesting however, the error closely resembles that of rhev vdsm bug 786942 which was encountered in the debug of imagefactory bug 768013. The underlying issue was a qcow2 compression issue within vdsm while import into a iScsi storage domain. I did check the training rhevm and there does seems to be a iScsi target, perhaps further testing is needed using the training rhevm server So we are going to leave this as it is, unreproducible... Feel free to re-open if this is encountered again but please note that pointing 10+ conductors at a single rhevm server is a unsupported scenario. |
Description of problem: These steps reproduce an error found during a training class. The error occurred when 8 people where all using conductor to launch instances into the same rhevm server at the same time. Steps from a Load Test that reproduced the error (3 different machines were used in the test): 1. Start 8 Deltacloud servers targeting the same rhevm server: >> API_PROVIDER="https://qeblade26.rhq.lab.eng.bos.redhat.com:8443/api;4051216f-a778-488f-8ae4-5a2541b009ff" deltacloudd -i rhevm -r localhost -p 3001 2. Run the following code, to create 10 instances from the same rhevm image, against all 8 servers concurrently (vary the instance names to avoid name clashes): ****************** it "launching multiple instances from the same image" do DeltaCloud.new( API_NAME, API_PASSWORD, API_URL ) do |client| for i in (1..10) puts "Launching instance rlandyLT" + i.to_s() instance = client.create_instance('f68a49fc-de26-4197- a39c-e557507cd922', :hardware_profile=>'SERVER', :name=>"rlandyLT" + i.to_s()) instance.should_not be_nil instance.uri.should match( %r{#{API_URL}/instances/[a-z 0-9]*} ) instance.image.id.should eql('f68a49fc-de26-4197-a39c-e557507cd922' ) sleep(0.5) end end ************* 7 of 8 the tests pass: ************* [rlandy@localhost client]$ rake spec API_PROVIDER=rhevm API_PORT=3004 /usr/bin/ruby -S rspec specs/load_spec.rb --format nested --color instance launch load test Launching instance rlandyLT20 Launching instance rlandyLT40 Launching instance rlandyLT60 Launching instance rlandyLT80 Launching instance rlandyLT100 Launching instance rlandyLT120 Launching instance rlandyLT140 Launching instance rlandyLT160 Launching instance rlandyLT180 Launching instance rlandyLT200 launching multiple instances from the same image Finished in 49.76 seconds 1 example, 0 failures ************* One of the eight tests fails with the following error: [root@hp-dl360g5-02 client]# rake spec API_PROVIDER=rhevm API_PORT=3002 /usr/bin/ruby -S rspec specs/load_spec.rb --format nested --color instance launch load test Launching instance rlandyLThpC1 launching multiple instances from the same image (FAILED - 1) Failures: 1) instance launch load test launching multiple instances from the same image Failure/Error: Unable to find matching line from backtrace DeltaCloud::API::BackendError: 500 : Unhandled exception or status code (AuthFailure) # # /usr/share/deltacloud-core/lib/deltacloud/drivers/mock/mock_driver.rb:459:in `check_credentials' # /usr/share/deltacloud-core/lib/deltacloud/base_driver/exceptions.rb:151:in `call' # /usr/share/deltacloud-core/lib/deltacloud/base_driver/exceptions.rb:151:in `safely' # /usr/share/deltacloud-core/lib/deltacloud/drivers/mock/mock_driver.rb:457:in `check_credentials' # /usr/share/deltacloud-core/lib/deltacloud/drivers/mock/mock_driver.rb:170:in `create_instance' # /usr/share/deltacloud-core/lib/deltacloud/server.rb:480 # /usr/share/deltacloud-core/lib/sinatra/rabbit.rb:125:in `instance_eval' # /usr/share/deltacloud-core/lib/sinatra/rabbit.rb:125:in `POST /api/instances' Finished in 0.16429 seconds 1 example, 1 failure Failed examples: rspec ./specs/load_spec.rb:29 # instance launch load test launching multiple instances from the same image rake aborted! /usr/bin/ruby -S rspec specs/load_spec.rb --format nested --color failed Tasks: TOP => spec rpm versions from the test machine where the error was found: [root@hp-dl360g5-02 specs]# rpm -qa |grep deltacloud deltacloud-core-vsphere-0.5.0-5.el6.noarch deltacloud-core-ec2-0.5.0-5.el6.noarch deltacloud-core-rhevm-0.5.0-5.el6.noarch deltacloud-core-0.5.0-5.el6.noarch rubygem-deltacloud-client-0.5.0-2.el6.noarch