While launching an instance with a 10Gb for the first time, I'm seeing these kind of errors from the api, compute, scheduler, network and volume services:
2011-09-01 07:24:05,947 ERROR nova [-] model server went away
(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "/usr/lib/python2.7/site-packages/nova/service.py", line 268, in report_state
(nova): TRACE: service_ref = db.service_get(ctxt, self.service_id)
(nova): TRACE: File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 83, in service_get
(nova): TRACE: return IMPL.service_get(context, service_id)
(nova): TRACE: File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 99, in wrapper
(nova): TRACE: return f(*args, **kwargs)
(nova): TRACE: File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 159, in service_get
(nova): TRACE: filter_by(deleted=can_read_deleted(context)).\
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 1770, in first
(nova): TRACE: ret = list(self[0:1])
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 1667, in __getitem__
(nova): TRACE: return list(res)
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 1839, in __iter__
(nova): TRACE: return self._execute_and_instances(context)
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 1854, in _execute_and_instances
(nova): TRACE: result = conn.execute(querycontext.statement, self._params)
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1399, in execute
(nova): TRACE: params)
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1532, in _execute_clauseelement
(nova): TRACE: compiled_sql, distilled_params
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1640, in _execute_context
(nova): TRACE: context)
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1633, in _execute_context
(nova): TRACE: context)
(nova): TRACE: File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 325, in do_execute
(nova): TRACE: cursor.execute(statement, parameters)
(nova): TRACE: OperationalError: (OperationalError) database is locked u'SELECT anon_1.services_created_at AS anon_1_services_created_at, anon_1.services_upd
ated_at AS anon_1_services_updated_at, anon_1.services_deleted_at AS anon_1_services_deleted_at, anon_1.services_deleted AS anon_1_services_deleted, anon_1.s
ervices_id AS anon_1_services_id, anon_1.services_host AS anon_1_services_host, anon_1.services_binary AS anon_1_services_binary, anon_1.services_topic AS an
on_1_services_topic, anon_1.services_report_count AS anon_1_services_report_count, anon_1.services_disabled AS anon_1_services_disabled, anon_1.services_avai
lability_zone AS anon_1_services_availability_zone, compute_nodes_1.created_at AS compute_nodes_1_created_at, compute_nodes_1.updated_at AS compute_nodes_1_u
pdated_at, compute_nodes_1.deleted_at AS compute_nodes_1_deleted_at, compute_nodes_1.deleted AS compute_nodes_1_deleted, compute_nodes_1.id AS compute_nodes_
1_id, compute_nodes_1.service_id AS compute_nodes_1_service_id, compute_nodes_1.vcpus AS compute_nodes_1_vcpus, compute_nodes_1.memory_mb AS compute_nodes_1_
memory_mb, compute_nodes_1.local_gb AS compute_nodes_1_local_gb, compute_nodes_1.vcpus_used AS compute_nodes_1_vcpus_used, compute_nodes_1.memory_mb_used AS
compute_nodes_1_memory_mb_used, compute_nodes_1.local_gb_used AS compute_nodes_1_local_gb_used, compute_nodes_1.hypervisor_type AS compute_nodes_1_hypervisor
_type, compute_nodes_1.hypervisor_version AS compute_nodes_1_hypervisor_version, compute_nodes_1.cpu_info AS compute_nodes_1_cpu_info \nFROM (SELECT services.created_at AS services_created_at, services.updated_at AS services_updated_at, services.deleted_at AS services_deleted_at, services.deleted AS services_deleted, services.id AS services_id, services.host AS services_host, services.binary AS services_binary, services.topic AS services_topic, services.report_count AS services_report_count, services.disabled AS services_disabled, services.availability_zone AS services_availability_zone \nFROM services \nWHERE services.id = ? AND services.deleted = ?\n LIMIT ? OFFSET ?) AS anon_1 LEFT OUTER JOIN compute_nodes AS compute_nodes_1 ON compute_nodes_1.service_id = anon_1.services_id AND compute_nodes_1.deleted = ?' (3, 0, 1, 0, 0)
I'm also seeing this from the network and scheduler services when running 'nova-manage network create'
I'm able to reproduce this, as well. For example, I see it when creating a network:
$ sudo nova-manage network create admin 10.0.0.0/24 1 256 --bridge=br0
The error comes from Service.report_state() in nova/service.py. This function gets called in a loop in a background thread. By default, it runs every 10 seconds. This error occurs when this check happens in the middle of some operation being processed. In my last test of creating a network, it took almost 20 seconds to complete and I got the error once in the middle.
I'd consider this a fairly low priority issue to deal with, because:
a) After reviewing the code it's coming from, it appears to be more of an annoyance than a real problem.
b) It's probably specific to sqlite. If you switch to a different db backend, I bet it doesn't happen.
We could fix it, but there are probably some other things we would want to focus on first. The report_state() code could be made a bit smarter on what it considers an error. Or perhaps it makes more sense to do something to synchronize access to the sqlite db.
If it's bothersome enough, we could consider switching to a different backend DB by default.
As a follow-up to this, we really should switch to something else by default. In this particular case in the report, the error is harmless. However, due to how OpenStack works, it looks like similar errors are going to occur elsewhere at random times causing real problems as long as sqlite is the backend database.
More on switching the default here: http://lists.fedoraproject.org/pipermail/cloud/2011-November/000996.html
Created attachment 535682 [details]
use mysql by default
I have attached a first pass at updating openstack-nova to be configured to use MySQL by default. It also includes a helper script for setting up MySQL for use with openstack-nova.
This looks perfect Russell. I just gave the script a go and it works great. The packaging changes look fine too.
openstack-nova-2011.3-9.el6 has been submitted as an update for Fedora EPEL 6.
openstack-nova-2011.3-8.fc16 has been submitted as an update for Fedora 16.
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing openstack-nova-2011.3-8.fc16'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
openstack-nova-2011.3-10.el6 has been submitted as an update for Fedora EPEL 6.
openstack-nova-2011.3-11.fc16 has been submitted as an update for Fedora 16.
openstack-nova-2011.3-11.el6 has been submitted as an update for Fedora EPEL 6.
openstack-nova-2011.3-13.fc16 has been submitted as an update for Fedora 16.
openstack-nova-2011.3-13.el6 has been submitted as an update for Fedora EPEL 6.
openstack-nova-2011.3-13.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.
openstack-nova-2011.3-14.el6 has been submitted as an update for Fedora EPEL 6.
openstack-nova-2011.3-18.el6 has been submitted as an update for Fedora EPEL 6.