Currently, during engine-upgrade there's a loop that waits for 3 minutes for System tasks to finish. Tasks could finish in 4 minutes, making a customer waiting for extra time. The utility should sample the current situation and continue the upgrade as soon as tasks are cleared.
patch 12177 merged upstream master: http://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=commit;h=88142da3d12d2af79434582e47e10458716636fe
How to Test This BZ, Please advise What steps should I take to verify this BZ Should I test upgrade to SF9, or from SF9 ?
Should I upgrade to SF9 or from SF9 ? Alex: You need to start an upgrade with async tasks in the DB. After the upgrade started, you need to clear async tasks within less than 3 minutes; you should see that upgrade continues automatically right away.
I started upgrade while asyn was running in less than 30sec upgrade failed and only postgres service was up, Is it correct ?? ----------------- rhevm-upgrade Checking for updates... (This may take several minutes)...[ DONE ] 10 Updates available: * rhevm-3.2.0-10.14.beta1.el6ev.noarch * rhevm-backend-3.2.0-10.14.beta1.el6ev.noarch * rhevm-config-3.2.0-10.14.beta1.el6ev.noarch * rhevm-dbscripts-3.2.0-10.14.beta1.el6ev.noarch * rhevm-genericapi-3.2.0-10.14.beta1.el6ev.noarch * rhevm-notification-service-3.2.0-10.14.beta1.el6ev.noarch * rhevm-restapi-3.2.0-10.14.beta1.el6ev.noarch * rhevm-tools-common-3.2.0-10.14.beta1.el6ev.noarch * rhevm-userportal-3.2.0-10.14.beta1.el6ev.noarch * rhevm-webadmin-portal-3.2.0-10.14.beta1.el6ev.noarch Stopping ovirt-engine service... [ DONE ] Stopping DB related services... [ DONE ] Cleaning async tasks... [ DONE ] Pre-upgrade validations... [ DONE ] Backing Up Database... [ DONE ] Rename Database... [ ERROR ] Error: Database rename failed. Check that there are no active connections to the DB and try again. Error: Upgrade failed. please check log at /var/log/ovirt-engine/ovirt-engine-upgrade_2013_03_12_15_38_28.log [root@dbotzer-upgrade yum.repos.d]# Srvs postmaster (pid 16090) is running... The engine is not running. /etc/init.d/ovirt-engine-dwhd is stopped
Second time I ran upgrade it fails saying connections to db exists, Only after I did manually restart to postgresql & ovirt-engine-dwhd services I could run upgrade, ====================================================== rhevm-upgrade Checking for updates... (This may take several minutes)...[ DONE ] 10 Updates available: * rhevm-3.2.0-10.14.beta1.el6ev.noarch * rhevm-backend-3.2.0-10.14.beta1.el6ev.noarch * rhevm-config-3.2.0-10.14.beta1.el6ev.noarch * rhevm-dbscripts-3.2.0-10.14.beta1.el6ev.noarch * rhevm-genericapi-3.2.0-10.14.beta1.el6ev.noarch * rhevm-notification-service-3.2.0-10.14.beta1.el6ev.noarch * rhevm-restapi-3.2.0-10.14.beta1.el6ev.noarch * rhevm-tools-common-3.2.0-10.14.beta1.el6ev.noarch * rhevm-userportal-3.2.0-10.14.beta1.el6ev.noarch * rhevm-webadmin-portal-3.2.0-10.14.beta1.el6ev.noarch Stopping ovirt-engine service... [ DONE ] Stopping DB related services... [ DONE ] Cleaning async tasks... [ DONE ] Pre-upgrade validations... [ DONE ] Backing Up Database... [ DONE ] Rename Database... [ ERROR ] Error: Database rename failed. Check that there are no active connections to the DB and try again. Error: Upgrade failed. please check log at /var/log/ovirt-engine/ovirt-engine-upgrade_2013_03_12_15_41_14.log
Created attachment 708979 [details] first-upgrade-fail
Created attachment 708980 [details] Second-upgrade-fail-db-connections
Created attachment 708982 [details] third-upgrade-fail
According to the above results, Is it correct behaviour for upgrade to run ?
Eli: Can u please advise, why something was connected to the db while dwhd is stopped and nothing should be still connected to it
When I upgrade while I have dwh running, should upgrade-process clear all db connections in order for rhevm-upgrade to succeed ? I think its not quite ok, to close Bug 877749, since it identified Async tasks and stopped, But the error was not on async tasks it was - as see in the first notes in the bz Error: Database rename failed. Check that there are no active connections to the DB and try again. Error: Upgrade failed. ----------------- So a user cannot understand its because async tasks were running
(In reply to comment #13) > Eli: > Can u please advise, > why something was connected to the db while dwhd is stopped and nothing > should be still connected to it The fact is that you had using the database with other tool or instance The easiest way to prevent that is 1) restart the postgresql server or 2) use ps -ef to find who is using the database
It seems obvious that dwh is using the DB, I had this discussion with Alex, whether rhevm-upgrade should do what u suggested and integration decided that user should do it manually before upgrade but what happens where both dwh and async tasks are running....
Fixed, 3.2/sf10 rhevm-upgrade stops right away on the case of running async tasks !!-- there is a place for a correct message why upgrade was stopped--! Because it shows in one of the stages --> Cleaning async tasks... [ DONE ] When clearing those tasks upgrade continues Fixed, 3.2/sf10 Please answer this last note so I can close this BZ
if you need to know how to see why it's stopped to "Cleaning async tasks" step, while the script is waiting it writes in the log a debug line "Still waiting for system tasks to be cleared."
I installed a Clean rhevm 3.2/SF9 without Reports and without DWH. I imported Template from NFS EXP Domain, and could see it in psql I started rhevm-upgrade (SF10) and the Async Task information was deleted from psql, See pastebin http://pastebin.test.redhat.com/132514 To examine if rhevm-upgrade failure had "leftovers" I had started engine service, to check the template but its in Lock !!! ./unlock_entity.sh -t disk -q -s localhost -p 5432 -d engine -u postgres /usr/share/ovirt-engine/dbscripts /usr/share/ovirt-engine/dbscripts entity_id | disk_id --------------------------------------+-------------------------------------- 3f8d3031-8423-4b4c-8320-63b491328981 | cc0be0f7-649c-4c19-bcb6-5a1fb7e739c3 (1 row) See Log: ovirt-engine-upgrade_2013_03_14_13_08_25
Created attachment 709979 [details] No-Reports-Upgrade-Lock-Template
Hi, Where does this bug stands ? What about the last note ?
3.2/sf13 -> 3.2/sf14 1. I upgraded the rhevm with async task running - Creating a Template from VM - After upgrade the VM is "Image locked" & the template I started to create is "locked" as well, Is this a correct behaviour ? ./unlock_entity.sh -t disk -q -s localhost -p 5432 -d engine -u postgres /usr/share/ovirt-engine/dbscripts /usr/share/ovirt-engine/dbscripts entity_id | disk_id --------------------------------------+-------------------------------------- 485da855-3faf-4bdf-a35f-0efcb329759a | 830d3180-4785-4ec2-beb6-bae47920b1f3 d9b089a7-6066-434f-8754-3b0ecabc7d81 | e607fb2d-ad82-436c-8970-0046d2c22de0 (2 rows) --------------------------- 2. I did upgrade from SF13 to SF14 (No reports, only rhevm) See below -> Is this the message I should get when upgrading, while running async task ?? Should there be a System Task name ?! or is it the name itself ? ------------------------------------------------------------ Would you like to proceed? (yes|no): y Stopping ovirt-engine service... [ DONE ] Stopping DB related services... [ DONE ] Cleaning async tasks... [ DONE ] Info: The following tasks have been found running in the system: System Tasks: [ May 01 16:52:40 ] Would you like to proceed and try to stop tasks automatically? (Answering 'no' will stop the upgrade)? (yes|no): y [ May 01 16:53:13 ] System will try to clear tasks during the next 3 minutes. ----------------------------------------------------------- Attached logs
Created attachment 742219 [details] rhevm-logs-NEW_1-5-2013
Created attachment 742220 [details] VDSM_Logs
Fixed, 3.2/sf14 Tasks are examined correctly, the message during upgrade, and stuck template will have new BZs Fixed, 3.2/sf14
3.2 has been released