Created attachment 1343303 [details] Ovirt-setup log, ovirt web portal screenshot. Description of problem: Engine-setup script fails with an error: Failed to execute stage 'Setup validation': 'list' object has no attribute 'splitlines' Version-Release number of selected component (if applicable): ovirt-vmconsole-1.0.4-1.el7.centos.noarch ovirt-setup-lib-1.1.4-1.el7.centos.noarch ovirt-node-ng-image-update-placeholder-4.1.6-1.el7.centos.noarch ovirt-release41-4.1.6-1.el7.centos.noarch ovirt-vmconsole-host-1.0.4-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7.centos.noarch ovirt-host-deploy-1.6.6-1.el7.centos.noarch ovirt-hosted-engine-ha-2.1.5-1.el7.centos.noarch ovirt-release-host-node-4.1.6-1.el7.centos.noarch ovirt-imageio-common-1.0.0-1.el7.noarch ovirt-node-ng-nodectl-4.1.4-0.20170919.0.el7.noarch ovirt-imageio-daemon-1.0.0-1.el7.noarch ovirt-hosted-engine-setup-2.1.3.8-1.el7.centos.noarch ovirt-node-ng-image-update-4.1.6-1.el7.centos.noarch vdsm-4.19.31-1.el7.centos.x86_64 How reproducible: Every time. Steps to Reproduce: 1. Install OS (4.1.5.2-1-el7.centos) on 3 nodes using ovirt-node-ng-installer iso image. 2. Deploy cluster HostedEngine on 3 nodes based on Gluster storage's via web portal of any above nodes. 3. Create some VM's and start it. 4. Upgrade all nodes to 4.1.6-1 version via ovirt web portal 5. Put cluster into GlobalMaintenance from any above nodes 6. Yum update all packages in VM HostedEngine 7. Run engine-setup on VM HostedEngine and accept all queries Actual results: engine-setup stops with an error: 'Setup validation': 'list' object has no attribute 'splitlines' Expected results: engine-setup completed normally and Ovirt Engine Version updated to 4.1.6-1 Additional info: Ovirt engine-setup log in attachment.
Seems like a bug in the fix for bug 1261335. That bug was meant to show nicer error message on some condition. The bug is in: stderrLines = stderr.splitlines() stderr in this case is already a list, should not be split further. To see the actual problem causing setup to fail, you can check the setup log. You can (also) see there: Constraint violation found in job_subject_entity (job_id) |1 So you can check the contents of the table 'job_subject_entity' to try and find out the problem, or attach it here. You can see that with: su - postgres -c "psql engine -c 'select * from job_subject_entity;'"
engine=# select * from job_subject_entity; job_id | entity_id | entity_type --------------------------------------+--------------------------------------+------------- 5001a696-ef62-48ec-987b-c21ca951b7c7 | 59a9d4f5-06ad-4873-be89-4f9c1119fb52 | VM 9e6a184f-103d-45cf-9ac1-3b66994abdab | 6e6ee144-5f9e-46aa-a135-56e7eadba743 | VM (2 rows)
Can you please check also: select * from job; You should see a matching line per each line in job_subject_entity. That's the only constraint I see in the sources. If all do match, perhaps it was a temporary state and you can try engine-setup gain. If not, it might be interesting/useful to know how this happened, but I guess it should be safe to remove the offending line from job_subject_entity. If you do want to investigate, you can start by searching the IDs in engine logs.
The job table is empty. engine=# select * from job; job_id | action_type | description | status | owner_id | visible | start_time | end_time | last_update_time | correlation_id | is_external | is_auto_cleared | engine_session_seq_ id --------+-------------+-------------+--------+----------+---------+------------+----------+------------------+----------------+-------------+-----------------+-------------------- --- (0 rows) The above VM's (ID 59a9d4f5-06ad-4873-be89-4f9c1119fb52, ID 6e6ee144-5f9e-46aa-a135-56e7eadba743) mostly in a down state. The last task/operations on them - removing the additional hdd's via the ovirt web portal. Do I need to export them to backup storage and remove from the cluster before job_subject_entity clearing?
The offered lines has been removed from the job_subject_entity table and engine-setup was completed successfully. The most probably that it was a VM FS crash caused by hardware failure during the some task operation. Many thanks for your help!
Thanks for the report :-) Keeping the bug open for fixing the error message. Also noticed a related bug while looking at the setup log: /usr/share/ovirt-engine/setup/dbutils/fkvalidator.sh: line 89: exit: Constraint: numeric argument required Relevant code is: if [ "${exit_code}" = "0" -a -z "${fix_it}" ]; then exit_code="$(echo "${res}" | sed -n '2p')" fi exit ${exit_code} So exit_code here is probably "Constraint violation found in..." instead of a number.
Reusing current bug also to fix the other issue. To reproduce, you have to have at least 2 violations in different tables. Did this: alter table job_subject_entity DISABLE TRIGGER ALL; insert into job_subject_entity values ('5001a696-ef62-48ec-987b-c21ca951b7c7', '59a9d4f5-06ad-4873-be89-4f9c1119fb52', 'VM'); insert into job_subject_entity values ('5001a696-ef62-48ec-987b-c21ca951b7c8', '59a9d4f5-06ad-4873-be89-4f9c1119fb53', 'VM'); alter table job_subject_entity ENABLE TRIGGER ALL; alter table async_tasks_entities DISABLE TRIGGER ALL; insert into async_tasks_entities values ('5001a696-ef62-48ec-987b-c21ca951b7c9', '59a9d4f5-06ad-4873-be89-4f9c1119fb54', 'VM'); alter table async_tasks_entities ENABLE TRIGGER ALL; With that, engine-setup fails as in the attached log, see comment 6.
Note to QE: Reproduction/Verification flow: 1. Setup an engine 2. Cause at least two different tables to have entries that invalid foreign keys. No idea how to cause this to happen using "normal" means, and almost certainly there aren't any - and if there are, it's most likely a bug in the engine or in postgresql (or both). I personally did this using comment 7. 3. Update setup packages to the version you want to verify 4. Run engine-setup With a broken version, it will fail with the message in comment 0, and setup log will also have a line like in comment 6. With a fixed version, it will fail with a nicer message: [ERROR] Failed to execute stage 'Setup validation': Failed checking Engine database: an exception occurred while validating the Engine database, please check the logs for getting more info: Constraint violation found in async_tasks_entities (async_task_id) |1 And in setup log you should see something like this: 2017-11-09 14:14:59,641+0200 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.upgrade.dbvalidations plugin.execute:926 execute-output: ['/usr/share/ovirt-engine/setup/dbutils/validatedb. sh', '--user=engine', '--host=localhost', '--port=5432', '--database=engine', '--log=/var/log/ovirt-engine/setup/ovirt-engine-setup-20171109141455-0j0cn5.log'] stderr: Constraint violation found in async_tasks_entities (async_task_id) |1 Constraint violation found in job_subject_entity (job_id) |1
engine-setup failed with the nicer message and log contains details, according to steps in Comment 8 verified in ovirt-engine-setup-4.2.0.2-0.1.el7.noarch
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.