Created attachment 1827348 [details] host Created attachment 1827348 [details] host Created attachment 1827348 [details] host Description of problem: while trying to remove the last host to maintenance out of a cluster of 10 hosts the operation failed on error related to "Image transfer is in progress" although no images are in progress and all the disks status are marked as OK the error from the UI : Error while executing action: Cannot switch Host f16-h33-000-r640.rdu2.scalelab.redhat.com to Maintenance mode. Image transfer is in progress for the following (13) disks: c32314ab-77d4-40ec-9962-eb53b9f90f58, d71c52c0-4154-4070-a0f8-b51cca2ce0c7, ea1a1782-dac9-4409-9a8d-0d737a213ffe, f4e6933e-f360-4c60-90be-51e42cf6c027, f4e6933e-f360-4c60-90be-51e42cf6c027, ... Please wait for the operations to complete and try again. Version-Release number of selected component (if applicable): rhv-release-4.4.8-5-001.noarch from the engine log : 2021-09-29 08:54:40,267-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-947) [e0255ab2-e7b0-4ffb-8b03-8109e5cc3963] EVENT_ID: GENERIC_ERROR_MESSAGE(14,001), Cannot switch Host f16-h33-000-r640.rdu2.scalelab.redhat.com to Maintenance mode. Image transfer is in progress for the following (13) disks: ${disks} Please wait for the operations to complete and try again.,$disks c32314ab-77d4-40ec-9962-eb53b9f90f58, d71c52c0-4154-4070-a0f8-b51cca2ce0c7, ea1a1782-dac9-4409-9a8d-0d737a213ffe, f4e6933e-f360-4c60-90be-51e42cf6c027, f4e6933e-f360-4c60-90be-51e42cf6c027, ... 2021-09-29 08:54:40,267-04 WARN [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (default task-947) [e0255ab2-e7b0-4ffb-8b03-8109e5cc3963] Validation of action 'MaintenanceNumberOfVdss' failed for user admin@internal-authz. Reasons: VAR__TYPE__HOST,VAR__ACTION__MAINTENANCE,VDS_CANNOT_MAINTENANCE_HOST_WITH_RUNNING_IMAGE_TRANSFERS,$host f16-h33-000-r640.rdu2.scalelab.redhat.com,$disks c32314ab-77d4-40ec-9962-eb53b9f90f58, d71c52c0-4154-4070-a0f8-b51cca2ce0c7, ea1a1782-dac9-4409-9a8d-0d737a213ffe, f4e6933e-f360-4c60-90be-51e42cf6c027, f4e6933e-f360-4c60-90be-51e42cf6c027, ...,$disks_COUNTER 13 2021-09-29 08:54:40,267-04 INFO [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (default task-947) [e0255ab2-e7b0-4ffb-8b03-8109e5cc3963] Lock freed to object 'EngineLock:{exclusiveLocks='', sharedLocks='[94b5e416-efef-41dc-949a-c4df7c78d63f=POOL]'}' 2021-09-29 08:54:42,918-04 INFO [org.ovirt.engine.core.sso.service.AuthenticationService] (default task-947) [] User admin@internal-authz with profile [internal] successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access 2021-09-29 08:54:42,930-04 INFO [org.ovirt.engine.core.bll.aaa.CreateUserSessionCommand] (default task-946) [56f33456] Running command: CreateUserSessionCommand internal: false. 2021-09-29 08:54:42,937-04 INFO [org.ovirt.engine.core.bll.aaa.LogoutSessionCommand] (default task-946) [4eb58174] Running command: LogoutSessionCommand internal: false. p.s 1. nine hosts have been moved to maintenance successfully and then was removed from the UI without any issues on the same cluster L0_Group_4 2. the severity set to high because there is no work around to by pass the message and put the host to maitenance the vdsm logs and the engine log can be found here : https://drive.google.com/drive/folders/1pOouooLC8kD9M8JGmLsDSyxZQ0dKsLwj?usp=sharing
This bug is the same as bug 1987295 that should be fixed in 4.4.8-5. Can you confirm the engine version?
engine version : ovirt-engine-4.4.8.4-0.7.el8ev.noarch
any workaround to by pass the message and put the host to maintenance ??
(In reply to Tzahi Ashkenazi from comment #3) > any workaround to by pass the message and put the host to maintenance ?? Yes, you just need to wait for some time (15 min top) until the DB cleaner thread will remove the image_transfer entity from the DB. Or, you can remove it manually. Closing as duplication of bug 1987295. *** This bug has been marked as a duplicate of bug 1987295 ***
manual intervention was required in order to move the host to maintenance the recommendation to wait 15 min didn't work in this case ( by Eyal ) the workaround on the engine DB : select vds_id from vds_static where vds_name='f16-h33-000-r640.rdu2.scalelab.redhat.com'; select status from vds_dynamic where vds_id='060f71f9-ebdf-4e04-8fe5-8ac538172b12'; update vds_dynamic set status=2 where vds_id='060f71f9-ebdf-4e04-8fe5-8ac538172b12'; p.s status=2 is maintenance
(In reply to Tzahi Ashkenazi from comment #5) > manual intervention was required in order to move the host to maintenance > the recommendation to wait 15 min didn't work in this case ( by Eyal ) > the workaround on the engine DB : > > select vds_id from vds_static where > vds_name='f16-h33-000-r640.rdu2.scalelab.redhat.com'; > select status from vds_dynamic where > vds_id='060f71f9-ebdf-4e04-8fe5-8ac538172b12'; > update vds_dynamic set status=2 where > vds_id='060f71f9-ebdf-4e04-8fe5-8ac538172b12'; > > p.s > status=2 is maintenance That's strange, in this version we should have the DbEntityCleanupManager thread that should clean finished/failed image transfers sessions. Can you check the DB for any existing Image transfer entity? select * from image_transfers; And add the engine logs.
engine=# select * from image_transfers; command_id | command_type | phase | last_updated | message | vds_id | disk_id | imaged_ticket_id | proxy_uri | bytes_sent | bytes_total | type | active | daemon_uri | client_inactivity_timeout | image_format | backend | backu p_id | client_type | shallow | timeout_policy --------------------------------------+--------------+-------+----------------------------+---------+--------------------------------------+--------------------------------------+------------------+--------------- --------------------------------------------+-------------+-------------+------+--------+----------------------------------------------------------------+---------------------------+--------------+---------+------ -----+-------------+---------+---------------- 910ed2c3-44ca-4502-851f-6b065113f650 | 1024 | 7 | 2021-08-24 07:27:25.582+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | 6134bbd3-51c6-4b74-96a8-b091d4e5c5ea | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 19537068032 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy 65e8a5e9-266c-4533-ab9f-e176c2bf3041 | 1024 | 7 | 2021-08-24 07:27:22.707+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | d71c52c0-4154-4070-a0f8-b51cca2ce0c7 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 10687086592 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy bad05882-3795-4350-8922-acd54dcee6c8 | 1024 | 7 | 2021-08-23 10:31:38.583+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | 73b80669-c21f-403f-8496-ad2d99d11c27 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 19998441472 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy a8b7395d-87bb-4591-983c-17fa8f562b32 | 1024 | 7 | 2021-08-23 10:31:38.979+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | f4e6933e-f360-4c60-90be-51e42cf6c027 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 50843353088 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy f7b07d64-202d-49e8-a7c8-b980ce0b18bc | 1024 | 7 | 2021-08-23 10:31:34.927+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | 59850072-f493-440f-8724-8568583ba1e9 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 34191966208 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy aa020bcb-ac75-4c7b-b051-4534842044e1 | 1024 | 7 | 2021-08-23 10:31:39.213+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | 3dd79e70-bf55-4a39-b7b7-f4e439e54c80 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 23077060608 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy 66736bc1-5406-4fd8-bbcd-99260a6453a2 | 1024 | 7 | 2021-08-23 10:31:39.258+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | 6134bbd3-51c6-4b74-96a8-b091d4e5c5ea | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 21760049152 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy 4e204816-eeac-464d-b44b-d36260b22dee | 1024 | 7 | 2021-08-23 10:31:35.11+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | ea1a1782-dac9-4409-9a8d-0d737a213ffe | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 42840621056 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy ce36488f-5b12-4a8a-92ca-30cff34bfee3 | 1024 | 7 | 2021-08-23 10:31:36.47+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | c32314ab-77d4-40ec-9962-eb53b9f90f58 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 26508001280 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy b43856fe-2a98-40e4-a976-c8374ddbeb3e | 1024 | 7 | 2021-08-24 07:27:21.666+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | 73b80669-c21f-403f-8496-ad2d99d11c27 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 20887633920 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy b6771f89-b358-4c30-99bc-f3b129a04eb8 | 1024 | 7 | 2021-08-24 07:27:23.1+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | f4e6933e-f360-4c60-90be-51e42cf6c027 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 42991616000 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy d77894e6-7179-4ad7-8fd0-a364b8e7a624 | 1024 | 7 | 2021-08-23 12:36:31.868+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | 59850072-f493-440f-8724-8568583ba1e9 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 53687091200 | 53687091200 | 1 | f | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy 7f2d7f41-5a16-4e2a-9b89-b3c0c9db5261 | 1024 | 7 | 2021-08-24 07:36:43.474+00 | | 060f71f9-ebdf-4e04-8fe5-8ac538172b12 | 348fcbb4-4dd0-4e62-9428-f935a9e6e037 | | https://rhev-r ed-01.rdu2.scalelab.redhat.com:54323/images | 26667384832 | 53687091200 | 1 | t | https://f16-h33-000-r640.rdu2.scalelab.redhat.com:54322/images | 60 | 5 | 1 | | 2 | f | legacy (13 rows)
It seems like your environment still has ongoing image transfer sessions. We can see that all the image transfers have phase '7' which is FINALIZING_SUCCESS. The host can be set to maintenance only when the phase is FINISHED_SUCCESS=9 or FINISHED_FAILURE=10 or PAUSED_SYSTEM=4 or PAUSED_USER=5. Not sure what cause them to stay at that phase but the reported bug is the expected behavior.