Red Hat Bugzilla – Bug 1486375
Katello::Host::Update gets retriggered frequently when it fails due to locking
Last modified: 2017-11-02 08:20:12 EDT
Description of problem: Katello::Host::Update action gets triggered for a host. Another host update gets triggered for the host, but fails due to lock being held by the original task. For unknown reason, additional host update task gets triggered within seconds of the second one failing. This leads to massive growth in tasks count (which mostly fail) and also degrades the performance. For example: there are 86 host update tasks for a single host which failed because one single host update task was already running (the original task ran for 1h15m). Task start times: 2017-08-29 04:42:10 2017-08-29 04:42:11 2017-08-29 04:43:11 2017-08-29 04:43:12 2017-08-29 04:44:13 2017-08-29 04:44:14 2017-08-29 04:45:17 2017-08-29 04:45:19 2017-08-29 04:46:20 2017-08-29 04:46:24 2017-08-29 04:47:25 2017-08-29 04:47:27 2017-08-29 04:48:27 2017-08-29 04:48:28 2017-08-29 04:49:29 2017-08-29 04:49:31 2017-08-29 04:50:32 2017-08-29 04:50:32 2017-08-29 04:51:33 2017-08-29 04:51:34 2017-08-29 04:52:35 2017-08-29 04:52:37 2017-08-29 04:53:37 2017-08-29 04:53:38 2017-08-29 04:54:39 2017-08-29 04:54:40 2017-08-29 04:55:41 2017-08-29 04:55:42 2017-08-29 04:56:43 2017-08-29 04:56:44 2017-08-29 04:57:44 2017-08-29 04:57:45 2017-08-29 04:58:46 2017-08-29 04:58:46 2017-08-29 04:59:47 2017-08-29 04:59:48 2017-08-29 05:00:48 2017-08-29 05:00:49 2017-08-29 05:01:50 2017-08-29 05:01:51 2017-08-29 05:02:51 2017-08-29 05:02:52 2017-08-29 05:03:53 2017-08-29 05:03:53 2017-08-29 05:04:54 2017-08-29 05:04:55 2017-08-29 05:05:55 2017-08-29 05:05:56 2017-08-29 05:06:57 2017-08-29 05:06:57 2017-08-29 05:07:58 2017-08-29 05:07:59 2017-08-29 05:09:00 2017-08-29 05:09:00 2017-08-29 05:10:01 2017-08-29 05:10:02 2017-08-29 05:11:03 2017-08-29 05:11:03 2017-08-29 05:12:04 2017-08-29 05:12:04 2017-08-29 05:13:05 2017-08-29 05:13:06 2017-08-29 05:14:07 2017-08-29 05:14:08 2017-08-29 05:15:08 2017-08-29 05:15:09 2017-08-29 05:16:10 2017-08-29 05:16:11 2017-08-29 05:17:12 2017-08-29 05:17:13 2017-08-29 05:18:13 2017-08-29 05:18:14 2017-08-29 05:19:15 2017-08-29 05:19:15 2017-08-29 05:20:16 2017-08-29 05:20:17 2017-08-29 05:21:18 2017-08-29 05:21:18 2017-08-29 05:22:19 2017-08-29 05:22:20 2017-08-29 05:23:20 2017-08-29 05:23:21 2017-08-29 05:24:22 2017-08-29 05:24:22 2017-08-29 05:25:23 2017-08-29 05:25:24 As we can see there are two host updates triggered for the host every 1 minute and 1 second. Also (might not be related), their /var/log/messages is filled with entries like qdrouterd: ROUTER_LS (info) Router Link Lost - link_id=0 Version-Release number of selected component (if applicable): How reproducible: No reproducer yet Steps to Reproduce: 1. 2. 3. Actual results: host updates don't get triggered so frequently when they fail Expected results: Additional info: