Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1603166

Summary:

Fail upgrading Satellite 6.3 to 6.4 beta

Product:

Red Hat Satellite

Reporter:

Juan Manuel Parrilla Madrid <jparrill>

Component:

Tasks Plugin

Assignee:

satellite6-bugs <satellite6-bugs>

Status:

CLOSED WONTFIX

QA Contact:

Peter Ondrejka <pondrejk>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

6.4

CC:

aruzicka, inecas, jgiordan, jparrill, pep

Target Milestone:

Unspecified

Keywords:

Triaged, Upgrades

Target Release:

Unused

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-09-03 18:58:27 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Error upgradind satellite to 6.4	none
pg_dump of foreman_tasks_locks	none

Description Juan Manuel Parrilla Madrid 2018-07-19 10:59:58 UTC

Description of problem:
On an upgrade from Sateellite 6.3 to 6.4 and in the step of "satellite-installer --scenario satellite --upgrade" I got an error on "foreman-rake db:migrate" Upgrade step

Version-Release number of selected component (if applicable):

satellite-6.4.0-10.beta.el7sat.noarch
satellite-installer-6.4.0.5-1.beta.el7sat.noarch
tfm-rubygem-foreman_theme_satellite-2.0.1.4-1.el7sat.noarch
satellite-capsule-6.4.0-10.beta.el7sat.noarch
satellite-common-6.4.0-10.beta.el7sat.noarch
satellite-cli-6.4.0-10.beta.el7sat.noarch

How reproducible:


Steps to Reproduce:
1. Subscribe to RH Satellite 6.4 repositories
2. perform a yum update
3. execute "satellite-installer --scenario satellite --upgrade"

Actual results:
StandardError: An error has occurred, this and all later migrations canceled:

PG::UniqueViolation: ERROR:  could not create unique index "foreman_tasks_locks_pkey"
DETAIL:  Key (id)=(5592) is duplicated.
: ALTER TABLE "foreman_tasks_locks" ALTER COLUMN "task_id" TYPE uuid USING task_id::uuid
...
...
...
ActiveRecord::RecordNotUnique: PG::UniqueViolation: ERROR:  could not create unique index "foreman_tasks_locks_pkey"
DETAIL:  Key (id)=(5592) is duplicated.
: ALTER TABLE "foreman_tasks_locks" ALTER COLUMN "task_id" TYPE uuid USING task_id::uuid
...
...
...
PG::UniqueViolation: ERROR:  could not create unique index "foreman_tasks_locks_pkey"
DETAIL:  Key (id)=(5592) is duplicated.
...
...
...
Tasks: TOP => db:migrate
(See full trace by running task with --trace)

Expected results:
Upgrade finish succesfully

Additional info:

Comment 2 Juan Manuel Parrilla Madrid 2018-07-19 11:04:06 UTC

Created attachment 1459993 [details]
Error upgradind satellite to 6.4

Comment 3 Juan Manuel Parrilla Madrid 2018-07-19 12:42:02 UTC

More details:

[root@sat ~]# sudo su - postgres -s /bin/bash -c "psql -d foreman -c '\d foreman_tasks_tasks'"
            Table "public.foreman_tasks_tasks"
     Column     |            Type             | Modifiers 
----------------+-----------------------------+-----------
 id             | character varying(255)      | 
 type           | character varying(255)      | not null
 label          | character varying(255)      | 
 started_at     | timestamp without time zone | 
 ended_at       | timestamp without time zone | 
 state          | character varying(255)      | not null
 result         | character varying(255)      | not null
 external_id    | character varying(255)      | 
 parent_task_id | character varying(255)      | 
 start_at       | timestamp without time zone | 
 start_before   | timestamp without time zone | 
 action         | character varying           | 
Indexes:
    "index_foreman_tasks_id_state" btree (id, state)
    "index_foreman_tasks_tasks_on_ended_at" btree (ended_at)
    "index_foreman_tasks_tasks_on_external_id" btree (external_id)
    "index_foreman_tasks_tasks_on_id" btree (id)
    "index_foreman_tasks_tasks_on_label" btree (label)
    "index_foreman_tasks_tasks_on_parent_task_id" btree (parent_task_id)
    "index_foreman_tasks_tasks_on_result" btree (result)
    "index_foreman_tasks_tasks_on_start_at" btree (start_at)
    "index_foreman_tasks_tasks_on_start_before" btree (start_before)
    "index_foreman_tasks_tasks_on_started_at" btree (started_at)
    "index_foreman_tasks_tasks_on_state" btree (state)
    "index_foreman_tasks_tasks_on_type" btree (type)
    "index_foreman_tasks_tasks_on_type_and_label" btree (type, label)

[root@sat ~]# sudo su - postgres -s /bin/bash -c "psql -d foreman -c '\d foreman_tasks_locks'"
                                    Table "public.foreman_tasks_locks"
    Column     |          Type          |                            Modifiers                             
---------------+------------------------+------------------------------------------------------------------
 id            | integer                | not null default nextval('foreman_tasks_locks_id_seq'::regclass)
 task_id       | character varying(255) | not null
 name          | character varying(255) | not null
 resource_type | character varying(255) | 
 resource_id   | integer                | 
 exclusive     | boolean                | 
Indexes:
    "foreman_tasks_locks_pkey" PRIMARY KEY, btree (id)
    "index_foreman_tasks_locks_name_resource_type_resource_id" btree (name, resource_type, resource_id)
    "index_foreman_tasks_locks_on_exclusive" btree (exclusive)
    "index_foreman_tasks_locks_on_name" btree (name)
    "index_foreman_tasks_locks_on_resource_type_and_resource_id" btree (resource_type, resource_id)
    "index_foreman_tasks_locks_on_task_id" btree (task_id)

Comment 6 Juan Manuel Parrilla Madrid 2018-07-19 12:58:06 UTC

Ok, seems like thre was some tasks stuck in the database and the locks of these tasks stays there. Verifying that there is not running tasks and also the services stopped I perform this both shell commands:

```
sudo su - postgres -s /bin/bash -c "psql -d foreman -c 'delete from foreman_tasks_locks where id = 5592;'"

sudo su - postgres -s /bin/bash -c "psql -d foreman -c 'delete from foreman_tasks_locks where id = 5593;'"

```

This is my case, the task id could vary.

Comment 7 Brad Buckingham 2018-07-19 13:21:55 UTC

Ivan, 

Does foreman-maintain check for stuck tasks?  In other words, should the documented workflow using foreman-maintain, help ensure users do not encounter this behavior?  If so, this may be 'notabug'.

Comment 8 Ivan Necas 2018-07-19 13:34:23 UTC

I don't think that the unique issue and running tasks are actually related: the lock objects stay in the database even after the task finishes.

I think though that this might be quite rare one, but we still should do make the migration more resilient to handle this corner case.

Juan: could you share with us the content of the `foreman_tasks_locks` table before the upgrade for analysis of what data cause the issue and how frequent that might be?

Comment 9 Josep 'Pep' Turro Mauri 2018-07-19 15:37:32 UTC

Created attachment 1460858 [details]
pg_dump of foreman_tasks_locks

I hit this too.

I was performing the upgrade from foreman-maintain, which indeed offered me to clean up tasks, which I accepted: this reported to have deleted 702 tasks, but apparently the tasks_locks table wasn't cleaned.

I'm also getting the same type of PG error:

[DEBUG 2018-07-19T11:15:58 main] PG::UniqueViolation: ERROR:  could not create unique index "foreman_tasks_locks_pkey"
[DEBUG 2018-07-19T11:15:58 main] DETAIL:  Key (id)=(5595) is duplicated.

Attaching a pg_dump of the foreman_tasks_locks table

Comment 11 Joe Giordano 2018-07-19 15:54:35 UTC

I ran into this same issue with the same 2 tasks, deleted them and re-kicked off the install. 

My logs are here http://people.redhat.com/jgiordan/files/sat_migrate_logs.tar

Comment 13 Ivan Necas 2018-07-20 16:48:00 UTC

Looking at the database log, it looks quite odd, as even the table seems to have duplicate id records, even though it had the pkey defined on the column.

It seems there has been a bug in postgres that could lead to it https://www.postgresql.org/about/news/1506/, but has been fixed for some time.

The generic workaround is to:

    su - postgres -c 'psql -d foreman -c "delete from foreman_tasks_locks where id in (select id from foreman_tasks_locks group by id having count(id) > 1);"'

Given we've not seen this so far with other large customer databases, I would suggest leaving this BZ open for now. If it turns out, that this would be more probable than it looks like right now, we could release a KCS article on this + implement a check to foreman-maintain to perform the cleanup before the upgrade.

Comment 14 Bryan Kearney 2019-08-05 12:22:30 UTC

The Satellite Team is attempting to provide an accurate backlog of bugzilla requests which we feel will be resolved in the next few releases. We do not believe this bugzilla will meet that criteria, and have plans to close it out in 1 month. This is not a reflection on the validity of the request, but a reflection of the many priorities for the product. If you have any concerns about this, feel free to contact Red Hat Technical Support or your account team. If we do not hear from you, we will close this bug out. Thank you.

Comment 15 Bryan Kearney 2019-09-03 18:58:27 UTC

Thank you for your interest in Satellite 6. We have evaluated this request, and while we recognize that it is a valid request, we do not expect this to be implemented in the product in the foreseeable future. This is due to other priorities for the product, and not a reflection on the request itself. We are therefore closing this out as WONTFIX. If you have any concerns about this, please do not reopen. Instead, feel free to contact Red Hat Technical Support. Thank you.