Bug 1758115

Summary: when satellite is under load, delete on host can fail: Katello::Resources::Candlepin::Consumer: 500 Internal Server Error Runtime Error could not obtain pessimistic lock at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorR
Product: Red Hat Satellite Reporter: Jan Hutař <jhutar>
Component: CandlepinAssignee: candlepin-bugs
Status: CLOSED WONTFIX QA Contact: Lai <ltran>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.6.0CC: csnyder, serheang.tan
Target Milestone: UnspecifiedKeywords: Performance, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-09 17:02:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Hutař 2019-10-03 10:24:39 UTC
Description of problem:
when satellite is under load, delete on host can fail: Katello::Resources::Candlepin::Consumer: 500 Internal Server Error Runtime Error could not obtain pessimistic lock at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,433


Version-Release number of selected component (if applicable):
satellite-6.6.0-7.el7sat.noarch
candlepin-2.6.9-1.el7sat.noarch


How reproducible:
rarely, about 1 of 1000 in my setup


Steps to Reproduce:
1. I have 24 hosts which reregister to satellite every 15 minutes 3 times


Actual results:
Rarely I can see this error:

"+ curl -X DELETE -k -s -u admin:changeme https://satellite.longscale.local/api/v2/hosts/docker4container190.example.com", 

        "{", 
        "  \"error\": {\"message\":\"Katello::Resources::Candlepin::Consumer: 500 Internal Server Error {\\\"displayMessage\\\":\\\"Runtime Error could not obtain pessimistic lock at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,433\\\",\\\"requestUuid\\\":\\\"60d7e6dc-c4a7-4e38-83fc-00306f5b7b9b\\\"} (DELETE /candlepin/consumers/3341f542-3e78-4eb1-aac7-b46de6e1112d)\"}", 
        "}", 


Expected results:
It should work without error

Comment 3 Jan Hutař 2019-10-03 10:31:13 UTC
This code is used to run the registration loop:

            set -xe;
            sleep \$(( \$RANDOM % 60 ));
            rpm --quiet -q katello-host-tools || yum -y install katello-host-tools;
            rpm --quiet -q zsh && rpm -e zsh;
            curl -X DELETE -k -s -u '{{ sat_user }}:{{ sat_pass }}' https://{{ satellite }}/api/v2/hosts/\$( hostname ) || true;
            subscription-manager status || true;
            for i in \$( seq {{ contreg_iter }} ); do
                subscription-manager unregister || true;
                subscription-manager status || true;
                if ! subscription-manager register --activationkey {{ content_activationkey }} --org {{ sat_orglabel }}; then
                    rc=\$?;
                    [ \"\$rc\" -ne 2 ] && echo \"ERROR: Registration failed with \$rc\" >&2;
                fi
                subscription-manager status | grep 'Overall Status: \(Current\|Invalid\)';
                subscription-manager refresh;
                yum -y install zsh;
                yum -y remove zsh;
            done &> {{ contreg_log }};
            tail {{ contreg_log }}"

What is important for this bug is the curl command.

Comment 5 Jan Hutař 2019-10-03 11:22:00 UTC
Error happened sometime close to 2019-10-03T09:30:03.781Z

Comment 8 serheang.tan 2020-01-03 10:07:50 UTC
Hi,
I am seeing similar error for my systems kickstart (about 50 PCs).  If the system failed to register to the satellite (first step in my kickstart's postscript), the rest of the postscripts will failed and I got systems that not properly provision.
Is there any workaround for it?

Comment 9 Mike McCune 2021-07-09 17:02:24 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and while we recognize that it is a valid request, we do not expect this to be implemented in the product in the foreseeable future. This is due to other priorities for the product, and not a reflection on the request itself. We are therefore closing this out as WONTFIX. If you have any concerns about this feel free to contact your Red Hat Account Team. Thank you.