Description of problem: System registration fails (return a 500 error) in Satellite if the /var/lib/pulp/sn.dat file is too big or corrupted. Version-Release number of selected component (if applicable): Red Hat Satellite 6.2.7 How reproducible: 100% Steps to Reproduce: 1. Install a fresh Satellite 2. Register a client to test if it's working 3. Corrupt the file by doing: # cp /var/lib/pulp/sn.dat{,.bkp} # for i in $(seq 100000); do echo -n $i >> /var/lib/pulp/sn.dat done note: the original corrupted file has been attached to the case. You can download and use the original one instead of creating it. Actual results: System registration gets stuck or unresponsive when executing the Create:Pulp:Consumer plan. Expected results: The system should be registered as expected. Additional info: A nice test is to call pulp API directly to isolate the problem: -- creating a fake Pulp consumer to test it # curl -v -k -X POST -H "Content-Type: application/json" -d '{"id": "$(uuidgen)"}' https://$(hostname)/pulp/api/v2/consumers/ We still don't know what caused the file to ended up with 11MB. One possibility is a race condition acquiring a lock on that file since this particular customer runs a cloud environment with tons of simultaneous registrations.
One important information is since the registration process fails, if accessing the Dynflow to **SKIP** the Pulp:Create:Consumer step to cancel the process, if you try to register the system again, the error below will be displayed: # hammer host info --name server1 Katello::Resources::Candlepin::CandlepinResource: 404 Resource Not Found {"displayMessage":"Runtime Error RESTEASY001185: Could not find resource for relative : /consumers//compliance of full path: https://localhost:8443/candlepin/consumers//compliance at org.jboss.resteasy.core.registry.PathParamSegment.matchPattern:209","requestUuid":"98688d6d-b2ec-419e-b195-5ada6f759f96"} (GET /candlepin/consumers//compliance) This issue happens because the Satellite tries to find the system UUID from foreman's database, however since it was not create, the URL is incomplete when trying to access the Candlepin API: --- error https://localhost:8443/candlepin/consumers//compliance Where should be: https://localhost:8443/candlepin/consumers/<SYTEM_UUID_HERE>/compliance
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
*** WORKAROUND *** mv /var/lib/pulp/sn.dat /var/lib/pulp/sn.dat.old echo 3000 > /var/lib/pulp/sn.dat katello-service restart
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.
The upstream fix will be included in Pulp 2.13, but it should apply cleanly and work just the same on 6.2.z.
Thanks Zach for the update... what's the rollback procedure?
** HOTFIX INSTRUCTIONS - SATELLITE 6.2.7 (RHEL 7) ** I've produced hotfix packages for a customer's installed version of Pulp and tested the locally. Install instructions: 1) Download the attached file Hotfix-1417689.tar.bz2 2) verify md5sum # md5sum Hotfix-1417689.tar.bz2 09cc8c59c1a11be82c4c568734f73e2d Hotfix-1417689.tar.bz2 3) Stop Services katello-service stop 4) Extract the tarball tar xvf Hotfix-1417689.tar.bz2 katello-service stop 5) Update packages cd Hotfix-1417689 yum update pulp-selinux-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm\ pulp-server-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \ python-pulp-agent-lib-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \ python-pulp-bindings-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \ python-pulp-client-lib-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \ python-pulp-common-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \ python-pulp-oid_validation-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \ python-pulp-repoauth-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \ python-pulp-streamer-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm 6) Start services katello-service start 5) Resume normal operations ** ROLLBACK INSTRUCTIONS ** If you wish to roll this back for any reason, utilize yum history: 1) # yum history Loaded plugins: langpacks, package_upload, product-id, search-disabled-repos, subscription-manager ID | Login user | Date and time | Action(s) | Altered ------------------------------------------------------------------------------- 39 | root <root> | 2017-02-03 17:14 | Update | 7 EE 38 | root <root> | 2017-01-31 10:16 | Update | 2 37 | root <root> | 2017-01-26 11:50 | I, U | 98 EE 2) Undo # yum history undo 39 ... Dependencies Resolved =========================================================================================================================== Package Arch Version Repository Size ========================================================================================================================== Downgrading: pulp-selinux noarch 2.8.7.5-1.el7sat rhel-7-server-satellite-6.2-rpms 81 k pulp-server noarch 2.8.7.5-1.el7sat rhel-7-server-satellite-6.2-rpms 732 k python-pulp-agent-lib noarch 2.8.7.5-1.el7sat rhel-7-server-satellite-6.2-rpms 93 k python-pulp-bindings noarch 2.8.7.5-1.el7sat rhel-7-server-satellite-6.2-rpms 113 k python-pulp-client-lib noarch 2.8.7.5-1.el7sat rhel-7-server-satellite-6.2-rpms 199 k python-pulp-common noarch 2.8.7.5-1.el7sat rhel-7-server-satellite-6.2-rpms 125 k python-pulp-oid_validation noarch 2.8.7.5-1.el7sat rhel-7-server-satellite-6.2-rpms 67 k Transaction Summary ========================================================================================================================== Downgrade 7 Packages 3) Restart # katello-service restart
Also Efraim, here is the MD5 checksum for the tar 09cc8c59c1a11be82c4c568734f73e2d Hotfix-1417689.tar.bz2
Please add verifications steps for this bug to help QE verify
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
Created attachment 1267481 [details] registration task Verified in snap 6.2.9-2, using steps from the problem description. Content host is registered successfully when oversized sn.dat occurs.
VERIFIED. (oops it is already) @satellite-6.2.9-2.0.el7sat.noarch pulp-server-2.8.7.10-2.el7sat.noarch by manual reproducer described in comment#0 1. Simulate sn.dat corruption # for i in $(seq 100000); do echo -n $i >> /var/lib/pulp/sn.dat ; done 2. Install CA certs and try to register a client # subscription-manager register --org="Default_Organization" --name="vm1.example.com" --activationkey="AK" Registering the System Task 00fdbb1f-9c55-4cf0-a8d6-3d93a6b461b0: RestClient::InternalServerError: 500 Internal Server Error 3. Upgrade 6.2.8 > 6.2.9 4. Register new client and install packages on it # subscription-manager register --org="Default_Organization" --name="vm2.example.com" --activationkey="AK" Registering the System The system has been registered with ID: 91da829d-30b0-497f-84e9-b1da31629811 Installed Product Current Status: Product Name: Red Hat Enterprise Linux Server Status: Subscribed # yum install katello-agent ... Complete! >>> client registration and packages installation are successful regardless of sn.dat file state
The Pulp upstream bug status is at CLOSED - COMPLETE. Updating the external tracker on this bug.
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1191