Bug 1417689 - System does not get registered if /var/lib/pulp/sn.dat is corrupted or too large
Summary: System does not get registered if /var/lib/pulp/sn.dat is corrupted or too large
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.2.7
Hardware: All
OS: All
high
high
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Peter Ondrejka
URL:
Whiteboard:
Depends On:
Blocks: 1426415
TreeView+ depends on / blocked
 
Reported: 2017-01-30 16:44 UTC by Marcelo Moreira de Mello
Modified: 2021-06-10 11:52 UTC (History)
25 users (show)

Fixed In Version: pulp-2.8.7.7-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1426415 (view as bug list)
Environment:
Last Closed: 2017-05-01 13:58:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
registration task (28.67 KB, image/png)
2017-03-30 10:23 UTC, Peter Ondrejka
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Pulp Redmine 1839 0 High CLOSED - CURRENTRELEASE sn.dat is succeptible to race conditions 2017-04-27 14:07:03 UTC
Pulp Redmine 2650 0 Normal CLOSED - COMPLETE Backport #1839, sn.dat is succeptable to race conditions 2017-03-30 23:07:37 UTC
Red Hat Knowledge Base (Solution) 2893811 0 None None None 2017-02-06 16:42:45 UTC
Red Hat Product Errata RHBA-2017:1191 0 normal SHIPPED_LIVE Satellite 6.2.9 Async Bug Release 2017-05-01 17:49:42 UTC

Description Marcelo Moreira de Mello 2017-01-30 16:44:22 UTC
Description of problem:

System registration fails (return a 500 error) in Satellite if the /var/lib/pulp/sn.dat file is too big or corrupted. 



Version-Release number of selected component (if applicable):
Red Hat Satellite 6.2.7


How reproducible:
100%

Steps to Reproduce:
1. Install a fresh Satellite
2. Register a client to test if it's working
3. Corrupt the file by doing:

# cp /var/lib/pulp/sn.dat{,.bkp}
# for i in $(seq 100000); 
  do
  echo -n $i >> /var/lib/pulp/sn.dat
  done

 
 note: the original corrupted file has been attached to the case. You can download and use the original one instead of creating it. 


Actual results:

  System registration gets stuck or unresponsive when executing the Create:Pulp:Consumer plan. 


Expected results:

  The system should be registered as expected. 


Additional info:

  A nice test is to call pulp API directly to isolate the problem:


-- creating a fake Pulp consumer to test it 
# curl -v -k -X POST -H "Content-Type: application/json" -d '{"id": "$(uuidgen)"}' https://$(hostname)/pulp/api/v2/consumers/

We still don't know what caused the file to ended up with 11MB. One possibility is a race condition acquiring a lock on that file since this particular customer runs a cloud environment with tons of simultaneous registrations.

Comment 2 Marcelo Moreira de Mello 2017-01-30 16:52:45 UTC
 One important information is since the registration process fails, if accessing the Dynflow to **SKIP** the Pulp:Create:Consumer step to cancel the process, if you try to register the system again, the error below will be displayed: 


# hammer host info --name server1
Katello::Resources::Candlepin::CandlepinResource: 404 Resource Not Found {"displayMessage":"Runtime Error RESTEASY001185: Could not find resource for relative : /consumers//compliance of full path: https://localhost:8443/candlepin/consumers//compliance at org.jboss.resteasy.core.registry.PathParamSegment.matchPattern:209","requestUuid":"98688d6d-b2ec-419e-b195-5ada6f759f96"} (GET /candlepin/consumers//compliance)


This issue happens because the Satellite tries to find the system UUID from foreman's database, however since it was not create, the URL is incomplete when trying to access the Candlepin API:


--- error
https://localhost:8443/candlepin/consumers//compliance

 Where should be:

https://localhost:8443/candlepin/consumers/<SYTEM_UUID_HERE>/compliance

Comment 4 pulp-infra@redhat.com 2017-01-30 19:02:49 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 5 pulp-infra@redhat.com 2017-01-30 19:02:53 UTC
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.

Comment 6 Mike McCune 2017-01-30 20:44:58 UTC
*** WORKAROUND ***

 mv /var/lib/pulp/sn.dat /var/lib/pulp/sn.dat.old
 echo 3000 > /var/lib/pulp/sn.dat 
 katello-service restart

Comment 8 pulp-infra@redhat.com 2017-02-01 20:32:45 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 9 pulp-infra@redhat.com 2017-02-01 21:03:05 UTC
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 10 Brian Bouterse 2017-02-02 14:08:23 UTC
The upstream fix will be included in Pulp 2.13, but it should apply cleanly and work just the same on 6.2.z.

Comment 13 Efraim Marquez-Arreaza 2017-02-03 21:51:48 UTC
Thanks Zach for the update... what's the rollback procedure?

Comment 18 Zach Huntington-Meath 2017-02-03 22:32:22 UTC
** HOTFIX INSTRUCTIONS - SATELLITE 6.2.7 (RHEL 7) **

I've produced hotfix packages for a customer's installed version
of Pulp and tested the locally. 

Install instructions: 
1) Download the attached file Hotfix-1417689.tar.bz2

2) verify md5sum

# md5sum Hotfix-1417689.tar.bz2
09cc8c59c1a11be82c4c568734f73e2d  Hotfix-1417689.tar.bz2

3) Stop Services

katello-service stop

4) Extract the tarball

tar xvf Hotfix-1417689.tar.bz2

katello-service stop

5) Update packages

cd Hotfix-1417689

yum update pulp-selinux-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm\
pulp-server-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \
python-pulp-agent-lib-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \
python-pulp-bindings-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \
python-pulp-client-lib-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \
python-pulp-common-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \
python-pulp-oid_validation-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \
python-pulp-repoauth-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm \
python-pulp-streamer-2.8.7.5-2.Hotfix_1417689.el7sat.noarch.rpm

6) Start services

katello-service start

5) Resume normal operations

** ROLLBACK INSTRUCTIONS **

If you wish to roll this back for any reason, utilize yum history:

1) 
# yum history
Loaded plugins: langpacks, package_upload, product-id, search-disabled-repos, subscription-manager
ID     | Login user               | Date and time    | Action(s)      | Altered
-------------------------------------------------------------------------------
    39 | root <root>              | 2017-02-03 17:14 | Update         |    7 EE
    38 | root <root>              | 2017-01-31 10:16 | Update         |    2   
    37 | root <root>              | 2017-01-26 11:50 | I, U           |   98 EE

2) Undo
# yum history undo 39
...
Dependencies Resolved

===========================================================================================================================
 Package                     Arch              Version                     Repository                                 Size
==========================================================================================================================
Downgrading:
 pulp-selinux                          noarch            2.8.7.5-1.el7sat            rhel-7-server-satellite-6.2-rpms             81 k
 pulp-server                           noarch            2.8.7.5-1.el7sat            rhel-7-server-satellite-6.2-rpms            732 k
 python-pulp-agent-lib                 noarch            2.8.7.5-1.el7sat            rhel-7-server-satellite-6.2-rpms             93 k
 python-pulp-bindings                  noarch            2.8.7.5-1.el7sat            rhel-7-server-satellite-6.2-rpms            113 k
 python-pulp-client-lib                noarch            2.8.7.5-1.el7sat            rhel-7-server-satellite-6.2-rpms            199 k
 python-pulp-common                    noarch            2.8.7.5-1.el7sat            rhel-7-server-satellite-6.2-rpms            125 k
 python-pulp-oid_validation            noarch            2.8.7.5-1.el7sat            rhel-7-server-satellite-6.2-rpms             67 k

Transaction Summary
==========================================================================================================================
Downgrade  7 Packages

3) Restart 

# katello-service restart

Comment 19 Zach Huntington-Meath 2017-02-03 22:33:31 UTC
Also Efraim, here is the MD5 checksum for the tar

09cc8c59c1a11be82c4c568734f73e2d  Hotfix-1417689.tar.bz2

Comment 20 Satellite Program 2017-02-23 21:11:24 UTC
Please add verifications steps for this bug to help QE verify

Comment 23 pulp-infra@redhat.com 2017-03-30 02:07:25 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 24 pulp-infra@redhat.com 2017-03-30 02:07:31 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 25 Peter Ondrejka 2017-03-30 10:23:33 UTC
Created attachment 1267481 [details]
registration task

Verified in snap 6.2.9-2, using steps from the problem description. Content host is registered successfully when oversized sn.dat occurs.

Comment 26 Lukas Pramuk 2017-03-30 14:08:56 UTC
VERIFIED. (oops it is already)

@satellite-6.2.9-2.0.el7sat.noarch
pulp-server-2.8.7.10-2.el7sat.noarch

by manual reproducer described in comment#0

1. Simulate sn.dat corruption
# for i in $(seq 100000); do echo -n $i >> /var/lib/pulp/sn.dat ; done

2. Install CA certs and try to register a client
# subscription-manager register --org="Default_Organization" --name="vm1.example.com" --activationkey="AK"
Registering the System
Task 00fdbb1f-9c55-4cf0-a8d6-3d93a6b461b0: RestClient::InternalServerError: 500 Internal Server Error

3. Upgrade 6.2.8 > 6.2.9

4. Register new client and install packages on it
# subscription-manager register --org="Default_Organization" --name="vm2.example.com" --activationkey="AK"
Registering the System
The system has been registered with ID: 91da829d-30b0-497f-84e9-b1da31629811 

Installed Product Current Status:
Product Name: Red Hat Enterprise Linux Server
Status:       Subscribed

# yum install katello-agent
...
Complete!

>>> client registration and packages installation are successful regardless of sn.dat file state

Comment 27 pulp-infra@redhat.com 2017-03-30 23:07:38 UTC
The Pulp upstream bug status is at CLOSED - COMPLETE. Updating the external tracker on this bug.

Comment 28 pulp-infra@redhat.com 2017-04-19 21:35:29 UTC
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.

Comment 29 pulp-infra@redhat.com 2017-04-27 14:07:04 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Comment 31 errata-xmlrpc 2017-05-01 13:58:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1191


Note You need to log in before you can comment on or make changes to this bug.