Bug 1262989 - If consumer is deleted server side, then 'unregister' runs on client, it fails and doesn't delete certs or cache
If consumer is deleted server side, then 'unregister' runs on client, it fail...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: subscription-manager (Show other bugs)
6.0
Unspecified Unspecified
unspecified Severity unspecified
: rc
: 6.8
Assigned To: Filip Nguyen
John Sefler
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-14 16:18 EDT by Adrian Likins
Modified: 2016-05-10 16:37 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-10 16:37:11 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Adrian Likins 2015-09-14 16:18:38 EDT
Description of problem:
# register a system
# go to access, unregister ('delete') it
# go to client before rhsmcertd runs
[subscription-manager (master u=)]$ sudo bin/subscription-manager unregister
Unit d157351a-97d6-4e66-8759-a9f090628df5 has been deleted
[subscription-manager (master u=)]$ ls -lart /etc/pki/consumer/
total 12
drwxr-xr-x. 20 root root 4096 Sep  2 14:25 ..
-rw-r-----.  1 root root 1679 Sep 14 15:19 key.pem
-rw-r-----.  1 root root 2049 Sep 14 15:19 cert.pem
drwxr-xr-x.  2 root root   35 Sep 14 15:19 .
[subscription-manager (master u=)]$ ls -alrt /var/lib/rhsm/cache/
total 32
drwxr-xr-x. 5 root root   83 Aug  7 12:33 ..
drwxr-xr-x. 2 root root 4096 Sep 10 19:44 .
-rw-r--r--. 1 root root  164 Sep 14 15:19 installed_products.json
-rw-r--r--. 1 root root    2 Sep 14 15:20 content_overrides.json
-rw-r--r--. 1 root root   20 Sep 14 15:20 releasever.json
-rw-r--r--. 1 root root    2 Sep 14 15:20 written_overrides.json
-rw-r--r--. 1 root root 5076 Sep 14 15:21 entitlement_status.json
-rw-r--r--. 1 root root  342 Sep 14 15:21 product_status.json

# last bit of rhsm.log
# Note that clean_all_data isn't run

2015-09-14 15:58:03,525 1932:subscription-manager:MainThread @connection.py:515 - Making request: GET /subscription/status
2015-09-14 15:58:04,002 1932:subscription-manager:MainThread @connection.py:546 - Response: status=200, requestUuid=0eb763ef-35d5-4fe9-ac2a-0503acf78258
2015-09-14 15:58:04,004 1932:subscription-manager:MainThread @managercli.py:363 - Server Versions: {'rules-version': '5.15', 'candlepin': '0.9.51.5-1', 'server-type': 'Red Hat Subscription Management'}
2015-09-14 15:58:04,004 1932:subscription-manager:MainThread @managercli.py:328 - Consumer Identity name=dhcp129-184.rdu.redhat.com uuid=d157351a-97d6-4e66-8759-a9f090628df5
2015-09-14 15:58:04,005 1932:subscription-manager:MainThread @connection.py:515 - Making request: DELETE /subscription/consumers/d157351a-97d6-4e66-8759-a9f090628df5
2015-09-14 15:58:04,349 1932:subscription-manager:MainThread @connection.py:546 - Response: status=410
2015-09-14 15:58:04,350 1932:subscription-manager:MainThread @managercli.py:160 - Unregister failed
2015-09-14 15:58:04,350 1932:subscription-manager:MainThread @managercli.py:161 - Unit d157351a-97d6-4e66-8759-a9f090628df5 has been deleted
Traceback (most recent call last):
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 1243, in _do_command
    managerlib.unregister(self.cp, self.identity.uuid)
  File "/usr/share/rhsm/subscription_manager/managerlib.py", line 789, in unregister
    uep.unregisterConsumer(consumer_uuid)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 1062, in unregisterConsumer
    return self.conn.request_delete(method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 639, in request_delete
    return self._request("DELETE", method, params)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 555, in _request
    self.validateResponse(result, request_type, handler)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 588, in validateResponse
    parsed['displayMessage'], parsed['deletedId'])
GoneException: Unit d157351a-97d6-4e66-8759-a9f090628df5 has been deleted


# And after running rhsmcertd-worker again, and running clean_all_data

[subscription-manager (master u=)]$ ls -alrt /var/lib/rhsm/cache/
total 4
drwxr-xr-x. 5 root root 83 Aug  7 12:33 ..
-rw-r--r--. 1 root root 20 Sep 14 15:20 releasever.json
drwxr-xr-x. 2 root root 28 Sep 14 16:00 .

So 'release' cache is definitely not getting cleaned up

Version-Release number of selected component (if applicable):


Expected results:

clean all to get invoked and old registration and caches removed.
The resulting state is kind of a mess.


Additional info:
This can probably be triggered by rhsmcertd running --autoheal as well.

I think this is the start of getting a client to the point where they start
sending /consumers/None requests (I suspect something like 'unregister' then 'release'
or 'repos' may end up attempting to make a request using the old consumer cert id, but
with a stale ref to the consumers uuid.
Comment 1 Adrian Likins 2015-09-15 10:04:01 EDT
This is also something of a particular instance of a more general problem:

Most sub man cli commands (and the code underneath them) don't handle getting a 410 response (GoneException) particularly well.

A couple of causes for that:

- There can be a lot of candlepin requests made early in cli startup, even
  before the CLiCommand subclasses ._do_command gets triggered.

- There is a lot of code (in the cli, and elsewhere) that doesn't really 
  expect to get exceptions from (candlepin,cp,uep,connection) requests.

  A lot of the cli just lets any exception bubble up to try/excepts in
  _do_command and the generic handlers.

- There is something of an assumption that if a consumer is unregistered/deleted
  server side, that it will be rhmscertd that picks it up and handles the  deletion. So the potential entry points for getting a 410 aren't covered well.

- Most of the cli tools do check if there is a consumer cert, and that it is_valid() before starting. But is_valid() just means it exists, it can be loaded, and it hasn't expired. So it is easier to get to a state where a command attempts to use consumer cert auth, but the server responds with 410s.

- The 410 GoneException handler in rhsmcertd-worker has a check in it to only
  delete the consumer cert if the active consumer uuid (ie, /etc/pki/consumer/cert.pem) matches the consumer uuid included in the GoneException.
Comment 2 Adrian Likins 2015-09-15 10:19:55 EDT
https://github.com/candlepin/python-rhsm/pull/155
https://github.com/candlepin/subscription-manager/pull/1295

Add some notes/docs about this.

(Not to imply that any of that can't change)
Comment 3 Filip Nguyen 2015-09-22 08:47:21 EDT
https://github.com/candlepin/subscription-manager/pull/1298

What I did for now is that I added error handling for unregister that will delete local data even in the presence of GoneException.

I also added more error handling for other actions. The actions will now in case of GoneException notify the user that clean may be used to remove local data. Even more aggresive route may be taken in the future, since the rhsm daemon will remove the data anyway.

I think the correct way to approach the general problem is to ensure server throws GoneException or similar exception in case local/server consistency is violated. I don't think premature checks e.g. whether a cert is valid on the server are a way to go.

The client should obviously not send None consumer-id, but I have not been able to reproduce that issue.
Comment 4 Filip Nguyen 2015-09-22 08:52:50 EDT
For the record, releasever.json is being deleted - fix for this is part of commit a3b43ce888488f69e4c8260503206bd5ef4092a9
Comment 5 John Sefler 2015-09-24 15:20:32 EDT
deferring to rhel-7.3.0
Comment 6 John Sefler 2015-11-16 14:07:51 EST
commit 5c48d059bb07b64b92722f249b38aaee7219ab47
    1262989: Fix unregister action when consumer is already 'Gone' on server

available in subscription-manager-1.16.3-1+
Comment 8 John Sefler 2015-11-16 14:37:55 EST
Testing with version...
[root@jsefler-6 ~]# rpm -q subscription-manager
subscription-manager-1.16.4-1.git.22.ca2fc3c.el6.x86_64


[root@jsefler-6 ~]# subscription-manager list --available 1>/tmp/stdout 2>/tmp/stderr
[root@jsefler-6 ~]# echo $?
0
[root@jsefler-6 ~]# cat /tmp/stdout
This consumer's profile has been deleted from the server. You can use command clean or unregister to remove local profile.
[root@jsefler-6 ~]# cat /tmp/stderr
[root@jsefler-6 ~]# 


Prior to commit 5c48d059bb07b64b92722f249b38aaee7219ab47, a call to list --avail resulted in an exitCode=70 EX_SOFTWARE with stderr="Unit {} has been deleted"

The new behavior is returning exitCode=0 which is masking the GoneException.  In my opinion, we should still be returning a non-zero exit code along with a slightly more informative message "Consumer profile {} has been deleted from the server. You can use command subscription-manager clean or unregister to remove the local profile."
Comment 9 Chris "Ceiu" Rog 2015-11-17 15:10:14 EST
commit 616ecda6db6ae8b054d7bbb8ba278bba242f4fd0
Author: Chris Rog <crog@redhat.com>
Date:   Mon Nov 16 14:45:01 2015 -0500

    1262989: Changed the error code returned when the consumer has been deleted on the server
    
    - When a x509 error occurs during a CLI command, or the active consumer was deleted on the server, subman now returns a non-zero error code and prints the message to stderr rather than stdout
Comment 10 John Sefler 2015-11-18 14:09:05 EST
The commit in comment 9 successfully changed the exit code to a non-zero 69 (EX_SOFTWARE(, but neglected to improve the stderr message.  Can we change it to...

"Consumer profile {ge.deleted_id} has been deleted from the server. You can use command subscription-manager clean or unregister to remove the local profile."
Comment 11 Adrian Likins 2015-11-18 17:57:22 EST
I'm not sure this scenario should be an error exit status. Or that it is an error.

(c&p from pr)


I think 616ecda6db6ae8b054d7bbb8ba278bba242f4fd0 also needs to be reverted.

Seeing a 410 GoneException is not an error. The correct behavior is for the client to _not_ show an error in that case. Thats what https://bugzilla.redhat.com/show_bug.cgi?id=1262989 is about.

The change here and in 616ecda6db6ae8b054d7bbb8ba278bba242f4fd0 apply to all cli invocations, even those that don't care about the consumer cert or registration status. 

The stdout message added in 5c48d059bb07b64b92722f249b38aaee7219ab47 is informational.
Comment 12 Chris "Ceiu" Rog 2015-11-19 11:14:16 EST
After further discussion, we've come to the conclusion that the non-zero return and output to stderr is acceptable given how subman handles 410s today.




commit 3ad13c20f6ab34cf2621bc48cdd7d15a82791d4f
Author: Chris Rog <crog@redhat.com>
Date:   Wed Nov 18 14:14:05 2015 -0500

    1262989: Updated error message to include consumer UUID
    
    - The error messages that can occur in subman when a consumer is
      deleted from the server before unregistering now include the
      consumer's UUID in the error messages.
    - Log messages related to this issue are no longer translated
Comment 15 Shwetha Kallesh 2015-12-09 09:33:16 EST
Marking verified as the cache gets cleared when you run unregister command, message for "subscription-manager list --avail after the consumer is deleted at server side" has been modified , and so is the exit code

[root@pogo-cn1100-01 ~]# subscription-manager version
server type: Red Hat Subscription Management
subscription management server: 0.9.51.11-1
subscription management rules: 5.15
subscription-manager: 1.16.6-1.el6
python-rhsm: 1.16.4-1.el6



[root@pogo-cn1100-01 ~]# subscription-manager register 
Registering to: subscription.rhn.stage.redhat.com:443/subscription
Username: qa@redhat.com
Password: 
The system has been registered with ID: fead1ea9-ea1f-4d00-85cc-bab239805e74 

[root@pogo-cn1100-01 ~]# curl -k -u *****:***** --request DELETE https://subscription.rhn.stage.redhat.com:443/subscription/consumers/fead1ea9-ea1f-4d00-85cc-bab239805e74

[root@pogo-cn1100-01 ~]# subscription-manager list --avail
Consumer profile "fead1ea9-ea1f-4d00-85cc-bab239805e74" has been deleted from the server. You can use command clean or unregister to remove local profile.

[root@pogo-cn1100-01 ~]# echo $?
69

[root@pogo-cn1100-01 ~]# subscription-manager unregister
System has been unregistered.

[root@pogo-cn1100-01 ~]# echo $?
127


[root@pogo-cn1100-01 ~]# ls -lart /etc/pki/consumer/
total 8
drwxr-xr-x. 13 root root 4096 Dec  8 09:25 ..
drwxr-xr-x.  2 root root 4096 Dec  9 09:13 .
[root@pogo-cn1100-01 ~]# ls -alrt /var/lib/rhsm/cache/
total 8
drwxr-xr-x. 5 root root 4096 Dec  9 03:56 ..
drwxr-xr-x. 2 root root 4096 Dec  9 09:13 .

[root@pogo-cn1100-01 ~]# tail -f /var/log/rhsm/rhsm.log
2015-12-09 09:12:07,434 [INFO] subscription-manager:25355 @cert_sorter.py:205 - Product status: valid_products= partial_products= expired_products= unentitled_producs=69 future_products= valid_until=None
2015-12-09 09:12:07,434 [DEBUG] subscription-manager:25355 @cert_sorter.py:207 - partial stacks: []
2015-12-09 09:12:07,435 [DEBUG] subscription-manager:25355 @cache.py:154 - Checking current system info against cache: /var/lib/rhsm/cache/installed_products.json
2015-12-09 09:12:07,436 [DEBUG] subscription-manager:25355 @cache.py:110 - Wrote cache: /var/lib/rhsm/cache/entitlement_status.json
2015-12-09 09:12:07,438 [DEBUG] subscription-manager:25355 @cache.py:171 - No changes.
2015-12-09 09:12:07,438 [DEBUG] subscription-manager:25355 @identity.py:131 - Loading consumer info from identity certificates.
2015-12-09 09:12:07,864 [INFO] rhsmd:25385 @rhsmd:230 - rhsmd started
2015-12-09 09:12:07,881 [DEBUG] rhsmd:25385 @rhsmd:185 - D-Bus interface com.redhat.SubscriptionManager.EntitlementStatus.update_status called with status = 1
2015-12-09 09:12:07,972 [DEBUG] rhsmd:25385 @identity.py:131 - Loading consumer info from identity certificates.
2015-12-09 09:12:07,983 [DEBUG] rhsmd:25385 @rhsmd:153 - D-Bus signal com.redhat.SubscriptionManager.EntitlementStatus.entitlement_status_changed emitted
2015-12-09 09:13:09,274 [DEBUG] subscription-manager:25388 @ga_loader.py:89 - ga_loader GaImporterGtk2
2015-12-09 09:13:09,285 [DEBUG] subscription-manager:25388 @plugins.py:569 - loaded plugin modules: [<module 'container_content' from '/usr/share/rhsm-plugins/container_content.pyc'>]
2015-12-09 09:13:09,285 [DEBUG] subscription-manager:25388 @plugins.py:570 - loaded plugins: {'container_content.ContainerContentPlugin': <container_content.ContainerContentPlugin object at 0x205de90>}
2015-12-09 09:13:09,285 [DEBUG] subscription-manager:25388 @identity.py:131 - Loading consumer info from identity certificates.
2015-12-09 09:13:09,377 [INFO] subscription-manager:25388 @managercli.py:357 - Client Versions: {'python-rhsm': '1.16.4-1.el6', 'subscription-manager': '1.16.6-1.el6'}
2015-12-09 09:13:09,379 [INFO] subscription-manager:25388 @connection.py:788 - Connection built: host=subscription.rhn.stage.redhat.com port=443 handler=/subscription auth=identity_cert ca_dir=/etc/rhsm/ca/ verify=False
2015-12-09 09:13:09,379 [INFO] subscription-manager:25388 @connection.py:788 - Connection built: host=subscription.rhn.stage.redhat.com port=443 handler=/subscription auth=none
2015-12-09 09:13:09,380 [INFO] subscription-manager:25388 @managercli.py:332 - Consumer Identity name=pogo-cn1100-01.rhts.eng.bos.redhat.com uuid=fead1ea9-ea1f-4d00-85cc-bab239805e74
2015-12-09 09:13:09,382 [DEBUG] subscription-manager:25388 @connection.py:482 - Loaded CA certificates from /etc/rhsm/ca/: redhat-entitlement-authority.pem, redhat-uep.pem
2015-12-09 09:13:09,383 [DEBUG] subscription-manager:25388 @connection.py:530 - Making request: DELETE /subscription/consumers/fead1ea9-ea1f-4d00-85cc-bab239805e74
2015-12-09 09:13:09,970 [DEBUG] subscription-manager:25388 @connection.py:562 - Response: status=410
2015-12-09 09:13:09,971 [INFO] subscription-manager:25388 @managerlib.py:798 - This consumer's profile has been deleted from the server. Local certificates and cache will be cleaned now.
2015-12-09 09:13:09,972 [DEBUG] subscription-manager:25388 @managerlib.py:845 - Removing identity cert: /etc/pki/consumer/key.pem
2015-12-09 09:13:09,972 [DEBUG] subscription-manager:25388 @managerlib.py:845 - Removing identity cert: /etc/pki/consumer/cert.pem
2015-12-09 09:13:09,972 [DEBUG] subscription-manager:25388 @identity.py:131 - Loading consumer info from identity certificates.
2015-12-09 09:13:09,972 [DEBUG] subscription-manager:25388 @identity.py:146 - Reload of consumer identity cert /etc/pki/consumer/cert.pem raised an exception with msg: [Errno 2] No such file or directory: '/etc/pki/consumer/key.pem'
2015-12-09 09:13:09,973 [DEBUG] subscription-manager:25388 @cache.py:86 - Deleting cache: /var/lib/rhsm/packages/packages.json
2015-12-09 09:13:09,974 [DEBUG] subscription-manager:25388 @cache.py:86 - Deleting cache: /var/lib/rhsm/cache/installed_products.json
2015-12-09 09:13:09,974 [DEBUG] subscription-manager:25388 @cache.py:86 - Deleting cache: /var/lib/rhsm/facts/facts.json
2015-12-09 09:13:09,974 [DEBUG] subscription-manager:25388 @cache.py:86 - Deleting cache: /var/lib/rhsm/cache/entitlement_status.json
2015-12-09 09:13:09,975 [INFO] subscription-manager:25388 @managerlib.py:879 - Cleaned local data
2015-12-09 09:13:09,977 [DEBUG] subscription-manager:25388 @base_action_client.py:85 - running lib: <subscription_manager.entcertlib.EntCertActionInvoker object at 0x2085950>
2015-12-09 09:13:09,978 [INFO] subscription-manager:25388 @entcertlib.py:131 - certs updated:
Total updates: 0
Found (local) serial# []
Expected (UEP) serial# []
Added (new)
  <NONE>
Deleted (rogue):
  <NONE>
2015-12-09 09:13:09,979 [DEBUG] subscription-manager:25388 @base_action_client.py:85 - running lib: <subscription_manager.content_action_client.ContentActionClient object at 0x2085b90>
2015-12-09 09:13:09,980 [DEBUG] subscription-manager:25388 @base_action_client.py:85 - running lib: <subscription_manager.repolib.RepoActionInvoker object at 0x2085d50>
2015-12-09 09:13:09,984 [DEBUG] subscription-manager:25388 @__init__.py:85 - Searching for content of type: yum
2015-12-09 09:13:09,986 [INFO] subscription-manager:25388 @repolib.py:296 - repos updated: Repo updates

Total repo updates: 0
Updated
    <NONE>
Added (new)
    <NONE>
Deleted
    <NONE>
2015-12-09 09:13:09,987 [DEBUG] subscription-manager:25388 @plugins.py:769 - Running update_content_hook in container_content.ContainerContentPlugin
2015-12-09 09:13:09,987 [DEBUG] subscription-manager:25388 @base_action_client.py:85 - running lib: <subscription_manager.content_action_client.ContentPluginActionInvoker object at 0x21c5490>
2015-12-09 09:13:09,988 [INFO] subscription-manager:25388 @container_content.py:43 - Updating container content.
2015-12-09 09:13:09,988 [INFO] subscription-manager:25388 @container_content.py:45 - registry hostnames = registry.access.redhat.com,cdn.redhat.com,access.redhat.com
2015-12-09 09:13:09,988 [DEBUG] subscription-manager:25388 @__init__.py:85 - Searching for content of type: containerimage
2015-12-09 09:13:09,989 [DEBUG] subscription-manager:25388 @container.py:53 - Got content_sets: []
2015-12-09 09:13:09,989 [DEBUG] subscription-manager:25388 @container.py:138 - Syncing container certificates to /etc/docker/certs.d/registry.access.redhat.com
2015-12-09 09:13:09,991 [DEBUG] subscription-manager:25388 @container.py:138 - Syncing container certificates to /etc/docker/certs.d/cdn.redhat.com
2015-12-09 09:13:09,992 [DEBUG] subscription-manager:25388 @container.py:138 - Syncing container certificates to /etc/docker/certs.d/access.redhat.com
2015-12-09 09:13:09,993 [DEBUG] subscription-manager:25388 @certdirectory.py:216 - Installed product IDs: ['69']
2015-12-09 09:13:09,994 [DEBUG] subscription-manager:25388 @cert_sorter.py:126 - Unregistered, skipping server compliance check.
2015-12-09 09:13:09,994 [DEBUG] subscription-manager:25388 @identity.py:131 - Loading consumer info from identity certificates.
2015-12-09 09:13:09,994 [DEBUG] subscription-manager:25388 @identity.py:146 - Reload of consumer identity cert /etc/pki/consumer/cert.pem raised an exception with msg: [Errno 2] No such file or directory: '/etc/pki/consumer/key.pem'
2015-12-09 09:13:10,331 [INFO] rhsmd:25390 @rhsmd:230 - rhsmd started
2015-12-09 09:13:10,336 [DEBUG] rhsmd:25390 @rhsmd:185 - D-Bus interface com.redhat.SubscriptionManager.EntitlementStatus.update_status called with status = 0
2015-12-09 09:13:10,425 [DEBUG] rhsmd:25390 @identity.py:131 - Loading consumer info from identity certificates.
2015-12-09 09:13:10,425 [DEBUG] rhsmd:25390 @identity.py:146 - Reload of consumer identity cert /etc/pki/consumer/cert.pem raised an exception with msg: [Errno 2] No such file or directory: '/etc/pki/consumer/key.pem'
2015-12-09 09:13:10,426 [DEBUG] rhsmd:25390 @rhsmd:153 - D-Bus signal com.redhat.SubscriptionManager.EntitlementStatus.entitlement_status_changed emitted
2015-12-09 09:13:10,426 [DEBUG] subscription-manager:25388 @utils.py:287 - No virt-who pid file, not attempting to restart
Comment 17 errata-xmlrpc 2016-05-10 16:37:11 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0797.html

Note You need to log in before you can comment on or make changes to this bug.