Bug 1114117 - rhsm-debug takes forever
Summary: rhsm-debug takes forever
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: subscription-manager
Version: 7.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: 7.1
Assignee: William Poteat
QA Contact: John Sefler
Depends On:
Blocks: rhsm-rhel71 1116349
TreeView+ depends on / blocked
Reported: 2014-06-27 19:05 UTC by Bryn M. Reeves
Modified: 2015-03-05 13:17 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1116349 (view as bug list)
Last Closed: 2015-03-05 13:17:30 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0502 normal SHIPPED_LIVE subscription-manager bug fix and enhancement update 2015-03-05 16:20:39 UTC

Description Bryn M. Reeves 2014-06-27 19:05:51 UTC
Description of problem:
Running rhsm-debug on a simple rhel7 VM installation takes around 90s:

# time rhsm-debug system --sos --no-archive --destination /var/tmp/out/
Wrote: /var/tmp/out//rhsm-debug-system-20140627-165705

real    1m26.624s
user    0m6.045s
sys     0m0.384s

This is run by default by sos and on a typical install now accounts for around 1/3rd of the total runtime:

# time sosreport -vv --batch
real    2m41.154s
user    0m57.406s
sys     0m11.706s

We really need to get this down or we'll have to make collection of the rhsm-debug data optional.

Version-Release number of selected component (if applicable):
# rpm -q subscription-manager

How reproducible:
100% on my systems

Steps to Reproduce:
1. time rhsm-debug system --sos --no-archive --destination /var/tmp/out/

Actual results:
Routinely over 90s to generate ~16M of json data.

Expected results:
Ideally less than 10s for a default run. The rhsm-debug script is by far the longest-running command we call today:

'selinuxconlist' 0.219773s
'sh' 0.221468s
'nmcli' 0.227090s
'systemctl' 0.232961s
'lsof' 0.306699s
'sadf' 0.337759s
'ls' 0.409806s
'lsusb' 0.484617s
'tuned-adm' 0.494981s
'parted' 1.025333s
'firewall-cmd' 1.067522s
'lsinitrd' 1.360956s
'grub2-mkconfig' 2.074552s
'semodule' 3.628310s
'yum' 4.272996s
'rpm' 4.824193s
'subscription-manager' 11.536240s
'journalctl' 17.303878s
'stap-report' 19.274455s
'rhsm-debug' 93.130227s

Additional info:

Comment 1 Bryn M. Reeves 2014-06-30 11:36:14 UTC
Seems like most of the time is going in getSubscriptionList:

138655/41032    2.669    0.000    5.625    0.000 connection.py:390(_decode_dict)
        8    0.000    0.000    6.194    0.774 decoder.py:360(decode)
        8    0.000    0.000    6.194    0.774 __init__.py:293(loads)
        8    0.569    0.071    6.194    0.774 decoder.py:371(raw_decode)
4515293/1222201    3.725    0.000    6.691    0.000 encoder.py:341(_iterencode_dict)
3603979/1221546    2.613    0.000    7.549    0.000 encoder.py:288(_iterencode_list)
  1222210    0.903    0.000    8.458    0.000 encoder.py:417(_iterencode)
        6    0.509    0.085    9.027    1.505 encoder.py:186(encode)
        6    0.025    0.004    9.052    1.509 __init__.py:193(dumps)
        6    0.002    0.000    9.071    1.512 debug_commands.py:191(_write_flat_file)
        1    0.000    0.000    9.569    9.569 connection.py:984(getPoolsList)
     4388    0.051    0.000   21.238    0.005 socket.py:336(read)
     4388    0.040    0.000   21.296    0.005 httplib.py:637(_safe_read)
        8    0.033    0.004   22.250    2.781 httplib.py:579(_read_chunked)
        8    0.002    0.000   22.251    2.781 httplib.py:534(read)
        8    0.000    0.000   32.910    4.114 httplib.py:363(_read_status)
        8    0.000    0.000   32.941    4.118 httplib.py:1009(getresponse)
        8    0.000    0.000   32.941    4.118 httplib.py:402(begin)
     2284    0.102    0.000   33.847    0.015 socket.py:406(readline)
        1    0.000    0.000   51.713   51.713 connection.py:1128(getSubscriptionList)
    18616   54.769    0.003   54.769    0.003 {M2Crypto.__m2crypto.ssl_read}
    18616    0.041    0.000   54.810    0.003 Connection.py:210(_read_bio)
    18616    0.042    0.000   54.852    0.003 Connection.py:226(read)
        8    0.001    0.000   66.293    8.287 connection.py:421(_request)
        8    0.001    0.000   66.297    8.287 connection.py:553(request_get)
        1    0.031    0.031   74.103   74.103 debug_commands.py:72(_do_command)
        1    0.000    0.000   75.566   75.566 cli.py:146(main)
        1    0.000    0.000   75.566   75.566 managercli.py:301(main)

Which'd fit with the size of the json archives:

# ll -h /var/tmp/out/rhsm-debug-system-20140630-896709/
total 18M
-rw-r--r--. 1 root root  19K Jun 30 13:09 compliance.json
-rw-r--r--. 1 root root  11K Jun 30 13:09 consumer.json
-rw-r--r--. 1 root root  13K Jun 30 13:09 entitlements.json
-rw-r--r--. 1 root root 480K Jun 30 13:09 pools.json
-rw-r--r--. 1 root root  17M Jun 30 13:09 subscriptions.json
-rw-r--r--. 1 root root  192 Jun 30 13:09 version.json

Comment 3 William Poteat 2014-06-30 20:07:46 UTC
This is a large amount of data. It is expected that it will take time to collect. As this is a 1 off type of tool that is run as part of manual troubleshooting, I'd  say that 90 seconds is not prohibitive.

Comment 4 Bryn M. Reeves 2014-07-01 11:05:03 UTC
> is run as part of manual troubleshooting

Not really. Running rhsm-debug by default in sos was requested in bug 1039036. This measn it runs every time a user runs sosreport (or when sos is run under automation e.g. via abrt, rhev-log-collector, rhs-log-collector etc.).

90s is an eternity in terms of an interactive support data collector tool and we routinely get complaints from customers and GSS associates about the time taken for sos to complete. As I said in comment #0 rhsm-debug is accounting for as much as 50% of our total runtime in a single command.

If this can't be improved in subscription-manager I'll file a bug against sos to have this command disabled from 7.1 onwards.

Comment 5 Bryn M. Reeves 2014-07-01 11:07:39 UTC
Incidentally the hosts I'm testing on collect ~160M of data in the remaining run time. That's around ten times the quantity of data generated by rhsm-debug in about the same period of time so I don't think the quantity argument in comment #3 really stands.

Comment 6 William Poteat 2014-07-01 13:09:49 UTC
Do those hosts collect it across the network from another machine that is concurrently handling a large number of entitlements? This data is not coming from the client, but rather out of the entitlement engine.

If the subscription data is not needed, then file an RFE to make it a conditional part of this process.

Comment 7 William Poteat 2014-07-01 13:54:19 UTC
It is also worth noting that the subscription data is compiled in the adapter layer at the entilement engine. This layer is controlled at stage and production by IT. These adapters pull the subscription data from additional systems. This would be the point of bottleneck.

Comment 8 Bryn M. Reeves 2014-07-01 14:33:12 UTC
> Do those hosts collect it across the network from another machine that is 
> concurrently handling a large number of entitlements?

I don't know; this is just using the default configuration with public RHN hosts. The only host I see the process connecting to is subcription.rhn.redhat.com.

> This data is not coming from the client, but rather out of the entitlement 
> engine.

In that case is there any need to generate it during collection? Can it be generated after-the-fact on an analysis host or does it require private key material from the client (that we deliberately don't collect)?

> If the subscription data is not needed

I'm not really the person to make that call; the rhsm team requested we collect 'rhsm-debug system --sos --no-archive' by default when sos is run. If that's not needed (or if there are additional options we should pass to remove unneeded data) then it's easy enough to make that change but it's hard for me to evaluate what is and isn't needed to support these components.

Comment 9 Adrian Likins 2014-07-01 18:16:26 UTC
I would say to remove getSubscriptionList call from rhsm_debug. Looks like it can be huge, or almost empty depending on server permission set up. The data is potentially useful for debugging subscription issues, but not 90s worth. It's
also not used by subscription-manager directly.

Tempting to say to remove all the calls that hit the network, since theoretically that data should be available to the folks doing the troubleshooting. I don't think that is actually always the case, and it loses some of the snapshot-in-time value, but it would be quicker and simpler.

Comment 10 Bryan Kearney 2014-07-02 13:00:51 UTC
will, go back on what said. Go ahead with removing it.

Comment 11 William Poteat 2014-07-03 20:08:29 UTC
Added --no-subscriptions option to the rhsm-debug command

commit 68a1a418c27172c4fb851d536813f8060f4d3d1f

Comment 13 John Sefler 2014-07-11 17:40:50 UTC
comment 11 has been committed to master which will show up in versions >= subscription-manager-1.12.7-1 which will be delivered with RHEL6.6 and subsequently RHEL7.1 and beyond.

Comment 14 Bryan Kearney 2014-07-30 19:24:13 UTC
Acking 7.1

Comment 16 John Sefler 2014-11-21 21:06:11 UTC
Verifying Version...
[root@jsefler-os7 ~]# rpm -q subscription-manager

[root@jsefler-os7 ~]# rhsm-debug system --help
Usage: rhsm-debug system [OPTIONS] 

Assemble system information as a tar file or directory

  -h, --help            show this help message and exit
  --proxy=PROXY_URL     proxy URL in the form of proxy_hostname:proxy_port
                        user for HTTP proxy with basic authentication
                        password for HTTP proxy with basic authentication
                        the destination location of the result; default is
  --no-archive          data will be in an uncompressed directory
  --sos                 only data not already included in sos report will be
  --no-subscriptions    exclude subscription data
VERIFIED" The newly implemented --no-subscriptions appears in the command help.

[root@jsefler-os7 ~]# subscription-manager register --serverur=https://jsefler-os-candlepin.usersys.redhat.com:8443/candlepin
Username: testuser1
Organization: admin
The system has been registered with ID: f63d7339-3e5b-41ed-8c75-de8f21f66fd9 

[root@jsefler-os7 ~]# time rhsm-debug system --sos --no-archive --destination /tmp/test
Wrote: /tmp/test/rhsm-debug-system-20141121-274525

real	0m2.578s
user	0m1.486s
sys	0m0.177s
[root@jsefler-os7 ~]# ls -l /tmp/test/rhsm-debug-system-20141121-274525
total 1940
-rw-r--r--. 1 root root     543 Nov 21 15:57 compliance.json
-rw-r--r--. 1 root root   10337 Nov 21 15:57 consumer.json
-rw-r--r--. 1 root root       2 Nov 21 15:57 entitlements.json
-rw-r--r--. 1 root root  275794 Nov 21 15:57 pools.json
-rw-r--r--. 1 root root 1680220 Nov 21 15:57 subscriptions.json
-rw-r--r--. 1 root root     185 Nov 21 15:57 version.json
[root@jsefler-os7 ~]# 
NOTICE: without specifying --no-subscriptions, the rhsm-debug took 2.578 seconds and wrote a large subscriptions.json file that was retrieved from the server.

[root@jsefler-os7 ~]# time rhsm-debug system --sos --no-archive --destination /tmp/test --no-subscriptions
Wrote: /tmp/test/rhsm-debug-system-20141121-959727

real	0m1.328s
user	0m0.688s
sys	0m0.141s
[root@jsefler-os7 ~]# ls -l /tmp/test/rhsm-debug-system-20141121-959727
total 296
-rw-r--r--. 1 root root    543 Nov 21 16:02 compliance.json
-rw-r--r--. 1 root root  10337 Nov 21 16:02 consumer.json
-rw-r--r--. 1 root root      2 Nov 21 16:02 entitlements.json
-rw-r--r--. 1 root root 275794 Nov 21 16:02 pools.json
-rw-r--r--. 1 root root    185 Nov 21 16:02 version.json
[root@jsefler-os7 ~]# 

VERIFIED: when specifying --no-subscriptions, the rhsm-debug took 1.328 seconds (50% faster) and did not retrieve a subscriptions.json file from the server.

Comment 18 errata-xmlrpc 2015-03-05 13:17:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.