Bug 1261916 - huge base channel "channel.software.clone" XMLRPC: 500 read timeout
huge base channel "channel.software.clone" XMLRPC: 500 read timeout
Status: CLOSED CURRENTRELEASE
Product: Red Hat Satellite 5
Classification: Red Hat
Component: API (Show other bugs)
570
Unspecified Unspecified
medium Severity high
: ---
: ---
Assigned To: Tomas Lestach
Red Hat Satellite QA List
:
Depends On:
Blocks: sat570-triage
  Show dependency treegraph
 
Reported: 2015-09-10 08:36 EDT by Daniel Peess
Modified: 2017-05-25 10:50 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-05-25 10:50:06 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
channel.software.clone (7.79 KB, text/plain)
2015-09-10 08:40 EDT, Daniel Peess
no flags Details
channel.software.delete (4.48 KB, text/plain)
2015-09-10 08:41 EDT, Daniel Peess
no flags Details

  None (edit)
Description Daniel Peess 2015-09-10 08:36:03 EDT
Description of problem:

The Satellite v5.7 XMLRPC API goes nuts if you clone huge software channels.

Version-Release number of selected component (if applicable):

RHEL 6.6 + Satellite v5.7 with all updates as of 2015-09-09.

$ rpm -qa | grep -i 'satellite\|spacewalk' | sort;
satellite-branding-5.7.0.26-1.el6sat.noarch
satellite-doc-indexes-5.7.0-1.el6sat.noarch
satellite-repo-5.6.0.3-1.el6sat.noarch
satellite-schema-5.7.0.19-1.el6sat.noarch
spacewalk-admin-2.3.0-2.el6sat.noarch
spacewalk-backend-2.3.3-28.el6sat.noarch
spacewalk-backend-app-2.3.3-28.el6sat.noarch
spacewalk-backend-applet-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-common-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-tool-2.3.3-28.el6sat.noarch
spacewalk-backend-iss-2.3.3-28.el6sat.noarch
spacewalk-backend-iss-export-2.3.3-28.el6sat.noarch
spacewalk-backend-libs-2.3.3-28.el6sat.noarch
spacewalk-backend-package-push-server-2.3.3-28.el6sat.noarch
spacewalk-backend-server-2.3.3-28.el6sat.noarch
spacewalk-backend-sql-2.3.3-28.el6sat.noarch
spacewalk-backend-sql-postgresql-2.3.3-28.el6sat.noarch
spacewalk-backend-tools-2.3.3-28.el6sat.noarch
spacewalk-backend-xml-export-libs-2.3.3-28.el6sat.noarch
spacewalk-backend-xmlrpc-2.3.3-28.el6sat.noarch
spacewalk-base-2.3.2-32.el6sat.noarch
spacewalk-base-minimal-2.3.2-32.el6sat.noarch
spacewalk-base-minimal-config-2.3.2-32.el6sat.noarch
spacewalk-certs-tools-2.3.0-4.el6sat.noarch
spacewalk-common-2.3.0-1.5.el6sat.noarch
spacewalk-config-2.3.0-4.el6sat.noarch
spacewalk-dobby-2.3.2-32.el6sat.noarch
spacewalk-grail-2.3.2-32.el6sat.noarch
spacewalk-html-2.3.2-32.el6sat.noarch
spacewalk-java-2.3.8-112.el6sat.noarch
spacewalk-java-config-2.3.8-112.el6sat.noarch
spacewalk-java-lib-2.3.8-112.el6sat.noarch
spacewalk-java-postgresql-2.3.8-112.el6sat.noarch
spacewalk-monitoring-2.2.1-1.el6sat.noarch
spacewalk-monitoring-selinux-2.2.1-1.el6sat.noarch
spacewalk-postgresql-2.3.0-1.5.el6sat.noarch
spacewalk-pxt-2.3.2-32.el6sat.noarch
spacewalk-remote-utils-2.3.0-5.el6sat.noarch
spacewalk-reports-2.3.0-8.el6sat.noarch
spacewalk-schema-2.3.2-21.el6sat.noarch
spacewalk-search-2.3.0-7.el6sat.noarch
spacewalk-selinux-2.2.1-1.el6sat.noarch
spacewalk-setup-2.3.0-17.el6sat.noarch
spacewalk-setup-jabberd-2.0.1-1.el6sat.noarch
spacewalk-setup-postgresql-2.3.0-21.el6sat.noarch
spacewalk-slf4j-1.6.1-6.el6sat.noarch
spacewalk-sniglets-2.3.2-32.el6sat.noarch
spacewalk-ssl-cert-check-2.3-1.el6sat.noarch
spacewalk-taskomatic-2.3.8-112.el6sat.noarch
spacewalk-utils-2.3.2-22.el6sat.noarch


How reproducible:

$ grep -- 'call(' `which sat-xmlrpc-clone_channel.pl`;
        $session = $client->call( 'auth.login', "$optaccount", "$optpassword" );
                $session = $client->call( 'auth.logout', $session );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $spackages = $client->call( 'channel.software.listAllPackages', $session, "$channel" );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
                my $cid = $client->call( 'channel.software.clone', $session, "$source", \%details, $client->boolean( "0" ) );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );
                $sharing = $client->call( 'channel.access.getOrgSharing', $session, "$source" );
                $client->call( 'channel.access.setOrgSharing', $session, "$target", "$sharing" );
                my $orgs = $client->call( 'org.trusts.listOrgs', $session );
                                $client->call( 'channel.org.enableAccess', $session, "$target", "$org->{org_id}" );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );

$ time sat-xmlrpc-clone_channel.pl --account 'satadmin-rbg' --password 'XXXXXXXXXXX' --source 'null-rhel-x86_64-server-6-20141204' --prefix 'tst1' --prestrip 4;
satellite xmlrpc authentication successful.
target channel does not exist yet, source channel found and is a parent channel.

cloning parent channel and all of its child channels, this takes some time...
cloned: null-rhel-x86_64-server-6-20141204 -> tst1-rhel-x86_64-server-6-20141204, packages: 14382/14382, id: 411
500 read timeout

real    3m42.517s
user    0m12.372s
sys     0m0.207s

$ spacecmd -q -u 'satadmin-rbg' -p 'XXXXXXXXXXX' -- softwarechannel_listbasechannels;
null-rhel-x86_64-server-6-20121016
null-rhel-x86_64-server-6-20141001
null-rhel-x86_64-server-6-20141204
rbg-rhel-x86_64-server-6-20121016
rbg-rhel-x86_64-server-6-20141001
rbg-rhel-x86_64-server-6-20141204
rhel-x86_64-server-6
tst1-rhel-x86_64-server-6-20141204

$ sat-xmlrpc-delete_channel.pl --account "satadmin-rbg" --password 'XXXXXXXXXXX' --prefix "tst1-";
satellite xmlrpc authentication successful.
ERROR: could not delete the satellite channel "tst1-rhel-x86_64-server-6-20141204": Fault returned from XML RPC Server, fault code -1: redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ERROR: update or delete on table "rhnchannel" violates foreign key constraint "rhn_channel_parent_ch_fk" on table "rhnchannel"
  Detail: Key (id)=(411) is still referenced from table "rhnchannel".
  Where: SQL statement "delete from rhnChannel where id = channel_id_in"
PL/pgSQL function delete_channel(numeric) line 13 at SQL statement

$ rhn-satellite restart;

$ spacecmd -q -u 'satadmin-rbg' -p 'XXXXXXXXXXX' -- softwarechannel_listbasechannels;
null-rhel-x86_64-server-6-20121016
null-rhel-x86_64-server-6-20141001
null-rhel-x86_64-server-6-20141204
rbg-rhel-x86_64-server-6-20121016
rbg-rhel-x86_64-server-6-20141001
rbg-rhel-x86_64-server-6-20141204
rhel-x86_64-server-6
tst1-rhel-x86_64-server-6-20141204

$ sat-xmlrpc-delete_channel.pl --account "satadmin-rbg" --password 'XXXXXXXXXXX' --prefix "tst1-rhel-x86_64-server-6-20141204";
satellite xmlrpc authentication successful.
deleted "tst1-rhel-x86_64-server-6-20141204" child: "tst1-rhn-tools-rhel-x86_64-server-6-20141204"
deleted parent: "tst1-rhel-x86_64-server-6-20141204"
done.


Steps to Reproduce:
1. Use the "channel.software.clone" API for several huge base and their child software channels.
2. It never completes with a "500 read timeout".
3. Deleting the same failed clone fails as well, until you restart all processes.

Actual results:

Half-cloned Software Channel without any child channels.
Sometimes it even completes the base channel clone and only goes into timeout while cloning the first or second of its child channels.

Expected results:

Fully-cloned Software Channel with all child channels.

Additional info:

This exact same XMLRPC API script worked fine with Satellite v5.4.1
and RHEL6 base channel cloning when it was still below 12.0000 packages.

We have also tested 'spacewalk-clone-by-date', and this does complete successfully for the same base channel and its child channels.
It does take 90 minutes instead of 4 minutes 'though,
so they might query the API for some "background action for this channel API call has completed, you can continue now without killing the server process".
Comment 1 Daniel Peess 2015-09-10 08:40:38 EDT
Created attachment 1072162 [details]
channel.software.clone

XMLRPC API channel.software.clone script.
Comment 2 Daniel Peess 2015-09-10 08:41:18 EDT
Created attachment 1072165 [details]
channel.software.delete
Comment 3 Daniel Peess 2015-09-10 08:44:51 EDT
high timeouts and big memory alloc already set:

# grep -- '^Timeout' /etc/httpd/conf/httpd.conf;
Timeout 1800
# grep -- '<session-timeout' /var/lib/tomcat6/webapps/rhn/WEB-INF/web.xml;
    <session-timeout>1800</session-timeout>
# grep -- 'web.session_cookie_lifetime' /etc/rhn/rhn.conf;
web.session_cookie_lifetime = 1800
# grep -- 'web.session_database_lifetime' /etc/rhn/rhn.conf;
web.session_database_lifetime = 3600
# grep -i -- '^ProxyTimeout' /etc/httpd/conf.d/zz-spacewalk-www.conf;
ProxyTimeout 1800

# grep -- '^wrapper.java.initmemory' /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf;
wrapper.java.initmemory=8192
# grep -- '^wrapper.java.maxmemory' /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf;
wrapper.java.maxmemory=16384
# grep -- '^JAVA_OPTS' /etc/sysconfig/tomcat6;
JAVA_OPTS="$JAVA_OPTS -ea -Xms8192m -Xmx16384m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -Dorg.apache.tomcat.util.http.Parameters.MAX_COUNT=1024 -Dcom.ibm.tools.attach.enable=no -XX:MaxNewSize=256 -XX:-UseConcMarkSweepGC"
# grep -- 'JAVA_OPTS' /etc/tomcat6/tomcat6.conf;
#JAVA_OPTS="-Xminf0.1 -Xmaxf0.3"
# Use JAVA_OPTS to set java.library.path for libtcnative.so
#JAVA_OPTS="-Djava.library.path=/usr/lib64"
JAVA_OPTS="-Djavax.sql.DataSource.Factory=org.apache.commons.dbcp.BasicDataSourceFactory"
# ps -ef | grep -i -- 'tomcat';
tomcat   42898     1 18 20:55 ?        00:00:11 /usr/lib/jvm/java/bin/java -Djavax.sql.DataSource.Factory=org.apache.commons.dbcp.BasicDataSourceFactory -ea -Xms8192m -Xmx16384m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -Dorg.apache.tomcat.util.http.Parameters.MAX_COUNT=1024 -Dcom.ibm.tools.attach.enable=no -XX:MaxNewSize=256 -XX:-UseConcMarkSweepGC -classpath :/usr/share/tomcat6/bin/bootstrap.jar:/usr/share/tomcat6/bin/tomcat-juli.jar:/usr/share/java/commons-daemon.jar -Dcatalina.base=/usr/share/tomcat6 -Dcatalina.home=/usr/share/tomcat6 -Djava.endorsed.dirs= -Djava.io.tmpdir=/var/cache/tomcat6/temp -Djava.util.logging.config.file=/usr/share/tomcat6/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager org.apache.catalina.startup.Bootstrap start
root     44434 58784  0 20:56 pts/1    00:00:00 grep -i -- tomcat
Comment 4 Daniel Peess 2015-09-10 09:07:14 EDT
embedded postgresql database not even that huge:
# du -msc /opt/rh/postgresql92/root/var/lib/pgsql/;
6456    /opt/rh/postgresql92/root/var/lib/pgsql/
6456    total

Note You need to log in before you can comment on or make changes to this bug.