Bug 1261916 - huge base channel "channel.software.clone" XMLRPC: 500 read timeout
Summary: huge base channel "channel.software.clone" XMLRPC: 500 read timeout
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite 5
Classification: Red Hat
Component: API
Version: 570
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
Assignee: Tomas Lestach
QA Contact: Red Hat Satellite QA List
URL:
Whiteboard:
Depends On:
Blocks: sat570-triage
TreeView+ depends on / blocked
 
Reported: 2015-09-10 12:36 UTC by Daniel Peess
Modified: 2019-08-15 05:23 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-05-25 14:50:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
channel.software.clone (7.79 KB, text/plain)
2015-09-10 12:40 UTC, Daniel Peess
no flags Details
channel.software.delete (4.48 KB, text/plain)
2015-09-10 12:41 UTC, Daniel Peess
no flags Details

Description Daniel Peess 2015-09-10 12:36:03 UTC
Description of problem:

The Satellite v5.7 XMLRPC API goes nuts if you clone huge software channels.

Version-Release number of selected component (if applicable):

RHEL 6.6 + Satellite v5.7 with all updates as of 2015-09-09.

$ rpm -qa | grep -i 'satellite\|spacewalk' | sort;
satellite-branding-5.7.0.26-1.el6sat.noarch
satellite-doc-indexes-5.7.0-1.el6sat.noarch
satellite-repo-5.6.0.3-1.el6sat.noarch
satellite-schema-5.7.0.19-1.el6sat.noarch
spacewalk-admin-2.3.0-2.el6sat.noarch
spacewalk-backend-2.3.3-28.el6sat.noarch
spacewalk-backend-app-2.3.3-28.el6sat.noarch
spacewalk-backend-applet-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-common-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-tool-2.3.3-28.el6sat.noarch
spacewalk-backend-iss-2.3.3-28.el6sat.noarch
spacewalk-backend-iss-export-2.3.3-28.el6sat.noarch
spacewalk-backend-libs-2.3.3-28.el6sat.noarch
spacewalk-backend-package-push-server-2.3.3-28.el6sat.noarch
spacewalk-backend-server-2.3.3-28.el6sat.noarch
spacewalk-backend-sql-2.3.3-28.el6sat.noarch
spacewalk-backend-sql-postgresql-2.3.3-28.el6sat.noarch
spacewalk-backend-tools-2.3.3-28.el6sat.noarch
spacewalk-backend-xml-export-libs-2.3.3-28.el6sat.noarch
spacewalk-backend-xmlrpc-2.3.3-28.el6sat.noarch
spacewalk-base-2.3.2-32.el6sat.noarch
spacewalk-base-minimal-2.3.2-32.el6sat.noarch
spacewalk-base-minimal-config-2.3.2-32.el6sat.noarch
spacewalk-certs-tools-2.3.0-4.el6sat.noarch
spacewalk-common-2.3.0-1.5.el6sat.noarch
spacewalk-config-2.3.0-4.el6sat.noarch
spacewalk-dobby-2.3.2-32.el6sat.noarch
spacewalk-grail-2.3.2-32.el6sat.noarch
spacewalk-html-2.3.2-32.el6sat.noarch
spacewalk-java-2.3.8-112.el6sat.noarch
spacewalk-java-config-2.3.8-112.el6sat.noarch
spacewalk-java-lib-2.3.8-112.el6sat.noarch
spacewalk-java-postgresql-2.3.8-112.el6sat.noarch
spacewalk-monitoring-2.2.1-1.el6sat.noarch
spacewalk-monitoring-selinux-2.2.1-1.el6sat.noarch
spacewalk-postgresql-2.3.0-1.5.el6sat.noarch
spacewalk-pxt-2.3.2-32.el6sat.noarch
spacewalk-remote-utils-2.3.0-5.el6sat.noarch
spacewalk-reports-2.3.0-8.el6sat.noarch
spacewalk-schema-2.3.2-21.el6sat.noarch
spacewalk-search-2.3.0-7.el6sat.noarch
spacewalk-selinux-2.2.1-1.el6sat.noarch
spacewalk-setup-2.3.0-17.el6sat.noarch
spacewalk-setup-jabberd-2.0.1-1.el6sat.noarch
spacewalk-setup-postgresql-2.3.0-21.el6sat.noarch
spacewalk-slf4j-1.6.1-6.el6sat.noarch
spacewalk-sniglets-2.3.2-32.el6sat.noarch
spacewalk-ssl-cert-check-2.3-1.el6sat.noarch
spacewalk-taskomatic-2.3.8-112.el6sat.noarch
spacewalk-utils-2.3.2-22.el6sat.noarch


How reproducible:

$ grep -- 'call(' `which sat-xmlrpc-clone_channel.pl`;
        $session = $client->call( 'auth.login', "$optaccount", "$optpassword" );
                $session = $client->call( 'auth.logout', $session );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $spackages = $client->call( 'channel.software.listAllPackages', $session, "$channel" );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
                my $cid = $client->call( 'channel.software.clone', $session, "$source", \%details, $client->boolean( "0" ) );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );
                $sharing = $client->call( 'channel.access.getOrgSharing', $session, "$source" );
                $client->call( 'channel.access.setOrgSharing', $session, "$target", "$sharing" );
                my $orgs = $client->call( 'org.trusts.listOrgs', $session );
                                $client->call( 'channel.org.enableAccess', $session, "$target", "$org->{org_id}" );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );

$ time sat-xmlrpc-clone_channel.pl --account 'satadmin-rbg' --password 'XXXXXXXXXXX' --source 'null-rhel-x86_64-server-6-20141204' --prefix 'tst1' --prestrip 4;
satellite xmlrpc authentication successful.
target channel does not exist yet, source channel found and is a parent channel.

cloning parent channel and all of its child channels, this takes some time...
cloned: null-rhel-x86_64-server-6-20141204 -> tst1-rhel-x86_64-server-6-20141204, packages: 14382/14382, id: 411
500 read timeout

real    3m42.517s
user    0m12.372s
sys     0m0.207s

$ spacecmd -q -u 'satadmin-rbg' -p 'XXXXXXXXXXX' -- softwarechannel_listbasechannels;
null-rhel-x86_64-server-6-20121016
null-rhel-x86_64-server-6-20141001
null-rhel-x86_64-server-6-20141204
rbg-rhel-x86_64-server-6-20121016
rbg-rhel-x86_64-server-6-20141001
rbg-rhel-x86_64-server-6-20141204
rhel-x86_64-server-6
tst1-rhel-x86_64-server-6-20141204

$ sat-xmlrpc-delete_channel.pl --account "satadmin-rbg" --password 'XXXXXXXXXXX' --prefix "tst1-";
satellite xmlrpc authentication successful.
ERROR: could not delete the satellite channel "tst1-rhel-x86_64-server-6-20141204": Fault returned from XML RPC Server, fault code -1: redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ERROR: update or delete on table "rhnchannel" violates foreign key constraint "rhn_channel_parent_ch_fk" on table "rhnchannel"
  Detail: Key (id)=(411) is still referenced from table "rhnchannel".
  Where: SQL statement "delete from rhnChannel where id = channel_id_in"
PL/pgSQL function delete_channel(numeric) line 13 at SQL statement

$ rhn-satellite restart;

$ spacecmd -q -u 'satadmin-rbg' -p 'XXXXXXXXXXX' -- softwarechannel_listbasechannels;
null-rhel-x86_64-server-6-20121016
null-rhel-x86_64-server-6-20141001
null-rhel-x86_64-server-6-20141204
rbg-rhel-x86_64-server-6-20121016
rbg-rhel-x86_64-server-6-20141001
rbg-rhel-x86_64-server-6-20141204
rhel-x86_64-server-6
tst1-rhel-x86_64-server-6-20141204

$ sat-xmlrpc-delete_channel.pl --account "satadmin-rbg" --password 'XXXXXXXXXXX' --prefix "tst1-rhel-x86_64-server-6-20141204";
satellite xmlrpc authentication successful.
deleted "tst1-rhel-x86_64-server-6-20141204" child: "tst1-rhn-tools-rhel-x86_64-server-6-20141204"
deleted parent: "tst1-rhel-x86_64-server-6-20141204"
done.


Steps to Reproduce:
1. Use the "channel.software.clone" API for several huge base and their child software channels.
2. It never completes with a "500 read timeout".
3. Deleting the same failed clone fails as well, until you restart all processes.

Actual results:

Half-cloned Software Channel without any child channels.
Sometimes it even completes the base channel clone and only goes into timeout while cloning the first or second of its child channels.

Expected results:

Fully-cloned Software Channel with all child channels.

Additional info:

This exact same XMLRPC API script worked fine with Satellite v5.4.1
and RHEL6 base channel cloning when it was still below 12.0000 packages.

We have also tested 'spacewalk-clone-by-date', and this does complete successfully for the same base channel and its child channels.
It does take 90 minutes instead of 4 minutes 'though,
so they might query the API for some "background action for this channel API call has completed, you can continue now without killing the server process".

Comment 1 Daniel Peess 2015-09-10 12:40:38 UTC
Created attachment 1072162 [details]
channel.software.clone

XMLRPC API channel.software.clone script.

Comment 2 Daniel Peess 2015-09-10 12:41:18 UTC
Created attachment 1072165 [details]
channel.software.delete

Comment 3 Daniel Peess 2015-09-10 12:44:51 UTC
high timeouts and big memory alloc already set:

# grep -- '^Timeout' /etc/httpd/conf/httpd.conf;
Timeout 1800
# grep -- '<session-timeout' /var/lib/tomcat6/webapps/rhn/WEB-INF/web.xml;
    <session-timeout>1800</session-timeout>
# grep -- 'web.session_cookie_lifetime' /etc/rhn/rhn.conf;
web.session_cookie_lifetime = 1800
# grep -- 'web.session_database_lifetime' /etc/rhn/rhn.conf;
web.session_database_lifetime = 3600
# grep -i -- '^ProxyTimeout' /etc/httpd/conf.d/zz-spacewalk-www.conf;
ProxyTimeout 1800

# grep -- '^wrapper.java.initmemory' /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf;
wrapper.java.initmemory=8192
# grep -- '^wrapper.java.maxmemory' /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf;
wrapper.java.maxmemory=16384
# grep -- '^JAVA_OPTS' /etc/sysconfig/tomcat6;
JAVA_OPTS="$JAVA_OPTS -ea -Xms8192m -Xmx16384m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -Dorg.apache.tomcat.util.http.Parameters.MAX_COUNT=1024 -Dcom.ibm.tools.attach.enable=no -XX:MaxNewSize=256 -XX:-UseConcMarkSweepGC"
# grep -- 'JAVA_OPTS' /etc/tomcat6/tomcat6.conf;
#JAVA_OPTS="-Xminf0.1 -Xmaxf0.3"
# Use JAVA_OPTS to set java.library.path for libtcnative.so
#JAVA_OPTS="-Djava.library.path=/usr/lib64"
JAVA_OPTS="-Djavax.sql.DataSource.Factory=org.apache.commons.dbcp.BasicDataSourceFactory"
# ps -ef | grep -i -- 'tomcat';
tomcat   42898     1 18 20:55 ?        00:00:11 /usr/lib/jvm/java/bin/java -Djavax.sql.DataSource.Factory=org.apache.commons.dbcp.BasicDataSourceFactory -ea -Xms8192m -Xmx16384m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -Dorg.apache.tomcat.util.http.Parameters.MAX_COUNT=1024 -Dcom.ibm.tools.attach.enable=no -XX:MaxNewSize=256 -XX:-UseConcMarkSweepGC -classpath :/usr/share/tomcat6/bin/bootstrap.jar:/usr/share/tomcat6/bin/tomcat-juli.jar:/usr/share/java/commons-daemon.jar -Dcatalina.base=/usr/share/tomcat6 -Dcatalina.home=/usr/share/tomcat6 -Djava.endorsed.dirs= -Djava.io.tmpdir=/var/cache/tomcat6/temp -Djava.util.logging.config.file=/usr/share/tomcat6/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager org.apache.catalina.startup.Bootstrap start
root     44434 58784  0 20:56 pts/1    00:00:00 grep -i -- tomcat

Comment 4 Daniel Peess 2015-09-10 13:07:14 UTC
embedded postgresql database not even that huge:
# du -msc /opt/rh/postgresql92/root/var/lib/pgsql/;
6456    /opt/rh/postgresql92/root/var/lib/pgsql/
6456    total


Note You need to log in before you can comment on or make changes to this bug.