Bug 1261916

Summary: huge base channel "channel.software.clone" XMLRPC: 500 read timeout
Product: Red Hat Satellite 5 Reporter: Daniel Peess <dpeess>
Component: APIAssignee: Tomas Lestach <tlestach>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Satellite QA List <satqe-list>
Severity: high Docs Contact:
Priority: medium    
Version: 570CC: dyordano, pmutha, xdmoon
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-25 14:50:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1127217    
Attachments:
Description Flags
channel.software.clone
none
channel.software.delete none

Description Daniel Peess 2015-09-10 12:36:03 UTC
Description of problem:

The Satellite v5.7 XMLRPC API goes nuts if you clone huge software channels.

Version-Release number of selected component (if applicable):

RHEL 6.6 + Satellite v5.7 with all updates as of 2015-09-09.

$ rpm -qa | grep -i 'satellite\|spacewalk' | sort;
satellite-branding-5.7.0.26-1.el6sat.noarch
satellite-doc-indexes-5.7.0-1.el6sat.noarch
satellite-repo-5.6.0.3-1.el6sat.noarch
satellite-schema-5.7.0.19-1.el6sat.noarch
spacewalk-admin-2.3.0-2.el6sat.noarch
spacewalk-backend-2.3.3-28.el6sat.noarch
spacewalk-backend-app-2.3.3-28.el6sat.noarch
spacewalk-backend-applet-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-common-2.3.3-28.el6sat.noarch
spacewalk-backend-config-files-tool-2.3.3-28.el6sat.noarch
spacewalk-backend-iss-2.3.3-28.el6sat.noarch
spacewalk-backend-iss-export-2.3.3-28.el6sat.noarch
spacewalk-backend-libs-2.3.3-28.el6sat.noarch
spacewalk-backend-package-push-server-2.3.3-28.el6sat.noarch
spacewalk-backend-server-2.3.3-28.el6sat.noarch
spacewalk-backend-sql-2.3.3-28.el6sat.noarch
spacewalk-backend-sql-postgresql-2.3.3-28.el6sat.noarch
spacewalk-backend-tools-2.3.3-28.el6sat.noarch
spacewalk-backend-xml-export-libs-2.3.3-28.el6sat.noarch
spacewalk-backend-xmlrpc-2.3.3-28.el6sat.noarch
spacewalk-base-2.3.2-32.el6sat.noarch
spacewalk-base-minimal-2.3.2-32.el6sat.noarch
spacewalk-base-minimal-config-2.3.2-32.el6sat.noarch
spacewalk-certs-tools-2.3.0-4.el6sat.noarch
spacewalk-common-2.3.0-1.5.el6sat.noarch
spacewalk-config-2.3.0-4.el6sat.noarch
spacewalk-dobby-2.3.2-32.el6sat.noarch
spacewalk-grail-2.3.2-32.el6sat.noarch
spacewalk-html-2.3.2-32.el6sat.noarch
spacewalk-java-2.3.8-112.el6sat.noarch
spacewalk-java-config-2.3.8-112.el6sat.noarch
spacewalk-java-lib-2.3.8-112.el6sat.noarch
spacewalk-java-postgresql-2.3.8-112.el6sat.noarch
spacewalk-monitoring-2.2.1-1.el6sat.noarch
spacewalk-monitoring-selinux-2.2.1-1.el6sat.noarch
spacewalk-postgresql-2.3.0-1.5.el6sat.noarch
spacewalk-pxt-2.3.2-32.el6sat.noarch
spacewalk-remote-utils-2.3.0-5.el6sat.noarch
spacewalk-reports-2.3.0-8.el6sat.noarch
spacewalk-schema-2.3.2-21.el6sat.noarch
spacewalk-search-2.3.0-7.el6sat.noarch
spacewalk-selinux-2.2.1-1.el6sat.noarch
spacewalk-setup-2.3.0-17.el6sat.noarch
spacewalk-setup-jabberd-2.0.1-1.el6sat.noarch
spacewalk-setup-postgresql-2.3.0-21.el6sat.noarch
spacewalk-slf4j-1.6.1-6.el6sat.noarch
spacewalk-sniglets-2.3.2-32.el6sat.noarch
spacewalk-ssl-cert-check-2.3-1.el6sat.noarch
spacewalk-taskomatic-2.3.8-112.el6sat.noarch
spacewalk-utils-2.3.2-22.el6sat.noarch


How reproducible:

$ grep -- 'call(' `which sat-xmlrpc-clone_channel.pl`;
        $session = $client->call( 'auth.login', "$optaccount", "$optpassword" );
                $session = $client->call( 'auth.logout', $session );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
        my $spackages = $client->call( 'channel.software.listAllPackages', $session, "$channel" );
        my $channels = $client->call( 'channel.listSoftwareChannels', $session );
                my $cid = $client->call( 'channel.software.clone', $session, "$source", \%details, $client->boolean( "0" ) );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );
                $sharing = $client->call( 'channel.access.getOrgSharing', $session, "$source" );
                $client->call( 'channel.access.setOrgSharing', $session, "$target", "$sharing" );
                my $orgs = $client->call( 'org.trusts.listOrgs', $session );
                                $client->call( 'channel.org.enableAccess', $session, "$target", "$org->{org_id}" );
        my $channels = $client->call( 'channel.software.listChildren', $session, $source );

$ time sat-xmlrpc-clone_channel.pl --account 'satadmin-rbg' --password 'XXXXXXXXXXX' --source 'null-rhel-x86_64-server-6-20141204' --prefix 'tst1' --prestrip 4;
satellite xmlrpc authentication successful.
target channel does not exist yet, source channel found and is a parent channel.

cloning parent channel and all of its child channels, this takes some time...
cloned: null-rhel-x86_64-server-6-20141204 -> tst1-rhel-x86_64-server-6-20141204, packages: 14382/14382, id: 411
500 read timeout

real    3m42.517s
user    0m12.372s
sys     0m0.207s

$ spacecmd -q -u 'satadmin-rbg' -p 'XXXXXXXXXXX' -- softwarechannel_listbasechannels;
null-rhel-x86_64-server-6-20121016
null-rhel-x86_64-server-6-20141001
null-rhel-x86_64-server-6-20141204
rbg-rhel-x86_64-server-6-20121016
rbg-rhel-x86_64-server-6-20141001
rbg-rhel-x86_64-server-6-20141204
rhel-x86_64-server-6
tst1-rhel-x86_64-server-6-20141204

$ sat-xmlrpc-delete_channel.pl --account "satadmin-rbg" --password 'XXXXXXXXXXX' --prefix "tst1-";
satellite xmlrpc authentication successful.
ERROR: could not delete the satellite channel "tst1-rhel-x86_64-server-6-20141204": Fault returned from XML RPC Server, fault code -1: redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ERROR: update or delete on table "rhnchannel" violates foreign key constraint "rhn_channel_parent_ch_fk" on table "rhnchannel"
  Detail: Key (id)=(411) is still referenced from table "rhnchannel".
  Where: SQL statement "delete from rhnChannel where id = channel_id_in"
PL/pgSQL function delete_channel(numeric) line 13 at SQL statement

$ rhn-satellite restart;

$ spacecmd -q -u 'satadmin-rbg' -p 'XXXXXXXXXXX' -- softwarechannel_listbasechannels;
null-rhel-x86_64-server-6-20121016
null-rhel-x86_64-server-6-20141001
null-rhel-x86_64-server-6-20141204
rbg-rhel-x86_64-server-6-20121016
rbg-rhel-x86_64-server-6-20141001
rbg-rhel-x86_64-server-6-20141204
rhel-x86_64-server-6
tst1-rhel-x86_64-server-6-20141204

$ sat-xmlrpc-delete_channel.pl --account "satadmin-rbg" --password 'XXXXXXXXXXX' --prefix "tst1-rhel-x86_64-server-6-20141204";
satellite xmlrpc authentication successful.
deleted "tst1-rhel-x86_64-server-6-20141204" child: "tst1-rhn-tools-rhel-x86_64-server-6-20141204"
deleted parent: "tst1-rhel-x86_64-server-6-20141204"
done.


Steps to Reproduce:
1. Use the "channel.software.clone" API for several huge base and their child software channels.
2. It never completes with a "500 read timeout".
3. Deleting the same failed clone fails as well, until you restart all processes.

Actual results:

Half-cloned Software Channel without any child channels.
Sometimes it even completes the base channel clone and only goes into timeout while cloning the first or second of its child channels.

Expected results:

Fully-cloned Software Channel with all child channels.

Additional info:

This exact same XMLRPC API script worked fine with Satellite v5.4.1
and RHEL6 base channel cloning when it was still below 12.0000 packages.

We have also tested 'spacewalk-clone-by-date', and this does complete successfully for the same base channel and its child channels.
It does take 90 minutes instead of 4 minutes 'though,
so they might query the API for some "background action for this channel API call has completed, you can continue now without killing the server process".

Comment 1 Daniel Peess 2015-09-10 12:40:38 UTC
Created attachment 1072162 [details]
channel.software.clone

XMLRPC API channel.software.clone script.

Comment 2 Daniel Peess 2015-09-10 12:41:18 UTC
Created attachment 1072165 [details]
channel.software.delete

Comment 3 Daniel Peess 2015-09-10 12:44:51 UTC
high timeouts and big memory alloc already set:

# grep -- '^Timeout' /etc/httpd/conf/httpd.conf;
Timeout 1800
# grep -- '<session-timeout' /var/lib/tomcat6/webapps/rhn/WEB-INF/web.xml;
    <session-timeout>1800</session-timeout>
# grep -- 'web.session_cookie_lifetime' /etc/rhn/rhn.conf;
web.session_cookie_lifetime = 1800
# grep -- 'web.session_database_lifetime' /etc/rhn/rhn.conf;
web.session_database_lifetime = 3600
# grep -i -- '^ProxyTimeout' /etc/httpd/conf.d/zz-spacewalk-www.conf;
ProxyTimeout 1800

# grep -- '^wrapper.java.initmemory' /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf;
wrapper.java.initmemory=8192
# grep -- '^wrapper.java.maxmemory' /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf;
wrapper.java.maxmemory=16384
# grep -- '^JAVA_OPTS' /etc/sysconfig/tomcat6;
JAVA_OPTS="$JAVA_OPTS -ea -Xms8192m -Xmx16384m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -Dorg.apache.tomcat.util.http.Parameters.MAX_COUNT=1024 -Dcom.ibm.tools.attach.enable=no -XX:MaxNewSize=256 -XX:-UseConcMarkSweepGC"
# grep -- 'JAVA_OPTS' /etc/tomcat6/tomcat6.conf;
#JAVA_OPTS="-Xminf0.1 -Xmaxf0.3"
# Use JAVA_OPTS to set java.library.path for libtcnative.so
#JAVA_OPTS="-Djava.library.path=/usr/lib64"
JAVA_OPTS="-Djavax.sql.DataSource.Factory=org.apache.commons.dbcp.BasicDataSourceFactory"
# ps -ef | grep -i -- 'tomcat';
tomcat   42898     1 18 20:55 ?        00:00:11 /usr/lib/jvm/java/bin/java -Djavax.sql.DataSource.Factory=org.apache.commons.dbcp.BasicDataSourceFactory -ea -Xms8192m -Xmx16384m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -Dorg.apache.tomcat.util.http.Parameters.MAX_COUNT=1024 -Dcom.ibm.tools.attach.enable=no -XX:MaxNewSize=256 -XX:-UseConcMarkSweepGC -classpath :/usr/share/tomcat6/bin/bootstrap.jar:/usr/share/tomcat6/bin/tomcat-juli.jar:/usr/share/java/commons-daemon.jar -Dcatalina.base=/usr/share/tomcat6 -Dcatalina.home=/usr/share/tomcat6 -Djava.endorsed.dirs= -Djava.io.tmpdir=/var/cache/tomcat6/temp -Djava.util.logging.config.file=/usr/share/tomcat6/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager org.apache.catalina.startup.Bootstrap start
root     44434 58784  0 20:56 pts/1    00:00:00 grep -i -- tomcat

Comment 4 Daniel Peess 2015-09-10 13:07:14 UTC
embedded postgresql database not even that huge:
# du -msc /opt/rh/postgresql92/root/var/lib/pgsql/;
6456    /opt/rh/postgresql92/root/var/lib/pgsql/
6456    total