Bug 1388160 - Satellite 5.7: Deadlock with system.obtainReactivationKey API call
Summary: Satellite 5.7: Deadlock with system.obtainReactivationKey API call
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite 5
Classification: Red Hat
Component: API
Version: 570
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
Assignee: Grant Gainey
QA Contact: Radovan Drazny
URL:
Whiteboard:
Depends On: 1385099 1400108
Blocks: sat5-errata
TreeView+ depends on / blocked
 
Reported: 2016-10-24 15:40 UTC by Grant Gainey
Modified: 2017-02-01 15:55 UTC (History)
7 users (show)

Fixed In Version: spacewalk-java-2.3.8-158-sat
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1385099
Environment:
Last Closed: 2017-02-01 15:55:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Main deadlock script (517 bytes, application/x-shellscript)
2016-12-14 09:33 UTC, Radovan Drazny
no flags Details
Reactivation script (2.03 KB, text/x-python)
2016-12-14 09:33 UTC, Radovan Drazny
no flags Details
Updatechannel script (1.39 KB, text/x-python)
2016-12-14 09:34 UTC, Radovan Drazny
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0236 0 normal SHIPPED_LIVE Satellite 5.7 bug fix update 2017-02-01 20:55:02 UTC

Description Grant Gainey 2016-10-24 15:40:40 UTC
+++ This bug was initially created as a clone of Bug #1385099 +++

Description of problem:

Hitting deadlock with calls to the system.obtainReactivationKey API call and the rhnRegToken table:

2016-10-14 17:17:32,475] ERROR - REQUESTED FROM: 192.168.100.2 CALL: system.obtainReactivationKey(832x56d180cb3a6dbedc9e08b3e2e1b4a7f9, 1000010167) CALLER: (admin) TIME: 9.264 seconds
redstone.xmlrpc.XmlRpcFault: unhandled internal exception: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource

        at com.redhat.rhn.frontend.xmlrpc.BaseHandler.invoke(BaseHandler.java:173)
        at redstone.xmlrpc.XmlRpcDispatcher.dispatch(XmlRpcDispatcher.java:123)
        at com.redhat.rhn.frontend.xmlrpc.RhnXmlRpcServer.execute(RhnXmlRpcServer.java:54)
        at com.redhat.rhn.frontend.xmlrpc.XmlRpcServlet.doPost(XmlRpcServlet.java:162)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.obtainContent(SiteMeshFilter.java:129)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.doFilter(SiteMeshFilter.java:77)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at com.redhat.rhn.frontend.servlets.LocalizedEnvironmentFilter.doFilter(LocalizedEnvironmentFilter.java:67)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at com.redhat.rhn.frontend.servlets.EnvironmentFilter.doFilter(EnvironmentFilter.java:100)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at com.redhat.rhn.frontend.servlets.SessionFilter.doFilter(SessionFilter.java:57)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at com.redhat.rhn.frontend.servlets.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:97)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)


Problematic code looks to be potentially in Hibernate where we need to add a lock on the factory/table from the SystemHandler API code: 

List<ActivationKey> existingKeys = ActivationKeyFactory.lookupByServer(server);
        for (ActivationKey key : existingKeys) {
            ActivationKeyFactory.removeKey(key);
        }

Took a while to figure out how to reproduce but will outline the steps below and attach the scripts used: 

1) reactivation.py is used to generate reactivation keys for clients calling in and is placed in the /var/www/cgi-bin directory of Satellite
2) updatechannels.py is used to update base channel of the client calling in and is also placed in the /var/www/cgi-bin directory of the Satellite
3) deadlock.sh is placed on the client and is the main driver to create the deadlock situation. 

Procedure is as follows: 

1) Kickstart 3+ clients to rhel 6u8. Place the deadlock.sh script on the clients and update the satellite url inside script. 
2) modify the two python cgi scripts to contain the correct user/pass information. 
3) create a activation key to assign a clone base channel to client for the main driver deadlock.sh script and name "foo" or rename the ACTIVATION_KEYS variable in deadlock.sh to your new name. 
4) run the deadlock.sh script on your clients and tail -f /var/log/rhn/rhn_web_api.log. 

I started seeing the deadlock about 20 times after the 2nd client script started looping.


--- Additional comment from Shannon Hughes on 2016-10-14 14:28:17 EDT ---

ggainey and I looked at this one some yesterday so he is familiar with the issue. I have not tried on embedded postgresql database but would think we have the same issue. Using external Oracle DB like the customer is using. 

[root@ ~]# rpm -q spacewalk-java
spacewalk-java-2.0.2-112.el5sat
[root@ ~]# rpm -q satellite-schema
satellite-schema-5.6.0.31-1.el5sat
[root@ ~]# rpm -q spacewalk-schema
spacewalk-schema-2.0.2-25.el5sat


--- Additional comment from Shannon Hughes on 2016-10-17 14:25:11 EDT ---

made a couple attempts at duplicating deadlock using same scripts on satellite 5.7 against postgresql and could not generate the deadlock using same procedure. plan on trying satellite 5.6 against postgresql.

--- Additional comment from Shannon Hughes on 2016-10-17 18:23:14 EDT ---

tried duplicating on sat 5.6 with embedded postgresql db but did not experience any deadlock. 

[root@ ~]# rpm -q spacewalk-java
spacewalk-java-2.0.2-112.el6sat.noarch
[root@ ~]# rpm -q satellite-schema
satellite-schema-5.6.0.31-1.el6sat.noarch
[root@ ~]# rpm -q spacewalk-schema
spacewalk-schema-2.0.2-25.el6sat.noarch

--- Additional comment from Shannon Hughes on 2016-10-19 13:55:07 EDT ---

attached the oracle trace file but looks like the table issue is rhnRegToken, 

----- Information for the OTHER waiting sessions -----
Session 172:
  sid: 172 ser: 23793 audsid: 211720 user: 87/SHUGHESDBA
    flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
    flags2: (0x40008) -/-
  pid: 45 O/S info: user: oracle, term: UNKNOWN, ospid: 22295

  client details:
    O/S info: user: tomcat, term: , ospid: 3101


  current SQL:
  delete from rhnRegToken where id=:1

----- End of information for the OTHER waiting sessions -----

Information for THIS session:

----- Current SQL Statement for this session (sql_id=8n5sg3ka24t05) -----
delete from rhnRegToken where id=:1

--- Additional comment from Shannon Hughes on 2016-10-20 09:53:49 EDT ---

ran almost 50k client registrations overnight using three clients in an infinite loop using the potential fix ggainey and I applied to my reproducer. We had no deadlocks. Going to run some regression on kickstarts and new clients since this area uses the same activation key calls.

--- Additional comment from Shannon Hughes on 2016-10-20 16:22:59 EDT ---

regression testing is showing we are hitting the deadlocks that were fixed recently in satellite 5.7, 

https://rhn.redhat.com/errata/RHBA-2016-1645.html
https://bugzilla.redhat.com/show_bug.cgi?id=1179770

ggainey and I are under discussion for next steps.

Comment 1 Grant Gainey 2016-10-24 15:42:03 UTC
Cloned so we can get this fix into 5.7 as well (although we have yet to encounter it there...)

Comment 4 Grant Gainey 2016-11-30 20:40:48 UTC
spacewalk.github:
d11976e6ad0f2a6206fc8667fbf6e40d9d578701

Comment 7 Radovan Drazny 2016-12-14 09:07:25 UTC
I have modified testing scripts from the bug #1385099 a bit, to run them all from a client, without a web server, but basically they are still the same scripts. Surprisingly enough, I haven't encouter any deadlocks during the system.obtainReactivationKey API calls, even pre-update. What I did encounter, unfortunately, is a lot of deadlocks in system.setChildChannels (preupdate) and system.setBaseChannel (postupdate) calls. 

Preupdate (spacewalk-java-2.3.8-157):

Deadlock starts to occur with 3 clients running in parallel. Deadlocks are not too frequents, there is one approximately every 20-30 loops, distributed randomly between all three clients. Adding more clients to the mix doesn't increase number of deadlocks significantly. There is following traceback in the rhn_web_api.log

[2016-12-13 06:06:48,578] ERROR - REQUESTED FROM: 10.19.41.84 CALL: system.setBaseChannel(admin, 1000010002, shell-rhel-x86_64-server-6) CALLER: (admin) TIME: 16.268 seconds
redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "RHNSAT.RHN_CHANNEL", line 672
ORA-06512: at "RHNSAT.RHN_CHANNEL", line 512
ORA-06512: at line 1

        at com.redhat.rhn.frontend.xmlrpc.BaseHandler.invoke(BaseHandler.java:195)
        at redstone.xmlrpc.XmlRpcDispatcher.dispatch(XmlRpcDispatcher.java:123)
        at com.redhat.rhn.frontend.xmlrpc.RhnXmlRpcServer.execute(RhnXmlRpcServer.java:54)
        at com.redhat.rhn.frontend.xmlrpc.XmlRpcServlet.doPost(XmlRpcServlet.java:162)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:643)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.obtainContent(SiteMeshFilter.java:129)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.doFilter(SiteMeshFilter.java:77)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.LocalizedEnvironmentFilter.doFilter(LocalizedEnvironmentFilter.java:67)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.EnvironmentFilter.doFilter(EnvironmentFilter.java:100)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.SessionFilter.doFilter(SessionFilter.java:57)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:97)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:299)
        at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
        at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
        at java.lang.Thread.run(Thread.java:809)
	

Postupdate (spacewalk-java-2.3.8-158):

Situation is much worse here. Two parallel clients are OK, three clients generate deadlocks with similar frequency as at the pre-update state, with four and more clients all hell breaks loose, and there is a deadlock or an ISE almost on every second call. Rhn_web_api.log contains following traceback for every deadlock:

[2016-12-14 03:57:33,546] ERROR - REQUESTED FROM: 2620:52:0:102f:5054:1ff:fe2c:e131 CALL: system.setChildChannels(admin, 1000010003, [clone-1-rhn-tools-rhel-x86_64-server-6, clone-2-rhn-tools-rhel-x86_64-server-6]) CALLER: (admin) TIME:
 3.206 seconds
redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "RHNSAT.LOCK_COUNTS", line 14
ORA-06512: at "RHNSAT.RHN_CHANNEL", line 95
ORA-06512: at line 1

        at com.redhat.rhn.frontend.xmlrpc.BaseHandler.invoke(BaseHandler.java:195)
        at redstone.xmlrpc.XmlRpcDispatcher.dispatch(XmlRpcDispatcher.java:123)
        at com.redhat.rhn.frontend.xmlrpc.RhnXmlRpcServer.execute(RhnXmlRpcServer.java:54)
        at com.redhat.rhn.frontend.xmlrpc.XmlRpcServlet.doPost(XmlRpcServlet.java:162)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:643)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.obtainContent(SiteMeshFilter.java:129)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.doFilter(SiteMeshFilter.java:77)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.LocalizedEnvironmentFilter.doFilter(LocalizedEnvironmentFilter.java:67)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.EnvironmentFilter.doFilter(EnvironmentFilter.java:100)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.SessionFilter.doFilter(SessionFilter.java:57)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:97)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:299)
        at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
        at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
        at java.lang.Thread.run(Thread.java:809)

Output on a client:

ACT KEY: 1-shell,re-1-e2e201ffe3e491daaa6b83eda180a116.
Traceback (most recent call last):
  File "updatechannels.py", line 51, in <module>
    main()
  File "updatechannels.py", line 47, in main
    client.system.setChildChannels(key, id, ['clone-1-rhn-tools-rhel-x86_64-server-6', 'clone-2-rhn-tools-rhel-x86_64-server-6'])
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in _parse_response
    return u.close()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close
    raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault -1: 'redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource\nORA-06512: at "RHNSAT.LOCK_COUNTS", line 14\nORA-06512: at "RHNSAT.RHN_CHANNEL", line 95\nORA-06512: at line 1\n'>
ACT KEY: 1-shell,re-1-e95e137475ad1efc2e0dc201d70e15ca.
rhn-plugin: Error communicating with server. The message was:
Internal Server Error
Traceback (most recent call last):
  File "updatechannels.py", line 51, in <module>
    main()
  File "updatechannels.py", line 47, in main
    client.system.setChildChannels(key, id, ['clone-1-rhn-tools-rhel-x86_64-server-6', 'clone-2-rhn-tools-rhel-x86_64-server-6'])
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in _parse_response
    return u.close()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close
    raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault -1: 'redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource\nORA-06512: at "RHNSAT.LOCK_COUNTS", line 14\nORA-06512: at "RHNSAT.RHN_CHANNEL", line 95\nORA-06512: at line 1\n'>
ACT KEY: 1-shell,re-1-7ae91d3e32f677d4bef5c71948acb5dd.
rhn-plugin: Error communicating with server. The message was:
Internal Server Error
ACT KEY: 1-shell,re-1-dd3b3292a114728eb467e02cfbd5b25d.
Traceback (most recent call last):
  File "updatechannels.py", line 51, in <module>
    main()
  File "updatechannels.py", line 46, in main
    client.system.setBaseChannel(key, id, newbase)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in _parse_response
    return u.close()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close
    raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault -1: 'redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource\nORA-06512: at "RHNSAT.LOCK_COUNTS", line 14\nORA-06512: at "RHNSAT.RHN_CHANNEL", line 95\nORA-06512: at line 1\n'>
ACT KEY: 1-shell,re-1-44edf05e422b08b637a488f35db51ec1.


FailedQA

Comment 8 Radovan Drazny 2016-12-14 09:33:17 UTC
Created attachment 1231549 [details]
Main deadlock script

Comment 9 Radovan Drazny 2016-12-14 09:33:55 UTC
Created attachment 1231550 [details]
Reactivation script

Comment 10 Radovan Drazny 2016-12-14 09:34:29 UTC
Created attachment 1231551 [details]
Updatechannel script

Comment 11 Grant Gainey 2016-12-14 13:08:14 UTC
Hey Radovan - did your test system have the fix for

  https://bugzilla.redhat.com/show_bug.cgi?id=1402933

applied? There's a problem in the Oracle version of lock_counts, which may be the source of the deadlocks you're seeing here.

If that fix *is* applied, then we definitely have a problem.

Comment 12 Radovan Drazny 2016-12-14 19:23:17 UTC
# spacewalk-schema-upgrade 
Schema upgrade: [satellite-schema-5.7.0.26-1.el6sat] -> [satellite-schema-5.7.0.26-1.el6sat]
Your database schema already matches the schema package version [satellite-schema-5.7.0.26-1.el6sat].

The fix from BZ #1402933 is applied.

Comment 14 Grant Gainey 2016-12-16 20:37:33 UTC
The problems in this test-run are due to lock_count() not being called on the *unsubscribe_server* path. That fix is more properly an extension of 1402933; fix-commits can be found there. With '2933 and '8160 in place, ran the reproducer scripts on 4 clients for 3 hours, with zero deadlocks.

Comment 15 Radovan Drazny 2016-12-19 11:27:38 UTC
Well, I'm sorry to say this, but there is still a problem. 

Schema upgrade is applied to the DB:

# spacewalk-schema-upgrade 
Schema upgrade: [satellite-schema-5.7.0.27-1.el6sat] -> [satellite-schema-5.7.0.27-1.el6sat]
Your database schema already matches the schema package version [satellite-schema-5.7.0.27-1.el6sat].



When running three and more clients with the reproducer scripts from https://bugzilla.redhat.com/show_bug.cgi?id=1388160, I still see following deadlocks in rhn_web_api.log:

[2016-12-19 06:00:04,366] ERROR - REQUESTED FROM: 2620:52:0:102f:5054:1ff:fe2c:e12a CALL: system.setBaseChannel(admin, 1000010002, shell-rhel-x86_64-server-6) CALLER: (admin) TIME: 6.767 seconds
redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "RHNSAT.LOCK_COUNTS", line 13
ORA-06512: at "RHNSAT.RHN_CHANNEL", line 95
ORA-06512: at line 1

        at com.redhat.rhn.frontend.xmlrpc.BaseHandler.invoke(BaseHandler.java:195)
        at redstone.xmlrpc.XmlRpcDispatcher.dispatch(XmlRpcDispatcher.java:123)
        at com.redhat.rhn.frontend.xmlrpc.RhnXmlRpcServer.execute(RhnXmlRpcServer.java:54)
        at com.redhat.rhn.frontend.xmlrpc.XmlRpcServlet.doPost(XmlRpcServlet.java:162)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:643)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.obtainContent(SiteMeshFilter.java:129)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.doFilter(SiteMeshFilter.java:77)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.LocalizedEnvironmentFilter.doFilter(LocalizedEnvironmentFilter.java:67)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.EnvironmentFilter.doFilter(EnvironmentFilter.java:100)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.SessionFilter.doFilter(SessionFilter.java:57)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:97)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:299)
        at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
        at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
        at java.lang.Thread.run(Thread.java:809)
<...>

[2016-12-19 05:59:51,826] ERROR - REQUESTED FROM: 2620:52:0:1329:216:3eff:fe4d:3fd0 CALL: system.setChildChannels(admin, 1000010004, [clone-1-rhn-tools-rhel-x86_64-server-6, clone-2-rhn-tools-rhel-x86_64-server-6]) CALLER: (admin) TIME: 11.533 seconds
redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "RHNSAT.LOCK_COUNTS", line 13
ORA-06512: at "RHNSAT.RHN_CHANNEL", line 95
ORA-06512: at line 1

        at com.redhat.rhn.frontend.xmlrpc.BaseHandler.invoke(BaseHandler.java:195)
        at redstone.xmlrpc.XmlRpcDispatcher.dispatch(XmlRpcDispatcher.java:123)
        at com.redhat.rhn.frontend.xmlrpc.RhnXmlRpcServer.execute(RhnXmlRpcServer.java:54)
        at com.redhat.rhn.frontend.xmlrpc.XmlRpcServlet.doPost(XmlRpcServlet.java:162)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:643)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.obtainContent(SiteMeshFilter.java:129)
        at com.opensymphony.sitemesh.webapp.SiteMeshFilter.doFilter(SiteMeshFilter.java:77)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.LocalizedEnvironmentFilter.doFilter(LocalizedEnvironmentFilter.java:67)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.EnvironmentFilter.doFilter(EnvironmentFilter.java:100)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.SessionFilter.doFilter(SessionFilter.java:57)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at com.redhat.rhn.frontend.servlets.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:97)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:299)
        at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
        at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
        at java.lang.Thread.run(Thread.java:809)

...and deadlocks are reported on client machines as well:

Traceback (most recent call last):
  File "updatechannels.py", line 51, in <module>
    main()
  File "updatechannels.py", line 47, in main
    client.system.setChildChannels(key, id, ['clone-1-rhn-tools-rhel-x86_64-server-6', 'clone-2-rhn-tools-rhel-x86_64-server-6'])
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in _parse_response
    return u.close()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close
    raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault -1: 'redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource\nORA-06512: at "RHNSAT.LOCK_COUNTS", line 13\nORA-06512: at "RHNSAT.RHN_CHANNEL", line 95\nORA-06512: at line 1\n'>
ACT KEY: 1-shell,re-1-63b7f463f694959a8d6b3386d130c0d7.
Traceback (most recent call last):
  File "updatechannels.py", line 51, in <module>
    main()
  File "updatechannels.py", line 46, in main
    client.system.setBaseChannel(key, id, newbase)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in _parse_response
    return u.close()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close
    raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault -1: 'redstone.xmlrpc.XmlRpcFault: unhandled internal exception: ORA-00060: deadlock detected while waiting for resource\nORA-06512: at "RHNSAT.LOCK_COUNTS", line 13\nORA-06512: at "RHNSAT.RHN_CHANNEL", line 95\nORA-06512: at line 1\n'>


Considering that Grant was using same reproducer script as I did, and he didn't encounter any errors, I'm inclined to believe that some part of the fix didn't make it into the final satellite-schema rpm.

Comment 17 Radovan Drazny 2017-01-04 10:40:59 UTC
Ok, after retest, it appears there was a problem with original test server, and fix works actually. After running the reproducer script with a new server and a new set of four concurrent clients, there were no deadlocks in more than 12 hours run. 

VERIFIED

Comment 19 errata-xmlrpc 2017-02-01 15:55:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0236.html


Note You need to log in before you can comment on or make changes to this bug.