Bug 837913 - Satellite 541 ISE when attempting to change the OS base channel of a kickstart profile, which seems to fail on channels with NULL checksums values.
Satellite 541 ISE when attempting to change the OS base channel of a kickstar...
Status: CLOSED CURRENTRELEASE
Product: Spacewalk
Classification: Community
Component: WebUI (Show other bugs)
1.8
All Linux
high Severity high
: ---
: ---
Assigned To: Stephen Herr
Red Hat Satellite QA List
: Patch
Depends On: 836610 837919
Blocks: space18
  Show dependency treegraph
 
Reported: 2012-07-05 15:44 EDT by Stephen Herr
Modified: 2012-11-01 12:17 EDT (History)
5 users (show)

See Also:
Fixed In Version: spacewalk-java-1.8.105-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 836610
Environment:
Last Closed: 2012-11-01 12:17:44 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Stephen Herr 2012-07-05 15:44:27 EDT
+++ This bug was initially created as a clone of Bug #836610 +++

Created attachment 595334 [details]
tomcat tracsestacks

Description of problem:

An upgraded 541 satellite server with legacy RHEL 3 and 4 channels encounters an ISE when attempting to change kickstart profile's Base Channel. Tomcat loggin has:

Caused by: 
org.hibernate.PropertyAccessException: Exception occurred inside getter of com.redhat.rhn.domain.channel.ClonedChannel.checksumType


Version-Release number of selected component (if applicable):

Satellite 541

How reproducible:

Always

Steps to Reproduce:

1. Go to FDQN/rhn/kickstart/KickstartSoftwareEdit.do?ksid=62
2. Change the Base channel to any different channel and click update kickstart.
3. ISE
  
Actual results:

 An ISE will result with the above 'caused by' message in catalina.out. Also see attached harris-tomcat.log file for the full tracestack.

Expected results:

Should see "Kickstart Operating System selection successfully updated."

Additional info:

We determined that a checksum update temporarily resolves the issue, but that any subsequent satellite-sync reverts the update and the issue resurfaces:


# sqlplus $(spacewalk-cfg-get default_db)

SQL*Plus: Release 10.2.0.4.0 - Production on Wed Jun 27 08:50:20 2012

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select id||','||parent_channel||','||label||','||checksum_type_id from rhnchannel where checksum_type_id is null;

ID||','||PARENT_CHANNEL||','||LABEL||','||CHECKSUM_TYPE_ID
--------------------------------------------------------------------------------
101,,rhel-x86_64-ws-4,
102,101,rhn-tools-rhel-4-ws-x86_64,
103,,rhel-x86_64-as-3,
104,,rhel-x86_64-as-4,
105,,rhel-x86_64-es-3,
106,,rhel-x86_64-ws-3,
110,,rhel-i386-as-4,
111,110,rhn-tools-rhel-4-as-i386,
113,,rhel-i386-as-3,
114,,rhel-i386-es-3,
115,,rhel-i386-es-4,

ID||','||PARENT_CHANNEL||','||LABEL||','||CHECKSUM_TYPE_ID
--------------------------------------------------------------------------------
116,,rhel-x86_64-es-4,
117,113,rhn-tools-rhel-3-as-i386,
118,103,rhn-tools-rhel-3-as-x86_64,
119,114,rhn-tools-rhel-3-es-i386,
120,105,rhn-tools-rhel-3-es-x86_64,
121,106,rhn-tools-rhel-3-ws-x86_64,
122,104,rhn-tools-rhel-4-as-x86_64,
123,115,rhn-tools-rhel-4-es-i386,
124,116,rhn-tools-rhel-4-es-x86_64,
129,,rhel-i386-ws-3,
130,,rhel-i386-ws-4,

ID||','||PARENT_CHANNEL||','||LABEL||','||CHECKSUM_TYPE_ID
--------------------------------------------------------------------------------
131,129,rhn-tools-rhel-3-ws-i386,
132,130,rhn-tools-rhel-4-ws-i386,

24 rows selected.

SQL> update rhnchannel set checksum_type_id = 1 where checksum_type_id is null;

24 rows updated.

SQL> commit;

Commit complete.

SQL>

--- Additional comment from xdmoon@redhat.com on 2012-07-03 15:16:24 EDT ---

4 cases so far with 3 strategic accts, probable regression, flagging for GSS.

--- Additional comment from xdmoon@redhat.com on 2012-07-03 15:49:17 EDT ---

From the Satellite schema upgrade scripts, looks like checksum_type_id was introduced in 5.4.0 and initially populated with the ID for 'sha1' -

/etc/sysconfig/rhn/schema-upgrade/spacewalk-schema-0.5-to-spacewalk-schema-0.6/191-rhnChannel.sql 

-- Add a checksum_type_id column, and fk to rhnChecksumType
ALTER TABLE rhnChannel
  ADD checksum_type_id number
CONSTRAINT rhn_channel_checksum_fk
    REFERENCES rhnChecksumType(id);

-- Update any existing channels that are not set
UPDATE rhnChannel SET 
  checksum_type_id = (select id 
                        from rhnChecksumType 
                       where LABEL = 'sha1')
WHERE checksum_type_id is null;

show errors

----------
so unless it was changed again after that, the column should never be null.

Xixi

--- Additional comment from shughes@redhat.com on 2012-07-03 15:58:30 EDT ---

satellite sync will grab what is on hosted and update the database, so its possible that something has changed on hosted side with the rhel3/4 channels

--- Additional comment from shughes@redhat.com on 2012-07-03 16:24:31 EDT ---

looking at ClonedChannel.java I think we are hitting recursion issue: 

    public ChecksumType getChecksumType() {
        if (super.getChecksumType() == null) {
            // if the checksum type is not set use the
            //checksum of original channel instead.
            setChecksumType(getOriginal().getChecksumType());
        }
        return super.getChecksumType();
    }


if a cloned channel checksum type is null then it will look up original, but if its null then we might get into a recursion if the original channels checksum type ever was set to null

--- Additional comment from sherr@redhat.com on 2012-07-03 16:39:26 EDT ---

Unless I'm missing something, the only way the above function should end up being infinitely recursive is if:

1) the ClonedChannel is somehow its own original
2) there is a loop of some kind in, e.g. CC1.original == CC2 and CC2.original == CC1

I would hope that neither of those could ever be true because they don't make any sense to me, but that seems to be what the traceback is pointing to. In my opinion the questions we should answer are:

1) is that the case?
2) how did it get that way?

--- Additional comment from nigjones@redhat.com on 2012-07-03 16:43:33 EDT ---

I just attached ~6 cases to this Bugzilla, I believe most were opened before the Bugzilla was.

First report of this issue seems to be from 20th June, which appears to co-incide with the 6.3 release, second issue reported 21st June.

Based on the 2nd ticket, it certainly appears to be on the Hosted side.

After applying GSS provided SQL to fix the Red Hat distributed channels, the customer responded with;

---
I have manually set the checksum values in oracle as you suggested and it appears to have worked. I'm now able to change the kickstart trees for RHEL5 and 6.
---

Followed by the next day:

---
As I somewhat expected the overnight satellite sync has reverted all the checksum values back to what they were originally (most rhel3/4 channels having no checksum) and the kickstart trees are showing the internal server error again when you attempt to change the tree.
---

--- Additional comment from xdmoon@redhat.com on 2012-07-03 16:49:21 EDT ---

Based on sat-devel discussion, it sounds checksum_type_id should be null for channels without yum repos (e.g., RHEL 3 and 4), the bug is in the application code which doesn't handle that well, resulting in the error.

Note that KCS solution https://access.redhat.com/knowledge/solutions/47660 will need to be changed once we confirm this, since updating the checksum_type_id is only a workaround til the next sat-sync, and not the solution. 

Xixi

--- Additional comment from shughes@redhat.com on 2012-07-03 17:30:12 EDT ---

Another observation is our customer db does not have any Cloned Channels that have a null checksum type so I am not sure how we are even hitting the code block in comment #4 unless the Satellite code is incorrectly casting a Channel object to a ClonedChannel object and the getCheckSumType method of ClonedChannel is getting executed. 

sherr and I are running queries to see if we can isolate the issue further.

--- Additional comment from nigjones@redhat.com on 2012-07-03 22:53:49 EDT ---

Created attachment 596125 [details]
Potential patch to avoid StackOverflow's

Using an instrumented ClonedChannel.class I managed to get the following stacktraces while running:

2012-07-03 23:41:18,223 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - channel label is: rhel-i386-es-4
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.domain.channel.ClonedChannel.getChecksumType(ClonedChannel.java:57)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: sun.reflect.GeneratedMethodAccessor340.invoke(Unknown Source)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: java.lang.reflect.Method.invoke(Method.java:611)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.property.BasicPropertyAccessor$BasicGetter.get(BasicPropertyAccessor.java:145)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.tuple.entity.AbstractEntityTuplizer.getPropertyValues(AbstractEntityTuplizer.java:256)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.tuple.entity.PojoEntityTuplizer.getPropertyValues(PojoEntityTuplizer.java:209)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.persister.entity.AbstractEntityPersister.getPropertyValues(AbstractEntityPersister.java:3580)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.event.def.DefaultFlushEntityEventListener.getValues(DefaultFlushEntityEventListener.java:167)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.event.def.DefaultFlushEntityEventListener.onFlushEntity(DefaultFlushEntityEventListener.java:120)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.event.def.AbstractFlushingEventListener.flushEntities(AbstractFlushingEventListener.java:196)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:76)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.event.def.DefaultAutoFlushEventListener.onAutoFlush(DefaultAutoFlushEventListener.java:35)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.impl.SessionImpl.autoFlushIfRequired(SessionImpl.java:969)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.impl.SessionImpl.list(SessionImpl.java:1114)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.hibernate.impl.QueryImpl.list(QueryImpl.java:79)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.domain.kickstart.KickstartFactory.lookupDefaultKickstartSessionForKickstartData(KickstartFactory.java:694)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.manager.kickstart.cobbler.CobblerProfileCommand.updateCobblerFields(CobblerProfileCommand.java:82)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.manager.kickstart.cobbler.CobblerProfileEditCommand.store(CobblerProfileEditCommand.java:59)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.frontend.action.kickstart.KickstartSoftwareEditAction.processFormValues(KickstartSoftwareEditAction.java:235)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.frontend.action.kickstart.BaseKickstartEditAction.execute(BaseKickstartEditAction.java:80)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:431)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:237)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.frontend.struts.RhnRequestProcessor.process(RhnRequestProcessor.java:99)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.struts.action.ActionServlet.process(ActionServlet.java:1196)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:432)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.frontend.servlets.AuthFilter.doFilter(AuthFilter.java:119)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.opensymphony.module.sitemesh.filter.PageFilter.parsePage(PageFilter.java:142)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.opensymphony.module.sitemesh.filter.PageFilter.doFilter(PageFilter.java:58)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.frontend.servlets.LocalizedEnvironmentFilter.doFilter(LocalizedEnvironmentFilter.java:67)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.frontend.servlets.EnvironmentFilter.doFilter(EnvironmentFilter.java:108)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.frontend.servlets.SessionFilter.doFilter(SessionFilter.java:55)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.frontend.servlets.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:97)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:200)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:775)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:704)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:897)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: java.lang.Thread.run(Thread.java:736)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - parent channel is: rhel-i386-es-4
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - channel label is: rhel-i386-es-4
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.domain.channel.ClonedChannel.getChecksumType(ClonedChannel.java:57)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.domain.channel.ClonedChannel.getChecksumType(ClonedChannel.java:68)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: sun.reflect.GeneratedMethodAccessor340.invoke(Unknown Source)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: java.lang.reflect.Method.invoke(Method.java:611)
[...]
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: java.lang.Thread.run(Thread.java:736)
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - parent channel is: rhel-i386-es-4
2012-07-03 23:41:18,224 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - channel label is: rhel-i386-es-4
2012-07-03 23:41:18,225 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.domain.channel.ClonedChannel.getChecksumType(ClonedChannel.java:57)
2012-07-03 23:41:18,225 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.domain.channel.ClonedChannel.getChecksumType(ClonedChannel.java:68)
2012-07-03 23:41:18,225 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: com.redhat.rhn.domain.channel.ClonedChannel.getChecksumType(ClonedChannel.java:68)
2012-07-03 23:41:18,225 [TP-Processor2] DEBUG com.redhat.rhn.domain.channel.ClonedChannel - ST: sun.reflect.GeneratedMethodAccessor340.invoke(Unknown Source)


"channel label is:" was printed at the start of getChecksumType(), the stacktrace at the start of inside the if() codeblock, and the "parent channel is:" directly afterwards.

From the above I noticed that essentially instead of calling the 'getChecksumType' from the original channel's Channel class, it was actually calling ClonedChannel's getChecksumType method, after discussing with Mark Huth I realised that the obvious answer was that in setOriginal(Channel originalIn) originalIn does not appear to be a Channel object, but infact a ClonedChannel object, so it was actually forced to call the overridden (problematic) getChecksumType again.

I'm unsure exactly how we get to this point, but it appears based on the stack traces, that during the Hibernate query in KickstartFactory.lookupDefaultKickstartSessionForKickstartData(KickstartFactory.java:694) which is:

    <query name="KickstartSession.findDefaultKickstartSessionForKickstartData">
        <![CDATA[from com.redhat.rhn.domain.kickstart.KickstartSession as t where
            t.ksdata = :ksdata and t.kickstartMode = :mode order by created desc ]]>
    </query>

Hibernate somehow triggers the ClonedChannel code (potentially due to ForeignKeys) and then runs into the what would be infiniate loop.

What my patch does to address this is that if the ClonedChannel 'original' channel has the same label as the current channel getChecksumType() is handling, then it will accept that the value may very well be null and proceed without argument.

If this is the best way to handle it, I'm not 100% sure, it may mask worse problems, however the above is what I've come up with today.

--- Additional comment from sherr@redhat.com on 2012-07-05 12:16:17 EDT ---

This is very strange. I can find no reason that Hibernate is hydrating rhel-i386-es-4 as a ClonedChannel. There is no entry in rhnChannelCloned that would suggest that it is one, and there is no place in the Hibernate mappings or otherwise that I can find that would lead to Hibernate mistaking it for a ClonedChannel.

You're right though, the root problem is that since the code thinks that rhel-i386-es-4 is a Cloned channel and it has itself set as its own original channel, we get caught in an infinite recursion.

I think your patch is as reasonable of a fix that we can get without addressing the root Hibernate issues, and we'd have to be able to find them before they can be addressed. :(
Comment 1 Stephen Herr 2012-07-05 15:54:39 EDT
Patch committed to Spacewalk master: 1e1aaa778a2c9e463b5df8288742bf8907411ccf
Comment 2 Jan Pazdziora 2012-10-30 15:22:44 EDT
Moving ON_QA. Packages that address this bugzilla should now be available in yum repos at http://yum.spacewalkproject.org/nightly/
Comment 3 Jan Pazdziora 2012-11-01 12:17:44 EDT
Spacewalk 1.8 has been released: https://fedorahosted.org/spacewalk/wiki/ReleaseNotes18

Note You need to log in before you can comment on or make changes to this bug.