Bug 115066
Summary: | Multi-JVM caching for CMS does not work | ||
---|---|---|---|
Product: | [Retired] Red Hat Enterprise CMS | Reporter: | Jon Orris <jorris> |
Component: | other | Assignee: | Vadim Nasardinov <vnasardinov> |
Status: | CLOSED WONTFIX | QA Contact: | Jon Orris <jorris> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | nightly | CC: | vnasardinov |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-09-05 17:33:42 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jon Orris
2004-02-06 00:12:41 UTC
> This suggests a test for ... state ... between cached objects. Hash
> code is not, and _should never be_, a function of mutable
> state. Hash code should always be computed against immutable
> variables.
"hashCode" should be computed against the same variables that "equals"
is computed against, in order to ensure that the implementations of
"hashCode" and "equals" are consistent with each other, as required by
these methods' contracts.
So, the question becomes, should "equals" be computed against mutable
variables? More often than not, the answer seems to be a resounding
yes.
|$ cat Main.java
|import java.awt.Point;
|
|public class Main {
| public final static void main(String[] args) {
| Point p1 = new Point(0, 0);
| Point p2 = new Point(1, 1);
| log("p1", p1);
| log("p2", p2);
| log (p1.equals(p2) ? "p1 equals p2" : "p1 does not equal p2");
| log("Mutating p2");
| p2.setLocation(0, 0);
| log("p2", p2);
| log (p1.equals(p2) ? "p1 equals p2" : "p1 does not equal p2");
| }
|
| private static void log(String name, Point p) {
| log("Point " + name + ": (x,y)=(" + p.x + "," +
| p.y + "); hash=" + p.hashCode());
|
| }
|
| private static void log(String msg) {
| System.out.println(msg);
| }
|}
|$ javac Main.java
|$ java -cp . Main
|Point p1: (x,y)=(0,0); hash=0
|Point p2: (x,y)=(1,1); hash=-2116026368
|p1 does not equal p2
|Mutating p2
|Point p2: (x,y)=(0,0); hash=0
|p1 equals p2
So, using hash codes the way CacheTable does it is not entirely
brain-dead. If two objects are equal, their hash codes are equal
(within the same JVM within the overlap between lifetimes of the
respective objects). Restated another way, if two objects have
non-matching hash codes, then we know for a fact that they are not
equal (within the same JVM).
So far, so good. Now, if you drop the caveat about the JVM being the
same, you arrive at the rationale for CacheTable's current behavior.
The question is, _can_ you drop the caveat? If you run identical JVM
versions across nodes, then you can, for a carefully selected subset
of objects. I'm 99.9% sure the above reasoning holds true for Strings.
Xref to a related earlier ticket: bug 111560. I both agree and disagree with Vadim's comment :) My original statement was, in a sense, unrelated to the problem at hand. On reflection, using hashCode as the cache value makes some sense after all, given the relation between the equals() and hashCode() contracts. This assumes, of course, that both are computed against a meaningful set of state parameters. (more on this in comment 4 ). The part that I disagree on, in a sense, is: >So, the question becomes, should "equals" be computed against mutable > variables? More often than not, the answer seems to be a > resounding yes. On one hand, yes, if the state parameters change, then equals should obviously change, as two objects will no longer be equal. Due to the contract of hashCode, it should change as well. This causes a problem , however. If the object is a key in a HashMap, changing it's hashCode will likely result in evil. This is why, among other reasons, a key Java idiom is to have everything be immutable unless it is absolutely neccessary for it to mutate state. In the example above, a Point class should be an immutable object, and a new one constructed instead of changing the existing object. This is really more a problem with Java, wherein important contracts conflict, and the only way they are maintained is by programmer discipline, not the language. The question, then, is what hashCode should we be using? For SiteNode, what is the meaning of it's state? Why is it being cached at all? What attributes of SiteNode should change to cause it to be flused from the cache? >If you run identical JVM versions across nodes, then you can, for a
>carefully selected subset of objects. I'm 99.9% sure the above
>reasoning holds true for Strings.
This is, I believe, true. In sun 1.4.2, at least, String.hashCode() is
a deterministic function of the characters in the String. I tested a
'solution' to this problem by having SiteNode.hashCode() call
getOID().toString().hashCode(), and similarly for ApplicationSpec.
This eliminated the 'constant flush' problem.
This isn't a full solution, at least for SiteNode, as OID represents
identity, not state. ApplicationSpec.toString(), though, should
represent state.
> This isn't a full solution, at least for SiteNode, as OID represents > identity, not state. I think it's fine for this to be a partial solution. As I pointed out in bug 111560, the main reason this thing kinda works is because it implements an LRU expulsion policy, which is independent of this whole hash-code hokeyness issue. I don't see a quick way to fix the flakiness of CacheTable. To quote Graydon Hoare http://post-office.corp.redhat.com/archives/java-project/2004-February/msg00027.html Message-ID: <4028217D.4080701> > overly-distributed programs usually lead to application programmers > trying to solve, in amongst their application code, research-grade > "hard" networking problems My conclusion: 1. The general problem is too hard to solve in this timeframe, so we won't do it. 2. Changing hashcode in SiteNode seems risky given the fact that SiteNodes might be hashed elsewhere. Moreover, I don't think the performance gain will be significant (and certainly doesn't exist on single JVM systems). 3. Since ApplicationSpec is a new class and isn't used anywhere else, we can go ahead and change hashCode on ApplicationSpec only. So, we're only going to change ApplicationSpec, unless anyone has strenuous objections... Reading through the comments I gather that the problem with SiteNode.hashCode() is that it varies across machines. Since we don't want to change the hashCode method on SiteNode, the obvious solution is to simply not store SiteNodes in the cache table directly. Create a then wrapper class that generates a hashcode based on the URL of the site node rather than its OID. Since URL is a string this should be stable acrosss JVMs. class SiteNodeWrapper { int m_hashcode; SiteNode m_node; SiteNodeWrapper(SiteNode node) { m_node =node; m_hashcode = node.getURL().hashcode(); } int hashcode() { return m_hashcode; } SiteNode getNode() { return m_node; } } This should enable us to fix richard's point 2. fairly simply, no ? If you implement hashCode() for SiteNodeWrapper, you also want to implement equals(). Otherwise, you'll pull half of your hair out trying to debug why you can't find entries that you know for a fact you put into CacheTable. Other than that, it looks like it'll work. Do we think it's worth implementing to reduce the number of cache flushes in multi-JVM situations? I think so. Remember that as it stands, the cache flushes will occur for every page hit, propagating to every node. This not only defeats caching altogether, but increases network traffic. Fixing SiteNode.hashCode() itself shouldn't be a problem, unless values are stored in the database. In that case, we'd have a much bigger MJVM problem which would require us to fix it anyhow. I'm not sure I understand the part about cache flushes occurring "for every page hit." To clariify, this is 'every page hit on a given node'. What's happening in my original ticket: 1) Go to content section in Node 1 2) Node 1 in web.PathMapCache fetches a site node & puts it in its cache. 3) Node 1 sends 'invalidate' messages to every other node. 4) Node 2-N, if it has the SiteNode, has a different hash code, and flushes it. 5) Goto content section in Node P, and the same 2-4 sequence occurs again. Ah, ok. So, under heavy load, we defeat our own caching efforts like so (I removed the term "site node" from your description, because I have trouble telling it apart from the notion of "node in a cluster"): 1) Go to content section in Node 1 2) Node 1 in web.PathMapCache fetches the content section and caches it. 3) Node 1 sends an 'invalidate' message to Node 2. 4) Node 2 evicts the content section, if it's been cached already. 5) Go to content section in Node 2. Since the content section has been evicted in step 4, Node 2 refetches and caches it. 6) Node 2 sends "invalidate" to Node 1. Lather, rinse, repeat. Is that correct? yes, that's it. Fixed on the trunk in changes 40430 and 40442. QA_READY has been deprecated in favor of ON_QA. Please use ON_QA in the future. Moving to ON_QA. Closing old tickets |