Bug 828504
| Summary: | State transfer taking too long on node join | ||
|---|---|---|---|
| Product: | [JBoss] JBoss Data Grid 6 | Reporter: | Michal Linhard <mlinhard> |
| Component: | Infinispan | Assignee: | Tristan Tarrant <ttarrant> |
| Status: | CLOSED WORKSFORME | QA Contact: | Michal Linhard <mlinhard> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 6.0.0 | CC: | dberinde, jdg-bugs, myarboro, nobody |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-06-06 13:12:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Michal Linhard
2012-06-04 19:56:56 UTC
Remember about the threading changes in the default ER11 configuration. What configuration are you using here ? Indeed, it looks very similar, the cache entries seem to take a very long time to get to the 32nd node.
Michal, could you schedule another run with stateTransfer.chunkSize="1000"? (The default is 10000.)
If possible, also checkout branch `t_uuperf_30` from `git:danberindei/JGroups.git` and run the UUPerf test with the jgroups-udp.xml configuration from our 5.1.x branch (which should be the same as the JDG default) with 32 nodes.
Assuming you have copied jgroups-udp.xml in the JGroups directory and you've already set up the IP_ADDR and NODE_NAME environment variables:
JG=. bin/jgroups.sh -Djgroups.bind_addr=${IP_ADDR} org.jgroups.tests.perf.UUPerf -props jgroups-udp.xml -name ${NODE_NAME}
I'll put it to my test queue in hyperion I just tried again: http://www.qa.jboss.com/~mlinhard/hyperion/run160-elas-dist-32-ER11-partial/logs/analysis/views.html and couldn't replicate it. I'm going to do two more tests and tests proposed by Dan I'd lessen the priority of this though. It doesn't seem to happen often and consequences are only lowered performance during view change. Another runs where I didn't reproduce this: partial elasticity test 30->32 http://www.qa.jboss.com/~mlinhard/hyperion/run161-elas-dist-32-ER11-partial/logs/analysis/views.html full elasticity test 16->32->16 http://www.qa.jboss.com/~mlinhard/hyperion/run162-elas-dist-32-ER11/logs/analysis/views.html Dan I'll move the further tests to the back of the hyperion test queue. Prabhat concludes: Can't reproduce in hyperion lab. |