Bug 1017372
| Summary: | Increase permissions_validity_in_ms setting for storage node | |||
|---|---|---|---|---|
| Product: | [Other] RHQ Project | Reporter: | John Sanda <jsanda> | |
| Component: | Database | Assignee: | John Sanda <jsanda> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.9 | CC: | hrupp | |
| Target Milestone: | --- | |||
| Target Release: | RHQ 4.10 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1017432 (view as bug list) | Environment: | ||
| Last Closed: | 2014-04-23 12:32:18 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1017432 | |||
I committed the change to master. I set the timeout to 10 minutes though. I had been testing with 10 minutes, not 5. master commit hash: d61b7ed441b25 Bulk closing of 4.10 issues. If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10. |
Description of problem: The storage node uses org.apache.cassandra.auth.CassandraAuthorizer for authorization checks. This imposes a non-trivial amount of overhead because now authorization checks are performed for each read/write request. To mitigate that overhead, a local cache of permissions is stored. The default lifetime for a cache entry is set by the permissions_validity_in_ms property in cassandra.yaml. It defaults to two seconds. When a node comes under heavy load, I have on several occassions started seeing read timeout exceptions, even on writes. This is because of the authorization check which very frequently has to query the system_auth.permissions table. The exceptions look like in rhq-storage.log look like, ERROR [Native-Transport-Requests:1101] 2013-10-07 14:06:52,730 ErrorMessage.java (line 210) Unexpected exception during request com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses. at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258) at com.google.common.cache.LocalCache.get(LocalCache.java:3990) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878) at org.apache.cassandra.service.ClientState.authorize(ClientState.java:290) at org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:170) at org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:163) at org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:147) at org.apache.cassandra.cql3.statements.ModificationStatement.checkAccess(ModificationStatement.java:67) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:100) at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:223) at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:121) at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:287) I want to make set permissions_validity_in_ms to five minutes which substantially reduces the overhead of the authorization checks but does not allow the permissions to get stale either. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: