Move Remote Querying via Hot Rod Java client from Tech Preview to Full Support.
Following features should be supported:
1) Keyword and range queries
2) Wildcard queries
3) Combining queries: 'and', 'not', 'or' operations
4) Sorting, filtering, and pagination of results
Ease-of-use in configuration (minimize number of configuration steps and dependencies) and comparable performance to Embedded Querying (within 15%).
Partial Commit (see comments)
Infinispan Engineering Team
Note: the points below apply to both embedded and remote query DSL, both being supported.
1) Keyword and range queries - YES
2) Wildcard queries - we can support the "like" statement
3) Combining queries: 'and', 'not', 'or' operations - YES
4) Sorting, filtering, and pagination of results - YES for sorting and filtering
Regarding the pagination of results: we support pagination, but without stitching. In other words, a user can execute a query specifying a page size and requesting only the 5th page of the result set, but in order to get the 6th page the query needs to be re-executed with different page number. In other words we don't support executing the query once, and getting pages on-demand from the server after that.
B. Ease-of-use in configuration
Following items cause a bad user experience and are fixable in the scope of JDG 6.4 (Dec '14).
- Select which protobuf fields to be indexed (ISPN-3718) Allows for customized, partial indexing
- Automatically setup shared indexing when indexing is enabled (ISPN-4340). User is no longer required configure indexing by default. Very laborious procedure ATM
- Use a specific cache for managing protofiles (ISPN-4357). This would replace the current JMX approach and make schema administration much easier
- Use protoparser lib instead of relying on binary descriptors generated by protoc (ISPN-3480) Compiling protofiles and uploading binary content through JMX in no longer needed
The execution of a remote query is made out of the execution of an embedded query to which following steps are added:
- a round trip RPC over the hotrod client
- query string parsing and conversion to a lucene query
We estimate these steps being a fixed time addition for every query. However the 2nd step is a complex one, relies on external libraries (antlr) and hasn't been properly tested, especially under load. Starting the performance testing in the early stages of the 6.4 release cycle (July) is critical in order to detect and fix all the 3rd party performance issues.
The 15% performance guarantee the PRD mentions can't be made without this proper testing.
D. Limitations and risks
- during the execution, the full result of a query ends up on a single node at a point in time (no streaming of query results). If the query selects a large amount of data, this causes an OOM on either the server or the client. Would be good to ask the field if this is an acceptable limitation
- having performance testing running early in the 6.4 timeline is critical in order to give us time to fix all the potential 3rd party issues found.