See: Description
Interface | Description |
---|---|
CharStream |
This interface describes a character stream that maintains line and
column number positions of the characters.
|
KeywordSyntaxParserConstants |
Token literal values and constants.
|
Class | Description |
---|---|
KeywordQueryParser |
This class is a helper that enables to create easily a boolean of twig
queries.
|
KeywordSyntaxParser |
Parser for the standard Lucene syntax.
|
KeywordSyntaxParserTokenManager |
Token Manager.
|
Token |
Describes the input token stream.
|
Exception | Description |
---|---|
ParseException |
This exception is thrown when parse errors are encountered.
|
Error | Description |
---|---|
TokenMgrError |
Token Manager Error.
|
A keyword Query can be either:
twig
) expression that represents a TwigQuery
; or
boolean
) expression that represents a BooleanQuery
.
The syntax allows to use custom datatype on any part of the query.
A twig expression is defined by the special character ':'. For example, the following
a : bwhere the term "a" appears in one of the top level node (i.e., one of the top level field names) and the term "b" appears in one of its child node (i.e., one of the value of the field).
+---+ | a | +-+-+ | +-+-+ | b | +---+
In a twig query, the child clause has a
NodeBooleanClause.Occur.MUST
occurrence by default.
Multiple children can be associated to a same top level node (i.e., one of the top level field names) by using a JSON-like array syntax:
a : [ b , c ]where the term "a" appears in one of the top level node, the term "b" appears in one of its child node (i.e., one of the value of the field), and the term "c" appears in a second child node.
+---+ | a | +-+-+ | +----+----+ +-+-+ +-+-+ | b | | c | +---+ +---+
A JSON-like object syntax is also supported. For example, the following query
{ a : b , c : d }allows to query for nested objects in a JSON document. This is a syntax sugar for
* : [ a : b , c : d ]where the twigs
a : b
and c : d
become children
of a parent with an empty top-level node.
+---+ | * | +-+-+ | +----+----+ +-+-+ +-+-+ | a | | c | +---+ +---+ | | +-+-+ +-+-+ | b | | d | +---+ +---+
The query syntax allows to use a wildcard *
as a node
in the twig query. For example, a twig query node with no constraint
on the top-level node is written as
* : b. The same goes for setting no constraint on the child:
a : *. A chain of wildcards can be used to set a node constraint on a specific descendant. The query
a : * : * : bdefines a twig with the term "a" occurring on the top level node, and the term "b" occurring on a node three levels below.
Wildcards can be used with any of the previous syntaxes.
The query syntax allows you to create complex twig queries by nesting arrays, objects, and other twigs. For example:
a : { * : b , a : [ { b : c : d } , { e : a }, g ] }correspond to the following query tree:
+---+ | a | +-+-+ | +------+--------+ | | +-+-+ +-+-+ | * | | a | +-+-+ +-+-+ | | +-+-+ +------+------+ | b | | | | +---+ +-+-+ +-+-+ +-+-+ | * | | * | | g | +-+-+ +-+-+ +---+ | | +-+-+ +-+-+ | b | | e | +-+-+ +-+-+ | | +-+-+ +-+-+ | c | | a | +-+-+ +---+ | +-+-+ | d | +---+
A boolean expression follows the Lucene query syntax, except for the ':' which does not define a field query but instead is used to build a twig query.
a AND bmatches documents where the terms "a" and "b" occur in any node of the JSON tree.
A boolean combination of twig queries is also possible:
(a : b) AND (c : d)matches JSON documents where both twigs occurs.
The complete Lucene query syntax, e.g., grouping, boolean operators or range queries, can be used to match a single node of a twig. For example, the query
a : b AND cmatches JSON documents where the term "a" occurs on the top level node, and with both terms "b" and "c" occurring in a child node.
+---+ | a | +-+-+ | +---------+ | b AND c | +---------+The twig operator ':' has priority over the boolean operators. Therefore, the query
a : b AND c : dmatches documents as in the previous query, with the additional constraint that a child node with an occurrence of the term "d" must be present. It is the same as the query
a : (b AND c) : d
+---+ | a | +-+-+ | +---------+ | b AND c | +---------+ | +---+ | d | +-+-+
Some terms need to be analyzed in a specific way in order to be correctly indexed and searched, e.g., numbers. For those terms to be searchable, the keyword syntax provides a way to set how a query term should be analyzed. Using a function-like syntax:
datatype( ... )any query elements inside the parenthesis are processed using the datatype.
A mapping from a datatype label to an Analyzer
is set thanks to configuration key
KeywordQueryConfigHandler.KeywordConfigurationKeys.DATATYPES_ANALYZERS
.
For example, I can search for documents where the field contains age
,
and the values are integers ranging from 5
to 10
using the range query below:
age : int( [ 5 TO 50 ] )The keyword parser in that example is configured to use
IntNumericAnalyzer
.
for the datatype int
.
The top level node of a twig query is by default set to use the datatype
JSONDatatype.JSON_FIELD
. Any query elements
which is not wrapped in a custom datatype uses the datatype
XSDDatatype.XSD_STRING
.
"Marie Antoinette"
genre : DramaSuch a twig query is the basic building block to query a particular field name of a JSON object. The field name is always the root of the twig query and the field value is defined as a child clause.
More complex twig queries can be constructed by using nested twig queries or using more than one child clause.
director : { last_name : Eastwood , first_name : Clint }
(genre : Drama) AND (year : 2010)
Copyright © 2014. All rights reserved.