Site Search in ofmind.net

How to write search phrases

Space-separated words are treated as a conjunction (concatenated with AND).

When you want to use disjunctions, explicitly write OR .

Common advanced queries

Boolean operations and grouping, +/- modifiers, field searches, quoted phrases, wildcards, fuzzy words and phrases, value ranges, boosting, etc.

You can use all functions in Lucene standard query syntax.

Regular expression queries

Regular expression queries can be used surrounding by slashes like /abc?de/, which has similar behaviors with wildcards.

The syntax: /pattern/flags . Flags are optional. You can also combine it with field search and boosting.

Valid pattern expression and flags are the same as Java standard regular expressions.

Example: /0x[a-f0-9]+/i^3 OR title:/\p{Alpha}{6,8}/

Span queries

You can append WITHIN <n> after grouping parentheses (...), so-called NEAR searches. For example, (sunny day) WITHIN 2 matches with documents in which both sunny and day appears not more than 2 words apart.

Besides, you can append WITHIN [<m> TO <n>], position range searches, where m is the start, n is the end. You can specify * at <m> and/or <n>, which means m is 0, n is infinity. For example, (example) WITHIN [* TO 4] represents that example appears in the first 5 words.

Inside span queries of (...), wildcards, fuzzy words, and regular expressions are available. In contrast, these cannot be used in phrases with fuzzy slop "..."~n . Moreover, spans can be nested, and OR and NOT operators works on spans.

All docs queries

If you specify both the field name and value, like *:*, it matches all documents.

This is valuable when combined with other restrictions such as NOT .

Range queries

The classic Lucene query parser only supports closed intervals [... TO ...] and open intervals {... TO ...} . This system also supports half-bounded intervals {... TO ...] and [... TO ...} .

Full syntax definition

Query         = 1*Disjunction
Disjunction   = Conjunction *( Or Conjunction ) 
Conjunction   = Clause *( And Clause )
Clause        = [ Modifier ] [ Field ] ( Term / Group ) [ Boost ]
Modifier      = "+" / "-" / Not
Field         = ( Word / "*" ) ":"
Group         = "(" Query ")" [ Within ( Proximity / Position ) ]
Term          = Single / Phrase / Range / WildCard / RegExp / "*"

Single        = ( Word / Number ) [ Fuzzy ]
Phrase        = DQUOTE *QuotedChar DQUOTE [ Fuzzy ]
Range         = RangeStart RangeLimit [ To ] RangeLimit RangeEnd
RangeLimit    = 1*< any UChar except WildCardChar or DQUOTE or "," or RangeEnd > / DQUOTE *QuotedChar DQUOTE / "*"
WildCard      = TermStartChar *( TermChar / WildCardChar )
RegExp        = "/" *RegExpChar "/" *( "i" / "d" / "m" / "s" / "u" / "x" / "C" / "L" )
Fuzzy         = "~" Number
Boost         = "^" Number
Proximity     = Integer [ "INORDER" ]
Position      = "[" ( Integer / "*" ) [ To ] ( Integer / "*" ) "]"

And           = "AND" / "&&"
Or            = "OR" / "||"
Not           = "NOT" / "!"
Within        = "WITHIN" / "~~"
To            = "TO" / ","
RangeStart    = "[" / "{"
RangeEnd      = "]" / "}"
Word          = TermStartChar *TermChar
Number        = 1*DIGIT [ "." 1*DIGIT ]
Integer       = 1*DIGIT

WildCardChar  = "*" / "?"
SpaceChar     = SP / HTAB / CR / LF / %x000B / %x000C / %x0085 / %x00A0 / %x1680 / %x180E
SpaceChar     =/ %x2000-200D / %x2028 / %x2029 / %x202F / %x205F / %x2060 / %x2800 / %x3000 / %xFEFF
ReservedChar  = "+" / "-" / "!" / "(" / ")" / ":" / "^" / "[" / "]" / "{" / "}" / DQUOTE / "~" / "\" / "/" / "@"
EscapedChar   = "\" UChar
TermStartChar = < any UChar except ReservedChar or WildCardChar or SpaceChar > / EscapedChar
TermChar      = TermStartChar / "+" / "-" / "/" / "@"
QuotedChar    = < any UChar except DQUOTE or "\" > / EscapedChar
RegExpChar    = < any UChar except "/" or "\" > / EscapedChar
UChar         = %x000001-10FFFF

List of fields

field namedescription
uriURI
mimetypeMIME types
titletitles
authorauthors
date.createdcreation date
date.modifiedmodification date
keywordkeywords
subjectsubject areas
default used when the field name is omitted, including body, title, keywords, etc.

Note that not all documents have all fields.