indexing - Lucene acting up on OrientDB when confronted with fuzzy queries -


i have indexed property on orientdb using lucene's keyword analyzer:

create index snippet.ssdeep on snippet (ssdeep) fulltext engine lucene metadata {"analyzer":"org.apache.lucene.analysis.core.keywordanalyzer"} 

the filed contains simhashes have indexed test.

now when search using lucene, response exact queries, not fuzzy queries (despite escaping query text).

for instance, given field value "192:d4e1gdzyduzrw9afcb+a66ancczmx9n2p:2e1gw18a66ac/yp", following query yields 1 record:

select snippet ssdeep lucene "192\\:d4e1gdzyduzrw9afcb\\+a66ancczmx9n2p\\:2e1gw18a66ac\\/yp" 

while query yields no records:

select snippet ssdeep lucene "192\\:d4e1gdzyduzrw9afcb\\+a66ancczmx9n2p\\:2e1gw18a66ac\\/yp~0.9" 

i wonder preventing lucene finding approximative results? more particularly lucene (or keywordanalyzer) not apt in fuzzy searching such strings, or interface between lucene , orientdb @ cause?

i.e. have other full text lucene indexes on same database work, fields contain ordinary text , analyzed using simple or standard analyzers. field need full text index on, , fails work.

the problem letter case. standardanalyzer, simpleanalyzer, , englishanalyzer lowercase text before indexing terms. keywordanalyzer doesn't.

since wildcard, fuzzy, , other expanded, multi-term queries aren't analyzed, queryparser, default, lowercases these types of query.

i don't know orientdb exposes of lucene allow effectively, 2 best solutions in lucene are:

  1. disable queryparser lowercasing these types of queries:

    queryparser.setlowercaseexpandedterms(false); 
  2. use custom analyzer combines keywordtokenizer lowercasefilter:

    public class lowercasekeywordanalyzer extends analyzer {     @override     protected tokenstreamcomponents createcomponents(string fieldname) {         tokenizer source = new keywordtokenizer();         tokenstream filter = new lowercasefilter(source);         return new tokenstreamcomponents(source, filter);     } } 

i know neither if nor how these exposed in orientdb, points in right direction.


Comments

Popular posts from this blog

shopping cart - Page redirect not working PHP -

php - How to modify a menu to show sub-menus -

python - Installing PyDev in eclipse is failed -