Examine Analyser : Stop words versus Case Sensitive
Hi
Currently I am using the StandardAnalyzer in my lucene configuration.
Now I stumbled upon the following little problem. My users should be able to search on IT (aka information technology / ICT / ... ). But IT is so-called stop word. So it is left out of the index by examine. (the same for simple analyzer).
I've read that the "whitespace analyzer" also includes stop-words. But this doesn't solve the problem 100% because the whitespace analyzer creates a Case sensitive index! This is not what you want to offre for a site search ofcourse.
So how can i solve this? Writing my own analyzer seems a bit of overkill.
Examine Analyser : Stop words versus Case Sensitive
Hi
Currently I am using the StandardAnalyzer in my lucene configuration.
Now I stumbled upon the following little problem. My users should be able to search on IT (aka information technology / ICT / ... ). But IT is so-called stop word. So it is left out of the index by examine. (the same for simple analyzer).
I've read that the "whitespace analyzer" also includes stop-words. But this doesn't solve the problem 100% because the whitespace analyzer creates a Case sensitive index! This is not what you want to offre for a site search ofcourse.
So how can i solve this? Writing my own analyzer seems a bit of overkill.
Kind regards
I found the solution already.
The SimpleAnalyzer doesn't has a stopFilter and is case insensitive apparently. http://lucene.apache.org/java/3_0_1/api/core/index.html
But the tokenizer is different has someone experience with this?
Kind regards
Guys,
See http://our.umbraco.org/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search
Regards
Ismail
Thanks Ismail, that's a very helpful topic you mentioned!
is working on a reply...
This forum is in read-only mode while we transition to the new forum.
You can continue this topic on the new forum by tapping the "Continue discussion" link below.