examine analyser stop words versus case sensitive

Press Ctrl / CMD + C to copy this to your clipboard.

Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

Damiaan 438 posts 1290 karma points MVP 3x c-trib

Nov 27, 2011 @ 22:43

0

Examine Analyser : Stop words versus Case Sensitive

Hi

Currently I am using the StandardAnalyzer in my lucene configuration.

Now I stumbled upon the following little problem. My users should be able to search on IT (aka information technology / ICT / ... ). But IT is so-called stop word. So it is left out of the index by examine. (the same for simple analyzer).

I've read that the "whitespace analyzer" also includes stop-words. But this doesn't solve the problem 100% because the whitespace analyzer creates a Case sensitive index! This is not what you want to offre for a site search ofcourse.

So how can i solve this? Writing my own analyzer seems a bit of overkill.

Kind regards

Copy Link
Damiaan 438 posts 1290 karma points MVP 3x c-trib

Nov 27, 2011 @ 23:09

0

I found the solution already.

The SimpleAnalyzer doesn't has a stopFilter and is case insensitive apparently. http://lucene.apache.org/java/3_0_1/api/core/index.html

But the tokenizer is different has someone experience with this?

Kind regards

Copy Link
Ismail Mayat 4511 posts 10059 karma points MVP 2x admin c-trib

Nov 28, 2011 @ 11:20

0

Guys,

See http://our.umbraco.org/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search

Regards

Ismail

Copy Link
Damiaan 438 posts 1290 karma points MVP 3x c-trib

Nov 28, 2011 @ 12:20

0

Thanks Ismail, that's a very helpful topic you mentioned!

Copy Link
is working on a reply...

This forum is in read-only mode while we transition to the new forum.

You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies

Flag this post as spam?

Examine Analyser : Stop words versus Case Sensitive