Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Andreas Kristensen 26 posts 176 karma points
    May 18, 2020 @ 07:37
    Andreas Kristensen
    0

    "The" seems to be ignored in search

    I have a site, with a lot of nodes which name contains "the double".

    If I make a search on that term "the double" i get results for "the", but no results containing "the double".

    If I then search for "double", I get the expected results that contain "the double" in the node name.

    What gives?

  • Marc Goodson 1451 posts 9716 karma points MVP 5x c-trib
    May 20, 2020 @ 22:45
    Marc Goodson
    100

    Hi Andreas

    Examine is a wrapper around Lucene, and the lucene standard analyzer has a list of English 'stop words' that for efficiency purposes it doesn't index, I guess because of their frequency... that a search for 'a' or 'it' or 'the' would be meaningless, when you search for 'the double' therefore the 'the' part is ignored...

    The full list of stop words is in this code sample:

    http://alvinalexander.com/java/jwarehouse/lucene/src/java/org/apache/lucene/analysis/StopAnalyzer.java.shtml

    There is an explanation of the issue here in the ezSearch repo, for V7

    https://github.com/umco/umbraco-ezsearch/issues/23

    where one workaround is to remove the stop words from the search terms...

    otherwise you could use a different analyser...

    regards

    Marc

  • This forum is in read-only mode while we transition to the new forum.

    You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies