Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Josip 187 posts 652 karma points c-trib
    Oct 31, 2019 @ 13:57
    Josip
    0

    Examine search problem with words with umlaut

    Hi all,

    Examine search gives me 0 results for german words with umlaut (Vögel, Löwen, Bären etc...) but it works fine if I search it like: Vogel, Lowen, Baren etc...

    I dont know where is the problem. This is part of my code:

    string nodeName = Culture == "en-US" ? "nodeName_en-us" : "nodeName_de-de";
                ExamineManager.Instance.TryGetIndex("ExternalIndex", out var ContentIndex);
                var ContentSearcher = ContentIndex.GetSearcher();
                List<IPublishedContent> ContentNodes = new List<IPublishedContent>();
                if (!String.IsNullOrEmpty(searchTerm))
                {
                    var examineQuery = ContentSearcher.CreateQuery("content").NodeTypeAlias("category");
                    if (searchTerm.Contains(" "))
                    {
                        var terms = searchTerm.Split(' ').Select(x => x.MultipleCharacterWildcard()).ToArray();
    
                        examineQuery.And().GroupedOr(new List<string> { nodeName }, terms);
                    }
                    else
                    {
                        examineQuery.And().GroupedOr(new List<string> { nodeName }, searchTerm.MultipleCharacterWildcard());
                    }
    
                    var results = examineQuery.Execute();
    
  • Ismail Mayat 4511 posts 10059 karma points MVP 2x admin c-trib
    Nov 04, 2019 @ 14:09
    Ismail Mayat
    100

    Josip,

    Get rid of wildcard and it will work. So when content is indexed its run through standard analyser and ends up in index ascii flattened, which means umlauts etc got ascii equivalent.

    When you search it also goes through anlyser and query gets flattened and it should all work. However when wildcard its not run through ascii flattening therefore it wont match.

  • Josip 187 posts 652 karma points c-trib
    Nov 04, 2019 @ 14:16
    Josip
    0

    Hi Ismail,

    Should I use Fuzzy instead MultipleCharacterWildcard? Would I get the same results?

    BR

    Josip

  • Ismail Mayat 4511 posts 10059 karma points MVP 2x admin c-trib
    Nov 04, 2019 @ 14:17
    Ismail Mayat
    0

    no idea you will have to test it. Try it without wildcard first. If that works and you will want wildcard I can see you code where you can ascii flatten the query first then your wilcarding will work.

  • Josip 187 posts 652 karma points c-trib
    Nov 04, 2019 @ 14:19
    Josip
    0

    Ok thanks,

    but yes MultipleCharacterWildcard was making that problem, without it It working as expected.

    Thanks a lot

    BR

    Josip

  • Ismail Mayat 4511 posts 10059 karma points MVP 2x admin c-trib
    Nov 04, 2019 @ 14:21
    Ismail Mayat
    0

    kk so try it with fuzzy and if that dont work then you need to run it through https://gist.github.com/ismailmayat/1b42271b883e31962d72091d17f0bae8 dont forget to update lucene version to 3 as this code has 2.9 it also has links to explain more about the issue

  • Josip 187 posts 652 karma points c-trib
    Nov 04, 2019 @ 14:25
    Josip
    0

    It looks like I am getting the same results with fuzzy but with good results for words with umlaut. I will test it more, but I am alredy happy with this.

  • This forum is in read-only mode while we transition to the new forum.

    You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies