Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • RoboDog 21 posts 41 karma points
    Aug 12, 2010 @ 12:38
    RoboDog
    0

    Lucene Index and case

    Hi im having some issue with the lucene index. For example suppose i have a doctype called customer which in tern has a property called name, when i create a document and enter the name property as "John Doe" it is stored in the index as "john doe" how do i retain the correct casing ?

  • Aaron Powell 1708 posts 3044 karma points c-trib
    Aug 12, 2010 @ 14:40
    Aaron Powell
    0

    This answer was originally posted here: http://our.umbraco.org/forum/developers/extending-umbraco/10999-Examine-Questions?p=0#comment42951

    You have to store both a case sensitive and case insensitive data as Lucene isn't really designed for data retrieval.

    To do this with Examine you have to attach to the UmbracoExamine.LuceneExamineIndexer.DocumentWriting event (which may have moved into the LuceneEngine with the latest check ins, I'm not 100% sure).

    This event is fired in a Lucene-scope as provides you with access to the Lucene Document object as it's being written to, and in which you'll need to add your un-analyzed version of the content.

    Here's an example of how we did it in a recent project for showing in search results:

    void indexer_DocumentWriting(object sender, DocumentWritingEventArgs e)
    {
           
    var doc = e.Document;
           
           
    // Find the title
           
    string title = !e.Fields.ContainsKey("PageTitle") || string.IsNullOrEmpty(e.Fields["PageTitle"]) ? e.Fields["nodeName"] : e.Fields["PageTitle"];
           
           
    // Default content is nothing:
           
    string content = string.Empty;
           
    // Unless a description is found:
           
    if (e.Fields.ContainsKey("Description") && !string.IsNullOrEmpty(e.Fields["Description"]))
           
    {
                    content
    = e.Fields["Description"];
           
    }
           
    // Or BodyContent is found:
           
    else if (e.Fields.ContainsKey("BodyContent") && !string.IsNullOrEmpty(e.Fields["BodyContent"]))
           
    {
                    content
    = e.Fields["BodyContent"];
           
    }

           
    // Store the title and content with text casing unchanged
            doc
    .Add(new Field("__PageTitle", title, Field.Store.YES, Field.Index.NOT_ANALYZED));
            doc
    .Add(new Field("__Content", content, Field.Store.YES, Field.Index.NOT_ANALYZED));
    }
    And when we display it in the search results we end up with showing the __PageTitle and __Content field, not the 'real' fields.
    Check out this article I wrote to better understand the Store and Index concepts: www.aaron-powell.com/documents-in-lucene-net

  • RoboDog 21 posts 41 karma points
    Aug 12, 2010 @ 15:10
    RoboDog
    0

    Thats cool exactly what i was looking for :) one stupid question how / where do i attach to the event ? from within my code so that it executes when the index runs ?

  • Aaron Powell 1708 posts 3044 karma points c-trib
    Aug 13, 2010 @ 01:06
    Aaron Powell
    0

    You can use ApplicationBase like it's an Umbraco event, or you can use a HttpModule and wire it up early in the life cycle.

    I'd go with ApplicationBase personally.

  • This forum is in read-only mode while we transition to the new forum.

    You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies