Thursday, March 29, 2012

The book is out!

If you haven’t already seen my tweets or noticed yourself, “Working with Microsoft FAST Search Server 2010 for SharePoint” is ready for sales in print edition at both Amazon and O’Reilly web sites. The e-book should arrive shortly as well.

It’s been fun and a lot of work creating the book, but together with my co-authors Marcus Johansson and Robert Piddocke we managed to complete it.
I hope and think that everyone should find something interesting and useful in it and I’d love to get your comments and thoughts on it.



I would also like to thank my employer Puzzlepart for supporting my writing and allowing me time during the day for parts of the writing. Very valuable indeed!

2 comments:

  1. Hi Mikael,

    I enjoyed your book. Contains a lot of valuable information. However, I still have a question regarding normalization, which I hope you can answer:

    A customer of ours has PDF files which have been OCR'ed. This process sometimes introduces spaces between words, for example --> "Microsoft" becomes "Mic r o soft". The idea was to fix this during indexing by changing the body, however, this seems to be read-only.

    Can this be changed? Or is there another property that needs to be specified as output parameter for the extension which also influences the hit highlighting?

    Thank you for your help!

    ReplyDelete
    Replies
    1. Hi,
      Unfortunately you cannot write back to the body field, (unless you do custom stages in python the ESP way).

      A workaround is to write your modified content back to a new crawled property, eg "fixedbody", and configure this property to be made searchable. This will add it default to the full-text index "content", and you should see highlights. You can also add it to a custom managed property if you want to add it as a higher priority when searching.

      How much content you put in this property is up to you, as you could get duplicate highlights as your text will match in both "body" and "fixedbody".

      Does this help?

      Delete