Wednesday, January 13, 2016

Getting diacrits to work with thesaurus in SharePoint search

When you live in Europe you tend to decorate the English a-z letters every so often. If you don’t know what I mean here’s some examples:

ö å ï é à

The important part when you create your thesaurus csv file as mention on TechNet is to save this file in UTF-8 format for it to work. And also do it without BOM/Ssgnature to be super sure.

In Notepad++ you find the setting under Encoding

image

In Visual Studio you pick Save As, then click the arrow on the Save button, pick Save with Encoding…and pick Unicode (UTF-8 without signature) – Codepage 65001 as your encoding.

image

image

The same goes for any other character set as well – stick with UTF-8 without signature.

Happy synonyming, and thanks for Elio Struyf for asking me this question!