Friday, June 3, 2016

Bringing out the client side hammer - The one thing you should learn about SharePoint search in 2016

image

SharePoint 2013 has been out for a good while now with SharePoint 2016 well on it’s way, and over the years there are some major flaws which has crystalized itself to me in SharePoint search - especially if you are working in SharePoint Online. So if you plan on learning just one thing about how SharePoint search work this year, this post is for you!

The top four flaws in my opinion are:

  • Best bets / promoted results via query rules
  • Synonym handling / Thesaurus
  • How query rules trigger
  • Remove custom noise words from the query

This post covers a client side solution to solve the last three points above, and I started this back in January based on a conversation with Thomas Mølbach at Microsoft about solving synonyms in SharePoint Online. I’ve had it linger for a while and finally Elio Struyf finished the code and pushed me to get it out there. The solution can be found at the the Github SPSCR project.

The gist of the code is that it hooks into the JavaScript page lifecycle of the search client side web parts, allowing you to intercept the query and modify it before it’s sent over to the server. Due to a lack of documentation it took a while to get this working, but now the sample code is out there for anyone to use and build upon. The only thing you have to remember is to set your search web parts to run in asynchronous mode. If not, the script will not inject itself on the first query, only subsequent ones performed on that same page. Which may or may not be an issue for you depending on what you want to accomplish.

Best bets

Basically it goes like this. Create a query rule for each best bet is tedious and hard to manage in a good way. The UI for it just not adequate. In my article Better Best Bets with Lists, I show an alternative way using SharePoint lists and one query rule.

Synonym handling

The procedure for adding synonyms is too cumbersome. For on-premises you create a CSV file which you then import using PowerShell, effectively blocking power users from performing the task. You either have to set up a scheduled task to do the import at intervals, or call your favorite IT pro with server access. Also, two-way synonyms require two rows in the CSV file. All in all, hard to maintain and manage for a person tasked with search administration.

The other option available is to use Query Rules to do a query expansion when certain words trigger. The issue with this solution, as for other query rules based functionality like best bets, is that:

  • If you write a “complex” query using operators (AND/OR/ANY/etc.) or quotes, then rules won’t fire.
  • If you want to expand more than one term using two query rules, that doesn’t work, as you can have only one query rewriting rule.
  • Managing more than 10-20 query rules using the current UI is just not feasible – the old 2010 UI worked a lot better.
  • Editing a two-way synonyms using a query rule requires you to trigger on both terms and re-write it into.

Custom query variables to the rescue

If you have ever configured a search web part, you know about query templates and query variables. The two most common ones you see are {SearchBoxQuery} and {searchTerms} which contains the query being passed in. I go over most of the other out of the box ones in my post on query variables.

A perhaps hidden gem is that you can create your own custom query variables. If you are on-premises MSDN even has a sample hidden in the User segmentation documentation, and I have some code on Codeplex which I used in my SPC12 session on people search and extensibility (available on Channel9). At SPC14 (available on Channel9) I did something similar, but this time using some hacky JavaScript.

After investing some more time on how the JavaScript object model work for search pages, I came up with a more generic piece of code which could easily be expanded to suit your exact need.

Using our Github, solution synonyms is now handled in a maintainable SharePoint list, and you can use the {SynonymQuery} query variable in your query template instead. A full list of the query variables can be found at the project page.

Also, by re-writing the query into a custom variable, you can still write query rules which trigger on the original search terms as long as they are non-complex. You could also add logic to remove complex query operators from the query and move them over to a query variable to make sure triggering always works.

Noise words

Getting hit highlighting on common words like any,are,as,at,be and also including them in the query itself can give poor recall and poor extracts. Our solution contains a list of noise words and you can edit this to suit your own needs. Potentially the solution could be re-built to include the noise words in a term set or in a SharePoint list instead.

Performance issues with {User.} variables

There’s a couple of posts out there relating to performance issues using {User.} variables in your query templates.

The issue is that depending on if you put your query template in a web part or in a result source, the server doing the expansion has to query the UPA and possibly the term store once per variable per search request from a user. Nothing is cached or re-used. And especially with SharePoint 2013 this can lead to cross server requests, increasing latency quite a lot – especially in environments with many users.

What we have implemented is a solution which pulls in all the synced UPA properties to a site collections hidden user info list, and you can use these as query variables instead. The values are cached per session right now, but we are looking at adding browser cache as well. Instead of using {User.} you can use {spcsrUser.} which solves many of the expansion cases.

Bonus variables - because we can!

Since we now have code to inject custom variables, why not throw in some more out of the box, right? Knowing that people create lists with different date related columns we decided to add a bunch of date related ones such as weekdays, hours, month names and numbers.

If you have a column named WeekDay with an associated managed property you could add a filter like: WeekDay:{WeekDay}, and it would expand with the name of the current day. Right now the code includes English weekday and month names, but this can be edited to suit your needs.

Summary

Using the sample code from https://github.com/SPCSR/HelperFunctions/tree/master/SPO-Search-Improvements you can get started with synonym and noise word handling, as well as get some new query variables. And you also have a great starting point for adding logic on the page to pull in more data needed to modify a query – and asynchronous.

Add caching where it makes sense and use this awesome hammer module to continue to deliver great search solutions.

The next improvement on my list is to show how you can use the code to do user segmentation triggering of query rules – a smart way to accomplish contextual search. If you have an idea? Do a pull request and we’ll include it.