Friday, April 4, 2014

S15E02: KQL–On Kryptonite

Want the book? Go get it!

This is the second episode my series “SharePoint Search Queries Explained - The Series”. See the intro post for links to all episodes.

In episode one I covered the basic query operators of KQL. In this episode I will cover the advanced ones, as well as show samples of how it can all be put together in useful scenarios together with some nice to know managed properties.

Note: Search terms entered are case-insensitive but the operators must be in uppercase.
Operator Description Usage Example
NEAR NEAR operator Specifies that terms in the query must appear
within a specific distance or tokens where the
operator takes two or more terms and a numeric
value to specify the distance N. The default numeric
value is 8. 
<expression> NEAR(n=4) <expression>
or
<expression> NEAR(n) <expression>
Searching for recipe NEAR(3) meatballs NEAR(2) swedish returns items where the term recipe is within 3 terms of meatballs. And the term meatballs is within 2 terms of swedish.

OK:
This is a recipe of tasty Swedish Meatballs

FAIL:
A Swedish cook creates tasty meatballs from an old family recipe
ONEAR Ordered NEAR Ordered variant of the NEAR operator, where the
terms must appear in the specified order. The default numeric
value is 8.
<expression> ONEAR(n=4) <expression>
or
<expression> ONEAR(n) <expression>
Searching for recipe ONEAR(3) meatballs ONEAR(2) swedish returns items where the term recipe is within 3 terms of meatballs. And the term meatballs is within 2 terms of swedish. The terms are also in the order: recipe, meatballs, swedish.

OK:
A recipe for good meatballs comes from Swedish people

FAIL:
Show me a recipe for Swedish meatballs
XRANK XRANK rank
modifying
operator
XRANK allows for modification of the ranking
values in the result set based on an expression
and a boost value. XRANK can be applied against
both the full-text index and managed properties.
The queries take the following format.
<match expression> XRANK(cb=100, rb=0.4, pb=0.4, avgb=0.4, stdb=0.4, nb=0.4, n=200) <rank expression>
Searching for swedish meatballs XRANK(nb=0.5) spicy returns all
items with Swedish and meatballs
and give a normalized boost to any item by 0.5 with the
term spicy in it.

The XRANK operator can take 7 different parameters. As noted on MSDN, you will typically only use the normalized boost parameter (nb) as this parameter provides the necessary control to promote or demote a particular item, without taking standard deviation into account. If you are using the Dynamic Ordering capabilities of the Query Builder, you will note it uses a combination of Constant Boost (cb) and Standard Deviation Boost (stdb). I have found cases where nb don’t give any value at all, so using cb with stdb might be a good approach.

The only way to actually get this working is by trying out different values and on real data. Using the SharePoint 2013 Query Tool is an excellent way to get started to look at the rank values per item.

References
Putting it all together
Parenthesis can be used to group together different parts of a query, and may be used on either side of an operator (AND, OR, NOT, NEAR, ONEAR, XRANK). They may also be nested.

Sample 1
The query below will match items which include either swedish or norwegian together with both meat and sheep, without having the term awful in them. In addition the items has to be last modified within the past year. Items which contain michelin guide in the title will be promoted by 0.5 rank points. Items which contain the term fish, will be promoted by 0.1 rank points. Items which have both michelin guide in the title and contain fish will in total get 0.6 rank points added.

((((ANY(swedish norwegian) AND (meat sheep)) XRANK(cb=0.5) title:”michelin guide”) write:”last year”) XRANK(cb=0.1) fish) -awful

Sample 2
The query below will match items of the Document content type which are Excel files.
SPContentType:Document fileextension:xlsx

Sample 3
The query below will match items with the term finance in documents which are either Word documents or PowerPoint presentation.

finance fileextension:doc fileextension:docx fileextension:xls fileextension:xlsx

Sample 4
The query below will match web sites, lists, libraries and folders which have the term secret in the name.

IsContainer:1 secret

Sample 5
The query below will match files which have the term secret in the name.

IsDocument:1 secret

Sample 6
List only documents store in OneDrive libraries.

IsMyDocuments:1

Sample 7
List all documents from a specific library or list, limited on the id of list.

ListId:18D36608-943C-4173-8770-589ABBC5B786

4 comments:

  1. Hi Mikael

    I have bought your book.I am reading it now.
    On this Ch.2 KQL Advanced I have a question about the Operator: NEAR

    In the Example,as the rule You said below I think when query for [recipe NEAR(3) meatballs NEAR(2) swedish]
    The folowing should be OK. Can you tell me Where my thought is wrong?

    OK:
    This is a recipe of tasty Meatballs term1 term2 Swedish.


    =============================
    Searching for recipe NEAR(3) meatballs NEAR(2) swedish returns items where the term recipe is within 3 terms of meatballs.
    And the term meatballs is within 2 terms of swedish.

    OK:
    This is a recipe of tasty Swedish Meatballs

    FAIL:
    A Swedish cook creates tasty meatballs from an old family recipe
    =============================

    ReplyDelete
    Replies
    1. not sure, but maybe it ignores noise words like «an». just a thought. or it could be a bug of sorts :)

      Delete
  2. I mean your book should write like the this:
    OK:
    This is a recipe of tasty term1 Meatballs term2 term3 Swedish.

    Am I right?

    ReplyDelete
    Replies
    1. Seems a double NEAR parses incorrect. You need to do: (recipe NEAR meatballs) (meatballs NEAR swedish)

      I cannot say if this used to work or not before - but doesn't work now at least as you point out.

      Delete