Thursday, March 27, 2014

S15E01: KQL – The Basics

Want the book? Go get it!

This is the first episode in my series “SharePoint Search Queries Explained - The Series”. See the intro post for links to all episodes.

SharePoint 2013 includes two query languages which can be used to formulate your search queries. The Keyword Query Language (KQL) and the FAST Query Language (FQL). KQL is the topic for episode 1 and 2, and is the language you will mostly use when writing search queries, and is aimed at end-users. FQL has some extended capabilities over KQL, but you will usually solve your queries using KQL. FQL is also trickier to execute using the UI/CSOM/REST, but more on that in episode 10.

The basis of KQL is a set of operators and special characters you can use to formulate your queries. The KQL version included in SharePoint 2013 also have some enhancements brought over from FQL. This first episode will focus on the basics of KQL while the next will expand on some of the new special operators available (NEAR, ONEAR, XRANK) as well as crafting complex queries.

Keyword Operators

Keyword operators are what you will use the most as an end user formulating a query in a search box. Most users will only use variants of the top two as well, perhaps venturing into doing phrase searches.
Note: Search terms entered are case-insensitive but the operators must be in uppercase.
Operator Description Usage Example
termA termB Free text query One or more terms entered after each other with an implicit AND between each term A search for Swedish meatballs returns only items that have swedish and meatballs somewhere in them but not necessarily together.
AND/+ Boolean AND operator (the + character may also be used) This is the default operator and need not be specified. It sets the query to return only items with termA and TermB in them. Swedish AND meatballs and Swedish +meatballs and Swedish meatballs are essentially the same query.
"termA termB" Double quotation marks Placing double quotation marks around a multiple-term query makes the query a phrase search - only items that match the terms together in an exact phrase are returned. Searching for “Swedish meatballs” returns only items with swedish and meatballs together. The terms must follow each other exactly and in the specified order.
OR Boolean OR operator This is the disjunct operator and specifies that matching either of the terms on either side of this operator will satisfy the query. That is, items with either termA or TermB are returned. Searching for Swedish OR meatballs returns any item with the term swedish in it and any item with the term meatballs in it.
NOT/- NOT operator (the – character may also be used) This is the negation operator; it sets a trailing term to be an exclusion query. That is, any item that contains this term is excluded from the result set. Searching for Swedish NOT meatballs returns items that match Swedish but no items with the term meatballs in them are returned. Swedish -Meatballs is essentially the same query.
* Prefix matching wildcard (postfix is not supported) The wildcard operator can be added to the end of partial words to match terms with 0 or more trailing characters. Essentially, all terms starting with the entered characters up to the wildcard are matched. Searching for Swed* Meat* matches Swedish meatballs, Sweden meatballs, Sweden Meatloaf, Swedish Meat, and so on. These terms are not necessarily together. Phrase search is not supported with wildcards in KQL.
WORDS(termA, termB)
or
ANY(termA termB)
Synonym operator The synonym operator allows you to specify terms that should be considered synonyms of each other in the query. TermA and TermB should be considered to have the same meaning and, therefore, be searched for. This is equivalent to using the OR operator between termA and termB.

This operator cannot be used with property queries.
Searching for WORDS(Swedish, Svensk) returns all items with the term Swedish and all items with the term Svensk.

The difference between WORDS and ANY is that with ANY the terms are ranked as if they were the same term and not by their individual weight. An item with one instance of Swedish and two instances of Svensk would rank the same as an item with three instances of Swedish.
ALL(termA termB) Boolean AND operator Enclosing terms with the ALL operator is the same as writing the terms with a boolean AND between them.

This operator cannot be used with property queries.
Searching for ALL(Swedish meatballs) returns only items with Swedish and meatballs together.
() Parenthesis Parentheses are used to enclose and isolate a specific part of a complex query. If an opening parenthesis is used, a closing parenthesis must be provided. Meatballs NOT (Danish OR Norwegian) represents a query where all items with the term meatballs returns as long as the terms Danish or Norwegian are not also present.

Parentheses can also be nested. (Danish OR (Norwegian AND Swedish)) NOT (meatballs OR sausage) represent a query where items containing either danish or norwegian together with swedish, but not meatballs or sausage.

Property Queries and Operators

It is important to note that the property restriction can not include whitespace between the property name, the property operator, and the property value. If a space is encountered, the query is treated as a free-text query. Also note that a property restriction is limited to 2,048 characters.
In order to use a managed property in a property restriction the managed property must be marked as Queryable.
Date properties have a resolution of one day, so you cannot limit items between smaller intervals. Any hours you enter into the comparisons will be ignored.
Operator Description Supported Property types Example
= Equals Text, DateTime, Integer, Decimal, Double, YesNo fileextension:docx
: Contains Text, DateTime, Integer, Decimal, Double, YesNo author:mikael
author:”mikael svenson”
="term*" Starts with Text author="mikael*"

You need to type a full term before the wildcard. "mik*" will for example not work.
< Less than DateTime, Integer, Decimal, Double write<2014-03-27

Using yyyy-MM-dd is a safe way to ensure parsing of dates are exact
> Greater than DateTime, Integer, Decimal, Double write>2014-03-27
<= Less than or equal DateTime, Integer, Decimal, Double size<=1000000
>= Grater than or equal DateTime, Integer, Decimal, Double size>=1000000
<> Does not equal Text, DateTime, Integer, Decimal, Double, YesNo write<>2014-03-27
.. Range DateTime, Integer, Decimal, Double

Use: lower..higher
size:2000..3000 is equal to
size>=2000 AND size<=3000
If two property restrictions with the same property is added to a query, then an OR is performed between them, and not the implicit AND which is used by default with all other terms.
Some examples:
Items where the author is either mikael or garth.
author:mikael author:garth
Items where the author is mikael and garth in the same item
author:mikael AND author:garth
Items where the content type is Document and author is mikael
spcontenttype:document author:mikael

Property Queries and date intervals

In addition to specifying the date manually in a property query you can use some nifty built in variables.
Interval Description Example
today Represents the time from the beginning of the current day until the end of the current day. write=today
yesterday Represents the time from the beginning of the day until the end of the day that precedes the current day. write=yesterday
this week Represents the time from the beginning of the current week until the end of the current week. The culture in which the query text was formulated is taken into account to determine the first day of the week. write<”this week”
this month Represents the time from the beginning of the current month until the end of the current month. write<”this month”
last month Represents the entire month that precedes the current month. write:”last month”
this year Represents the time from the beginning of the current year until the end of the current year. write<”this year”
last year Represents the entire year that precedes the current year. write>”last year”

Query for managed metadata

The clue to query against managed metadata is to make sure the columns are site columns, so that you get an automatically created managed property for the column which you can query against. You can use either the label or the unique identifier for the term when querying. If you however want to query on a term which has siblings you have to use another notation:
  • To query for all items that are tagged with a term: GP0|#
    • It is also possible to query against the term default label as well
  • To query for all items that are tagged with a child of term: GPP|#
    • (Note that the parent term is NOT included in the results)
  • To query for all items that are tagged with a term from a term set: GTSet|#
    • (Note that the parent term is NOT included in the results)
If you query against owstaxIdTaxKeyword, then hierarchies will not work, only the individual term.

If you have a site column named location tied to a taxonomy, you will have the corresponding managed property yo query against: owstaxIdLocation.

Given the following taxonomy

World (fc01ae6d-8ed3-4872-9cef-d2199d52d61c)
      India (c8a43f13-5ea1-45f2-b46d-3a1986a1cbd7)
            Mumbai (ad491ed9-c21c-46d9-896c-c0d148957c60)
            Delhi (c195b6e0-9062-446a-9af1-8ec1a642fede)
      France (17587ed2-8433-45a4-9f4b-6825164fcd09)
            Paris (01031cfe-2492-47f1-8723-45c63ef70ec9)
            Lyon (3b2137a9-3c3a-4676-a50a-14f72ab29175)

All restaurants in Mumbai
owstaxIdLocation:"GP0|#ad491ed9-c21c-46d9-896c-c0d148957c60"
or
owstaxIdLocation:"Mumbai"

All restaurants in the World
owstaxIdLocation:"GTSet|#c8a43f13-5ea1-45f2-b46d-3a1986a1cbd7"

All restaurants in India
owstaxIdLocation:"GPP|#c8a43f13-5ea1-45f2-b46d-3a1986a1cbd7"

All restaurants in India, including India itself
owstaxIdLocation:"GPP|#c8a43f13-5ea1-45f2-b46d-3a1986a1cbd7" AND owstaxIdLocation:"GP0|#c8a43f13-5ea1-45f2-b46d-3a1986a1cbd7"

All restaurants which has a tagged location
owstaxIdLocation:"GTSet|#fc01ae6d-8ed3-4872-9cef-d2199d52d61c"

All items not tagged from a term set in a list
The clue is to add enough including terms to what you want, and then exclude the taxonomy column based on the term set id.

-owstaxIdLocation:"GTSet|#fc01ae6d-8ed3-4872-9cef-d2199d52d61c" contentclass:STS_ListItem ListID:72b79215-34e1-43fc-8cd5-16d594872aeb

Boost restaurants tagged with Delhi, if the users location is Delhi, and also boost restaurants which are not tagged at all
See S15E02 for more information on using XRANK and S15E03 for information on Query Variables. This sample was inspired by a conversation I had with Unni Løland.

(({searchTerms}) {?XRANK(cb=1) owstaxIdLocation:{User.SPS-Location}} ) {?XRANK(cb=1) -owstaxIdLocation:"GTSet|#fc01ae6d-8ed3-4872-9cef-d2199d52d61c"}

References

13 comments:

  1. Hi,
    is it also possible to use a dynamic field instead of a term-ID, for instance a user profile property? Something like: GPP|#User. ? I cant seem to get this to work...

    ReplyDelete
    Replies
    1. owstaxidlocation:GPP|#{\User.Taxfield.Id}
      should work

      Delete
  2. I don't get the "All restaurants in the World" example, is that correct? The id corresponds to India and what happens with France?

    ReplyDelete
  3. Hi Mikael! Is it possible to create other date intervals than the predefined. I want to filter everything which is elder than three months but that don't seem to be possible?

    ReplyDelete
    Replies
    1. Hi,
      By using query variables in the query template you can do this:
      write<{Today-90}

      See http://techmikael.blogspot.com/2014/05/s15e03-query-variables-constant-trouble.html

      Delete
    2. For some reason I couldn't get that to work directly in the searchbox but when I put in via the query generator in the search source it works fine!

      Next problem: I want to combine this argument with a specific part of my site, so it's only there the older items are filtered away. I've tried -(write<{Today-90} AND path:http://domainname/sitename/), (-write<{Today-90} AND path:http://domainname/sitename/) and (-write<{Today-90} AND -path:http://domainname/sitename/) but none of them work as intended.

      Delete
    3. You cannot use {} in the query itself. Query variables is part of the query template framework and must be used in the web part query template settings (query generator/wizard) or in a result source.

      Delete
    4. OK, that makes sense, but the queries above combining 'write' and 'path' are made on the result source and don't work. Using -write<{Today-90} alone works fine, but not in combination with path.

      Delete
    5. First off I would not do -write<{Today-90}, but rather do write>{Today-90} to reduce complexity.

      The following query template should work just fine:

      write>{Today-90} path:http://domainname/sitename

      This will give documents newer than 90 days from that site. No need to type AND as it's implicit between properties of different types.

      Delete
    6. Good point with the rewrite, but my challenge is that this restriction only applies to part of the source (http://domainname/sitename) but not for the rest of the domain (http://domainname/[anything else but sitename])

      Delete
    7. spsiteurl:https://domain.com/sites/sitecoll will limit to just that one site collection

      Delete
  4. Hi Mikael. Do you know if the querying for managed metadata using owstaxidxxx:guid should still work in hybrid search setups? I was earlier using a query like owstaxIdGlobalBrands:"a69d1f79-3d27-4d66-8b72-a04f88b0e087" that gave me results tagged with that parent term and all its children. Trying something similar now but not getting anything back. Testing in the non-hybrid setup still gives me expected results.

    ReplyDelete