Tuesday, December 30, 2014

Limit search results to “Documents”

image
An often asked question is to have Document Search, where the want is to list documents only - typically Office type formats like Word, Excel and PowerPoint. But the exact answer is not that white/black, and can differ from company to company.

One query you might consider executing to list documents only in SharePoint is IsDocument:true.

This query will return everything SharePoint deems a file or web page.. or not a list item (almost), returning Office documents, images, zip files… you name it, it will be part of the returned result set.

Which brings up the question: What is a document?
And this is where human thought process comes into play and you should be opinionated and not rely entirely on computing power. In my recent project we’re starting off by defining a document as:
  • A file authored in an Office client application (or similar)
  • E-mails saved as a file
  • Acrobat Reader files / Microsoft XPS
  • Web pages in SharePoint (Wiki, Blog etc)
  • NOT a template file
  • NOT a multimedia file
This yields the following query to show what we deem a document:
IsDocument:1 FileType:doc* FileType:xl* FileType:ppt*
FileType:one FileType:aspx FileType:htm* FileType:eml
FileType:msg FileType:-od* FileType:pdf FileType:vd*
FileType:vsd* FileType:vss* FileType:xps FileType:rtf
-FileType:xlt* -FileType:pot* -FileType:dot* 

If you have any input as to what should be included/omitted feel free to leave a comment.

As for implementation we have created a custom Result Source with this filter which is used at a search vertical on the search center.