Tuesday, August 9, 2011

How-to: Create a content collection in code

(Update: Read http://techmikael.blogspot.com/2011/08/working-with-content-collections-via.html for how to do this in a SharePoint context)


When installing FAST Search for SharePoint it creates a default content collection named “sp” where all content crawled via the FAST Content SSA is stored.

You also have the option to create new collections, and this is typically something you would do for the FAST Specific connectors (FAST Enterprise Web Crawler, FAST Database Connector, FAST Lotus Notes Connector) in order to support management like clearing out all content.

A content collection is merely a logical grouping of content inside of FAST Search for SharePoint, where all items indexed have an additional field named “meta.collection” attached, and not something which affects the physical layout of how FAST stores the search index.

In PowerShell you can use the command New-FASTSearchContentCollection to create new collections, but sometimes you want to do this in code for example via SharePoint feature activation.

Once you have a reference to the ContentContext object, it’s one line of code to create a new collection:
ContentContext contentContext = new ContentContext();
contentContext.Collections.AddCollection("notes", "Lotus notes connection");

Except if you run this code you get an exception:

Unhandled Exception: Microsoft.SharePoint.Search.Extended.Administration.Common.AdminException: Invalid pipeline name 'Office14'.

The name “Office14” is a static string retrieved from Microsoft.SharePoint.Search.Extended.Administration.Content.CollectionConstants.Office14

So how do we fix this? Examining the pipelines installed for example using PowerShell, we find the correct name of the pipeline:

PS C:\FASTSearch> Get-FASTSearchDocumentProcessingPipeline
Name
----
Office14 (webcluster)
Attachments (webcluster)

The correct name is “Office14 (webcluster)”, not “Office14”. Modifying the initial code to use the overload which also takes in the name of the pipeline to use solves the error:

contentContext.Collections.AddCollection("notes", "Lotus notes connection", "Office14 (webcluster)");

The “webcluster” part of the name is something which i s carried over from FAST ESP.