Wednesday, June 15, 2011

Clearing items from a specific content source

On your Search Service Application, that be the standard SharePoint Search or FAST for SharePoint has a link called Index Reset which will clear all the searchable items in your index.

For FAST Search for SharePoint you would have to call Clear-FASTSearchContentCollection on the FS4SP farm as well via PowerShell.

This is all fine if you want to remove everything, but sometimes you may want to only remove parts of the index, say only items from a particular content source like SharePoint or your file server.

image

The trick to achieve this is to remove the start addresses from a search location, and then re-add them. When you remove a start address from a content source, it triggers a delete mechanism which will remove all items from that particular start address.

Here’s a sample PowerShell script which will remove all items from my File server content source called File residing on my Search Service Application called FASTContent. Probably something to turn into a .ps1 file or a cmdlet.

$sourceName = "File"

$contentSSA = "FASTContent"

$source = Get-SPEnterpriseSearchCrawlContentSource -Identity $sourceName -SearchApplication $contentSSA

$startaddresses = $source.StartAddresses | ForEach-Object { $_.OriginalString }

$source.StartAddresses.Clear()

ForEach ($address in $startaddresses ){ $source.StartAddresses.Add($address) }

5 comments:

  1. thks, sharing is a very useful:)

    ReplyDelete
  2. Can we do it for MOSS2007 site ? If so please let me know the steps to proceed?

    ReplyDelete
    Replies
    1. You can try to do it manually by removing the start addresses, saving, then adding them.

      You can also do it programatically. Take a look at http://msdn.microsoft.com/en-us/library/aa551656(office.12).aspx which should get you started.

      Delete
  3. Thanks for your reply Mikael.,
    I have an issue with one content source and now i wanted to run full crawl on that content source by resetting the index for that content source..

    our index server has multiple content sources.
    will removing the start address from the content source (issue) and delete , will it reset indexes of all content sources?

    ReplyDelete
    Replies
    1. Hi,
      You can either try to remove the start addresses, or remove the content source itself. This should clear up any items for that content source, not affecting the others.

      Delete