Monday, January 17, 2011

How to “spy” the data in a custom pipeline extensibility stage with FS4SP

In the old FAST a much used stage during development is the “Spy” stage. What this stage does is dump out a log file of all current attributes and the values assigned to them at that point in the content processing pipeline.

Fortunately for us, this stage still exists in FS4SP, and it might help you when testing and debugging your crawling.

In order to enable the spy stage, first stop the FAST configserver

nctrl stop configserver

Second, open up %FASTSEARCH%\etc\pipelineconfig.xml

Typically you want to add your spy stage before or after the custom extensibility. In the example below I have added it before.
image

After the edit, save your file, and start the configserver up again.

nctrl start configserver

If you watch the %FASTSEARCH\var\log folder during indexing you will see a file named spy.txt appear which contains all current fields available to you.
image

The file is overwritten by each processed file and will contain information from the latest document only. If you index using only one document processor it’s still a valuable tool during development to check that you are receiving the data you expect for your custom stage.