Friday, February 26, 2010

Directory Search with multiple filters in .Net

Directory.GetFiles in .Net 3.5 and Directory.EnumerateFiles in .Net 4.0 neither supports multiple patterns when searching for files. The reason is that it uses FindFirstFile / FindNextFile of kernel32.dll which lacks the support.

Initial thought would be to create an extension method to the Directory class, but since it’s a static class that’s not possible. The second best choice is to create a short helper class instead. What we do is do a wildcard search with “*” and filter the results with a regular expression.

If you return a large result set the new Enumerable version in .Net 4.0 is preferable as it returns values to act on as you go along.

public static class MyDirectory
{
// Works in .Net 3.5 - you might want to create several overloads
public static string[] GetFiles(string path, string searchPatternExpression, SearchOption searchOption)
{
if (searchPatternExpression == null) searchPatternExpression = string.Empty;
Regex reSearchPattern = new Regex(searchPatternExpression);
return Directory.GetFiles(path, "*", searchOption).Where(file => reSearchPattern.IsMatch(Path.GetFileName(file))).ToArray();
}

// Works in .Net 4.0 - inferred overloads with default values
public static IEnumerable<string> GetFiles(string path, string searchPatternExpression = "", SearchOption searchOption = SearchOption.TopDirectoryOnly)
{
Regex reSearchPattern = new Regex(searchPatternExpression);
return Directory.EnumerateFiles(path, "*", searchOption).Where(file => reSearchPattern.IsMatch(Path.GetFileName(file)));
}

// Works in .Net 4.0 - takes same patterns as old method, and executes in parallel
public static IEnumerable<string> GetFiles(string path, string[] searchPatterns, SearchOption searchOption = SearchOption.TopDirectoryOnly)
{
return searchPatterns.AsParallel().SelectMany(searchPattern => Directory.EnumerateFiles(path, searchPattern, searchOption));
}
}