Friday, February 26, 2010

Directory Search with multiple filters in .Net

Directory.GetFiles in .Net 3.5 and Directory.EnumerateFiles in .Net 4.0 neither supports multiple patterns when searching for files. The reason is that it uses FindFirstFile / FindNextFile of kernel32.dll which lacks the support.

Initial thought would be to create an extension method to the Directory class, but since it’s a static class that’s not possible. The second best choice is to create a short helper class instead. What we do is do a wildcard search with “*” and filter the results with a regular expression.

If you return a large result set the new Enumerable version in .Net 4.0 is preferable as it returns values to act on as you go along.

public static class MyDirectory
{
// Works in .Net 3.5 - you might want to create several overloads
public static string[] GetFiles(string path, string searchPatternExpression, SearchOption searchOption)
{
if (searchPatternExpression == null) searchPatternExpression = string.Empty;
Regex reSearchPattern = new Regex(searchPatternExpression);
return Directory.GetFiles(path, "*", searchOption).Where(file => reSearchPattern.IsMatch(Path.GetFileName(file))).ToArray();
}

// Works in .Net 4.0 - inferred overloads with default values
public static IEnumerable<string> GetFiles(string path, string searchPatternExpression = "", SearchOption searchOption = SearchOption.TopDirectoryOnly)
{
Regex reSearchPattern = new Regex(searchPatternExpression);
return Directory.EnumerateFiles(path, "*", searchOption).Where(file => reSearchPattern.IsMatch(Path.GetFileName(file)));
}

// Works in .Net 4.0 - takes same patterns as old method, and executes in parallel
public static IEnumerable<string> GetFiles(string path, string[] searchPatterns, SearchOption searchOption = SearchOption.TopDirectoryOnly)
{
return searchPatterns.AsParallel().SelectMany(searchPattern => Directory.EnumerateFiles(path, searchPattern, searchOption));
}
}



2 comments:

  1. I'm in .NET 4.0 and am using the bottom overload successfully as it does exactly what I was doing, but in half the time.

    I like the idea if using LINQ with the results but wouldn't have known about how to execute it in parallel so pretty like that.

    Very nice code. Thanks for sharing!

    ReplyDelete
  2. Thank you for nice and concise article. You taught me
    1. Why we don't have multiple filters in EnumerateFiles
    2. Introduced me to ParallelEnumerables
    3. SelectMany

    Thank you11

    ReplyDelete