Saturday, December 25, 2010

Identifying target machine (32bit or 64bit) with ClickOnce deployment

The title is from a question at Stack Overflow, and the problem is how do you check what environment you are running under when you have third party libraries which are compiled explicit to x86/x64, in order to instantiate the correct version?
My test project for the scenario contains three parts. “test” my third party library, “ClickOnceTest”, the main application, and “Loader”, the application which will start “ClickOnceTest” with the correct version of the “test” library.
image

Thursday, December 23, 2010

How To: Change location of index files for FAST Search for SharePoint

[Update: The only supported way of moving the data folder is using symlinks ( eg. mklink.exe) - http://support.microsoft.com/kb/2506015]


When installing FS4SP it will default put the index files below the same folder which FAST Search is installed at. If you installed at C:\FASTSearch, then the files reside in
  • C:\FASTSearch\data\data_index (the search index)
  • C:\FASTSearch\data\data_fixml (raw format files)
Often you would like to put the data on a separate volume. If you don’t know what files to edit you can resolve to using mklink.exe and symlink the above folders to another volume.

Wednesday, December 22, 2010

How To: Debug and log FAST Search for SharePoint pipeline extensibility stages with Visual Studio

One of the most powerful features with FAST Search for SharePoint is the ability to do work on the indexed data before it’s made searchable. This can include extracting location names from the documents being indexed or enriching the data from external sources by adding financial data to a customers CRM record based on a lookup key. Only your imagination limits the possibilities.

Saturday, December 18, 2010

Doing blended search results in SharePoint–Part 2: The Custom CoreResultsWebPart Way

(Part 1: The Hackish Way)
In Part 1 I used two Search Core Results Web Parts and a bit of jQuery magic to achive the look of blended search results.

This time we will create our own CoreResultsWebPart and inject the blended results into the result xml before it is transformed into html. In addition to blend in news results I decided to get some  images as well. I did this by importing a “Federated Location” for Flickr. The location definition can be found at “Flickr Search Connector for SharePoint Server, Search Server, and FAST Search for SharePoint”.

Wednesday, December 8, 2010

Doing blended results in SharePoint–Part 1: The Hackish Way

A comment from a colleague on my previous blog post, “XSLT creation revisited for SharePoint 2010 Search and a small search tip”, asked how I would do blended search results in SharePoint Search. I have come up with three ways of doing this, where I will demonstrate the quick and dirty one in this post, and save the “best architectural” version for last. So watch out for part 2 and 3 of this topic in the weeks to come.

The method described in this post is suitable for non-developers.

Tuesday, November 30, 2010

Increasing the summary length in FS4SP

In the settings for the Core Result Web Part you have the possibility to set the length of your hit summary. The default is 185 characters, and the upper limit seems to be somewhere around 400 when running against FAST Search Server 2010 for SharePoint.
image

Friday, November 26, 2010

XSLT creation revisited for SharePoint 2010 Search and a small search tip

Search tip

If you search with only a hash “#”, then you will do an empty search and all results are returned.
When modifying the xslt for the Core Search Result Webpart it’s nice to know what data is actually included in the xml.
SharePoint 2010 has a section called “How to: View Search Results XML Data” which also existed for 2007. This time around it has included the important (obsolete for HTML5) XMP tag which makes rendering xml a breeze. Best practice is to use the PRE tag, but then you have to html encode your tags for it to render correctly.

Friday, November 12, 2010

Creating Zip files with System.IO.Packaging namespace

Originally created to support working with Open Office Xml documents, it’s possible to use this namespace to create zip files as well.
The only drawbacks I have found is that you end up with an additional xml file at the root of your zip file called [Content_Types].xml which lists the mapping of file extension to mime type, and you cannot have spaces or non-ascii characters in your filenames.
If you can live with this, there is no need to rely on an external library.

Thursday, November 11, 2010

Why the Enterprise Search Web Parts are sealed

I can’t claim to know the real political reasons behind this, which according to Corey Roth’s blog post last December is because “it’s by design”. In my mind “by design” is not a real reason.
For non .Net programmers, a sealed web part means that you cannot inherit from it, doing your own customizations.
So, why are they really sealed?

Thursday, November 4, 2010

Reading Excel Sheets (xlsx) with .Net

The most common way to read Excel sheets up until recently was to use ADO.Net with the ACE OLEDB driver. It works, but you have to install the latest drivers etc., and isn’t it time to do it differently.
Third party solutions aside like Aspose, it’s possible to do this with all native .Net code. As many may or may not know, C# 3.0 introduced the System.IO.Packaging namespace used to work with Office Open Xml format, used by Office 2007 and newer. Files with docx, xlsx and pptx are all created in this format, and are basically zipped xml file structures. Rename a xlsx file to zip, and open it in your favorite zip browser and you will se something like this:
zipxlsx

Monday, October 25, 2010

Know your SharePoint version

This post is not so technical, but relates to the business side of SharePoint. I’m doing a project with a customer with FAST Search for SharePoint. The company has 3000 employees and were told they could use FS4SP with the Standard version of SharePoint, meaning the standard CAL’s. I mentioned in a meeting some weeks ago that I was 99% sure this was not possible, and that you need the Enterprise license in order to use FS4SP.

Starting the install today showed I was right. Standard version does not have the FAST options for a Search Service Application. Adding in an Enterprice serial number enabled all the good pieces.

But, for 3000 users, going from standard to enterprise will cost you. Using ballpark figures from http://www.sharepointconfig.com/2010/05/indicative-sharepoint-2010-licencing-costs/ where a standard CAL is ~£60 and eCal is £60+£53=£113, we are talking forking out £339,000 instead of £180,000, a whopping 88% increase. And to top it off, we are talking an off-shore company where there are 3 shifts during a 24h period, and people work 2 weeks, and are off for 4 weeks, so the number of licenses in use at one time will never be more than around 350.

At times like this I’m happy to be a tech and not a buyer Smile

Friday, August 20, 2010

Voting and StackOverflow

While watching TV in the evening or waiting for compiles at work to finish I often spend my time browsing and answering questions on StackOverflow.

Today I got my first “Nice Answer” badge, meaning you got 10+ votes on an answer. My answer was on a rather non-important question: Which operator is faster: ?: or &&

The question is irrelevant in day to day work, and that’s what I wrote in my answer. Readability is more important than which piece of code is faster. At least for things which are already super fast.

What I find interesting is that answers on questions like these get the most votes. Harder, difficult questions, tend to get fewer votes.

So why is this? My only explanation is that topics/questions which many people understand get more votes as more people read the question. If the question is outside your programming skills, you can’t value the answers, and most likely will not spend time understanding it, so you won’t vote on it (if you ever decided to read the question at all).

Monday, August 16, 2010

Boot a little bit faster with Windows 7

It’s nothing new, but thought I’d mention it for those who are not aware of it.

When booting Windows 7, or Vista, it will only use one cpu or core when executing the programs during the boot process. It might not help you at all, but if a lot of services etc are started upon boot, it might improve your boot time a little bit.

  1. Click on: [Start] (Round windows button on your taskbar)
  2. Type in: msconfig
  3. Hit: [Enter]
  4. Click on the tab: Boot
  5. Click: Advanced options…
  6. Check: Number of processors
  7. Select the number for your machine, 2 in my case as I have a dual core, without hyper-threading.
  8. Click on: [OK]

That’s it, it will now utilize more of your cpu during boot.

CpuBoot

Tuesday, July 13, 2010

To Tuple or Not To Tuple

Yesterday I answered a question on StackOverflow about “What is the most inuitive way to ask for objects in pairs?” The question was really about using KeyValuePair<,>, but a lot of the answers suggested to use Tuple<> instead which is a new construct in .Net 4.

From the MSDN documentation we read:

“A tuple is a data structure that has a specific number and sequence of values. The Tuple<T1, T2> class represents a 2-tuple, or pair, which is a tuple that has two components. A 2-tuple is similar to a KeyValuePair<TKey, TValue> structure.”

As long as you only have two elements, it doesn’t really matter if you use Tuple<,> or KeyValuePair<,>. But keep in mind that a Tuple is a class while KeyValuePair is a struct. So for certain scenarios one would be preferable over the other. ( A Tuple can have 8 direct elements, or it can nest it self to create a n-tuple.)

So over to the real question, should you use it, or when should you use it? In my opinion it boils down to expressing the intent of your code.

Consider the two following lines of code:

List<Tuple<int, int>> list = new List<Tuple<int, int>>();
List<Point> list = new List<Point>();

They can both be used for holding an x/y coordinate, but the Point struct clearly is more expressive, letting the reader know we’re talking about coordinates.

This does not mean we should not use the Tuple class. The following is also expressive and shows intent

Tuple<Man, Woman> couple = new Tuple<Man, Woman>(m, w);

This leads me to the conclusion that it’s ok to use Tuple<> as long as the types in the Tuple are expressive, meaning they are not base types. A base type says nothing about what it holds. An int can hold a coordinate, an age or any other number of things, but if you wrap your coupled data in a pairing class you can express what you are working with.

Would you use

var bmiList = new List<Tuple<double, double>>();
var bmi = new Tuple<double, double>(180,75);
bmiList.Add(bmi);

or

class BMI
{
public double Height;
public double Weight;
}

var bmiList = new List<BMI>();
BMI bmi = new BMI {Height = 180, Weight = 75};
bmiList.Add(bmi);

A good use for Tuple’s is for methods that need to return multiple values, and the values are only used locally/once in the return. They are not passed around.

Here’s an example where a Tuple could be preferable to multiple out parameters.

public Tuple<bool, Stream, long> GetStreamAndSpaceAvail(string path)
{
if (File.Exists(path))
return new Tuple<bool, Stream, long>(true, File.OpenRead(path), new DriveInfo("c:").AvailableFreeSpace);
return new Tuple<bool, Stream, long>(false, null, 0);
}

public void usage()
{
Tuple<bool, Stream,long> result = GetStreamAndSpaceAvail("somepath");
if (result.Item1 && result.Item3 > 1000)
{
result.Item2.Write(...);
}
}

compared to

public bool GetStreamAndSpaceAvail(string path, out Stream stream, out long freeSpace)
{
freeSpace = new DriveInfo("c:").AvailableFreeSpace;
if (File.Exists(path))
{
stream = File.OpenRead(path);
return true;
}
stream = null;
return false;
}

public void usage()
{
Stream s;
long freeSpace;
if(GetStreamAndSpaceAvail("somepath", out s, out freeSpace) && freeSpace > 1000)
{
s.Write(...);
}
}

I’d love to hear others opinions on this as well.

Monday, July 12, 2010

Check if an element is truly hidden with jQuery

I’m tinkering with a web application and have a situation where I want to select only elements which are visible on the page. Initially I tried using the is(“:visible”) jQuery (v1.4.2) selector, but it seems to be broken, at least for IE.

My solution was inspired by an old posting, and I created the following extension which checks the styles on the current element, and if it’s inherited, then check the parent element.

$.extend(
$.expr[":"],
{
reallyhidden: function (a) {
var obj = $(a);
while ((obj.css("visibility") == "inherit" && obj.css("display") != "none") && obj.parent()) {
obj = obj.parent();
}
return (obj.css("visibility") == "hidden" || obj.css('display') == 'none');
}
}
);

and use it like this:

if (element.is(':reallyhidden')) return false;

Monday, June 28, 2010

Get along with WCF 4 and jQuery Ajax

Initially I thought this was going to be a breeze, but as I experienced it was closer to rough sea. But as any experienced sea creature knows, rough sea is just like a breeze.

web.config

<system.serviceModel>
<services>
<service name="StbSetupGUI.HtmlParser">
<endpoint address="" behaviorConfiguration="StbSetupGUI.AjaxAspNetAjaxBehavior" binding="webHttpBinding" contract="StbSetupGUI.HtmlParser" />
</service>
</services>
<behaviors>
<endpointBehaviors>
<behavior name="StbSetupGUI.AjaxAspNetAjaxBehavior">
<enableWebScript />
</behavior>
</endpointBehaviors>
<serviceBehaviors>
<behavior name="">
<serviceMetadata httpGetEnabled="true" />
<serviceDebug includeExceptionDetailInFaults="true" />
</behavior>
</serviceBehaviors>
</behaviors>
<serviceHostingEnvironment aspNetCompatibilityEnabled="true"
multipleSiteBindingsEnabled="true" />
</system.serviceModel>

The most important part is the <enableWebScript /> attribute. (This config also has exception detailts turn on). This is added for you when you add an AJAX-enabled WCF Service to your project, so no snag there.

Service class

The service class will be automatically decorated with the AspNetCompatibilityRequirements attribute if you chose to add an AJAX-enabled WCF Service. Still crusing ahead.

[ServiceContract]
[AspNetCompatibilityRequirements(RequirementsMode = AspNetCompatibilityRequirementsMode.Allowed)]
public class HtmlParser{...}

WCF Methods

I wanted to use POST for my calls and decorated the method with the WebInvoke attribute. I initally used WebGet which worked fine, but WebInvoke got the waves crushing over my head.

[OperationContract]
[WebInvoke(RequestFormat = WebMessageFormat.Json)]
public string GetText(string cssPath, string url){...}

jQuery call

This is where I had most trouble getting it right, due to my stubbornness of using WebInvoke.

First I had issues getting the json format correct, so I decided to use the json2 library to encode my parameters correct with double and single quotes.

Next, the content type made me slightly sea sick. Most examples specify this as “application/json; charset=utf-8”. This just gave me error upon error. In the end I removed the charset part and it all played ball. Who cares about utf-8 anyway, right?

var path = "H1";
var url = "http://something";
$.ajax({
type: "POST",
url: "HtmlParser.svc/GetText",
contentType: "application/json",
data: JSON.stringify({ cssPath: path, url: url }),
dataType: "json",
success: AjaxSuccess,
error: AjaxFailed
});

And if you want readable exceptions, parse the result.responseText to a json object in order to get at the details of the error message being returned from WCF. The WCF details are residing in a property called ExceptionDetail. So the key properties to remember is responseText and ExceptionDetail.

function AjaxFailed(result) {
var res = JSON.parse(result.responseText);
if (res.ExceptionDetail) {
alert(res.Message);
return;
}
};

Fairly easy, but alot of small things can go wrong. Took me a couple of days of trial and error (and a lot of Fiddling) to get it all 100% working. It was a breeze :D

Monday, April 19, 2010

How To Shrink a VMWare Disk With Windows 2008 R2

First off, the lesson to be learned:

Never ever create a bigger disk for your VM than necessary. It is WAY simpler to grow it later or attach more disks compared to shrinking the existing one!

Update - November 2010
With the new VMWare converter v4.3 you don't have to use v3.03 any more

(Solution is located at the bottom of the article)

I was recently assigned the task of moving a VMWare image created in VMWare Player 3 over to our ESX server (ESXi 4). In theory a simple task, in practice a lot of grief. Why you might ask?

The person creating the image had allocated a physical disk of 500gb, just to be on the safe side, while only 19gb were actually needed. This is perfectly dandy under VMWare Player since you don’t have to actually allocate all this space on your host disk, but when moving over to ESX, all 500gb will be allocated, and it is very much a waste on the SAN.

Jogged by an earlier memory that you could copy and resize just the partitions of the disk with VMWare converter made me fire up VMWare Converter 4. I went back and forth all over the menus but couldn’t find the options I was looking for. Naturally my next step was to install VMWare Converter 3.03 (actually it took me cose to 3h to get to that conclusion). And there the options were. What product manager decided to pull an excellent feature from a newer release?

Next I had to wait for the conversion to finish, dum di dum.. and for it to crash at 97%. There are many articles on the 97% crash in VMWare Converter, and in my case it was that it couldn’t configure the newly created machine. The disk had converted just fine, but wouldn’t boot. (It took me three conversions at 97% to figure this out). The reason it couldn’t configure it, is because Windows Server 2008 R2 is not supported in v3 of Converter, neither is it in v4.

Ok, so now I have a new vmdk file of 25gb, which boots under VMWare Player with the following Windows boot error: Status: 0xc0000225. In order to solve this I booted from the R2 dvd image inside VMWare converter and did this sequence:

  1. Select your keyboard layout and click "Next", and on the next screen click "Repair your computer"
  2. At the System Recovery Options screen, select your instance of Windows Server 2008 R2 OS from the list, and click "Next"
  3. Select the "Command Prompt" option and type:
    cd recovery
    startrep
  4. Click “Finish” when done, and Windows starts normally.

One step left, I uninstalled VMWare Converter 3.03 and installed v4. Then I started a conversion to move it over to the ESX server. (This last step could have worked with 3.03 as well, but I didn’t try it) Converting it to ESXi went perfect except that I had to remove and re-add the NIC to get it up and running.

Recipe:

  1. Shrink partitions with VMWare Converter 3.03
  2. Fix Windows boot with Repair option on Windows Server 2008 R2 media
  3. Convert image to ESXi with VMWare Converter
  4. Remove and add NIC

Tuesday, April 13, 2010

MSDN Partner Benefits for Visual Studio 2010

vs2010[Update - link to the VS2010 and MSDN licensing white paper]

The company I work for is a Microsoft Gold Partner with and MSDN subscription, and this includes licenses for the new Visual Studio 2010 released yesterday. As I’ve been singled out to administer Microsoft licenses I dived into the Microsoft Partner site to figure out what licenses we are entitled to use regarding Visual Studio 2010.

Friday, April 9, 2010

Show “Message Options” in Outlook 2010

Every now and then it’s interesting to look at the smtp headers in an e-mail and with Outlook 2010 they seem to have take a leave of abscense. But do not fear, this is how to get them back:

Choose Options in back office menu.

image

This will bring open the options screen for Outlook 2010

image

  1. Click “Customize Ribbon”
  2. Choose to filter by “Commands Not in Ribbon”
  3. Find Message Options

Next you need to create a new group, which you can place under the “Home” tab.

image

  1. Select “Home”
  2. Click “New Group”

Now it’s time to add the “Message Options” to the group we just made by clicking the “Add” button. And click “OK” to close the dialog.

image

In your ribbon you now have a new group with the “Message Options” command

image

When you are viewing a message and click this new button you will see the headers of the actual message like we wanted.

image

Tuesday, April 6, 2010

Compiling Linq to SQL the Lazy Way

In the March issue of MSDN magazine there was an article about precompiling Linq queries in order to optimize query speed for queries being executes numerous times.

This was perfect for the current project I’m working with, and I set out to change my code which originally looked like this:

string Original(int refId)
{
var query = DbContext.Notes
.Where( note => note.CaseId == refId )
.Select(note => note.Text);
return string.Join(";", query);
}

Creating a static compiled query along the lines of the article changed the code to this:

private static Func<DataContext, int, IEnumerable<string>> _compiledQuery;
private Func<DataContext, int, IEnumerable<string>> GetQuery()
{
if (_compiledQuery == null)
{
_compiledQuery = CompiledQuery.Compile((DataContext db, int refId) =>
db.Notes
.Where( note => note.CaseId == refId )
.Select(note => note.Text));
}
return _compiledQuery;
}

string Compiled(int refId)
{
var query = GetQuery().Invoke(DbContext, refId);
return string.Join(";", query);
}

This is your regular code with checking if it’s been created and if not instantiate it. What I don’t like with this approach, now that I’m a .Net 4.0 guy, is that you might compile it twice if two threads access it at the same time since it’s not thread safe. Putting double locking in there would also cloud readability.

Certainly no big issue, but since we now have the wonderful Lazy<T> operator we can write the code like this instead:

private static Lazy<Func<DataContext, int, IEnumerable<string>>> NotesQuery = new Lazy<Func<DataContext, int, IEnumerable<string>>>(
() => CompiledQuery.Compile((DataContext db, int refId) =>
db.Notes
.Where( note => note.CaseId == refId )
.Select(note => note.Text))

);

string Lazy(int refId)
{
var query = NotesQuery.Value.Invoke(DbContext, refId);
return string.Join(";", query);
}

Not as clean as the first version, but certainly less messy than the intermediate one. Using Lazy<T> on shared instances is a good way to ensure it’s created and to avoid threading issues. And if you never use it, which could be the case for a function in a general busuiness layer, you won't compile it if you don't need it.

If we could hide some of the signature it would look and read even better.

Wednesday, March 31, 2010

SharePoint 2010: Search-Driven Portals

 workflow

On April 7th, 2010 (1pm Pacific Time) I will be presenting at a Microsoft TechNet webcast together with a colleague. The topic at hand is: How can we use search engines as a source of content for internet facing portals.

TechNet Webcast: SharePoint 2010: Search-Driven Portals (Level 200)

Websites today are often statically authored, meaning the components on the page have been put there by a person, and they show the same content to all visitors.

What if you could leverage the user’s context and intent in a better fashion together with the features of an enterprise search engine? Then package this in reusable business components in order to author better content to the end-user?

In this webcast we will propose FAST together with SharePoint as a starting platform to achieve this. By combining your business logic in workflows you can reuse logic and content across pages in a more dynamic way. Think of search as a more intelligent CMS tool, and define search as lookups to the source necessary to achieve a task, not just queries against the search engine.

Sunday, March 14, 2010

Ajax and jQuery in the Enterprise, is it such a good idea?

image For the past six months I’ve been working on a web application for a business with around 800 employees. The employees are geographically spread around and they access most of their application by logging onto a Citrix server.

Now tracking back to the start of the project. I didn’t know there was a Citrix environment and started out developing the application screens in a traditional way. It’s a read only application and all actions can be performed with url parameters. Therefore all actions on the page navigates to a new url, using GET.

In a previous project I had been playing around with jQuery and ajax, and figured I could beef up the user experience by tapping into behind the scenes calls and DOM manipulation. Less refresh and flicker and updating relevant parts of the UI generally gives a better user experience.

Since all actions points were links it was fairly easy to ajax’ify them with jQuery.

$("A").each(function() {
$(this).click(
function(event) {
event.preventDefault();
navigate($(this).attr("href"));
});
}
);


The ajax function to execute the calls is also simple. It retrieves the #container element of the page retrieved (“code”) and inserts it into the page viewed by the user. I would define this as a “poor mans ajax”, but it required very little work.


function navigate(navurl) {
$.ajax({
url: navurl,
dataType: "html",
success: function(code) {
$("#container").html($('div #container', code).html());
}
});
return false;
}


In my development and test environments this worked like a charm, and even in production - when I accessed it from my laptop.

Then came the problems, test users were accessing it from a browser within the company Citrix environment.

A sub-second action on my machine, suddenly took anywhere from 5 to 40 seconds on Citrix. Investigating the matter showed that the Citrix server was using around 80%+ CPU at most times. The ajax call is executed fairly fast, but the line

$(“#container”).html( $('div #container', code).html() )

performed really slow. What it does is load the returned html into the DOM, traverse it to fetch the html from the #container element, and then find the #container element on the current page and replace the html. This actually uses a fair amount of cpu. On a stand-alone machine this is not an issue, but on a loaded Citrix server it is.

So there were two options, scrap the ajax calls, or try to fix it. Being stubborn by nature I went for the fix.

The fix was fairly easy. I cached a reference to the current page’s #container element in a global variable, and replaced the DOM search of the returned page with placeholders and good old fashioned substring.


var contentContainer = $("#container");
function navigate(navurl) {
$.ajax({
url: navurl,
dataType: "html",
success: function(code) {
var start = code.indexOf("<!-- cStart -->");
var end = code.indexOf("<!-- cEnd -->");
var html = code.substing(start, end);
contentContainer.html(html);
}
});
return false;
}


In effect I removed the two DOM traversals, and as expected indexOf and substring performs fast.

This shows that an application might behave very different in an Enterprise, since there are many factors to consider. Doing initial research to see how much resources are available for your application is a must for choosing the right strategy. This is equally true for desktop applications. How many colors are available and can the graphics card handle WPF transitions etc?

Friday, February 26, 2010

Directory Search with multiple filters in .Net

Directory.GetFiles in .Net 3.5 and Directory.EnumerateFiles in .Net 4.0 neither supports multiple patterns when searching for files. The reason is that it uses FindFirstFile / FindNextFile of kernel32.dll which lacks the support.

Initial thought would be to create an extension method to the Directory class, but since it’s a static class that’s not possible. The second best choice is to create a short helper class instead. What we do is do a wildcard search with “*” and filter the results with a regular expression.

If you return a large result set the new Enumerable version in .Net 4.0 is preferable as it returns values to act on as you go along.

public static class MyDirectory
{
// Works in .Net 3.5 - you might want to create several overloads
public static string[] GetFiles(string path, string searchPatternExpression, SearchOption searchOption)
{
if (searchPatternExpression == null) searchPatternExpression = string.Empty;
Regex reSearchPattern = new Regex(searchPatternExpression);
return Directory.GetFiles(path, "*", searchOption).Where(file => reSearchPattern.IsMatch(Path.GetFileName(file))).ToArray();
}

// Works in .Net 4.0 - inferred overloads with default values
public static IEnumerable<string> GetFiles(string path, string searchPatternExpression = "", SearchOption searchOption = SearchOption.TopDirectoryOnly)
{
Regex reSearchPattern = new Regex(searchPatternExpression);
return Directory.EnumerateFiles(path, "*", searchOption).Where(file => reSearchPattern.IsMatch(Path.GetFileName(file)));
}

// Works in .Net 4.0 - takes same patterns as old method, and executes in parallel
public static IEnumerable<string> GetFiles(string path, string[] searchPatterns, SearchOption searchOption = SearchOption.TopDirectoryOnly)
{
return searchPatterns.AsParallel().SelectMany(searchPattern => Directory.EnumerateFiles(path, searchPattern, searchOption));
}
}



Wednesday, February 17, 2010

Blazing fast IPC in .Net 4: WCF vs. Signaling and Shared Memory

[Update 2011-02-02: Did a test against NamedPipeServerStream and NamedPipeClientStream which i mention in a comment at the end]

image
An MSDN article from 2007 compares the speed of WCF vs. .Net Remoting, and shows the speed increase WCF gives over remoting in an IPC scenario using named pipes as transport. With the introduction of the System.IO.MemoryMappedFiles namespace in .Net 4 and a blog post by Salva Patuel which outlines that almost all communication inside windows uses memory mapped files at it’s core, I had to try this myself with the new capabilities in the .Net 4 framework.

Friday, February 12, 2010

I want to copy those files! – A tale on how to get files from a “locked down” virtual machine over RDP

I recently attended a course where we first connected to a virtual machine thru a custom application, then connected to the VM via RDP.

The VM had a folder full of lab exercises which would be nice to have locally for reference. But how could I move these files? The VM’s were not on internet so file transfer to a repository on the internet was out of the questions.

Mapping up local drives were also blocked, which it should for security reasons. The only available mechanism which carried data back and forth from my local machine to the VM was the clipboard.

An idea is born!

Since this was a developers course and we had Visual Studio available it was actually a no brainer.

  1. Zip up the labs folder in explorer
  2. Run the following code:
    static void Main(string[] args)
    {
    string base64 = Convert.ToBase64String(File.ReadAllBytes(@"C:\Student\Labs.zip"));
    File.WriteAllText(@"C:\Student\labs.txt", base64);
    }
  3. Open up the base64 encoded file in visual studio
  4. Copy the content to the clipboard
  5. Paste the clipboard to a text file locally
  6. Run the following code to decode it all:
    static void Main(string[] args)
    {
    string base64 = File.ReadAllText(@"c:\temp\labs.txt");
    File.WriteAllBytes( @"c:\temp\labz.zip", Convert.FromBase64String(base64) );
    }
Presto, and the lab files were transferred.

Sunday, January 24, 2010

Code improvement tools - NDepend

VisualNDependView After I released the first version of Disk Based Data Structures, a library for persisting collections on disk (Dictionary<>, List<>), I had the opportunity to try NDepend. As I worked towards a bugfixing/relfactoring release, I used NDepend on the code to clue me in to where I should focus my efforts.

If you haven’t used NDepend, it’s basically a tool which gives you a lot of metrics on your code base along with visual representations.

AbstractnessVSInstability Code metrics can be scary at first, but once you understand what they tell you, they really help. For instance in the “Abstractness vs. Instability” chart my assemblies are plotted on the green area down right. This is due to the fact that I don’t expose many interfaces at all. The library is a concrete implementation of already established interfaces. So it’s ok to be down in the right corner.

I’ve also changed the signature from public to internal/private for many classes which were only used internally, in order to provide a cleaner public interface. All provided by a metric:

WARN IF Count > 0 IN SELECT TOP 10 METHODS WHERE CouldBePrivate

My Serializer assembly have a very low relational cohesion, which actually went down from 1.28 to 1.25. And it’s expected since I added one more serializer to the project, and none of the serializers have anything to do with each other. But it makes more sense to bundle them up together than to have multiple assemblies.

ComponentDependenciesDiagram The most useful metrics for my refactor release was to identify long methods and the ones with circular complexity. This allowed me to break them up into more understandable code pieces. I try to write short code, but sometimes you forget. By using a metric tool it’s easy to find those pains and bring them up in the open, especially on old code. We all know our old code is worse than what we write today ;)



Another interesting fact is that while my code grew 40% in line size, my comments coverage increased with 2%. So during the refactoring process I wrote more comments. Not only did I break up the code, I documented it as well.

NDepend has it’s own query language so you can easily create your own code insights, or you can modify the existing ones.

All in all, I’m glad I stumbled over NDepend, and it’s become as natural an addition as R#, and I will most likely include it in the automatic build process in future projects.

Sunday, January 10, 2010

.NET Serialization Performance Comparison

After reading the blog post from James Newton-King on serialization speed of the the new release of Jason.Net, I decided to benchmark the different serializers I have in my Disk Bases Data Structures project. The serialization is done to a byte array. (The project contains a factory class which benchmarks your data type and returns the fastest one)

AltSerialize can be found at codeproject, and the .Net implementations of Google Protocol Buffers at Google Code.

For the first test I used the same class hierarchy as Jason.Net.

image

The serialization sizes were as follow:

BinaryFormatter 2937 bytes
AltSerialize 610 bytes
DataContractSerializer 1237 bytes
protobuf-net 245

The second test is done on a well defined struct located at the bottom of this posting.

image

The serialization sizes were as follow:

BinaryFormatter 303 bytes
DataContractSerializer 272 bytes
AltSerialize 150 bytes
Marshal.Copy 144
Unsafe pointers 144

As you can see the memory copying variants are a lot faster than the other serializers when it comes to structs laid out sequential in memory. AltSerialize is also fairly quick, as it uses Marshal.Copy as well. The big winner is the version using pointers to copy the data. It’s 10x to Marshal.Copy on serialization and 17x on deserialization. Compared to the DataContractSerializer we’re talking almost 100x on serializing and over 250x on deserializing.

But remember that these tests were done on 100,000 iterations. For all normal purposes they would all work just fine.

If speed is of importance to you combined with a lot of serializing happening, then you can gain speed by choosing the right serializer.

[DataContract]
[Serializable]
[StructLayout(LayoutKind.Sequential)]
public struct Coordinate
{
[DataMember(Order = 1)]
public float X;
[DataMember(Order = 2)]
public float Y;
[DataMember(Order = 3)]
public float Z;
[DataMember(Order = 4)]
[MarshalAs(UnmanagedType.Currency)]
public decimal Focus;
[DataMember(Order = 5)]
[MarshalAs(UnmanagedType.Struct)]
public Payload Payload;

}

[DataContract]
[Serializable]
[StructLayout(LayoutKind.Sequential,Size = 113)]
public struct Payload
{
[DataMember(Order = 1)]
public byte Version;
[DataMember(Order = 2)]
public byte Data;
}

Saturday, January 9, 2010

Disk based data structures – Release 2

The seconds release is now available for download at Codeplex.

Changelog

  • Dictionary<TKey,TValue> class now persists all data to disk, so you should not run out of memory on a 64bit system. Only available disk space matters.
  • Strings can now be used for key/values. Strings don’t have a default empty constructor so I’ve added code to make them work.
  • I’ve included protobuf-net (Google Protocol Buffers) as a serializer. It’s very fast and efficient on size, but requires decorating your classes either with DataContract/DataMember attributes or ProtoContract/ProtoMember attributes. Check out the Getting Started section on protobuf-net.
  • Improved locking throughout the code.