Sunday, February 22, 2009

FastForward’09

This year was my first visit to the FastForward conference. The theme of the conference was “Engage your user”, something which has been on my mind for the last couple of years.

The conference was held at the Mirage Hotel in Las Vegas, a place where they certainly try their best engaging their users. A well chosen spot indeed (I lost but $40 bucks, slot machine sounds is not what makes me fuzzy and warm).

Travelling 17 hours each way is a strain, but it was worth it. Talking to FAST customers and others working with search technology was very inspiring, and we all agree that things are moving in the right direction. More and more information is accessible for search, and people are going at great lengths to create search driven applications and interfaces which are intuitive for the end-user. Seems like Enterprise 2.0 might finally be around the corner.

2009 might just be the best year for search yet!

Sunday, February 1, 2009

Going unsafe in managed code – give me speed!

After doing the array comparison article my mind has been working subconsciously on another matter I’ve thought about for several years.What is the fastest possible way to serialize/deserialize an object in .Net?

One way is using the built-in serialization in .Net with a BinaryFormatter or a SoapFormatter. This is the most general way and works for “all” cases. If you know a bit more about the data you want to serialize you can improve speed quite a lot.

In my article Using memory mapped files to conserve physical memory for large arrays I solve the serialization on structs or value types and use Marshal.StructureToPtr and Marshal.Copy in order to get a byte array I can write to disk afterwards (because I didn’t know better at the time) This will work for any struct with only value types in them. My weekend testing showed that if I use explicit layout on a struct or class we can omit the Marshal.StructureToPtr step and use Marshal.Copy.

Now over to the unsafe bit. By using pointes directly and skipping the Marshalling all together we improve speed even more. This fueled me to continue on my Disk Based Dictionary project which will benefit both from memory mapped files and fast serializing. My approach will be to analyze the type of object being used. If it’s an object with explicit layout or a base value type I will use fast pointer to pointer copying. If it’s an object with only value types, but implicit layout I’ll go with the StructureToPtr. For an object with reference types I will use normal serialization, or check if they implement a BinaryWriter/BinaryReader interface for writing out the values manually.

The library will then work for the lazy coder which don’t need killer performance, but also for the conscious ones bothering about speed.

If I’m lucky with inspiration I’ll have it done this week before I go to Vegas.

If you’re wondering why I bother with these things it’s because I used to work with search engines where speed vs. memory is a big issue. In my current job doing SharePoint consulting it’s all a waste of time since the SQL server will always be the bottleneck :)