Showing posts with label pointers. Show all posts
Showing posts with label pointers. Show all posts

Sunday, January 10, 2010

.NET Serialization Performance Comparison

After reading the blog post from James Newton-King on serialization speed of the the new release of Jason.Net, I decided to benchmark the different serializers I have in my Disk Bases Data Structures project. The serialization is done to a byte array. (The project contains a factory class which benchmarks your data type and returns the fastest one)

AltSerialize can be found at codeproject, and the .Net implementations of Google Protocol Buffers at Google Code.

For the first test I used the same class hierarchy as Jason.Net.

image

The serialization sizes were as follow:

BinaryFormatter 2937 bytes
AltSerialize 610 bytes
DataContractSerializer 1237 bytes
protobuf-net 245

The second test is done on a well defined struct located at the bottom of this posting.

image

The serialization sizes were as follow:

BinaryFormatter 303 bytes
DataContractSerializer 272 bytes
AltSerialize 150 bytes
Marshal.Copy 144
Unsafe pointers 144

As you can see the memory copying variants are a lot faster than the other serializers when it comes to structs laid out sequential in memory. AltSerialize is also fairly quick, as it uses Marshal.Copy as well. The big winner is the version using pointers to copy the data. It’s 10x to Marshal.Copy on serialization and 17x on deserialization. Compared to the DataContractSerializer we’re talking almost 100x on serializing and over 250x on deserializing.

But remember that these tests were done on 100,000 iterations. For all normal purposes they would all work just fine.

If speed is of importance to you combined with a lot of serializing happening, then you can gain speed by choosing the right serializer.

[DataContract]
[Serializable]
[StructLayout(LayoutKind.Sequential)]
public struct Coordinate
{
[DataMember(Order = 1)]
public float X;
[DataMember(Order = 2)]
public float Y;
[DataMember(Order = 3)]
public float Z;
[DataMember(Order = 4)]
[MarshalAs(UnmanagedType.Currency)]
public decimal Focus;
[DataMember(Order = 5)]
[MarshalAs(UnmanagedType.Struct)]
public Payload Payload;

}

[DataContract]
[Serializable]
[StructLayout(LayoutKind.Sequential,Size = 113)]
public struct Payload
{
[DataMember(Order = 1)]
public byte Version;
[DataMember(Order = 2)]
public byte Data;
}

Tuesday, December 15, 2009

Filling an array with a default value

After following the discussion on Stackoverflow about how to initialize a byte array with a value I decided to do some benchmarking for fun.

[Update 2014-09-13]
As the question on SO has evolved I have now included poth PInvoke and Memset methods into my tests. The most interesting observation is that the Memset method performs excellent on 64bit, but poorly on 32bit. If you are compiling for 32bit, go with unsafe or PIvoke, if you are running 64bit, Memset delegate is the way to go.

Here are the results, and the code follows below (run on Windows 7 64bit dual core).

Array length: 1048576
Iterations: 1000
Enumerable: 00:00:12.3817249
Parallel Enumerable: 00:00:17.6506353
For loop: 00:00:01.7095341
Parallel for loop: 00:00:06.9824434
Unsafe: 00:00:00.7028914


Here are  the results running on Windows 8.1, 64bit i5 processor (Lenovo T420s) with .Net 4.5.1.

Array Length: 1048576
32bit execution - 5000 iterations

EnumerableFill: 00:00:50.1071043
ParallelEnumerableFill: 00:01:12.2980480
ForLoopFill: 00:00:05.3504656
ParallellForLoopFill: 00:00:45.5518340
UnsafeFill: 00:00:02.2804084
MemsetFill: 00:00:03.9383964
PInvokeFill: 00:00:02.4391258

32bit execution - 10000 iterations
UnsafeFill: 00:00:04.1653674
MemsetFill: 00:00:07.2561020
PInvokeFill: 00:00:04.2709875

64bit execution - 10000 iterations

UnsafeFill: 00:00:03.9618905
MemsetFill: 00:00:03.5594970
PInvokeFill: 00:00:03.8012791

using System;
using System.Diagnostics;
using System.Linq;
using System.Reflection;
using System.Reflection.Emit;
using System.Runtime.InteropServices;
using System.Threading.Tasks;

namespace FillArrayBenchmark
{
    internal class Program
    {
        private static readonly Action MemsetDelegate;
        private static int _arrLength = 1048576;

        static Program()
        {
            var dynamicMethod = new DynamicMethod("Memset", MethodAttributes.Public | MethodAttributes.Static,
                CallingConventions.Standard,
                null, new[] {typeof (IntPtr), typeof (byte), typeof (int)}, typeof (Util), true);

            ILGenerator generator = dynamicMethod.GetILGenerator();
            generator.Emit(OpCodes.Ldarg_0);
            generator.Emit(OpCodes.Ldarg_1);
            generator.Emit(OpCodes.Ldarg_2);
            generator.Emit(OpCodes.Initblk);
            generator.Emit(OpCodes.Ret);

            MemsetDelegate =
                (Action) dynamicMethod.CreateDelegate(typeof (Action));
        }

        private static void Main(string[] args)
        {
            EnumerableFill(12);
            ParallelEnumerableFill(12);
            ForLoopFill(12);
            ParallellForLoopFill(12);
            UnsafeFill(12);
            MemsetFill(12);
            PInvokeFill(12);

            int iteration = 10000;
            Stopwatch sw;
            byte b = 129;
            sw = Stopwatch.StartNew();
            for (int i = 0; i < iteration; i++)
            {
                EnumerableFill(b);
            }
            sw.Stop();
            Console.WriteLine("EnumerableFill: " + sw.Elapsed);
            sw = Stopwatch.StartNew();
            for (int i = 0; i < iteration; i++)
            {
                ParallelEnumerableFill(b);
            }
            sw.Stop();
            Console.WriteLine("ParallelEnumerableFill: " + sw.Elapsed);
            sw = Stopwatch.StartNew();
            for (int i = 0; i < iteration; i++)
            {
                ForLoopFill(b);
            }
            sw.Stop();
            Console.WriteLine("ForLoopFill: " + sw.Elapsed);
            sw = Stopwatch.StartNew();
            for (int i = 0; i < iteration; i++)
            {
                ParallellForLoopFill(b);
            }
            sw.Stop();
            Console.WriteLine("ParallellForLoopFill: " + sw.Elapsed);
            sw = Stopwatch.StartNew();
            for (int i = 0; i < iteration; i++)
            {
                UnsafeFill(b);
            }
            sw.Stop();
            Console.WriteLine("UnsafeFill: " + sw.Elapsed);
            sw = Stopwatch.StartNew();
            for (int i = 0; i < iteration; i++)
            {
                MemsetFill(b);
            }
            sw.Stop();
            Console.WriteLine("MemsetFill: " + sw.Elapsed);
            sw = Stopwatch.StartNew();
            for (int i = 0; i < iteration; i++)
            {
                PInvokeFill(b);
            }
            sw.Stop();
            Console.WriteLine("PInvokeFill: " + sw.Elapsed);
        }

        private static void EnumerableFill(byte value)
        {
            byte[] a = Enumerable.Repeat(value, _arrLength).ToArray();
        }

        private static void ParallelEnumerableFill(byte value)
        {
            byte[] a = ParallelEnumerable.Repeat(value, _arrLength).ToArray();
        }

        private static byte[] ForLoopFill(byte value)
        {
            var a = new byte[_arrLength];
            for (int i = 0; i < _arrLength; i++)
            {
                a[i] = value;
            }
            return a;
        }

        private static byte[] ParallellForLoopFill(byte value)
        {
            var a = new byte[_arrLength];
            Parallel.For(0, _arrLength, i => { a[i] = value; });
            return a;
        }

        private static unsafe byte[] UnsafeFill(byte value)
        {
            Int64 fillValue = BitConverter.ToInt64(new[] {value, value, value, value, value, value, value, value}, 0);

            var a = new byte[_arrLength];
            Int64* src = &fillValue;
            fixed (byte* ptr = &a[0])
            {
                var dest = (Int64*) ptr;
                int length = _arrLength;
                while (length >= 8)
                {
                    *dest = *src;
                    dest++;
                    length -= 8;
                }
                var bDest = (byte*) dest;
                for (byte i = 0; i < length; i++)
                {
                    *bDest = value;
                    bDest++;
                }
            }
            return a;
        }

        public static byte[] MemsetFill(byte value)
        {
            var a = new byte[_arrLength];
            GCHandle gcHandle = GCHandle.Alloc(a, GCHandleType.Pinned);
            MemsetDelegate(gcHandle.AddrOfPinnedObject(), value, _arrLength);
            gcHandle.Free();
            return a;
        }

        private static byte[] PInvokeFill(byte value)
        {
            var arr = new byte[_arrLength];
            GCHandle gch = GCHandle.Alloc(arr, GCHandleType.Pinned);
            MemSet(gch.AddrOfPinnedObject(), value, _arrLength);
            gch.Free();
            return arr;
        }

        [DllImport("msvcrt.dll",
            EntryPoint = "memset",
            CallingConvention = CallingConvention.Cdecl,
            SetLastError = false)]
        public static extern IntPtr MemSet(IntPtr dest, int value, int count);
    }

    public static class Util
    {
        private static readonly Action MemsetDelegate;

        static Util()
        {
            var dynamicMethod = new DynamicMethod("Memset", MethodAttributes.Public | MethodAttributes.Static,
                CallingConventions.Standard,
                null, new[] {typeof (IntPtr), typeof (byte), typeof (int)}, typeof (Util), true);

            ILGenerator generator = dynamicMethod.GetILGenerator();
            generator.Emit(OpCodes.Ldarg_0);
            generator.Emit(OpCodes.Ldarg_1);
            generator.Emit(OpCodes.Ldarg_2);
            generator.Emit(OpCodes.Initblk);
            generator.Emit(OpCodes.Ret);

            MemsetDelegate =
                (Action) dynamicMethod.CreateDelegate(typeof (Action));
        }

        public static void Memset(byte[] array, byte what, int length)
        {
            GCHandle gcHandle = GCHandle.Alloc(array, GCHandleType.Pinned);
            MemsetDelegate(gcHandle.AddrOfPinnedObject(), what, length);
            gcHandle.Free();
        }
    }
}

Sunday, February 1, 2009

Going unsafe in managed code – give me speed!

After doing the array comparison article my mind has been working subconsciously on another matter I’ve thought about for several years.What is the fastest possible way to serialize/deserialize an object in .Net?

One way is using the built-in serialization in .Net with a BinaryFormatter or a SoapFormatter. This is the most general way and works for “all” cases. If you know a bit more about the data you want to serialize you can improve speed quite a lot.

In my article Using memory mapped files to conserve physical memory for large arrays I solve the serialization on structs or value types and use Marshal.StructureToPtr and Marshal.Copy in order to get a byte array I can write to disk afterwards (because I didn’t know better at the time) This will work for any struct with only value types in them. My weekend testing showed that if I use explicit layout on a struct or class we can omit the Marshal.StructureToPtr step and use Marshal.Copy.

Now over to the unsafe bit. By using pointes directly and skipping the Marshalling all together we improve speed even more. This fueled me to continue on my Disk Based Dictionary project which will benefit both from memory mapped files and fast serializing. My approach will be to analyze the type of object being used. If it’s an object with explicit layout or a base value type I will use fast pointer to pointer copying. If it’s an object with only value types, but implicit layout I’ll go with the StructureToPtr. For an object with reference types I will use normal serialization, or check if they implement a BinaryWriter/BinaryReader interface for writing out the values manually.

The library will then work for the lazy coder which don’t need killer performance, but also for the conscious ones bothering about speed.

If I’m lucky with inspiration I’ll have it done this week before I go to Vegas.

If you’re wondering why I bother with these things it’s because I used to work with search engines where speed vs. memory is a big issue. In my current job doing SharePoint consulting it’s all a waste of time since the SQL server will always be the bottleneck :)