Search This Blog

2011-03-13

PLINQ: is it worth to use?

Last month I had a speaker corner with my colleagues: "Are you still using loops? - We are coming to you!".
We always face the scalability issues in our work and one of the most evident scalability holes is processing great amount of data. One of the solutions could be the concurrent executing on multi-core CPUs. So the main point of the discussion was that we should not use loops any more because they are obsolete :-). Loops like: for, foreach, while, etc are not lead us to the atomic programming and as a result we are not really ready to develop effectively for the concurrent execution.
There were not only .Net developers participating the speaker corner, so they resented what they can use instead? Fortunately C# has LINQ and even more - Parallel LINQ. .NET 4.0 natively provides us with such a chance to use it without a necessity to be a concurrency guru.

Parallel LINQ (PLINQ) is a query execution engine that accepts any LINQ-to-Objects or LINQ-to-XML query and automatically utilizes multiple processors or cores for execution when they are available. PLINQ becomes increasingly crucial to ensuring the scalability of software on future parallel microprocessor architectures. In fact, threads and locks won't even come up unless you really want to dive under the hood to understand how it all works. PLINQ is a key component of Parallel FX, the next generation of concurrency support in the .NET.
Of course, using  concurrent processing one should think about external resource: if you actively work with it within the query, most time the threads are in synchronization state so you're apt to be quite disappointed about the performance.
I am not going to dive deep into the concurrent development theory, I am just want to share some observations weather PLINQ is really able to help with performance improvements or not. Let's write some simple application and profile it. Let's say we have some CollectionProcessor class that calculates a math expression for each element in the collection of numbers and that sums all elements.

public class CollectionProcessor
{
  private static double ItemProccess(int item)
  {
    return Math.Cos(10005000 * item) / item;
  }

  public static double ForEachSerial(IEnumerable<int> collection)
  {
    double sum = 0d;

    foreach (int i in collection)
    {
      sum += ItemProccess(i);
    }

    return sum;
  }

  public static double LinqSerial(IEnumerable<int> collection)
  {
    return collection.Select(ItemProccess).Sum();
  }

  public static double ForEachParallel(IEnumerable<int> collection)
  {
    double sum = 0d;

    foreach (int i in collection.AsParallel())
    {
      sum += ItemProccess(i);
    }

    return sum;
  }

  public static double LinqParallel(IEnumerable<int> collection)
  {
    return collection.AsParallel().Select(ItemProccess).Sum();
  }
}

Here we have four solutions: foreach loop, LINQ, foreach loop using parallel mode and PLINQ.
Let's test the results now and take 100050000 elements for the collection of numbers.

[TestFixture]
public class CollectionProcessorTest
{
  private IEnumerable Collection { get; set; }

  [SetUp]
  public void SetUp()
  {
    Collection = ParallelEnumerable.Range(0, 100050000);
  }

  [Test]
  public void ForEachSerialTest()
  {
    Stopwatch stopwatch = new Stopwatch();

    stopwatch.Start();
    CollectionProcessor.ForEachSerial(this.Collection);
    stopwatch.Stop();

    Console.WriteLine(stopwatch.ElapsedMilliseconds);
  }

  [Test]
  public void LinqSerialTest()
  {
    Stopwatch stopwatch = new Stopwatch();

    stopwatch.Start();
    CollectionProcessor.LinqSerial(this.Collection);
    stopwatch.Stop();

    Console.WriteLine(stopwatch.ElapsedMilliseconds);
  }

  [Test]
  public void ForEachParallelTest()
  {
    Stopwatch stopwatch = new Stopwatch();

    stopwatch.Start();
    CollectionProcessor.ForEachParallel(this.Collection);
    stopwatch.Stop();

    Console.WriteLine(stopwatch.ElapsedMilliseconds);
  }

  [Test]
  public void LinqParallelTest()
  {
    Stopwatch stopwatch = new Stopwatch();

    stopwatch.Start();
    CollectionProcessor.LinqParallel(this.Collection);
    stopwatch.Stop();

    Console.WriteLine(stopwatch.ElapsedMilliseconds);
  }
}

I have Intel Core Quad CPU (4-cores) running on 3,3 GHz. The time of execution results are:
ForEachSerialTest : 16 sec;
LinqSerialTest : 16,3 sec;
ForEachParallelTest : 15,5 sec;
LinqParallelTest :  3,9 sec
Hey! Here we have such a great  time of execution improvements: up to 4 times! Also, it was very nice to see the Core-gadget while PLINQ was running: 

But foreach loop even in concurrent mode had not shown real improvements.
Ok, let's profile the results. 
There are not such a great performance improvements while profiler is working, but still it exists. The expanded tree is: 
Now we can see that MoveNext operation to iterate takes more time than item processing. So I don't see any sense to use .AsParallel() with foreach loop. 
ItemProcess takes the same time for sequential test (approx. 12 sec) but PLINQ requires only 8 sec. But it also creates more threads:
And that is the point! With PLINQ we have 4 threads working in parallel. 
The interesting thing is how PLINQ  works inside. I have tried to look at the systems calls more thoroughly. 
First, let's take a common LINQ query:

This looks simple: it creates expression tree: ItemProcess - WhereSelectEnumerableIterator - Sum.
But the PLINQ tree is a little bit different:-) I even had to flatten the picture.
The call tree is impressive! It uses lots of system commands to start and execute threads. It obviously requires a lot of resources to start the execution. So if you query is not really shared in the parallel tasks you would get serious performance issues with PLINQ.
In the conclusion, PLINQ allows LINQ developers to take advantage of parallel hardware — including achieving better performance and building more intrinsically scalable software — without needing to understand concurrency or data parallelism. But my own experience with PLINQ had shown that one should be very careful with it, test and profile the resulted query to insure it really gives benefits of running in parallel and does not wait for thread synchronization all the time.

P.S.
There are nice samples to start with PLINQ.

5 comments:

  1. Nice) I havn't had a chance to use PLINQ, but results looks impressive :)

    ReplyDelete
  2. Cool!
    What about generated code? Is it possible to read in reflector?

    ReplyDelete
  3. About Reflector, code can be observed like methods chain:

    For serial test:
    public static double LinqSerial(IEnumerable collection)
    {
    return collection.Select(new Func(CollectionProcessor.ItemProccess)).Sum();
    }

    For parallel one:
    public static double LinqParallel(IEnumerable collection)
    {
    return collection.AsParallel().Select(new Func(CollectionProcessor.ItemProccess)).Sum();
    }

    Used .Net Reflector 6, settings for .NET 4.0.

    ReplyDelete