LINQ : Performance problems

LINQ is great. Really, it is.

But when you have lots of records to insert or delete on a SQL Server, then it's performance sucks. Thankfully, it's very easy to keep using the best bits of LINQ, but work around the bottlenecks.

>Deleting records

Let's start with an example of deleting records using LINQ.

In my Northwind database, I want to delete all [Products] records, where the product name starts with "Banana Milk Shake".

using (NorthwindDataContext dc = new NorthwindDataContext())
{
    dc.Products.DeleteAllOnSubmit(dc.Products.Where(p => p.ProductName.StartsWith("Banana Milk Shake")));
    dc.SubmitChanges();
}

If you have a few records to delete, this script is perfectly acceptable, readable and maintainable. However, if you needed to delete several thousand records, then this code would crawl.

At this point, sadly, it's time to go back to running raw SQL queries.

using (NorthwindDataContext dc = new NorthwindDataContext())
{
    dc.ExecuteCommand("DELETE FROM [Products] WHERE [ProductName] LIKE 'Banana Milk Shake%'");
}

Yes, I know, I know.. Putting raw SQL in your applications is a bad idea. The whole point of LINQ is to rid our beautiful C# code of statements like this.

I absolutely agree.

But if, one day, you find your code hitting a bottleneck while deleting SQL Server records, well, slip in a bit of SQL, with an apology in the comments, along with benchmark tests showing how much faster the raw SQL is.

As you'll see later on, deleting records using a SQL command can be 20 times faster than using LINQ's DeleteOnSubmit() function. And the LINQ demo application (which you can download below) lets you see the improvements in performance yourself.

Inserting records using a Bulk Insert

Of course, you could use the same trick to insert records. You could build up a length SQL command to run an INSERT INTO.. command, rather than using LINQ's InsertOnSubmit() function. But let's look at a better way, using Bulk Inserts.

Time for another example: this time I'm going to insert 5,000 records into the [Products] table in the [Northwind] database.

Here's how our regular LINQ would look:

const int RECORDS_TO_INSERT = 5000;

using (NorthwindDataContext dc = new NorthwindDataContext())
{
    for (int n = 0; n < RECORDS_TO_INSERT; n++)
    {
        // Note: The [Products] table's Primary Key, ProductID, is an identity column, so we don't need to specify a value for each record we're about to create.
        Product newProduct = new Product()
        {
            ProductName = "Product " + n.ToString(),
            UnitPrice = 3999,
            UnitsInStock = 2,
            UnitsOnOrder = 0
        };
        dc.Products.InsertOnSubmit(newProduct);
    }
    dc.SubmitChanges();
}

Once again, if we were just inserting a few records, then this code would be perfectly fine. But even inserting 5,000 records, this can take a lot of time using LINQ.

We can improve this by creating a List<> variable, containing the set of records that we want to insert into the [Products] table, then call a new function, BulkCopyToDatabase, which inserts these records into our SQL Server database using bulk insert.

(You can download the source code for this DataTableHelper class below.)

const int RECORDS_TO_INSERT = 5000;

List<Product> recordsToBeInserted = new List<Product>();
using (NorthwindDataContext dc = new NorthwindDataContext())
{
    for (int n = 0; n < RECORDS_TO_INSERT; n++)
    {
        Product newProduct = new Product()
        {
            ProductName = "Product " + n.ToString(),
            UnitPrice = 3999,
            UnitsInStock = 2,
            UnitsOnOrder = 0,
            Discontinued = false
        };
        recordsToBeInserted.Add(newProduct);
    }
    // Insert this List<> of records into the [Products] table in our database, using a Bulk Insert
    DataTableHelper.BulkCopyToDatabase(recordsToBeInserted, "Products", dc);
}

Well, that was easy! We just needed to make three changes to our code, then call a function to do the bulk insert:

DataTableHelper.BulkCopyToDatabase(recordsToBeInserted, "Products", dc);

You'll actually find versions of this BulkCopyToDatabase function all over the internet. I've chosen to add an extra (second) parameter which takes the SQL Server table name.
LINQ-to-SQL will often create LINQ classes with modified versions of your database's table names, such as turning the [Products] table into a class called Product (rather than Products).
Without this table name parameter, our bulk-insert class would incorrectly try to insert our records into a table called [Product], rather than [Products].
 


So, what does this DataTableHelper class do ?

Well, it copies the records from your List<> variable into a DataTable, then gets Microsoft's SqlBulkCopy class to do the inserting into your SQL Server database.

The nice thing about using this DataTableHelper class is that it's easy to slip into existing code, with very few code changes.

Demo program

The attached Visual Studio 2008 C# WinForms application lets you see how fast regular LINQ is at inserting and deleting, compared to these two new methods.

Screenshot

The "Run test.." buttons in the first column insert and delete a set of [Products] records using the regular LINQ InsertOnSubmit() and DeleteOnSubmit() commands.

The "Run test.." buttons in the second column use the new methods, described above.

After pressing one of the "Run test.." buttons, the label at the bottom of the dialog will tell you how many records were inserted or deleted, and how long it took to run.

Here are some typical results that I've got, using a local SQL Server database, and inserting sets of 50,000 [Products] records at a time.

Regular LINQ Bulk Insert / raw SQL Improvement
INSERT test 40.1 s 4.5 s 9 times faster
DELETE test 20.5 s 0.9 s 20 times faster

Not bad, hey ?

With improvements like this, yes, it's worth slipping in the occasional SQL command or Bulk Insert class into your code !

And you can use this trick in your WinForms, WPF and ASP.Net applications.

Downloads

LINQtest.zip Simple LINQ demo, demonstrating how to insert and delete records using this new method.
LINQlibraries.zip
The two library files you'll need to include, to use this library in your own applications.

References

SqlBulkCopy Class
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx


 

Comments

blog comments powered by Disqus