10

I've noticed that database queries run faster when adding .ToArray() or .ToList() to queries. Is this because the data set is loaded into memory and all subsequent queries are done in-memory rather than doing further expensive database calls?

What should the limit be for in-memory storage for database queries as I'm concerned that using up too much internal memory will slow down the entire project as I am sure that taking up too much in-memory storage could slow things down dramatically.

EDIT: This is for Linq-SQL queries. I'm using SQL Server 2008.

Example 1: performing a large database query and filtering in-memory

I have a database table with 5 000 rows. I query that entire table (e.g. SELECT* From Clients). My next few queries are based on fields from the previous query: a) Get all male clients; b) Get all female clients c) Get all clients where FirstName starts with A.

Example 2: performing more frequent database calls

Using the same Client table with 5000 rows, I need to perform 3 queries a) Get all male clients; b) Get all female clients c) Get all clients where FirstName starts with A. I do all the queries via database calls and not in-memory.

Which method is more efficient?

4
  • 3
    It is pretty much just caching in a very simple guise. Why run a query over and over that takes 2 minutes each time when you can run it once and remember the results? This, unfortunately, has no hard and fast answer. You need to understand each case. You cannot just blanket apply ToArray to all situations, it won't always make things faster. As for any limits, who knows? That depends on lots of things. Commented Jan 28, 2014 at 14:48
  • 6
    You need to give a concrete example. In some cases it would make things much slower, by pulling in the whole database table before filtering, for example.
    – Jon Skeet
    Commented Jan 28, 2014 at 14:49
  • 1
    Is this for Linq-to-Sql? EF? Regardless, .ToArray() and .ToList() are extension methods that run on IEnumerable<T> and actually 'Enumerate' the results. Until you actually need the IEnumerable<T> enumerated, it will defer execution and try to produce the most efficient expression to return the enumerated result. Enumerating the results early (eager-loading) by calling .ToArray() or .ToList() takes the hit up front but allows you to reuse that query over and over without having to enumerate the results. Commented Jan 28, 2014 at 14:55
  • Also using ToList() or ToArray() saving result in memory will not always give us fresh data. You will create some thing like catching so you need to update data time to time to get fresh data.
    – Sameer
    Commented Jan 28, 2014 at 15:07

1 Answer 1

28

NOTE: I used an entity framework "context" on my queries. The answer applies to Entity Framework or LINQ-to-SQL though.

ToArray() and ToList() will never speed a query. They may, however, make it appear that way if you are accidentally running the query multiple times. This is a common mistake when using LINQ. For example:

// Construct a query to find all the cute puppies
var data = myContext.Puppies.Where(puppy => puppy.Type == PuppyTypes.Cute);

The above line DOES NOT actually query the database for cute puppies. Instead, it constructs a query that will, at some future point, query for the puppies. This is called Delayed Execution. Quiz question: What type is "data" in the example above?

The query is actually run when you enumerate through the variable data. This variable is of type IEnumerable. IEnumerables are different from Lists or Arrays. An IEnumerable is merely a promise that the data can be obtained. It does not mean you actually have the data.

// Display the puppies
foreach (Puppy p in data) { Console.WriteLine(p.Name); }

The foreach call above will force the execution of the query. The puppies will go onto the screen, but they will not be cached in an array or a list anywhere. So now, if you do this:

// Display their owner's names
foreach (Puppy p in data) { Console.Writeline(p.OwnerName); }

This foreach causes the query to execute again. In fact, if new puppies have been added or removed in the database in between the two calls, you might even get different results! So suppose instead, we added ToList() to the initial call.

// Construct a query to find all the cute puppies
var data = myContext.Puppies.Where(puppy => puppy.Type == PuppyTypes.Cute).ToList();

Now, data is of type List. Under the hood, this creates an IEnumerable, then does a foreach through it, and puts the results into a list. So now, if you display the puppy name and owner, it will not query the database twice. It is instead looping through the list twice.

If you do a search for Delayed Execution you will find lots of interesting uses for this, as well as caveats like the one I listed. In general, you want to make sure you only run your query once. If that requires a ToList() then great. But don't add ToList() unnecessarily. Using one last example:

var data = myContext.Puppies.Where(puppy => puppy.Type == PuppyTypes.Cute).ToList();
foreach (Puppy p in data) { Console.WriteLine(p.Name + "," + p.OwnerName); }

Should you have the ToList() here? No! Because that would add the overhead of a list when you really don't need the list. Even worse:

var data = myContext.Puppies.Where(puppy => puppy.Type == PuppyTypes.Cute).ToList();
Console.Writeline(data.Count());

That's even worse! It pulled all the puppies into a list in memory, then counted them. But that query never even requires grabbing the puppies at all. Done this way:

var data = myContext.Puppies.Where(puppy => puppy.Type == PuppyTypes.Cute);
Console.Writeline(data.Count());

It actually tells the SQL server to count the number of puppies, and never wastes bandwidth or memory actually loading the puppies out of the database or sending it to C# or any of that.

I hope this helps!

4
  • great answer my friend +1
    – WiiMaxx
    Commented Jan 28, 2014 at 15:19
  • 3
    This variable is of type IEnumerable - surely you meant IQueryable
    – Jamiec
    Commented Jan 28, 2014 at 16:07
  • @Jamiec is correct. The data context returns an IQueryable, which is in turn an IEnumerable as well. I didn't want to get into the details of IQueryable.
    – Moby Disk
    Commented Apr 21, 2014 at 18:06
  • @Moby Disk PuppyTypes.Cute- that's cool!)) Nice answer, thanks!
    – AsValeO
    Commented Mar 14, 2015 at 9:56

Not the answer you're looking for? Browse other questions tagged or ask your own question.