Published: Thu Jan 20 2022

Using the C# SQL DataReader for getting around slow ORM performance

In the average application, ORMs such as Entity Framework or NHibernate are great for about 95% of your database access. They work very well for inserts, updates, deletes and most selects.

It's common knowledge that complex selects pulling data from a lot of different tables can result in inefficient queries, but one of the other scenarios they struggle with is bulk data fetching. ORMs are very much focused on dealing with individual records, or small numbers of individual records. And this works well most of the time; if you have an order management system, you probably don't want to retrieve half a million order records from your database at once. That's far more than your users can focus on!

But databases are flexible things and not every database table is backing a results grid in a CRUD app. There are scenarios where it makes sense to be dealing with half a million records, especially when those records are fairly small pieces of data.

Imagine you have a word list table consisting of just an ID ((integer(64))) and a word (varchar(16)). Half a million records of those sounds like a lot, but it's in the region of 10MB of data - certainly well within what you'd hope a database driven application can handle. An SQL database could return that instantaneously, but often with ORMs, the response will take seconds. I don't know why this is, but if I had to guess I'd say that allocating memory for and creating the objects to represent these records adds a lot of overhead.

When dealing with bulk data you often don't really want high level objects. You'd often be happy with a list of dictionaries or even just a dictionary of strings keyed by the record ID, or perhaps you can get away with some routine that just reads through the data as a stream, calculating something as you go.

Using the SqlDataReader

So, how do we bypass the ORM to give us the raw data without all the overhead?

The answer is: the SqlDataReader.

SqlDataReader is a built-in .NET class designed as a more powerful but less convenient way of streaming results from an SQL query. Despite being less convenient, you'll be pleased to know that DataReaders are actually very simple to understand and use.

Conceptually, an SqlDataReader is a line based string scanner. It operates on a streamed string containing tabular data from an SQL database response and provides methods to consume primitives from the string.

What that means is you iterate over each row of the results with .Read(), and use .GetType(fieldIndex) methods to read each field from the row.

It looks something like this:

public IDictionary<long, string> ReadTable()
{
  var statement = "SELECT id, word FROM wordlist";

  using (SqlConnection connection = new SqlConnection("connection string goes here"))
  using (SqlCommand command = new SqlCommand(statement, connection))
  {
    connection.Open();

    using (SqlDataReader reader = command.ExecuteReader()) 
    {
      var words = new Dictionary<long, string>();
      while (reader.Read()) 
      {
        // The indices of GetInt64() and GetString()
        // refer to the field positions, i.e. id = 0, word = 1, 
        // as this was the order of our SELECT statement.
        var id = reader.GetInt64(0);
        var word = reader.GetString(1);
        words[id] = word;
      }
      return word;
    }
  }
}

This should give you nearly the same performance as running the query directly in an SQL console.