Optimizing Long Running Processes with EF6 – Postmortem

In: C#

4 Feb 2018

One of the issues my workplace has been dealing with over the past several months has been intermittent slow performance of long running processes.

In several cases it was impacting the entire company until the long running process had completed, it was doing pessimistic locking on reading data, causing deadlocks, and running really really slow.

Other team members were responsible for investigating and addressing the issues.

I disagreed with how they were going about it.

TLDR;

Skip to the end if you don’t want to read the whole story.

Backstory

Our team had just finished a 4 month project to replace store procedures with entity framework 6 (EF6).

In order to investigate the issues, team members involved decided that running SQL Profiler on the production server while the slowdowns were happening would identify a smoking gun.

It’s obviously a database issue, and fixing the offending execution plan(s) will fix the issues, right? Well… no.

It found 1-2 issues and brought down the time for 1 of the long running processes down from hours to just under an hour, but ignored the underlying problem.

Long running processes were taking 10x longer to run, compared to previously using stored procedures.

So this happens several times, each time the business is getting more and more frustrated with the lack of resolution of this issue.

And so this is where I get handed the problem with an open time frame to investigate and fix all long running processes on the system.

My Investigation

Running SQL Profiler in production when the issue is happening IMO is the wrong way to investigate this issue, it’s ignoring the gorilla in the room.

The EF6 project needed to be investigated and profiled. I wrote a lightweight tracer that logged method execution times to a file, and wrapped every method that interacted to the database and got to work.

What did I find?

I didn’t find 1 smoking gun, I found heaps of little smoking guns, adding up to a small platoon.

  1. Un-optimized EF Queries
    • Fetching large amounts of unnecessary data from the db (poor fetching strategies)
    • Read Queries with change tracking over large sets
    • Bulk inserts with change tracking over n entity inserts
  2. Pessimistic Locking
  3. Sequential Processing over mutually exclusive aggregate roots

Optimizing EF Queries

First order of business is to fix the offending EF Queries.

There’s lots of great information on the net about fixing these issues.

Personally I went through the list here provided on

Reduce unnecessary retrieved data

By returning only the data that we’re going to use will reduce the overhead of each query.

E.g. If we only want the list of pupil names of a school, it’s not necessary to retrieve the entire school.

Transform queries from this

List schools = db.Schools     
  .Where(s => s.City == city)
  .Include(x => x.Pupils)
  .ToList();

To this

var pupils = db.Pupils
  .Where(p => p.SchoolId == schoolId)
  .Select(x =>  new { x.FirstName, x.LastName })
  .ToList();

Optimize read-only queries

If you don’t need change tracking (i.e. to perform writes), add .AsNoTracking() to your query to ensure it will not be loaded into the context


List entities;

using (var context = new DBContext())
    entities = context.YourEntity
	    .AsNoTracking()
	    .Where(x => x.Name == "Example")
	    .ToList();

// ... some operation on entities that does not require a save

Speed up Bulk Adds

If you are simply adding a number of objects then saving them, EF will check whether each of those new objects has changed. To avoid this, turn off AutoDetectChangesEnabled before adding, then back on again afterwards.

using (var context = new EFContext()) {
    context.Configuration.AutoDetectChangesEnabled = false;
    try {
        // Perform adds
        context.SaveChanges();
    } 
    finally {
        context.Configuration.AutoDetectChangesEnabled = true;
    }
}

I like adding the finally clause so that even if there is a problem with SaveChanges() or the adds we will reset the AutoDetect Changes for any other operations used in this context.

Pessimistic Locking

By default, out of the box if you don’t configure SQL Server it will come with Read Committed Transaction Isolation (RCTI) at the Database Level, or Pessimistic Locking.

What RCTI essentially does is creates shared locks whenever a transaction tries to read data and blocks all other read transactions from reading that row.

Now you can imagine if you had some common query that cycled over all members of an account, if any other query lets say a long running process fetched and performed actions on those members. It could cause some significant slowdowns while waiting for locks to clear for both read and write operations.

Switching to Optimistic Locking for Reads (Read Committed Snapshot Transaction Isolation)

Read Committed Snapshot Transaction Isolation or RCSTI works a little differently. While reading it doesn’t take out a shared lock but instead relies on row versioning for its writes.

Anytime RCSTI writes data to a row, it will version the row and provide the previous version to read operations during its write operation. It will create shared locks for concurrent writes, however reads are extremely fast as there’s no locking overhead and it’s not waiting on a shared lock to clear so more than 1 transaction can read the same data at a time.

RCSTI isn’t a holy grail, it comes with it’s own issues that need to be considered.

I recommend going through this guide before implementing RCSTI for yourself. I’ve gone for a hybrid approach myself, and only enabling RCSTI on selective queries after I’ve reviewed the performance and side effects of possible out of date reads.

implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide

Sequential Processing

With all of the above issues addressed, this is by far the easiest one to address last. With the Task Parallel Library introduced to C# this has now become trivial.

The entities that were being processed in sequence were all mutually exclusive aggregate roots. This means that they were perfect candidates for multi-threading or processing in parallel. With only a single connection to the database to perform multiple reads collate data, perform calculations and then write back out to the database on conclusion, neither the database nor the processor were being taxed in any way. The time taken was essentially overhead for setting up and performing data transformations.

The End Result

The end result from all of the optimization?

For the same processing batch

  • Stored Procedures averaged 20 mins
  • Un-optimized EF6 averaged between 50-70+ mins (bad mkay?)
  • Optimized EF6 averaged 22 mins (EF overhead)
  • Optimized EF6 with TPL < 4 mins

The stored procedure was performing the entire processing inside itself as a long running procedure and so didn’t lend itself to parallel processing. Along with other issues inherent with having any business logic hidden away in a database level structure.

With converting to EF6, optimizing and then performing mutually exclusive tasks in parallel.

I’ve not only addressed the EF6 issue, but improved the performance throughput over the original stored procedure.

Next step – micro services processing

Postmortem

So if faced with the same issue again, I’d perform these steps to fix.

  1. Put tracing around all functions or methods that access the database
  2. Optimize EF Queries
    • Review Fetching Strategies
    • Remove Change Tracking for Read Queries that don’t need it
    • Remove Change Tracking for Bulk inserts
  3. Replace with Read Committed Snapshot Transaction Isolation where appropriate
  4. Parallel Processing over mutually exclusive aggregate roots

Simply investigating the database in isolation is only looking at part of the whole picture with your eyes closed, in the dark, in the basement.

It smells down there, could be GRU’s.

Comment Form

About Justin

justin

Justin is a Senior Software Engineer living in Brisbane. A Polyglot Developer proficient in multiple programming languages including [C#, C/C++, Java, Android, Ruby..]. He's currently taking an active interest in Teaching Kids to Code, Functional Programming, Robotics, 3D Printers, RC Quad-Copters and Augmented Reality.

About This Blog

Software Engineering is an art form, a tricky art form that takes as much raw talent as it does technical know how. I'll be posting articles on professional tips and tricks, dos and donts, and tutorials.

profile for Justin Shield on Stack Exchange, a network of free, community-driven Q&A sites

Photostream