Every once in a while when I’m working with Rails, I face a little problem like this: given two models, Book and Category, book belonging to a category, list all books ordered by its category’s name, showing it alongside with the it’s name.

That’s easy enough, right?

ca1 = Category.create! name: 'Category A'
ca2 = Category.create! name: 'Category B'
Book.create! name: 'Book A', category: ca1
Book.create! name: 'Book B', category: ca1
Book.create! name: 'Book C', category: ca2

category_name = Category.arel_table[:name]
Book.eager_load(:category).order(category_name).each do |book|
  puts "#{book.category.name} - #{book.name}"
end
#=> Category A - Book A
#=> Category A - Book B
#=> Category B - Book C

The code above generates the SQL below:

SELECT "books"."id" AS t0_r0, "books"."name" AS t0_r1,
"books"."category_id" AS t0_r2, "books"."created_at" AS t0_r3,
"books"."updated_at" AS t0_r4, "categories"."id" AS t1_r0,
"categories"."name" AS t1_r1, "categories"."created_at" AS t1_r2,
"categories"."updated_at" AS t1_r3 FROM "books"
LEFT OUTER JOIN "categories" ON "categories"."id" = "books"."category_id"
ORDER BY "categories"."name";

I don’t like this SQL. If I have a lot of books and categories, its performance will not be as good as it could. Let’s make it better.

category_name = Category.arel_table[:name]
Book.joins(:category).preload(:category).order(category_name) \
  .each do |book|
    puts "#{book.category.name} - #{book.name}"
  end
#=> Category A - Book A
#=> Category A - Book B
#=> Category B - Book C

The SQL now will be:

SELECT "books".* FROM "books"
INNER JOIN "categories" ON "categories"."id" = "books"."category_id"
ORDER BY "categories"."name";

SELECT "categories".* FROM "categories" WHERE "categories"."id" IN (1, 2);

Despite the fact that two queries need to be run for this approach, it is, in fact, better than the former. It’s faster, when you have a large amount of data.

Very well. Now, let’s create a fourth book:

Book.create! name: 'Book D'
category_name = Category.arel_table[:name]
Book.joins(:category).preload(:category).order(category_name) \
  .each do |book|
    puts "#{book.category.name} - #{book.name}"
  end
#=> Category A - Book A
#=> Category A - Book B
#=> Category B - Book C

Did you notice something wrong? The fourth book is not shown on the list. That’s because it has no category. The INNER JOIN generated by ActiveRecord excludes this record from the query.

A Known Solution

The solution here, as you might know, is to use LEFT JOIN instead of INNER JOIN. The first algorithm shown here uses it LEFT JOIN, but it’s slower.

Digging on Google brought me to this solution:

category_name = Category.arel_table[:name]
Book.joins('LEFT JOIN categories ON books.category_id = categories.id') \
  .preload(:category).order(category_name).each do |book|
    puts "#{book.category.try(:name)} - #{book.name}"
  end
#=> Category A - Book A
#=> Category A - Book B
#=> Category B - Book C
#=>  - Book D

The pros:

  • It works;
  • it’s faster than the eager_load.

But it looks terrible, don’t you think? The cons:

  • It’s not database agnostic;
  • I have to explicitly define in raw SQL how these two tables are called and how they are related to each other;
  • It’s not DRY, since I already defined their names and relashionship in the model.

Arel to the Rescue!

I found another solution:

categories = Category.arel_table
books = Book.arel_table

books_categories = books.join(categories, Arel::Nodes::OuterJoin) \
  .on(books[:category_id].eq(categories[:id])).join_sources

category_name = Category.arel_table[:name]
Book.joins(books_categories).preload(:category).order(category_name) \
  .each do |book|
    puts "#{book.category.try(:name)} - #{book.name}"
  end
#=> Category A - Book A
#=> Category A - Book B
#=> Category B - Book C
#=>  - Book D

The pros:

  • It’s database agnostic;
  • I do not need to know the tables’ names.

The cons:

  • It’s too much code to do too little;
  • I still have to specify how the relashionship is done.

Based on the Arel solution, I created a gem that adds this functionality to ActiveRecord:

  gem 'left_join'

Now, the version using the gem:

category_name = Category.arel_table[:name]
Book.left_join(:category).preload(:category).order(category_name) \
  .each do |book|
    puts "#{book.category.try(:name)} - #{book.name}"
  end
#=> Category A - Book A
#=> Category A - Book B
#=> Category B - Book C
#=>  - Book D

Much cleaner, right? The advantages:

  • It’s faster than eager_load;
  • It’s database agnostic;
  • I don’t need to write any raw SQL;
  • I don’t need to repeat information about tables or relationships;
  • It works like joins does, so it’s possible to LEFT JOIN a chain of associations.

That’s all for now. Thanks for reading!