Raw PHP and MySQL Eager Loading

p3pj7joyvnm

Working with complicated and joined data sets can be tricky for MySQL database performance. For example, What if you need to load multiple authors and their books from a database. How would you work around the N+1 problem and avoid unnecessary queries to the database without an expensive join statement? The answer is quite simple.

Use eager loading.

Eager Loading

Eager loading is merely fetching records preemptively. For example, when the author records are requested, you can aggregate their IDs. Then, you can query for only the books used by those authors beforehand.

Note that our example could use a join but I’m using a simple example to demonstrate the point.

SELECT * FROM `authors` LIMIT 3;

If this query returns the authors of the IDs 1,2,3 you can save queries on books by running just one select statement,

SELECT * FROM `books` WHERE `author_id` IN ( 1,2,3 );

PHP: Solutions

There are two great ways to work with eager loading in PHP.

  1. Using a handy ORM.
  2. Manually using PHP’s PDO.

ORM

An ORM such as Eloquent in Laravel makes eager loading is effortless. Laracasts has a great video explaining the situation and then covers the solution, and there is a more recent video if you have an account.

PDO

There are times you do not want the overhead of an ORM. You might also want to build eager loading yourself. In these cases, you need to use PHP’s PDO class.

Let’s say we have 10 authors. If you queried their books without eager loading, you would run 11 SQL queries.

$authors = $pdo->query("SELECT * FROM `authors` LIMIT 10;");

// Query books 10 times!
foreach($authors as $author) {
    var_dump($author);
    $id = (int) $author['id'];
    
    $books = $pdo->query("SELECT * FROM `books` WHERE `author_id` = {$id}");
    foreach($books as $book) {
        var_dump($book);
    }
}

The code is beautiful and brief but at the expense of speed. 11 queries are not ideal. Imagine the crawling speed of 100 authors. Ouch!

If you use eager loading, you can have the same net result with only 2 queries.

$authors = $pdo->query("SELECT * FROM `authors` LIMIT 10;");

// Setup eager list of IDs
$ids = [];
foreach($authors as $author) {
    $ids[] = (int) $author['id'];
}
$author_ids = implode(',', $ids);

// Get all books and group them
$grouped_books = [];
$all_books = $pdo->query("SELECT * FROM `books` WHERE `author_id` IN ( {$author_ids} )");
foreach($all_books as $book) {
    $grouped_books[$book['author_id']][] = $book;
}

// Final loop
foreach ($authors as $author) {
    var_dump($author);
    var_dump($grouped_books[ $author['id'] ]);
}

Limiting

This appreach works well for simple queries. However, if you wanted to limit the number of books per author to 3 the query becomes complex.

  1. What if each author has a different numbers of books published when compared to the other authors?
  2. What if you only want a subset of books per author?

If you have 3 authors each with a different number of books and you only want the latest 3 books from each author you can not use LIMIT 9 and be on your merry way.

To solve this issue you will need to use ROW_NUMBER. In MySQL 8 you have access to this window function. Let’s use ROW_NUMBER to limit the result set:

SELECT book_title, author_id, id, 
ROW_NUMBER() OVER (PARTITION BY author_id ORDER BY id) as row_number
FROM books WHERE row_number < 4 AND `author_id` IN ( {$author_ids} );

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.