Tray2.se Logo

Properly formed foreign keys are your best friends

A few days ago a user at the Laracasts forum had performance issue with one of his queries that used joins. While a join can cause some performance slow-downs, it shouldn't be anywhere near the 12 seconds he claimed that the query took. This post here will explain what the issue probably was and how to speed that up. I will be using two tables, authors and books in my example. There is a classic one-to-many relationship between them.

I've created a fresh Laravel project for this, configured the database, and have created the following using the artisan command.

  • Author model
  • Author factory
  • Authors migration
  • Book model
  • Book factory
  • Books migration

So let's start with setting up the migrations for our two tables.

Our authors table will just contain the name of the author.

1public function up()
2{
3 Schema::create('authors', function (Blueprint $table) {
4 $table->id();
5 $table->string('author');
6 $table->timestamps();
7 });
8}

And for simplicity’s sake we keep the columns in the books table to a bare minimum as well.

1public function up()
2{
3 Schema::create('books', function (Blueprint $table) {
4 $table->id();
5 $table->unsignedBigInteger('author_id');
6 $table->string('title');
7 $table->integer('published');
8 $table->timestamps();
9 });
10}

Now let's update the factories that we created as well, so that we can easily generate fake data.

In the AuthorFactory we add a faker for the name.

1public function definition()
2{
3 return [
4 'name' => $this->faker->name(),
5 ];
6}

And for the BookFactory we add fakers for the title and published columns. We also call the AuthorFactory to create the author for the book.

1public function definition()
2{
3 return [
4 'title' => $this->faker->words(3, true),
5 'published' => $this->faker->year(),
6 'author_id' => Author::factory()->create()->id,
7 ];
8}

Now that we defined the migrations, model and factories let's migrate our database, and fake up some data to play around with.

1php artisan migrate

We should see something similar to this.

1INFO Preparing database.
2 
3 Creating migration table ......................................... 86ms DONE
4 
5 INFO Running migrations.
6 
7 2014_10_12_000000_create_users_table ............................ 155ms DONE
8 2014_10_12_100000_create_password_resets_table ................... 88ms DONE
9 2019_08_19_000000_create_failed_jobs_table ...................... 125ms DONE
10 2019_12_14_000001_create_personal_access_tokens_table ........... 172ms DONE
11 2022_09_13_032339_create_authors_table ........................... 26ms DONE
12 2022_09_13_032349_create_books_table ............................. 29ms DONE

Now you can use php artisan tinker to run the following code to create the fake data, but I prefer using Tinkerwell.

1Author::factory()->count(1000)->create()->each(function ($author){
2 Book::factory()->count(10)->create([
3 'author_id' => $author->id,
4 ]);
5});

Now we should have 1000 authors and with 10 books each, totaling 10000 books.

1MariaDB [foreign_keys]> select count(*) from authors;
2+----------+
3| count(*) |
4+----------+
5| 1000 |
6+----------+
71 row in set (0,000 sec)
8 
9MariaDB [foreign_keys]> select count(*) from books;
10+----------+
11| count(*) |
12+----------+
13| 10000 |
14+----------+
151 row in set (0,000 sec)

So let's start querying the database.

We will start with a basic join and select all the records.

1SELECT authors.name, books.title
2FROM books
3JOIN authors
4ON books.author_id = authors.id;

This will give us a list of 10000 books and the author connected to the book.

1| Kade Huel | autem et pariatur |
2| Kade Huel | aperiam aliquid nulla |
3| Kade Huel | et cupiditate sint |
4| Kade Huel | saepe quia at |
5| Kade Huel | ea dolorum accusantium |
6| Kade Huel | ad velit quia |
7| Kade Huel | vitae laborum iusto |
8| Kade Huel | ea vero cum |
9| Kade Huel | consequatur vel porro |
10| Kade Huel | vel assumenda dicta |
11+-------------------------------+----------------------------------------+
1210000 rows in set (0,005 sec)

That is pretty fast, so let's take a look at the execution plan. To do that we put the word EXPLAIN in front of the query like so.

1EXPLAIN SELECT authors.name, books.title
2FROM books
3JOIN authors
4ON books.author_id = authors.id;
1MariaDB [foreign_keys]> EXPLAIN SELECT authors.name, books.title FROM books JOIN authors ON books.author_id = authors.id;
2+------+-------------+---------+--------+---------------+---------+---------+------------------------------+-------+-------+
3| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
4+------+-------------+---------+--------+---------------+---------+---------+------------------------------+-------+-------+
5| 1 | SIMPLE | books | ALL | NULL | NULL | NULL | NULL | 10000 | |
6| 1 | SIMPLE | authors | eq_ref | PRIMARY | PRIMARY | 8 | foreign_keys.books.author_id | 1 | |
7+------+-------------+---------+--------+---------------+---------+---------+------------------------------+-------+-------+
82 rows in set (0,000 sec)

Since we didn't use any conditions other than that the books author_id should equal the id in the authors table.

So let's up the ante a bit and tell it that we want only the books with the author_id of 10.

1SELECT authors.name, books.title
2FROM authors
3JOIN books
4ON books.author_id = authors.id
5WHERE author_id = 10;

The result would look something like this.

1+-----------------+------------------------+
2| name | title |
3+-----------------+------------------------+
4| Blaise Turcotte | maxime nisi ut |
5| Blaise Turcotte | qui dolores est |
6| Blaise Turcotte | assumenda optio velit |
7| Blaise Turcotte | dolor nam voluptatem |
8| Blaise Turcotte | voluptas maiores qui |
9| Blaise Turcotte | omnis cumque molestias |
10| Blaise Turcotte | nisi id quisquam |
11| Blaise Turcotte | eum temporibus maiores |
12| Blaise Turcotte | dolor eum quae |
13| Blaise Turcotte | dolore facere quo |
14+-----------------+------------------------+
1510 rows in set (0,002 sec)

It is still pretty fast as you can see, but if we run an explain on the query we will see something nasty.

1+------+-------------+---------+-------+---------------+---------+---------+-------+-------+-------------+
2| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
3+------+-------------+---------+-------+---------------+---------+---------+-------+-------+-------------+
4| 1 | SIMPLE | authors | const | PRIMARY | PRIMARY | 8 | const | 1 | |
5| 1 | SIMPLE | books | ALL | NULL | NULL | NULL | NULL | 10000 | Using where |
6+------+-------------+---------+-------+---------------+---------+---------+-------+-------+-------------+
72 rows in set (0,000 sec)

Take a look at the rows on the second line in the table, we are looking at 10000 records to find the books that belongs to the author with the id of 10, but what if we change the query and query for the id of the authors table instead of the author_id on the books table?

Well let's find out.

1SELECT authors.name, books.title
2FROM authors
3JOIN books
4ON books.author_id = authors.id
5WHERE authors.id = 10;

Well the time seems to be the same.

1+-----------------+------------------------+
2| name | title |
3+-----------------+------------------------+
4| Blaise Turcotte | maxime nisi ut |
5| Blaise Turcotte | qui dolores est |
6| Blaise Turcotte | assumenda optio velit |
7| Blaise Turcotte | dolor nam voluptatem |
8| Blaise Turcotte | voluptas maiores qui |
9| Blaise Turcotte | omnis cumque molestias |
10| Blaise Turcotte | nisi id quisquam |
11| Blaise Turcotte | eum temporibus maiores |
12| Blaise Turcotte | dolor eum quae |
13| Blaise Turcotte | dolore facere quo |
14+-----------------+------------------------+
1510 rows in set (0,002 sec)

The execution plan as well.

1+------+-------------+---------+-------+---------------+---------+---------+-------+-------+-------------+
2| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
3+------+-------------+---------+-------+---------------+---------+---------+-------+-------+-------------+
4| 1 | SIMPLE | authors | const | PRIMARY | PRIMARY | 8 | const | 1 | |
5| 1 | SIMPLE | books | ALL | NULL | NULL | NULL | NULL | 10000 | Using where |
6+------+-------------+---------+-------+---------------+---------+---------+-------+-------+-------------+
72 rows in set (0,000 sec)

We are still looking at 10000 rows, but what if we add an index on the author_id column in the books table, sure we can do that, but there is a better way to handle this, and that is called a foreign key constraint, it is an index but with some added benefits, we can make sure that we can't create a book with an author_id that does not exist in the authors table, and we can utilize something called the cascade, which basically is a way to control what happens to a record when a certain action is done on the table that the foreign key references. For example, we can delete all the books belonging to an author when the author is deleted, but we can talk more about that in another post.

So what do we need to do to create a foreign key constraint, well all we need to do is update our migration for the books table, migrate the database, and reseed it with fake data.

So in the books table migration we change the author_id column like so.

1//$table->unsignedBigInteger('author_id');
2 $table->foreignId('author_id')->constrained()

Then we run this command to reseed the database.

php artisan migrate:fresh

After that we can once again seed the database with fake data.

1Author::factory()->count(1000)->create()->each(function ($author){
2 Book::factory()->count(10)->create([
3 'author_id' => $author->id,
4 ]);
5});

Okay, now let's run the query again, remember that the last run took 0,002 sec.

1SELECT authors.name, books.title
2FROM authors
3JOIN books
4ON books.author_id = authors.id
5WHERE authors.id = 10;
1+------------------+-------------------------------+
2| name | title |
3+------------------+-------------------------------+
4| Dr. Neva Kilback | velit eum est |
5| Dr. Neva Kilback | eligendi soluta aut |
6| Dr. Neva Kilback | a velit aut |
7| Dr. Neva Kilback | soluta distinctio dolorem |
8| Dr. Neva Kilback | est aut assumenda |
9| Dr. Neva Kilback | occaecati vero iste |
10| Dr. Neva Kilback | ratione rerum qui |
11| Dr. Neva Kilback | voluptates reiciendis error |
12| Dr. Neva Kilback | laborum laudantium aut |
13| Dr. Neva Kilback | voluptate dignissimos dolores |
14+------------------+-------------------------------+
1510 rows in set (0,000 sec)

Wow, now it took so little time it didn't even register, so let's look at the plan.

1+------+-------------+---------+-------+-------------------------+-------------------------+---------+-------+------+-------+
2| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
3+------+-------------+---------+-------+-------------------------+-------------------------+---------+-------+------+-------+
4| 1 | SIMPLE | authors | const | PRIMARY | PRIMARY | 8 | const | 1 | |
5| 1 | SIMPLE | books | ref | books_author_id_foreign | books_author_id_foreign | 8 | const | 10 | |
6+------+-------------+---------+-------+-------------------------+-------------------------+---------+-------+------+-------+
72 rows in set (0,000 sec)

If we look at the rows column we see that instead of looking at 10000 rows we only look at 10 rows, and if we look as the type column, it is now ref instead of all as it was previously. We can also see that the query now both has the possibility to use and is using the books_author_id_foreign index. This is exactly what we want.

So we gained about 0.0002 seconds in the execution, big deal right?

Yes, it's a huge deal, let me prove it to you.

First we change the migration back to the way we wrote it the first time.

1$table->unsignedBigInteger('author_id');
2//$table->foreignId('author_id')->constrained();

We then migrate fresh to make the changes in the database.

php artisan migrate:fresh

We then make a small change to our fake data producing code, so that we create 10000 authors with 100 books each. This will take some time to generate, we are talking about one million records here so be patient.

1Author::factory()->count(10000)->create()->each(function ($author){
2 Book::factory()->count(100)->create([
3 'author_id' => $author->id,
4 ]);
5});

Now we can run the query again.

1SELECT authors.name, books.title
2FROM authors
3JOIN books
4ON books.author_id = authors.id
5WHERE author_id = 10;

The time has increased by a lot as you can see.

1| Zachary Leannon | qui sint odit |
2| Zachary Leannon | qui enim quia |
3| Zachary Leannon | beatae reiciendis culpa |
4| Zachary Leannon | ut dolorum repudiandae |
5| Zachary Leannon | excepturi dolor explicabo |
6| Zachary Leannon | et sunt optio |
7+-----------------+------------------------------------+
8100 rows in set (0,150 sec)

From 0,002 to 0,150, now that is quite a lot, but understandable since it needs to go through a million records. If you don't believe me we can check the execution plan.

1+------+-------------+---------+-------+---------------+---------+---------+-------+---------+-------------+
2| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
3+------+-------------+---------+-------+---------------+---------+---------+-------+---------+-------------+
4| 1 | SIMPLE | authors | const | PRIMARY | PRIMARY | 8 | const | 1 | |
5| 1 | SIMPLE | books | ALL | NULL | NULL | NULL | NULL | 1000000 | Using where |
6+------+-------------+---------+-------+---------------+---------+---------+-------+---------+-------------+
72 rows in set (0,000 sec)

Now we need to change it back again so that we create the foreign key, but since it took so long time to seed the database, we can try to add the foreign key constraint directly in the database, so we don't have to reseed it.

1ALTER TABLE books
2 ADD FOREIGN KEY
3 (author_id)
4 REFERENCES authors (id);

That took about 5 seconds to complete, while doing reseed would have taken several minutes.

Now let's run our query again and see what happens.

1SELECT authors.name, books.title
2FROM authors
3JOIN books
4ON books.author_id = authors.id
5WHERE author_id = 10;
1| Zachary Leannon | quia error ipsum |
2| Zachary Leannon | asperiores minima ipsam |
3| Zachary Leannon | qui sint odit |
4| Zachary Leannon | qui enim quia |
5| Zachary Leannon | beatae reiciendis culpa |
6| Zachary Leannon | ut dolorum repudiandae |
7| Zachary Leannon | excepturi dolor explicabo |
8| Zachary Leannon | et sunt optio |
9+-----------------+------------------------------------+
10100 rows in set (0,026 sec)

It took 0,026 seconds instead of the 0,150 it did before. So I would say that is a good improvement.

So remember when to always create your foreign keys properly with a foreign key index, it will speed up your application. If you do that your joins will be faster and those 12 seconds I mentioned in the beginning of this post will be nothing but a memory.

There are of course things we need to consider. The data is never inserted the way we did it, we inserted the book 100 at the time, and they all had the same author id, so the database was pretty much ordered from the beginning. Now we did this with one million records, what if you have 50 million records, well the foreign key index will help you decrease the execution time, there are however limits to what an index can do, and if that limit is reached you can use something called partitions, which is way beyond the scope of this post.

//Tray2

© 2024 Tray2.se. All rights reserved.

This site uses Torchlight for syntax highlighting.