Programster's Blog

Tutorials focusing on Linux, programming, and open source

MySQL Left Joins

This is a medium level tutorial for SQL. If you are new to SQL/databases, you may want to check out this beginner's guide to SQL.

Let's start with the definition of a left join (also called the "left outer join") from W3 Schools.

"The LEFT JOIN keyword returns all rows from the left table (table1), with the matching rows in the right table (table2). The result is NULL in the right side when there is no match."

[Image also from W3 Schools]

Reading that is straight forward, but I find that it's more helpful to demonstrate left joins. This will show you their benefits, caveats, and where to use them.

Database Setup

Lets create a databse to play with. On a MySql server, run the following commands:

create database left_joins;
use left_joins

We need to create some data to play with, so let's create our users table, representing all the users on our website.

CREATE TABLE `users` (
    `id` INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
    `first_name` VARCHAR(30) NOT NULL,
    `last_name` VARCHAR(30) NOT NULL,
    `email` VARCHAR(100) NOT NULL,
    UNIQUE INDEX (`email`)
);

INSERT INTO `users` (`email`, `first_name`, `last_name`) VALUES 
('user1@programster.org', 'Jon', 'Snow'), 
('user2@programster.org', 'Tyrion', 'Lannister'), 
('user3@programster.org', 'Arya', 'Stark'),
('user4@programster.org', 'Joffrey', 'Baratheon');

Now let's imagine that our site has lots of different permissions, for which we need a permissions table because there is a 1 to many relationship (e.g. each user may have any number of permissions, including none at all).

CREATE TABLE `permissions` (
    `id` INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
    `name` varchar(100) NOT NULL
)

INSERT INTO `permissions` (`name`) VALUES 
('upload_data'), 
('read_data'), 
('process_data'), 
('delete_users'), 
('add_users');

Now let's assign permissions to various users, remembering that a user is likely to have more than one permission assigned to them, or may have none at all!

CREATE TABLE `user_permissions` (
    `user_id` int(11) NOT NULL,
    `permission_id` int(11) NOT NULL,
    PRIMARY KEY (`user_id`,`permission_id`),
    KEY `user_id` (`user_id`),
    KEY `permission_id` (`permission_id`)
);

INSERT INTO `user_permissions` (`user_id`, `permission_id`) 
VALUES 
(1, 2), 
(2, 2), 
(3, 2), 
(3, 5), 
(3, 4), 
(2, 5), 
(3, 1), 
(2, 1)

To make things interesting, let's simulate that a user once existed in our database that had some permissions, but has since been removed by the application, but was not removed cleanly, and is thus still in the user_permissions table but not the users table.

INSERT INTO `user_permissions` (`user_id`, `permission_id`) VALUES (5, 2)

Fetching Data

Now let's imagine that we want to fetch all of the data about a user that has just logged in with their email address. This includes their name and all of their permissions. We could run the following queries which would give us all of the data in two separate tables that we would have to parse on the application level.

SELECT * FROM `users` WHERE `email`='user2@programster.org';

+----+------------+-----------+-----------------------+
| id | first_name | last_name | email                 |
+----+------------+-----------+-----------------------+
|  2 | Tyrion     | Lannister | user2@programster.org |
+----+------------+-----------+-----------------------+

SELECT * FROM `user_permissions` 
WHERE `user_id` in (
    SELECT `id` FROM `users` WHERE `email`='user2@programster.org'
);

+---------+---------------+
| user_id | permission_id |
+---------+---------------+
|       2 |             1 |
|       2 |             2 |
|       2 |             5 |
+---------+---------------+

This queries took 0.00 and 0.01 seconds respectively.

Alternatively, we could submit just one query by using a left join:

SELECT * FROM `users` 
LEFT JOIN `user_permissions` on `users`.`id` = `user_permissions`.`user_id`
WHERE `email`='user2@programster.org';

This query is faster (takes 0.00 secs), and we only had to wait for 1 query instead of two. This also results in all of our data being easier to parse by the application, as shown by the result below:

+----+------------+-----------+-----------------------+---------+---------------+
| id | first_name | last_name | email                 | user_id | permission_id |
+----+------------+-----------+-----------------------+---------+---------------+
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             1 |
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             2 |
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             5 |
+----+------------+-----------+-----------------------+---------+---------------+

An inner join with the following query would have fetched the same result which may be even faster:

SELECT * FROM `users`, `user_permissions` 
WHERE `users`.`id` = `user_permissions`.`user_id` 
AND `users`.`email`='user2@programster.org';

This begs the question of why you would want to use a LEFT join in the first place? Imagine that you want a list of all of your users, and their respective permissions. The following query uses a lef join:

SELECT * FROM `users` 
LEFT JOIN `user_permissions` on `users`.`id` = `user_permissions`.`user_id`;

+----+------------+-----------+-----------------------+---------+---------------+
| id | first_name | last_name | email                 | user_id | permission_id |
+----+------------+-----------+-----------------------+---------+---------------+
|  1 | Jon        | Snow      | user1@programster.org |       1 |             2 |
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             1 |
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             2 |
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             5 |
|  3 | Arya       | Stark     | user3@programster.org |       3 |             1 |
|  3 | Arya       | Stark     | user3@programster.org |       3 |             2 |
|  3 | Arya       | Stark     | user3@programster.org |       3 |             4 |
|  3 | Arya       | Stark     | user3@programster.org |       3 |             5 |
|  4 | Joffrey    | Baratheon | user4@programster.org |    NULL |          NULL |
+----+------------+-----------+-----------------------+---------+---------------+

Compare this to using an inner join:

SELECT * FROM `users`, `user_permissions` 
WHERE `users`.`id` = `user_permissions`.`user_id`;
+----+------------+-----------+-----------------------+---------+---------------+
| id | first_name | last_name | email                 | user_id | permission_id |
+----+------------+-----------+-----------------------+---------+---------------+
|  1 | Jon        | Snow      | user1@programster.org |       1 |             2 |
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             1 |
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             2 |
|  2 | Tyrion     | Lannister | user2@programster.org |       2 |             5 |
|  3 | Arya       | Stark     | user3@programster.org |       3 |             1 |
|  3 | Arya       | Stark     | user3@programster.org |       3 |             2 |
|  3 | Arya       | Stark     | user3@programster.org |       3 |             4 |
|  3 | Arya       | Stark     | user3@programster.org |       3 |             5 |
+----+------------+-----------+-----------------------+---------+---------------+

As you can see, the inner join "forgets" about Joffrey Baratheon because he doesn't have any permissions yet!

Now imagine that you want to clean up your database by finding all of the permissions that are assigned to users that no longer exist. This would require us to perform a Left join on the user permissions table that excludes any matches from the users table.

SELECT * FROM `user_permissions` 
LEFT OUTER JOIN `users` on `users`.`id` = `user_permissions`.`user_id`
WHERE `users`.`id` IS NULL;

+---------+---------------+------+------------+-----------+-------+
| user_id | permission_id | id   | first_name | last_name | email |
+---------+---------------+------+------------+-----------+-------+
|       5 |             2 | NULL | NULL       | NULL      | NULL  |
+---------+---------------+------+------------+-----------+-------+

Conclusion

You should now have a basic understanding of how to perform Left joins, and start to recognize what situations they will be useful in, rather than executing multiple queries.