PostgreSQL - Leaderboard Query Example
Today, I needed to get my PostgreSQL database to generate a leaderboard of a user and their friends from the entire history of scores. The critical thing here is that it should show the "global" ranking of each player against everyone in the game, but only display the user and their friends. E.g if the users we care about were 2, 4, and 6 then it would need to output something like:
user_id | rank | total_score
---------+------+-------------
4 | 2 | 15
6 | 4 | 21
2 | 7 | 42
Setup
CREATE TABLE "score_history" (
"id" int NOT NULL,
"user_id" int NOT NULL,
"score" int NOT NULL,
"created_at" int NOT NULL,
PRIMARY KEY (id)
);
INSERT INTO score_history (id, user_id, score, created_at) VALUES
(1, 1, 10, 1562004653),
(2, 1, 11, 1562004654),
(3, 1, 12, 1562004655),
(4, 2, 13, 1562004656),
(5, 2, 14, 1562004657),
(6, 2, 15, 1562004658),
(7, 3, 16, 1562004659),
(8, 3, 10, 1562004660),
(9, 3, 11, 1562004661),
(10, 4, 12, 1562004662),
(11, 4, 1, 1562004663),
(12, 4, 2, 1562004664),
(13, 5, 3, 1562004665),
(14, 5, 4, 1562004666),
(15, 5, 5, 1562004667),
(16, 6, 6, 1562004668),
(17, 6, 7, 1562004669),
(18, 6, 8, 1562004670),
(19, 7, 9, 1562004671),
(20, 7, 10, 156200472);
Queries
Here we will start simple, and "build up" to our final query which will output the desired effect.
The query below will get the total score of all the users ordered by user_id
SELECT
user_id
, SUM(score) as score_sum
FROM
score_history
GROUP BY user_id
ORDER BY user_id ASC;
user_id | score_sum
---------+-----------
1 | 33
2 | 42
3 | 37
4 | 15
5 | 12
6 | 21
7 | 19
If you wanted to order by the score instead of user, you would just add a simple ORDER BY
clause at the end:
SELECT
user_id
, SUM(score) as total_score
FROM
score_history
GROUP BY user_id
ORDER BY total_score DESC;
user_id | total_score
---------+-------------
2 | 42
3 | 37
1 | 33
6 | 21
7 | 19
4 | 15
5 | 12
Now if we wanted a column which showed the user's rank, we would make use of the RANK()
window function on the nesting of the previous example query:
SELECT
user_id,
RANK () OVER (ORDER BY total_score DESC) AS rank,
total_score
FROM (
SELECT
user_id,
SUM(score) as total_score
FROM
score_history
GROUP BY user_id
ORDER BY user_id ASC
) as myTableAlias
ORDER BY user_id;
RANK()
is like ROW_NUMBER()
except if there are two scores that are the same, they will be given the same rank, and the next rank number will be skipped. E.g. two people could be rank 2, but in such a case, there would be no rank 3.
user_id | rank | total_score
---------+------+-------------
1 | 3 | 33
2 | 1 | 42
3 | 2 | 37
4 | 6 | 15
5 | 7 | 12
6 | 4 | 21
7 | 5 | 19
Finally, when you have hundreds of users and you just want to show a user their ranking against their peers, we would nest the previous query one more time to filter out only the users we care about. In this case we are fetching the "leaderboard" for user 4 and their friends, 2 and 6.
SELECT * FROM
(
SELECT
user_id,
RANK () OVER (ORDER BY total_score DESC) AS rank,
total_score
FROM (
SELECT
user_id,
SUM(score) as total_score
FROM
score_history
GROUP BY user_id
ORDER BY user_id ASC
) as myTableAlias
) as mySecondTableAlias
WHERE user_id IN(4,2,6);
user_id | rank | total_score
---------+------+-------------
2 | 1 | 42
6 | 4 | 21
4 | 6 | 15
Performance
Trying to calculate a global ranking of all players across all time would be quite intensive. Luckily, the parallel queries feature in PostgreSQL should help quite a bit automatically if your server has multiple cores. Databases such as MySQL do not (yet) support this though.
First published: 1st July 2019