Programster's Blog

Tutorials focusing on Linux, programming, and open-source

PostgreSQL - Generated Columns

Similar / Related Posts

About

  • A generated column is a special column that is always computed from other columns.
  • Support for generated columns was added in PostgreSQL 12.

Virtual And Stored

  • PostgreSQL has only implemented support for stored generated columns, and thus the STORED keyword must be provided.
    • The other type of generated column that is not supported is virtual.
  • A stored generated column is one which is calculated whenever the relevant related data is updated/inserted, and the result is stored and thus takes up storage. On the other hand, a virtual one would simply be calculated at the point of running queries, and would not be stored anywhere.
  • These stored generated columns can be indexed, which can dramatically improve query performance.
  • Due to the fact that the value is calculated just once, and stored, this is very efficient for read-heavy workloads, unlike virtual generated columns.

Example

The following example creates a table for rectangles, whose area can be calculated as the width times the height.

CREATE TABLE rectangles (
    "id" UUID PRIMARY KEY,
    "width" DECIMAL(3,2), 
    "height" DECIMAL (3,2), 
    "area" DECIMAL(6,4) GENERATED ALWAYS AS ("width" * "height") STORED
);

We use a stored generated column to store the area of the rectangle so that we don't have to perform this mathematical operation in our queries or application layer. By using a generated column rather than a non-generated one, we do not risk having data out-of-sync. E.g. the application layer updates the width or height, but forgets to update the area.

Optionally Add Index

As stated before, one can index these generated columns, which may be useful in situations where you have millions of rows, and want to quickly sort by the area, or just pick the largest/smallest.

CREATE INDEX on rectangles ("area");

Insert Examples

INSERT INTO rectangles ("id", "width", "height") VALUES 
  ('96c90ad7-7fd6-4e7d-87e7-cf9346cd086a', 3.1, 3.2),
  ('96c90af1-9276-456a-948c-cecdf81fdb9d', 4.5, 2.3),
  ('96c90af1-9357-4189-820a-ea11adbe4f20', 6.4, 1.2)
;

Now we can select everything from the rectangles table, which will now show the area.

SELECT * FROM rectangles;
                  id                  | width | height |  area   
--------------------------------------+-------+--------+---------
 96c90ad7-7fd6-4e7d-87e7-cf9346cd086a |  3.10 |   3.20 |  9.9200
 96c90af1-9276-456a-948c-cecdf81fdb9d |  4.50 |   2.30 | 10.3500
 96c90af1-9357-4189-820a-ea11adbe4f20 |  6.40 |   1.20 |  7.6800
(3 rows)

Now we can find the largest rectangle by simply executing:

SELECT * FROM rectangles ORDER BY area DESC limit 1;
                  id                  | width | height |  area   
--------------------------------------+-------+--------+---------
 96c90af1-9276-456a-948c-cecdf81fdb9d |  4.50 |   2.30 | 10.3500
(1 row)

If we didn't have the generated column, we would have to do something like below, which looks very messy/confusing, especially with all the * symbols.

SELECT 
  *, 
  width * height as area 
FROM rectangles 
ORDER BY area 
DESC LIMIT 1;

This query won't work whilst the table has the generated column called area.

References

Last updated: 3rd September 2023
First published: 15th July 2022