/ mysql

MySQL Enums

Enums can be a great way to store a value for a column when there are only a finite number of possible string values. A common example for this could be the status for a support ticket. These could be open, assigned, resolved, and closed. Another example could be the list of t-shirt sizes which are small, medium, large, and x-large. Technically, you could use an enum to store integer values, but I strongly recommend that you do not do this as it will lead to all kinds of confusion.

Here is an example of creating a table that has an enum for the status.

CREATE TABLE `support_tickets` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `subject` varchar(255) NOT NULL,
    `message` text NOT NULL,
    `status` ENUM('open', 'assigned', 'resolved', 'closed') NOT NULL,
    `modified_timestamp` timestamp DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The values you give to the column are given an index starting at 1. So for the definition above, the index for open is 1, and the index for closed is 4. The index plays an important factor in many of the behaviours when interacting with an enum column, so be very careful of the order when defining the enum.

Inserting Values

When inserting values for an enum, you can use either the string representation or the index. So you could use either of the following Insert statements which would have the same result:

INSERT INTO `support_tickets` 
(`subject`, `message`, `status`) 
VALUES("test", "this is a test", "open");
INSERT INTO `support_tickets` 
(`subject`, `message`, `status`) 
VALUES("test", "this is a test", 1);

Gotchas

Below are situations where behaviour doesn't act as you might expect.

Sorting

Sorting of a column that is an enum will sort by the index, and not by alphabetically by the string value. So if you were to sort by ticket status ascending, you would have the open tickets first and closed tickets last, instead of the assigned tickets first and the resolved tickets last. This may or may not be what you want.

If you wish to sort by the string value of an enum rather than by its index then use CAST. For example:

SELECT * FROM support_tickets ORDER BY CAST(status AS CHAR)

Inserting Values

If strict mode is not enabled, then if you were to try to insert a value that was not in the list of possible values, then the value would be set to an empty string with an index of 0. If strict mode is enabled then this would throw an error and insert nothing.

Even if strict mode is enabled and the enum column has no default, then an insert statement that does not provide a value for the enum column will result in having a value of the first index and won't throw an error. This is extremely bad in my opinion and is the number one reason why I won't use ENUMs in my databases.

Reference Tables

Instead of using Enums, I recommend that you use reference tables instead. A refence table is another table that is created which is referenced by ID from other tables. Below is an example of creating a reference table for the support_tickets example given earlier:

CREATE TABLE `statuses` (
    `id` int UNSIGNED NOT NULL AUTO_INCREMENT,
    `status` varchar(255) NOT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `statuses` VALUES
(1, 'open'),
(2, 'assigned'),
(3, 'resolved'),
(4, 'closed');

Now when we provide a status for a ticket, we need to use the ID of the status in the statuses table, rather than the string value, so the definition of the support_tickets table should be:

CREATE TABLE `support_tickets` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `subject` varchar(255) NOT NULL,
    `message` text NOT NULL,
    `status_id` int UNSIGNED NOT NULL,
    `modified_timestamp` timestamp DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

A reference table can be used in all situations where you are considering using an Enum, and will save you a lot of hassle in the long term for a short term pain of having to set up another table. It resolves the gotchas above, and also solves the problems described below.

General Reasons For Not Using ENUMs

Changing Enum Definition

Changing the possible options for an enum is extremely slow, even if all you are doing is adding another option, rather than editing the existing ones. You're probably going to say that your list of options is never going to change, but if I had a penny for every time that I heard that statement and it was incorrect I'd be a rich man. Making the change to a reference table is almost instant and effortless.

Getting Values For A Drop-Down Menu

When creating a form for editing support tickets, you probably want an HTML select menu with the list of values the user can choose for the status. You are going to have to hard-code the possible values in the menu rather than selecting distinct values from your table for that column, because otherwise you are going to prevent the user for selecting a status for a ticket if there isn't already another ticket in the table with that status. This is a chicken and the egg situation. If you had a reference table, then you could select its distinct values and your drop down menu will "self-update" whenever you add/edit the options.

Re-using Enums Across Tables

Lets say you have an website that lists all the roads in the world and it just uses one table called roads, with an enum for the country the road is in. Your country list hardly ever changes, except for the occasional civil-war and break-up of smaller nations. After the company makes lots of money, it decides to do the same thing again, but this time with rivers. You are going to have to recreate that country enum in the new rivers table, and now you have two tables to maintain whenever a nation breaks up. There is now the possibility of data integrity issues arising with one country list being different from the other. The risk is small when there are just two tables, but as you add cities, mountains, and monuments, it starts to become unmanageable. It would be much simpler (and faster) if there was just one countries table that all the other tables referenced using a country_id rather than having their own enum list of countries.

Adding Context

Lets extend the example above and imagining you are a stubborn developer that keeps re-using the country enum across several tables. Your manager asks you to extend the website's behaviour so that it now also provides context about the country the road/river/mountain/monument is in, such as the GDP and population. At this point you are going to have to create a countries table anyway, for storing the GDP and population for each country. The alternative of adding GDP and population columns to each of the tables, filling the values in for every row, and maintaining consistency would be near-impossible. If you had started with a reference table design, performing this task for your manager would be simple. However, converting from using mutliple enums to this reference table with contextual data columns will be much more work.

Conclusion

Enum's can be a quick solution, but I recommend that you don't use them and always use a reference table instead. If your not convinced, or wish to read more, then I recommend reading 8 Reasons Why MySQL's ENUM Data Type Is Evil.

References

Stuart Page

Stuart Page

Stuart is a software developer with a passion for Linux and open source projects.

Read More