Advantages Of Using UUIDs
I started switching over to UUIDs in 2019 and would never choose to go back. When I have to go back to old projects that use integers, all the hasstles come back to me, so I thought I'd make this post to record of all them as they come to me.
When one uses auto-incrementing integers for unique identifiers, one needs to immediately save an object to the database before returning it, so that it can have its ID. When one uses UUIDs, one knows the unique identifier without having to save to the database first, so one can choose to return it immediately without saving to the database. This is most useful when one is mass creating a lot of different objects and one would rather save them all to the database in a a few bulk insert statements using a multi-query (package helper). This can save a lot of time.
Soft Deletes And Archiving
A lot of systems use a "flag" column on the database to say whether the object has been deleted or archived.
Alternatively, they may use a nullable timestamp called
This is a quick solution, but prevents that database from having data-integrity through foreign keys.
E.g. there would be no delete operation that would trigger a cascading delete, or an error to be thrown from a RESTRICT clause.
For me, the ideal solution if you want to have "soft deletes" or "soft archiving" is to create separate tables for your "deleted" or archived data. This keeps it away from the data you really care about, and can boost performance (shorter table scans on the "active" data). Best of all, you can keep your referential integrity. Since you use UUIDs instead of auto incrementing integers, there is no "gap" or fragmentation in your primary key that might be taken at a later date (such as if your auto increment resets from a reboot). If you used integers, then there is a high chance that you could have a conflict and will have to put in a lot more effort ensuring that moving rows between your active and deleted tables does not have issues.
Mistakes Become Obvious (Same IDs on Different Tables)
If all your tables are using integers, then some mistakes become obvious sooner. For example, I was working on a model that was accidentally hydrating using the incorrect class name. Initially this went unseen because all the operations still continues to "work". E.g. loading id 3 from the wrong table still returned an object, and calling delete() on that still worked. When I started putting in foreign keys to lock down the database because I was seeing strange behavior, I found this bug, but this would have become obvious sooner without the FKs because my IDs would not be the same between tables.
Multi-Master Delayed Synchronization
I was recently working on a project where there needed to be two "master" APIs. E.g. there were two of the same deployed codebase, with their own databases, and they needed to both be able to work "offline" without being able to talk to each other in case the internet cut out (which it often did). Hence they both needed to be able to write objects to their databases and sync up with each other later. The only way to do this was to use UUIDs for th unique identifiers as integers would inevitably clash. This allowed a very simple codebase and deployment with little stress.
Identifiers of resources are often put in the URL. For example consider the following URL for viewing a particular resource:
Now if that resource's identifier was an auto-incrementing integer, it would be pretty obvious to someone that they might be able to just adjust the number to view a different resource. With UUIDs, they will not be able to guess any of the other resource's identifiers. It would not be good to rely on this from a security perspective, but it is worth noting and the principal could prove useful in niche circumstances.
More On The Way
As I remember more, I will be sure to put them here.
First published: 29th April 2020