UUID's are not always fun to work with. Troubleshooting values that cannot be easily memorized or communicated across the team is just one of the common points of frustration for young (and not so young) developers. But are UUID's worth the extra headache?
The very short answer is yes, the vast majority of projects will benefit from using UUID's as their primary keys. There are however a few key points that you can use as rule of thumb before making your decision.
Are UUID's more secure?
UUID's are incredibly hard to guess and therefore slightly safer to use if your plan is to expose your ID's externally.
If your API exposes a user's regular auto incremented ID of 2202, it is very easy for an attacker to guess that there's a user 2203, 2204 and so on. That means that a security breach discovered for one user could potentially be applied to all your users with minimal effort.
With UUID's, on the other hand, an attacker would need to have gained knowledge of the UUID of every single user that he intends to attack. UUID's are not incremented and trying to guess them would simply not work.
With that being said, UUID's should not be thought as a real security measure. They are security through obscurity and only make things a bit more difficult. If you don't handle ACL correctly and your security holes are big enough, UUID's will do nothing to protect your application.
What about performance?
If you are looking for increased performance, UUID's will not give you that. On the contrary.
Regarding storage, UUID usage can start to add up quite a bit.
Your primary keys will not only be stored in the original table, but a copy of it will also get stored in every relation table as well. That means that the simplest many to many table will be storing 72 characters for every single relationship, and things only get worse if you are using UTF-8...
Luckily, modern database versions (such as MySQL 8) offer the possibility to store UUID's in a compact binary format. As great as that is, you will be using 16 bytes per UUID as opposed to 8 bytes per BIGINT ID - or even 4 bytes per INT ID for smaller databases.
Speed is also a concern when it comes to UUID's.
Integer ID's are much easier to index and scan due to the fact that they are sequential values.
Integer indexes will also take much longer to become too big to fit in memory, which may affect the performance of your application down the line.
Uniqueness: why should I care?
UUID stands for Universally Unique Identifier and we already alluded to how practically impossible it is to generate a collision. But why is that useful?
The key here is that UUID's are unique across all tables, of all databases. Ever!
This means that merging two tables, and distributing databases across multiple servers becomes much easier, sometimes even trivial. Integrating your system with other external systems (and vice-versa) is often easier as well.
In more practical development terms, inserting related records in a single transaction is also easier when you have pre-generated a UUID which is guaranteed to be valid, without even needing to double check it in the database. Integer ID's would require a round trip to the db for this use case.
Why you probably should use UUID's?
So you may be wondering, why is it that earlier we said that "the vast majority of projects will benefit from using UUID's as their primary keys"?
Based on the points above, this topic seems very much like a "it depends" type of question. And that is because it is. The reason why we chose to provide pragmatic advice here is to get you unstuck, rather than feed the "Analysis Paralysis" monster.
The truth is that most applications will not scale up to be insanely huge. Nor they will require insane performance either. The applications that eventually reach that stage will most likely have transformed so much over time, that they will require design reviews to accommodate the growth appropriately.
We are strong believers that at that point it will be less error prone and easier to write new code, migrate and ensure data integrity if you can ensure that every identifier is unique and cannot accidentally be matched with the wrong record just because "it's ID is also 12".
Why not both?
It is a fool's errand to attempt to optimize your app too early. But when it is all said and done, if your app reach the later stages you should start to see pretty clearly where UUID's join performance is starting to get in the way.
You will probably also start to see where UUID's are just too helpful to pass up.
At this point every app solution is gonna differ greatly. Check out how instagram have handled their insane traffic requirements.
There's nothing wrong with using integer ID's and UUID's together, as long as you do it in a consistent manner. In the cases where you decide that multiple servers are not necessary, you could use integer ID's for very large tables which get joined often. Tables where values basically don't change are also good candidates for integer ID's.
We hope this article gave you a general idea of the trade-offs in the UUID vs ID battle.
Here is our advice in one paragraph:
Don't spend too much time trying to guess what will be right later. Err on the side of using UUID's, and your specific situation will dictate the adjustments you need to make when the time comes.