They’re called Snowflakes, an idea originally popularised by Twitter for tweet IDs - it’s basically a way of defeating the birthday paradox in a distributed system, without the need for a central coordinator!
The basic idea is similar to a UUIDv1, but instead of baking in the host’s MAC address, you hash something suitably unique about the host/container/microservice/potato field into a unique “worker ID”, and then generate IDs by baking that in, along with the current time, and a rolling sequence number to prevent two IDs generated in the same microsecond from being identical.
This means that there are only two things you need to defend against: worker ID collissions, and your sequence number rolling over twice in the same microsecond, and the latter you can just defend against with a sleep()
until the next microsecond.
But they also another nice property: because they start with a timestamp, they’re sortable in both numeric (Twitter) and base32 (Zevvle, Monzo) forms - and if you want to, you could even omit the creation timestamp entirely and just extract it from the snowflake.
The prefix (acc_
, user_
) is just to make sure it’s never ambiguous what kind of thing an ID refers to; for example, was this transaction (tx_
) created by a Mastercard message (mcauthmsg_
) or a BACS record (bacsrcd_
)?
But it also means that you can build some neat tooling! In Monzo’s internal CLI, find user_XYZ
pulls up not just that user’s profile, but also their accounts, and their cards. Another find
on one of those IDs lets you explore other, related systems.
This is a big deal, because it turns out that the single most universal need in nearly any distributed system, is the ability to store some sort of data (users, accounts, tweets, moderator flags, log lines…), and storing data isn’t much good without an ID to help you find it again - so if your ability to generate IDs has a single point of failure, so does your ability to store data.
Well, apart from your database, but it turns out that if you don’t rely on AUTOINCREMENT
IDs, you can in many cases get away with building an eventually consistent system, which can cope with a lost database node or two.