Zevvle Login Platform

bonzi · June 5, 2020, 9:19pm

Whilst playing with the Zevvle Auth system, I came across some funny status messages:

The login system seems very similar to that of Monzo’s. Is it some form of off the shelf Single Sign-on platform I haven’t seen yet, or something you’ve built in house?

I’d love to be able to have other applications authenticate against Zevvle’s auth system (via OpenID Connect or SAML2.0) ~~mainly for shits and giggles~~

Thanks,
Alfie

nick · June 5, 2020, 9:51pm

I completely forgot about the whoami endpoint, must’ve been a slow day.

In-house, kind of like a traditional password-reset flow with stateful JWTs so we can invalidate tokens server-side. To be honest 3rd-party access isn’t on the horizon; some day, but not sure when.

Rjevski · June 6, 2020, 2:46am

I’m curious about the entity IDs like user_<some random token. How are those generated? Is it a standard thing? I’ve seen them on both Stripe and Monzo (probably others I forgot) but not sure if there’s anything off the shelf that does this.

Also, the much login success message makes me think of LOLCODE. I guess we found Zevvle’s main programming language.

theenbyperor · June 6, 2020, 11:14am

On the topic of user_ when I’m programming stuff like this I’ll often use the UUID of the backing store’s object as the something, or in python I might use secrets.token_urlsafe() or secrets.token_hex()

nick · June 6, 2020, 3:14pm

Not that I’m aware of, but they make reading logs a lot easier (Stripe mentioned that being the reason somewhere).

This is how we generate the unique IDs…

It’s not very efficient though, I do need to move that to the database.

anon57251825 · June 6, 2020, 4:03pm

~/P/zc $ curl https://api.zevvle.com/whoami
You shall not pass! 🧙‍♂

Rika · June 6, 2020, 5:04pm

I respect the effort behind this shitpost.

embr · June 11, 2020, 12:09pm

They’re called Snowflakes, an idea originally popularised by Twitter for tweet IDs - it’s basically a way of defeating the birthday paradox in a distributed system, without the need for a central coordinator!

The basic idea is similar to a UUIDv1, but instead of baking in the host’s MAC address, you hash something suitably unique about the host/container/microservice/potato field into a unique “worker ID”, and then generate IDs by baking that in, along with the current time, and a rolling sequence number to prevent two IDs generated in the same microsecond from being identical.

This means that there are only two things you need to defend against: worker ID collissions, and your sequence number rolling over twice in the same microsecond, and the latter you can just defend against with a sleep() until the next microsecond.

But they also another nice property: because they start with a timestamp, they’re sortable in both numeric (Twitter) and base32 (Zevvle, Monzo) forms - and if you want to, you could even omit the creation timestamp entirely and just extract it from the snowflake.

The prefix (acc_, user_) is just to make sure it’s never ambiguous what kind of thing an ID refers to; for example, was this transaction (tx_) created by a Mastercard message (mcauthmsg_) or a BACS record (bacsrcd_)?

But it also means that you can build some neat tooling! In Monzo’s internal CLI, find user_XYZ pulls up not just that user’s profile, but also their accounts, and their cards. Another find on one of those IDs lets you explore other, related systems.

This is a big deal, because it turns out that the single most universal need in nearly any distributed system, is the ability to store some sort of data (users, accounts, tweets, moderator flags, log lines…), and storing data isn’t much good without an ID to help you find it again - so if your ability to generate IDs has a single point of failure, so does your ability to store data.

Well, apart from your database, but it turns out that if you don’t rely on AUTOINCREMENT IDs, you can in many cases get away with building an eventually consistent system, which can cope with a lost database node or two.

Rika · June 11, 2020, 1:56pm

find is easily one of the single best features of Monzo’s internal tooling.

It allows you to take an ID from logs or an escalation and immediately start finding out more. You can very quickly go from, say, a payment ID to a card ID to a user ID and account ID pair, even in a huge distributed system of microservices. You don’t have to know anything about the underlying services, only the relationship between the concepts (which are mostly self-describing once you run the first find).

Rjevski · June 11, 2020, 3:12pm

Is the command using a central service that knows the mapping between an object type (the first part of the ID) and the underlying data store or is the command itself responsible for this (does the command need to be updated when a new object type is introduced)?

smsm1 · June 11, 2020, 3:35pm

With the internal system at my day job, which is related to public transport data it produces internal persistent IDs, the persistent ids have a 2 letter prefix, e.g. ST for stop, and OP for operator. This allows the search to take is to exactly the thing we’re looking for.

Luckily so far the datasets are small, thus no need for distributed stuff, yet. More likely to come from paralisation of the import for larger sources.

Rika · June 11, 2020, 4:00pm

Yes, but there is no reason why the logic of the tool couldn’t be moved into a service much like our build and deploy system where the tool is a emoji-filled interface over some calls to the builder services. It works well enough to make adding the mapping between your new ID and the endpoint where someone can find out more to the tooling as part of building a new system.

Monzo’s engineering tools are generally updated with the main branch as part of the monorepo and it’s expected that you run the rebuild/install script regularly. Building, packaging, and versioning beyond tracking the main branch turned out to be not worth it when everyone has all the source code and a Go compiler anyway. Anything requiring access by more than engineering or that needs to be prettier than a raw JSON output is turned into a component of something like the Business Operations web app.