Two of them? Dude, WTF?
OAuth at its simplest
If you've come across OAuth, you've probably realised that it works a bit like this:
You use a client - for instance, a social media app on your phone - to log in to a server. The server provides the client with an access token, and the client can use that token to access the resources that belong to you, such as your posts and messages.
That's not a bad way to think about it at a high level, but if you want to build a service which relies on OAuth, it's useful to know a bit more about what's going on inside the boxes. Let's take a closer look at that box labelled "Server".
"The Server" in more detail
OAuth is described in detail in RFC6749 which jumps right in at the deep end by talking about not one, but two servers:
- Resource Server - The server hosting the protected resources, capable of accepting and responding to protected resource requests using access tokens.
- Authorization Server - The server issuing access tokens to the client after successfully authenticating the resource owner and obtaining authorization.
But aren't they both the same server? Well, kinda:
Our conceptual server has two tasks: first it allows the client to log in and gives it a token, then it handles requests from the now-authorized client.
Why is it useful to split the tasks up like this? From RFC6749 again:
The authorization server may be the same server as the resource server or a separate entity. A single authorization server may issue access tokens accepted by multiple resource servers.
Scaling up
There are a few reasons why you might want to do this; one of them is for scalability. The resources could be quite large (eg photos or videos), and if you have a lot of users storing a lot of large resources, it's useful to be able to add more physical machines as your system grows.
Users shouldn't have to care about this complexity. For one thing, what happens when a new user signs up? It shouldn't be their problem to choose a server to hold their data; they don't know any of the details, such as which ones have spare capacity. Instead they arrive at the AS for the first time and go through the signup process; as part of this, the AS can choose a suitable RS to assign them to.
In the real world: ATProto
Now this section of the ATProto docs is a bit easier to understand:
In OAuth terminology, an atproto Personal Data Server (PDS) is a "Resource Server" to which authorized HTTP requests are made using access tokens. Sometimes the PDS is also the "Authorization Server" - which services OAuth authorization flows and token requests - while in other situations a separate "entryway" service acts as the Authorization Server for multiple PDS instances.
BlueSky's engineers have designed the system to allow for two configurations: a simple one and a powerful one.
The simple case: Self-hosted PDS
The first case, where "the PDS is also the authorization server", is ideal for small-scale setups such as self-hosting, where there are only a handful of user accounts on the system. To make this setup as easy as possible to install, the PDS has a basic AS built into it, and we can draw a diagram with only one "server" that looks almost as simple as the one I started with:
Install the PDS, and off you go, without needing to worry about how your authentication works in any detail; it just works, out of the box.
The powerful case: BlueSky
The second case, where "a separate entryway service acts as the authorization server", is more appropriate for setups with a large number of user accounts.
When you sign up to BlueSky, first you arrive at their "entryway" service at bsky.social
. You
create your account, and the entryway assigns you to one of several PDS servers, all named after
mushrooms:
As the network grows, they can add as many mushrooms as they need to accommodate new users, and perhaps even move users between the mushrooms, without anyone outside BlueSky needing to know or care about the details. The key to this flexibility is in separating the authorization server from the PDSes, so that it can oversee the whole process and manage the assignment of users to PDSes.