Phil gives a talk about Small Scale ATproto at ATmosphere Conference 2025 in Seattle
I'm Phil, my pronouns are they/them. Thanks. I'm a mom to some plants, non-binary, but I haven't updated my passport gender marker, so I was able to get into the country relatively easily. Border control has fresh fingerprints from me—it’s not easy for everybody to get here. I know we also have folks here thanks to external sponsorships like the Umoya Fellowship, and the conference itself has also supported people in getting here. Building community takes work and intentional effort, and what we build isn't separate from the world around us, even—and maybe especially—when we're deep in a social networking protocol. Blaine said technology matters but it doesn't matter more than people, and I really like that.
I'm not going to be as values-forward in this talk as I usually would be, but I want to say that I'm going to talk about App Proto at the small scale. There will be a bit of overlap with Devon's talk earlier. I’ll be a little meaner but I’ll try to wrap it up optimistically, because I am pretty optimistic.
Self-hosting is something that I’ve spent the most time on in App Proto. By that, I mean small versions of services that tap into larger third-party lexicons. I’ll also talk about building new apps from scratch on App Proto. I think these are distinct but related.
I’ve already used the word lexicon here. I’ll generally use it to refer to app-specific data. So when I say “lexicon,” I mean the data a platform like Blue Sky uses. I’ll also use “app view,” which in this talk just means backend.
Blaine took us through a great explanation of what building on App Proto can look like, but I want to start with the choice to use App Proto, especially compared to a classic app architecture.
The classic approach might be: client app → API → SQL database. This is well-trodden—you have choices of frameworks for APIs, platforms for the client app, JSON or gRPC for communication. It’s well-established, and you can usually find answers to any problems quickly.
Conceptually, App Proto is like taking all the data locked inside a backend service and pulling it out into personal data servers (PDSs). It’s “inside out.” You still have a similar app structure, but the data is decentralized across PDSs.
With a SQL backend, you get:
Typed data with relationships.
Constraints (like unique keys).
Query optimizers built over decades of research.
Clear source of truth.
If you scale up, SQL is strong. But if you go to extreme scale—like Facebook or Twitter—you eventually outgrow it and start making trade-offs: read replicas, microservices, event logs, denormalized data, etc. Each introduces complexity for your developers.
App Proto turns the data service inside out, but you get the benefits of that high-scale architecture without the operational burden for small apps.
Read more than write: Most new apps have more reads than writes.
Durability: Traditional apps assume durability—if you acknowledge a write, it must be persisted. App Proto shifts this because data is stored in users’ PDSs, so bad data can arrive, and it may be your app’s responsibility to handle it.
Event sequencing: There’s no global clock. If a post and a like happen simultaneously, you can’t assume the like came after seeing the post. This adds complexity in your app logic.
Some advantages: durability is less critical because the event log allows replay. You also don’t get backpressure against PDSs—if your app goes viral or there are network issues, you may have a backlog to process.
I wanted to track likes without relying on Blue Sky’s infrastructure. So I built a backlink aggregator starting with likes.
Each like in App Proto is a link stored in the user’s PDS.
To see who liked a post, you need to listen to all likes across all PDSs.
I used tools like App Proto Browser and atp.tools to inspect PDSs.
I used SQLite as a simple key-value store for my aggregator:
Indexed likes by target post.
Stored deleted likes in a separate table, because delete events only contain the record address, not content.
Performance was tricky: SQLite could handle up to ~5 million writes with tuning. I also explored LSM-tree databases (e.g., RocksDB), which are better for writes. Key takeaway: avoid reading in your write path when using LSM trees.
The aggregator is self-hosted on a Raspberry Pi 5 (16 GB RAM), but even older Pi models work. It processes the firehose in real-time, backfilling efficiently.
Key Lessons
Reads in write path are expensive for LSM trees.
Deleted records need careful handling—App Proto is adding improvements in sync 1.1.
Event logs and backlinks are fundamental to building apps and aggregators in App Proto.
You don’t need to host everything yourself: Small-scale self-hosting is feasible. Larger-scale apps may need cluster databases.
The videos from ATmosphereConf 2025 held in Seattle, Washington, are being republished along with transcripts as part of the process of preparing for ATmosphereConf 2026, taking place March 26th - 29th in Vancouver, Canada.
Follow the conference updates and register to join us!