Persistent multiplayer state without chaos

How I keep a live multiplayer game consistent with PostgreSQL holding the truth and Redis doing the fast work

May 25, 2026

In a single-player game, state is easy. There’s one player, one save file, and as long as you don’t corrupt the file, you’re fine.

In a live multiplayer game, state is the hardest problem you’ll have. Two players can act on the same target at the same time. A scheduled job (a build, a launder, a raid) needs to resolve at exactly the right moment regardless of who’s online. A player can come back after three days and expect their progress to be where they left it, plus whatever happened in the world while they were gone.

You can’t naively keep this in memory. You can’t lazily flush to disk on logout. You need a real persistence story, and you need it to handle concurrent access without melting.

Here’s the architecture I landed on.

The problem: two kinds of state, one source of truth

A multiplayer game has two flavors of state:

1. Authoritative state. Player resources, owned nodes, queued actions, completed achievements. If this is wrong, the game is broken. Players notice within seconds and rage-quit.

2. Hot state. Things you read constantly but recompute often: target defense snapshots, online presence, active raid timers, leaderboard pages. If this is slightly stale, no one cares.

Trying to put both in the same store is what breaks games. Either you put hot state in PostgreSQL and your hot path is one query per click, or you put authoritative state in Redis and one OOM kills your economy.

The fix is to be honest about which is which.

Sponsor

Agent loop is the most important piece of infrastructure in your workflow right now and for most developers, it’s the one piece they can’t open up. Agent builders have to jump through all the hoops themselves, crafting the infrastructure and tools, testing the harness, while fighting to maintain what they’ve built.

Meet Cline SDK: agent harness behind Cline 2.0, fully open-sourced. The same runtime that powers Cline across VS Code, JetBrains, and the CLI is now an npm install away: npm i @cline/sdk. Inspect it, fork it, extend it, ship on it.

Best-in-class harness: 74.2% on Terminal-Bench 2.0 with Claude Opus 4.7 ahead of Claude Code (69.4%) and strongest numbers published on open-weight models.
Open model & provider choice: Anthropic, OpenAI, Google, Bedrock, Mistral, or any OpenAI-compatible endpoint.
Real plugin system: Register tools, hooks, commands, providers, message builders. Prototype as a local file, harden into a package. Extend it freely for any of your agent use cases.
Scheduled + event-driven agents: Cron and event specs for PR reviews, dependency checks, coverage audits, changelogs no separate orchestration layer.

Stop building around your agent. Start building on it.

# Install CLI:
npm i -g cline

# Install SDK
npm install @cline/sdk

Get Started Today

The design choice: PostgreSQL is truth, Redis is speed

The rule I follow:

- PostgreSQL holds the source of truth. Player rows, node rows, queued jobs, transactions. Every authoritative mutation goes through a transaction.

- Redis holds derived or hot state. Cached node lists, defense snapshots, presence, locks for periodic jobs. Anything in Redis can be rebuilt from PostgreSQL.

If Redis disappears tomorrow, the game is slow but correct. If PostgreSQL disappears tomorrow, the game is gone. That asymmetry tells you exactly what each store is for.

The implementation: transactions, repositories, and a thin cache layer

The backend is Go (Echo + pgx). The structure is the standard 3-layer:

api/ — Echo handlers, mostly thin
game/ — services that orchestrate game logic
db/ — repositories with explicit SQL

Every authoritative mutation is wrapped in a transaction. Reads of hot data first hit Redis, fall through to PostgreSQL on miss, and the cache is invalidated whenever the underlying row is written.

A single Redis pattern that pays for itself many times over: distributed locks for periodic jobs. A live game runs background tickers — heat decay, build completion, passive income, raid resolution. If you scale to two backend instances, every one of those tickers will fire on every instance and double-apply. The fix is a short-lived Redis lock keyed by job name:

ok, err := s.redis.GetClient().SetNX(ctx, "heat_decay:lock", "1", s.interval).Result()
if err != nil || !ok {
    // Another instance holds the lock — skip this tick.
    return
}
affected, err := s.players.DecreaseAllHeat(ctx, heatDecayPerTick)

SetNX with a TTL equal to the tick interval guarantees exactly one instance applies the decay per window, and if the holder crashes the lock expires automatically. No leader election, no Zookeeper, no surprise.

For per-player mutations, the lock is just a SELECT ... FOR UPDATE inside a transaction. PostgreSQL is good at this and you do not need to invent a lock manager.

The trade-off: cache invalidation and write amplification

The price you pay:

- Cache invalidation is real work. Every authoritative mutation has to invalidate the right Redis keys. Get this wrong and you serve stale data for minutes. The way I keep it sane is to make caching opt-in per repository method, and to always invalidate before returning success — never after. Slightly slower writes, but no zombie data.

- Write amplification. Sometimes a single user action mutates four or five rows: player, node, queued job, achievement, telemetry. Each of those is a write to PostgreSQL plus possibly a cache invalidation. You can’t avoid this, but you can keep it inside one transaction so the total cost is one round trip and the result is atomic.

- Background tickers cost. A naïve ticker that scans every player every minute will not survive 50k accounts. The pattern I use is to push almost all “due work” into a finishes_at column and have one ticker scan for finishes_at <= now() ordered by that column. PostgreSQL can serve that query from an index in a few milliseconds even at scale.

The general lesson: cheap reads are non-negotiable in a live game, but never at the cost of correctness on writes. If you ever feel tempted to "just write to Redis and flush later", stop. That's how you lose a player's progress and lose them as a customer.

How this looks in HiddenWars

HiddenWars has a player base that builds and extend its botnet continuously. Each player owns a set of nodes; nodes have state (healthy, degraded, corrupted, offline), HP, build queues, and active jobs. On top of that, players run laundering jobs, hacks against other players, and bounty contracts.

- PostgreSQL holds players, nodes, build queues, launder jobs, hacks, bounties, achievements, and the audit log of every meaningful action.
- Redis holds the dashboard's paginated node list, the target defense snapshot used during a hack, presence ("who's online right now"), and the per-job locks for tickers like heat decay, build completion, and bounty expiry.

Every action endpoint is a Go handler that opens a transaction, runs a few repository calls, commits, then invalidates the relevant Redis keys before responding. That gives consistent state across players and a hot dashboard that returns in single-digit milliseconds even when the player has hundreds of nodes.

The thing I underestimated when I started was how much of the game design depends on this architecture being boring. Players don't notice good state management. They notice when their action vanishes, or when their resources jitter, or when a build that should have finished an hour ago is still pending. Removing those failures is what makes the game feel solid.

Takeaways

- Pick a single source of truth. PostgreSQL is a great default.
- Use Redis for things you can rebuild from PostgreSQL — caches, locks, presence — never for authoritative state.
- Distributed locks for periodic jobs are one SetNX call away. You probably don't need leader election.
- Push due work into a finishes_at column and let one ticker drain it. Don't scan everything every minute.
- Invalidate caches before returning success. Stale reads after a successful write feel like bugs to players, because they are.

A live game is, more than anything else, a database problem dressed up as a game. The closer you can get the database design to the game design, the less of your time goes into firefighting state corruption.

---

This is one of the systems behind HiddenWars, a browser-based multiplayer hacking strategy game.

Discussion about this post

Ready for more?