Table of content
- Your database will fail at 2 AM. The question is: will you care?
- Architecture overview
- The failover dance
- Storage architecture
- What's configurable
- Why not managed PaaS?
- Quick start
- Why we open-sourced this
- Go break things (safely)
Your database will fail at 2 AM. The question is: will you care?
It's 2 AM. Your primary PostgreSQL instance just died. Your phone is buzzing. You're half-asleep, SSHing into production in your underwear, trying to remember which node is the replica and whether you can safely promote it without losing data.
Or — hear me out — you stay asleep.

That's the pitch. We built a Terraform module that deploys a production-ready, self-healing PostgreSQL cluster on Oracle Cloud Infrastructure. Three nodes, automatic failover, and a load balancer smart enough to know who the leader is — all provisioned with a single terraform apply.
It's open source: github.com/obytes/oci-postgres-cluster
Architecture overview
The cluster runs three nodes: two PostgreSQL instances managed by Patroni, and one lightweight etcd witness for quorum. A Network Load Balancer sits in front and routes traffic exclusively to the current primary.
Here's what each piece does:
- Patroni is the brain. It manages PostgreSQL lifecycle — startup, replication, health monitoring, and leader election. When the primary disappears, Patroni on the replica detects it and triggers promotion.
- etcd is the vote. A distributed key-value store that holds the cluster state and provides consensus. It prevents split-brain scenarios where both nodes think they're the leader (that's the nightmare).
- The witness node is the tiebreaker. It runs etcd only — no PostgreSQL, no block volumes, minimal compute. Its sole purpose is to maintain quorum: with 3 etcd members, any 2 can form a majority (2/3). Without it, losing one node means losing quorum entirely.
- The NLB is the router. It doesn't just check if port 5432 is open. It sends
HTTP GET /primaryto Patroni's REST API on port 8008. Only the actual leader returns200 OK. Replicas return503. Dead nodes return nothing. This is the secret sauce — your application connects to one stable IP, and the NLB always knows where to send traffic.
The failover dance
So what actually happens when the primary dies? Let's walk through it.

Step by step:
- Node 1 crashes. Patroni's TTL expires after ~10 seconds. The etcd lease is gone.
- Quorum check. Node 2's etcd + the witness still form a majority (2 out of 3 members). Consensus holds.
- Promotion. Patroni on Node 2 sees the leader key is vacant, acquires the lock, and promotes PostgreSQL from replica to primary. This takes ~5-15 seconds depending on WAL replay.
- Health check flips. The NLB's next health check hits Node 2's
/primaryendpoint —200 OK. Node 1 is unreachable — connection refused. The NLB updates its routing table. - Traffic reroutes. Your application's next query goes to the NLB, which sends it to Node 2. Done.
Total downtime: 15-30 seconds. No SSH. No runbooks. No 3 AM heroics.

The critical detail most PostgreSQL HA setups get wrong: using a dumb TCP health check on port 5432. That tells you PostgreSQL is running, not that it's the leader. A newly promoted replica still has port 5432 open during the transition. A deposed primary might still accept connections briefly. The Patroni HTTP API eliminates this ambiguity entirely — it's the single source of truth for leadership.
Storage architecture
Each PostgreSQL node gets three dedicated block volumes, plus the boot volume. Separating these isn't just organizational — it's about I/O isolation.
- Data volume (
/pgdata) — Your tables, indexes, and TOAST data. Sized generously at 1 TB by default. - WAL volume (
/pgwal) — Write-Ahead Logs on a separate volume means sequential WAL writes don't compete with random data I/O. This matters under heavy write loads. - Backup volume (
/pgbackup) — Dedicated space forpg_basebackupand WAL archival. Keeps backups from eating into data or WAL headroom.
All three support OCI KMS encryption — pass a kms_key_id and every volume is encrypted with your customer-managed key. Skip it and OCI's platform-managed encryption is used by default. All volume sizes are configurable.
The setup uses LVM under the hood (volume groups and logical volumes), and the user-data scripts detect volumes by size — making the whole process idempotent across instance restarts.
What's configurable
This isn't a rigid template you fork and find-replace. Everything is parameterized:
- PostgreSQL version — defaults to 15, but pass any major version
- etcd version — defaults to v3.5.17
- Compute specs — separate sizing for PostgreSQL nodes and the witness (the witness defaults to 1 OCPU / 8 GB because it only runs etcd — no need for 32 GB of RAM on a tiebreaker)
- Volume sizes — data, WAL, backup, and boot volumes are all independently configurable
- PostgreSQL tuning —
max_connections,shared_buffers,effective_cache_sizevia a simple object variable - KMS encryption — optional, just pass the key OCID
- Network — bring your own VCN, subnet, and reserved private IPs
- NSG rules — fully customizable ingress rules for each service port
And importantly: no provider lock-in in the module. It doesn't configure its own OCI provider — it inherits from the caller. That's not an accident; that's following Terraform best practices.
Why not managed PaaS?

Managed databases are great — until they're not. You want to pin a specific PostgreSQL minor version? Tune wal_level and max_wal_size? Use a custom extension? Control exactly where your data lives, how it's encrypted, and when backups run?
Self-hosted PostgreSQL HA gives you the knobs. Patroni + etcd is a battle-tested pattern used by GitLab, Zalando, and plenty of companies running PostgreSQL at scale. This module packages that pattern into something you can terraform apply in minutes instead of spending a week writing cloud-init scripts.
Quick start
module "postgres-cluster" {
source = "github.com/obytes/oci-postgres-cluster//modules/postgres-cluster"
prefix = "myapp"
cluster_name = "MYAPP-POSTGRES"
compartment_id = var.compartment_id
vcn_id = var.vcn_id
subnet_id = var.subnet_id
subnet_cidr = "10.0.2.0/24"
family_shape = "VM.Standard.E4.Flex"
image_id = data.oci_core_images.oracle_linux.images[0].id
postgres_instance_specs = { ocpus = 4, memory = 32 }
reserved_private_ips = [
cidrhost("10.0.2.0/24", 175), # node1
cidrhost("10.0.2.0/24", 176), # node2
cidrhost("10.0.2.0/24", 177), # witness
]
ssh_authorized_keys_postgres = var.ssh_public_key
}
That's the minimum. For the full example with KMS encryption, NSG rules, and all the trimmings, check out examples/complete/.
Why we open-sourced this
We built this module, battle-tested it in production, and decided the community should have it. There's no catch. No "enterprise tier" behind a paywall. No "contact sales for the HA version."

PostgreSQL high availability shouldn't be a mystery. The Patroni + etcd pattern is well-documented, but packaging it into a clean, reusable Terraform module for OCI — with proper health checks, storage separation, KMS encryption, and a witness node — takes real engineering time. We already spent that time. Now you don't have to.
Go break things (safely)
The repo is live: github.com/obytes/oci-postgres-cluster
- Star it if this saved you time
- Try it in a dev environment first (obviously)
- Open issues if you find bugs — we're actively maintaining this
- PRs are welcome — check out CONTRIBUTING.md for guidelines

Now go deploy a cluster and sleep through your next failover.





