The Postgres consolidation strategy makes a clear argument: for the overwhelming majority of SaaS and AI workloads, Postgres is the right database. But honest engineering also means knowing exactly where that argument stops — and what to do when you get there.
This is not a balanced “it depends” piece. It gives you measurable thresholds, a decision framework, and a real production case study with specific numbers. When Wingify hit Postgres’s analytical limits, they migrated their OLAP layer to ClickHouse, cut infrastructure costs by 80%, and reduced aggregation query latency from 30–50 seconds to 100–300 milliseconds.
The core principle: consolidate first, measure, then break out only when measurable pain forces the move.
Where does the Postgres consolidation thesis have limits?
Postgres is a row-oriented transactional database optimised for row-level reads and writes with ACID guarantees. That is its strength for the workloads that make up the vast majority of SaaS applications. The structural exception is OLAP — and the limitations are architectural, not configurational.
I/O amplification. Postgres stores data row-by-row. An aggregation query needing two columns from a 50-column row reads every column in every row. At analytical scale — hundreds of millions of rows — this saturates disk bandwidth structurally. Not because of missing indexes. Because of how it stores data.
Autovacuum contention. Analytical queries require long-running transactions, and while one is open, Postgres cannot clean up dead tuples. The result is table bloat degrading OLTP performance. Picture arriving in the morning to find your read replica lagging by hours because a heavy analytical query blocked autovacuum overnight.
Absence of vectorised execution. Postgres processes data one value at a time. ClickHouse uses SIMD CPU instructions to process millions of values in a single cycle. At analytical scale, this compounds with I/O amplification in a way that no amount of tuning fixes.
Reaching this point is a diagnostic signal, not a failure. Postgres did its job — the workload has just grown to require a specialist.
What workload characteristics mark the OLAP vs OLTP boundary?
The boundary is not data volume alone. It is aggregate query patterns, real-time freshness requirements, and row counts — in combination.
OLTP is Postgres’s native domain. Transactions, user records, session management, job queues, embeddings with pgvector. Frequent, low-latency, row-level reads and writes. Postgres does this natively and well.
OLAP is a different beast. Aggregate-heavy queries across large datasets — COUNT, SUM, AVG, GROUP BY across hundreds of millions to billions of rows, with sub-second latency requirements. A SaaS analytics dashboard showing live A/B test results across millions of sessions is an OLAP workload. Postgres is not the right tool for it.
Postgres materialised views help, but only to a point. They require a manual REFRESH and still store data in row format, so I/O amplification persists. They delay the problem — they do not solve it.
ClickHouse’s AggregatingMergeTree engine with Materialised Views is architecturally different. Data is stored column-by-column, enabling 15–20x compression ratios. Materialised Views update automatically on every insert — raw events go in, the MV aggregates them, and read queries hit the pre-aggregated MV. No manual refresh. No I/O amplification.
The diagnostic checklist. If two or three of these are true, you have hit architectural limits, not a tuning gap:
- Aggregation queries (
GROUP BY,COUNT,SUMacross large tables) taking more than 5 seconds - Autovacuum visibly lagging or blocked by long-running analytical queries
- Analytical batch jobs degrading OLTP query latency concurrently
shared_blks_readinpg_stat_statementsextremely high for analytical queries despite good indexing
What does Wingify’s ClickHouse migration actually show?
Wingify built VWO (Visual Website Optimiser), an A/B testing platform serving over 10,000 customers. Their core product requires real-time aggregation of experiment results at scale. It is a demanding workload.
The before state. Wingify’s analytics ran against Postgres. As data volumes grew, aggregation query latency degraded to 30–50 seconds. Vertical scaling reached diminishing returns. Sharding introduced more complexity than it resolved. They had hit the wall.
The migration. They moved their analytics layer to ClickHouse using AggregatingMergeTree with Materialised Views. Raw events go into a base table, the MV aggregates them in real time, and read queries hit the pre-aggregated summary — never raw data.
The result. Aggregation query latency dropped from 30–50 seconds to 100–300 milliseconds. That is a 100–200x improvement. Infrastructure costs fell by 80%. Postgres stayed as the OLTP system of record.
And here is the thing worth noting: Wingify’s 80% cost reduction only materialised because they started on Postgres and moved when measurable evidence demanded it. Not before.
What is the decision framework for breaking out to ClickHouse?
Work through these in order before concluding you need ClickHouse.
Step 1 — Confirm it is architectural, not a tuning gap. Run EXPLAIN ANALYZE on your worst queries. Check pg_stat_statements I/O metrics and autovacuum lag via pg_stat_user_tables. If pg_partman partitioning has not been implemented, do that first. If queries are still above 5 seconds after all of this, you have a structural problem. Not a config problem.
Step 2 — Assess workload composition. If more than 20% of your query load is aggregate-heavy, a dedicated OLAP store is justified. Below that threshold, the overhead of a second system is unlikely to be worth it.
Step 3 — Check real-time freshness requirements. If dashboards require sub-second data staleness at more than a few hundred queries per minute, Postgres Materialised Views cannot meet it. The manual REFRESH cycle creates an unbridgeable latency floor.
Step 4 — Estimate data volume trajectory. If row counts are approaching hundreds of millions and growth points to billions within 12–18 months, architect for ClickHouse now, not in crisis.
Step 5 — Consider the operational cost honestly. ClickHouse adds real complexity: Keeper nodes, schema evolution constraints, columnar modelling. A team below five engineers should look at pg_duckdb first — approximately 500x OLAP speedup without a second database. If your dataset is below roughly 100GB and you do not need real-time streaming ingestion, pg_duckdb is likely sufficient. It is the right call far more often than people expect.
The migration bridge. When ClickHouse is the right call, PeerDB (open-source) or ClickPipes (managed, built into ClickHouse Cloud) handle Postgres-to-ClickHouse change data capture via logical replication. Postgres stays as the write system of record. Application code does not change. The migration is far less disruptive than it sounds.
When does pgvectorscale stop being enough for vector search?
The same principle applies to vector search. The upgrade path is: pgvector → pgvectorscale → dedicated vector database. pgvectorscale from Timescale extends Postgres’s useful range dramatically via StreamingDiskANN. At moderate scale, benchmarks show 28x lower p95 latency than Pinecone, 16x higher throughput, and 75% cost reduction.
One constraint to know upfront: pgvectorscale is not available on AWS RDS or Aurora. If your AI workloads need StreamingDiskANN and you are on managed Postgres, you face a hosting decision before you face a vector database decision. Self-host, or use a platform that supports it, such as Timescale Cloud.
The genuine breakout threshold is billion-vector scale with strict latency SLAs and metadata filtering requirements that the upgrade path cannot resolve. For a detailed analysis, see pgvectorscale at billion-vector scale.
How does managed Postgres (RDS/Aurora) compare to BYOC for AI-intensive workloads?
For AI-intensive workloads, the hosting model determines your IOPS ceiling, extension availability, and cost structure. It matters more than most teams realise until they are already hitting limits.
AWS RDS and Aurora are the correct default for most teams. The constraints appear at scale: EBS-backed storage throttles IOPS, pgvectorscale is unavailable, and OLAP-heavy workloads force RAM overprovisioning. For a lot of teams, that is fine. Until it is not.
BYOC (Bring Your Own Cloud) deploys Postgres inside your own cloud account with managed-service operations but infrastructure control. NVMe-backed storage removes IOPS limits and makes vector index rebuilds non-punishing. The 2026 ClickHouse Cloud + Ubicloud offering is a good example — NVMe-backed Postgres colocated with ClickHouse compute, up to 10x IOPS over EBS. For regulated industries, BYOC keeps data within your own cloud account boundary, so HIPAA and SOC 2 compliance follows naturally.
If your AI workloads are IOPS-constrained on RDS, or you need pgvectorscale, BYOC is the right move. When Postgres consolidation makes sense covers the broader hosting decision.
The principle: consolidate first, measure, then break out
Instrument before you conclude. EXPLAIN ANALYZE for query plan inspection, pg_stat_statements for I/O-heavy query identification, pg_stat_user_tables for autovacuum lag. These tools tell you whether you have hit an architectural constraint or a tuning gap. Use them.
Wingify’s 80% cost reduction only means something because they started on Postgres and moved when the data demanded it — not before. If you are earlier in the decision journey, the Postgres consolidation strategy is the right starting point. For the full business case, the honest exceptions to the consolidation thesis covers what the numbers mean for your own infrastructure spend.
FAQ
What are the signs that Postgres is no longer sufficient for analytics workloads?
Aggregation queries taking more than 5 seconds is the primary structural signal. Autovacuum lagging or blocked by long-running analytical queries is the operational symptom. OLTP latency degrading when analytical batch jobs run concurrently is the second red flag. High shared_blks_read in pg_stat_statements despite good indexing confirms I/O amplification. If pg_partman partitioning and materialised view refreshes have not resolved the latency, you are past the tuning horizon and into architectural territory.
How does ClickHouse’s columnar storage differ from Postgres row-based storage?
Postgres stores each row contiguously — an analytical query reading two columns from a 50-column row still reads the entire row. ClickHouse stores each column contiguously, reading only the columns you need, with 15–20x compression ratios. Columnar layout also enables vectorised execution via SIMD CPU instructions. At billion-row scale, this produces the 100–200x latency improvements documented in Wingify’s migration.
What is ClickHouse AggregatingMergeTree and how does it enable real-time aggregation?
AggregatingMergeTree stores pre-aggregated partial states — running SUM, COUNT — rather than raw rows. Combined with a Materialised View, aggregations update automatically on every INSERT. Read queries hit the pre-aggregated view, not the raw table. That is the mechanism behind Wingify’s latency reduction from 30–50 seconds to 100–300 milliseconds. Postgres materialised views require a manual REFRESH and store data in row format. They are a read optimisation, not a real-time aggregation engine.
Why did Wingify migrate from Postgres to ClickHouse?
Wingify’s VWO platform serves over 10,000 customers and requires real-time aggregation of A/B experiment results. As data volumes grew, aggregation query latency degraded to 30–50 seconds. The migration to ClickHouse using AggregatingMergeTree and Materialised Views reduced latency to 100–300 milliseconds and cut infrastructure costs by 80%. Postgres remained the OLTP system of record.
What is PeerDB and how does it help migrate from Postgres to ClickHouse?
PeerDB is an open-source CDC (Change Data Capture) tool built for Postgres-to-ClickHouse replication. It uses PostgreSQL logical replication slots to stream row-level changes in near-real time. ClickPipes is the managed version, integrated into ClickHouse Cloud. Postgres stays as the write system of record. Application code does not change.
Is pgvectorscale available on AWS RDS?
No — pgvectorscale is not available on AWS RDS or Aurora as of 2026. Teams who need StreamingDiskANN performance must self-host or use a platform that supports it, such as Timescale Cloud. If you are on RDS, you face a hosting decision before you face a vector database decision.
When should I use pg_duckdb instead of migrating to ClickHouse?
pg_duckdb embeds DuckDB‘s vectorised execution engine inside Postgres — approximately 500x OLAP speedup without a second database. It is right for teams below five engineers, datasets below roughly 100GB, or workloads without real-time streaming ingestion requirements. ClickHouse is the right choice when you need real-time aggregation at hundreds of millions to billions of rows, or continuous streaming write workloads.
What database workloads does Postgres NOT handle well in 2026?
Real-time aggregation at hundreds of millions to billions of rows with sub-second latency — ClickHouse wins here. Streaming analytics at very high write velocity — Postgres materialised views cannot keep pace. Vector search at billion-vector scale with strict latency SLAs and complex metadata filtering — dedicated vector databases handle adaptive pre/post-filter switching that pgvector does not. At petabyte scale with multi-tenant enterprise requirements, scale-out warehouses are more appropriate.
What is the “Postgres Wall”?
The Postgres Wall is the inflection point at which analytical performance degrades structurally — not because of missing configuration, but because of the architectural constraints of row-oriented storage. In production, it manifests as your read replica lagging by hours because a heavy analytical query blocked autovacuum. Reaching it is a diagnostic signal, not a failure — the workload has grown to require a specialist.