Disclaimer: This blog post is automatically generated from project documentation and technical proposals using AI assistance. The content represents our development journey and architectural decisions. Code examples are simplified illustrations and may not reflect the exact production implementation.
The Materialized View Problem
I was reviewing PostgreSQL slow query logs when Caroline pointed out something concerning: “Our materialized view refreshes are taking longer every week.”
She pulled up the stats:
SELECT schemaname, matviewname, last_refresh, query_start, current_timestamp - query_start as refresh_duration
FROM pg_stat_progress_refresh_mat_view
JOIN pg_stat_activity USING (pid);
The match_statistics view was taking 18 seconds to refresh. The player_momentum view was at 12 seconds.
“That’s only going to get worse as we accumulate more matches,” Caroline said. “And these views lock tables during refresh.”
Claude suggested a solution: “ClickHouse is designed for exactly this—analytical queries over large datasets. Keep PostgreSQL for transactions, use ClickHouse for analytics.”
We decided to migrate.
Architecture: OLTP vs OLAP
Caroline drew the architecture on the whiteboard:
PostgreSQL (OLTP) ClickHouse (OLAP)
- Match state - Shot history
- Event sourcing - Score changes
- Transactions - Analytics queries
- Current scores - Time-series analysis
The key insight: separate transactional and analytical workloads.
“PostgreSQL is perfect for event sourcing,” Claude explained. “It has strong consistency, transactions, and foreign keys. But for analytical queries over millions of shots, ClickHouse is 10-100x faster.”
I asked: “How do we keep them in sync?”
“NATS,” Caroline answered. “Events flow through NATS to both databases. PostgreSQL stores the current state, ClickHouse stores the analytical history.”
Data Flow
Match Events → NATS Stream → Projections Service
↓
┌───────┴────────┐
↓ ↓
PostgreSQL ClickHouse
(Current state) (Analytics)
Phase 1: Infrastructure Setup (1 Day)
Caroline started with the Docker Compose configuration:
clickhouse:
image: clickhouse/clickhouse-server:24.3
container_name: clickhouse
ports:
- '8123:8123' # HTTP interface
- '9000:9000' # Native protocol
volumes:
- clickhouse-data:/var/lib/clickhouse
- ./db/clickhouse/migrations:/docker-entrypoint-initdb.d
environment:
CLICKHOUSE_DB: shock
CLICKHOUSE_USER: shock
CLICKHOUSE_PASSWORD: shock
“ClickHouse automatically runs .sql files from /docker-entrypoint-initdb.d on first startup,” she explained. “No manual migration runner needed.”
We created the initial schema:
-- db/clickhouse/migrations/001_initial_schema.sql
CREATE DATABASE IF NOT EXISTS shock;
CREATE TABLE IF NOT EXISTS shock.events (
event_id String,
match_id String,
event_type String,
event_timestamp DateTime64(3),
event_data String, -- Full JSON for flexibility
created_at DateTime64(3) DEFAULT now64(3)
) ENGINE = ReplacingMergeTree(created_at)
PARTITION BY toYYYYMM(event_timestamp)
ORDER BY (match_id, event_timestamp, event_id);
Caroline noted: “We use ReplacingMergeTree with created_at as the version column. This handles duplicate inserts—NATS delivers at-least-once, so we need deduplication.”
Phase 2: Event Projection (1 Day)
Next, we needed to stream events from NATS to ClickHouse. Caroline extended our existing projections service:
ClickHouse Client Setup
// app/packages/matches-app/index.mts
import { createClient } from '@clickhouse/client';
const clickhouse = createClient({
url: config.clickhouse.url,
username: config.clickhouse.username,
password: config.clickhouse.password,
database: config.clickhouse.database,
});
const app = new Application(database, nats, clickhouse, tracer, logger);
“Now ClickHouse is a first-class dependency,” Caroline explained. “Just like PostgreSQL.”
Batch Processing
We created a ClickHouseWriter class with batch support:
class ClickHouseWriter {
async write(events: DomainEvent[]): Promise<void> {
const rows = events.map((event) => ({
event_id: event.id,
match_id: event.detail.matchId,
event_type: event.type,
event_timestamp: new Date(event.detail.eventTime),
event_data: JSON.stringify(event.detail),
}));
await this.clickhouse.insert({
table: 'shock.events',
values: rows,
format: 'JSONEachRow',
});
}
}
Caroline integrated it with the projections service:
// app/packages/matches-projections/index.mts
const batchDuplex = new BatchDuplex({
maxBatchSize: 10,
maxWaitTime: 5000, // 5 seconds
});
consumer
.messages()
.pipe(batchDuplex)
.on('data', async (batch) => {
try {
await clickhouseWriter.write(batch);
} catch (error) {
logger.error(error, 'Failed to write batch to ClickHouse');
}
});
“We batch 10 events or wait 5 seconds, whichever comes first,” she explained. “This balances latency and throughput.”
I asked: “What if a batch fails?”
“Retry with exponential backoff,” Claude answered. “3 attempts with 1s, 2s, 4s delays. If all fail, we log the error and continue processing the next batch.”
Phase 3: Denormalized Schema (1 Day)
The initial schema stored everything as JSON. Caroline proposed denormalization:
“We should extract hot columns—the fields we query most often—into native ClickHouse types for better performance.”
Shots Table
CREATE TABLE IF NOT EXISTS shock.shots (
event_id String,
match_id String,
event_timestamp DateTime64(3),
-- Denormalized hot columns
player UInt8, -- Changed from String to save space
shot_type String, -- forehand, backhand, serve, volley
shot_outcome String, -- WINNER, ERROR, FORCED_ERROR, NEUTRAL
winner_team UInt8, -- 0 or 1 (team that won the point)
text String, -- Original description
-- Full event for flexibility
event_data String,
created_at DateTime64(3)
) ENGINE = ReplacingMergeTree(created_at)
PARTITION BY (match_id, toYYYYMMDD(event_timestamp))
ORDER BY (match_id, player, shot_type, event_timestamp);
Caroline explained the benefits:
- Storage:
playeras UInt8 (1 byte) vs String (~10 bytes) = 90% savings - Query speed: Native types avoid JSON parsing overhead
- Indexes: Sparse indexes on
playerandshot_typefor filtering - Compression: ClickHouse compresses native types better than JSON
Score Changes Table
CREATE TABLE IF NOT EXISTS shock.score_changes (
event_id String,
match_id String,
event_timestamp DateTime64(3),
sequence UInt32,
-- Current game score
current_game_point_0 String, -- e.g., '0', '15', '30', '40', 'AD'
current_game_point_1 String,
-- Set and match status
current_set UInt8,
match_finished Boolean,
match_winner Int8, -- -1 if not finished, 0 or 1 for winner
-- Sets data (JSON due to variable length)
sets String,
event_data String,
created_at DateTime64(3)
) ENGINE = ReplacingMergeTree(created_at)
PARTITION BY (match_id, toYYYYMMDD(event_timestamp))
ORDER BY (match_id, sequence, event_timestamp);
Phase 4: Query Migration (2 Days)
Now the hard part: migrating 11 analytical queries from PostgreSQL to ClickHouse.
Migration Strategy
Caroline created a systematic approach:
- Feature flags - Run both PostgreSQL and ClickHouse versions in parallel
- Identical interfaces - Same input parameters, same output structure
- Query optimization - Leverage ClickHouse-specific features
- Testing - Compare results between implementations
Example: Player Momentum
The PostgreSQL version used materialized views:
-- PostgreSQL materialized view
CREATE MATERIALIZED VIEW match_momentum AS
SELECT
match_id,
time_window,
player,
SUM(CASE WHEN winner = player THEN 1 ELSE 0 END) as points_won,
SUM(CASE WHEN winner != player THEN 1 ELSE 0 END) as points_lost
FROM shots
GROUP BY match_id, time_window, player;
Caroline migrated it to ClickHouse:
// get.player_momentum.clickhouse.query.mts
const query = `
WITH time_windows AS (
SELECT
match_id,
player,
floor(date_diff('second', match_start, event_timestamp) / 60) as time_window,
countIf(winner_team = team) as points_won,
countIf(winner_team != team) as points_lost
FROM shock.shots
WHERE match_id = {matchId:String}
GROUP BY match_id, player, time_window
),
momentum_calc AS (
SELECT
time_window,
player,
points_won,
points_lost,
-- Exponential decay with lookback of 4 windows
points_won + 0.7 * lag(points_won, 1, 0) OVER w +
0.49 * lag(points_won, 2, 0) OVER w +
0.343 * lag(points_won, 3, 0) OVER w as weighted_points_won
FROM time_windows
WINDOW w AS (PARTITION BY player ORDER BY time_window)
)
SELECT
time_window,
player,
round(weighted_points_won / (weighted_points_won + weighted_points_lost) * 100, 2) as momentum_pct
FROM momentum_calc;
`;
const result = await clickhouse.query({
query,
query_params: { matchId },
});
“The logic is identical to PostgreSQL,” Caroline said. “But ClickHouse processes it 10-20x faster because of columnar storage and vectorized execution.”
Complex Query: Pressure Analysis
The most challenging migration was GetPressureAnalysis. It combines three scoring components:
- Base Weight (0.3-1.0): Shot outcome impact
- Situation Weight (1.0-5.0): Game score context (deuce, game point, advantage)
- Momentum Factor (0.5-1.5): Recent point trend with exponential decay
Caroline showed me the ClickHouse query:
const query = `
WITH scored_shots AS (
SELECT
s.event_timestamp,
s.player,
s.shot_outcome,
-- Base weight from outcome
multiIf(
s.shot_outcome = 'WINNER', 1.0,
s.shot_outcome = 'FORCED_ERROR', 0.8,
s.shot_outcome = 'UNFORCED_ERROR', 0.6,
s.shot_outcome = 'LET', 0.3,
0.5
) as base_weight,
-- Situation weight from game score
multiIf(
sc.current_game_point_0 = 'AD' OR sc.current_game_point_1 = 'AD', 5.0,
sc.current_game_point_0 = '40' AND sc.current_game_point_1 = '30', 4.0,
sc.current_game_point_0 = '40' AND sc.current_game_point_1 = '40', 3.5,
2.5
) as situation_weight,
-- Momentum factor (last 5 winners with 0.2 decay)
1.0 + (
arrayReduce('sum',
arrayMap((w, i) -> if(w = team, exp(-0.2 * i), -exp(-0.2 * i)),
groupArray(5)(s.winner_team) OVER (
PARTITION BY s.match_id
ORDER BY s.event_timestamp
ROWS BETWEEN 5 PRECEDING AND 1 PRECEDING
),
range(5)
)
) / 5.0
) * 0.5 as momentum_factor
FROM shock.shots s
ASOF LEFT JOIN shock.score_changes sc
ON s.match_id = sc.match_id
AND sc.event_timestamp <= s.event_timestamp
WHERE s.match_id = {matchId:String}
)
SELECT
player,
round(avg(base_weight * situation_weight * momentum_factor), 3) as pressure_score
FROM scored_shots
GROUP BY player;
`;
I was amazed: “You’re using ASOF LEFT JOIN to correlate shots with the closest previous score change?”
Caroline nodded: “ClickHouse’s time-series joins are perfect for this. And arrayMap + arrayReduce let us calculate momentum directly in SQL—no intermediate tables.”
Performance Results
After migrating all 11 queries, we benchmarked:
| Query | PostgreSQL (P95) | ClickHouse (P95) | Speedup |
|---|---|---|---|
| Player Momentum | 450ms | 42ms | 10.7x |
| Team Momentum | 380ms | 38ms | 10.0x |
| Pressure Analysis | 820ms | 68ms | 12.1x |
| Player Profiles | 620ms | 54ms | 11.5x |
| Match Statistics | 580ms | 48ms | 12.1x |
| Head-to-Head | 740ms | 71ms | 10.4x |
| Shot Type Stats | 410ms | 36ms | 11.4x |
| Set Stats | 390ms | 34ms | 11.5x |
| Situation Performance | 680ms | 58ms | 11.7x |
| Outcomes Over Time | 520ms | 46ms | 11.3x |
| Match Streaks | 710ms | 62ms | 11.5x |
| Average | 573ms | 51ms | 11.2x |
Caroline was thrilled: “We’re seeing 11x faster queries on average. And this gap will grow as we accumulate more data.”
Phase 5: PostgreSQL Cleanup (1 Day)
With all queries migrated, we cleaned up PostgreSQL:
Materialized Views Dropped
DROP MATERIALIZED VIEW IF EXISTS match_statistics;
DROP MATERIALIZED VIEW IF EXISTS match_momentum;
DROP MATERIALIZED VIEW IF EXISTS match_streaks;
DROP MATERIALIZED VIEW IF EXISTS match_pressure_analysis;
DROP MATERIALIZED VIEW IF EXISTS player_shot_type_statistics;
DROP MATERIALIZED VIEW IF EXISTS player_set_statistics;
DROP MATERIALIZED VIEW IF EXISTS player_situation_performance;
DROP MATERIALIZED VIEW IF EXISTS player_head_to_head_statistics;
“We dropped 8 materialized views,” Caroline said. “That’s 8 fewer things to maintain, 8 fewer background refresh jobs, and a lot less disk space.”
Disk Space Savings
Before cleanup:
PostgreSQL database size: 2.4 GB
- Transactional data: 680 MB
- Materialized views: 1.72 GB
After cleanup:
PostgreSQL database size: 680 MB
ClickHouse database size: 890 MB (compressed)
Total: 1.57 GB (34% reduction)
“We saved 850 MB,” I noted. “And queries are 11x faster.”
What Stayed in PostgreSQL
We kept transactional data in PostgreSQL:
- Event sourcing tables:
matches,match_events,match_projections - Current match state: Live scores, active matches
- Reference tables:
shot_types,shot_outcomes - Transactional queries:
GetMatch,GetMatchScore,GetLatestMatches
Claude explained: “PostgreSQL excels at transactional workloads—strong consistency, ACID guarantees, foreign keys. ClickHouse excels at analytical workloads—columnar storage, vectorized execution, distributed queries.”
Challenges and Lessons Learned
Challenge 1: Column Name Mismatches
Caroline hit an error early on:
DB::Exception: Unknown identifier: outcome
“The PostgreSQL table had outcome, but ClickHouse table had shot_outcome,” she explained. “I had to update 6 queries.”
Lesson: Consistent naming across databases is critical during migration.
Challenge 2: Subquery Cloning Errors
ClickHouse doesn’t support correlated subqueries in some contexts:
-- This fails in ClickHouse
SELECT player, (SELECT AVG(points) WHERE team = player.team) FROM shots;
Caroline’s fix: Use window functions instead:
-- This works
SELECT player, AVG(points) OVER (PARTITION BY team) FROM shots;
Lesson: ClickHouse window functions are more powerful than subqueries for analytical queries.
Challenge 3: Timestamp Casting
Date functions in ClickHouse require explicit casting:
-- This fails
SELECT toStartOfMinute(event_timestamp) FROM shots;
-- This works
SELECT toStartOfMinute(CAST(event_timestamp AS DateTime64)) FROM shots;
Lesson: ClickHouse type system is stricter than PostgreSQL—explicit casts prevent runtime errors.
Challenge 4: Type Consistency
We changed player from String to UInt8 across the entire stack:
// Before (domain/events.mts)
interface ShotPlayedEvent {
player: string; // '1', '2', '3', '4'
}
// After
interface ShotPlayedEvent {
player: 1 | 2 | 3 | 4; // Numeric union type
}
Caroline explained: “This cascaded through commands, events, projections, and queries. But it caught bugs—we found places where player IDs were strings when they should be numbers.”
Lesson: Strict typing prevents errors, but requires discipline during migration.
Monitoring and Observability
Caroline added OpenTelemetry spans for all ClickHouse operations:
async write(events: DomainEvent[]): Promise<void> {
return this.tracer.startActiveSpan('clickhouse.write', async (span) => {
span.setAttribute('event.count', events.length);
try {
await this.clickhouse.insert({ table: 'shock.events', values: rows });
span.setStatus({ code: SpanStatusCode.OK });
} catch (error) {
span.recordException(error);
span.setStatus({ code: SpanStatusCode.ERROR });
throw error;
} finally {
span.end();
}
});
}
“Now we can see ClickHouse latency in Grafana,” she said. “Inserts, queries, errors—all traced.”
Mermaid Diagram: Migration Architecture
graph TB
subgraph EventSource["📝 Event Source"]
MATCHES[Match Events<br/>ShotPlayed, ScoreChanged]
end
subgraph MessageBus["📨 NATS JetStream"]
STREAM[MATCHES Stream<br/>At-least-once delivery]
end
subgraph Projections["🔄 Projections Service"]
CONSUMER[Event Consumer<br/>BatchDuplex 10/5s]
PG_WRITER[PostgreSQL Writer<br/>Current state]
CH_WRITER[ClickHouse Writer<br/>Analytics history]
end
subgraph PostgreSQL["💾 PostgreSQL (OLTP)"]
PG_TABLES[(matches<br/>match_events<br/>match_projections)]
PG_QUERIES[Transactional Queries<br/>GetMatch, GetMatchScore]
end
subgraph ClickHouse["📊 ClickHouse (OLAP)"]
CH_TABLES[(shots<br/>score_changes<br/>events)]
CH_QUERIES[Analytical Queries<br/>11 migrated queries]
end
subgraph API["🌐 API Layer"]
API_ROUTES[API Routes<br/>Split by workload]
end
MATCHES --> STREAM
STREAM --> CONSUMER
CONSUMER --> PG_WRITER
CONSUMER --> CH_WRITER
PG_WRITER --> PG_TABLES
CH_WRITER --> CH_TABLES
PG_TABLES --> PG_QUERIES
CH_TABLES --> CH_QUERIES
PG_QUERIES --> API_ROUTES
CH_QUERIES --> API_ROUTES
classDef source fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef message fill:#fff3e0,stroke:#f57c00,stroke-width:2px
classDef projection fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef postgres fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px
classDef clickhouse fill:#fff9c4,stroke:#f57f17,stroke-width:2px
classDef api fill:#ffebee,stroke:#c62828,stroke-width:2px
class MATCHES source
class STREAM message
class CONSUMER,PG_WRITER,CH_WRITER projection
class PG_TABLES,PG_QUERIES postgres
class CH_TABLES,CH_QUERIES clickhouse
class API_ROUTES api
Takeaways
- Separate OLTP from OLAP - PostgreSQL for transactions, ClickHouse for analytics. Each database optimized for its workload.
- Denormalize for performance - Extract hot columns to native types. Avoid JSON parsing overhead.
- Batch for efficiency - 10 events or 5 seconds. Balance latency and throughput.
- Feature flags for safe migration - Run both implementations in parallel. Compare results before switching.
- ClickHouse time-series features are powerful - ASOF JOIN, window functions, array operations.
Caroline summarized: “We went from 8 materialized views refreshing every 5 minutes to real-time ClickHouse queries that are 11x faster.”
Claude agreed: “This is how you scale analytics. PostgreSQL handles state, ClickHouse handles history, NATS keeps them in sync.”
I was just happy our dashboard loaded instantly now.