Shared data Namespace — Home for dm, ClickHouse, and the Ingest Tier
Status: Proposed · May 2026
Stand up a new k8s namespace
datain the existing prod cluster as the physical home for the environment-agnostic data layer: a dedicated PostgreSQL cluster for thedmschema (plus ingest outbox and pg-boss), the existing ClickHouse instance moved out of theproductionnamespace, and the single scraper tier from the External Data Ingestion proposal.This is an infrastructure addendum to two in-flight proposals — it doesn't change their app-layer design, only where the shared pieces physically live.
Problem
Two proposals in flight (External Data Ingestion, Datamining Schema (dm)) both want a home for shared, environment-agnostic data and workloads that today live inside production:
- A single ingest tier (one scraper, one webhook endpoint) plus its outbox + pg-boss queue.
- The
dm.*schema (channel identity, slug history, probe schedule, stream sessions, event subscriptions, scrape log, webhook dedupe, chat-export bookkeeping). - ClickHouse — already shared by both envs today, but lives in the
productionnamespace where its name implies otherwise.
Default placement in the two proposals (open-question #2 of External Data Ingestion: "reuse production's PG with an ingest. schema") works, but couples three independent failure domains:
- Ingest backlog can pile up against the same Postgres the production app commits user-visible transactions to.
- Backup, PITR, and resource policies are forced to a single common denominator.
- The kick-external-schema proposal explicitly contemplates a future physical extraction of
dm— landing it in production's Postgres now means doing that extraction twice.
The pre-production state is the only cheap moment to lay this layout down. Once production is live, every move is a coordinated cutover.
What's on the cluster today
| Namespace | Contains | Role |
|---|---|---|
production | vf-pg, chi-vf-prod (ClickHouse), cloudflared | Will hold production app stack + currently hosts ClickHouse |
stage | api, worker, webhooks, migrator, vf-pg, cloudflared | Stage app stack |
operators | cnpg, cho (Altinity ClickHouse Operator) | DB operators |
monitoring / ops / system | Grafana / Prom / TLS / Flux | Cluster plumbing |
ClickHouse already sits in production and is consumed by both envs — the naming lies. Stage's vf-pg is per-env user data only; production's vf-pg is being provisioned for the same role.
Proposal
Add one namespace, data, that owns the shared infrastructure layer:
clusters/prod/data/
releases/
vf-pg-dm.yaml # CNPG cluster — dm schema + ingest outbox + pg-boss
chi-vf.yaml # ClickHouse — moved from production/, renamed (drop -prod)
ingest.yaml # the single scraper tier from external-data-ingestion
routes/ # wh.verifluence.io ingress
monitoring/ # data-namespace alert routes1. New PostgreSQL cluster vf-pg-dm in data
Owns three schema families, none of which belong in a per-env app DB:
| Schema | Source | Why here |
|---|---|---|
dm.* | Datamining Schema | Environment-agnostic; both envs soft-FK into it via public.streamer_channel_link.channel_id |
ingest.* (ingest_outbox, ingest_outbox_delivery) | External Data Ingestion §INGEST-7 | Transactional outbox for the fan-out; colocated with its producer (ingest tier) |
pgboss | Today in stage's vf-pg | Ingest fan-out job queue; belongs with the outbox, not with stage app data |
Stage and production app DBs (stage/vf-pg, production/vf-pg) hold only public.* plus auth.* after this lands. They soft-reference dm.streamer_channel.id per the kick-external-schema design — same code, different physical DB.
2. Cross-namespace alias for ClickHouse (CHI stays in production)
Updated after evaluation: the original draft proposed moving the
ClickHouseInstallationCRD intodata. We reconsidered: PVCs are namespaced, so a clean move requires a PV re-bind dance with ClickHouse downtime (or a backup/restore). The only functional gain was naming honesty, which anExternalNameService delivers at zero cost. The CHI itself stays inproduction.
Add a permanent ExternalName Service chi-vf in data that CNAMEs to clickhouse-vf.production.svc.cluster.local. Every consumer points at the namespace-neutral name chi-vf.data.svc.cluster.local; kube-dns resolves the CNAME directly with no proxy hop.
The Prometheus per-pod scrape continues to target the real headless chi-vf-prod-0-0.production.svc.cluster.local — per-pod scraping needs a real Service, not a cluster ExternalName, and the namespace stays consistent with the CHI's actual home.
3. Move the ingest workload to data
The External Data Ingestion proposal places ingest in the production namespace because it's "shared infra living somewhere." data is a better home — colocated with its outbox DB (vf-pg-dm) and its raw-event sink (chi-vf). wh.verifluence.io becomes a data-namespace ingress.
This resolves open questions §1 and §2 of External Data Ingestion in one move.
Why this is better than the proposals' default placement
| Concern | Reuse production PG (default) | New data namespace |
|---|---|---|
| Failure domain | Ingest pg-boss outage piles up on prod app's WAL/locks | Isolated DB; production app unaffected |
| Backup / PITR | Same policy for both — ingest's churny outbox forces prod's policy | Separate policy; dm can have tighter PITR for analytics, prod-app keeps its own |
| Resource contention | ClickHouse + scrapers compete with prod app for CPU/IO | Schedule data workloads to a separate node-pool or taint |
| Schema ownership clarity | dm.*, ingest.*, pgboss.* co-mingled with public.* in prod | One DB per concern; one Flux folder per namespace; one secret per role |
| Future DB split (kick-external-schema Phase B) | Need to extract dm out of prod — pg_dump --schema=dm + cutover | Done by construction — no second migration |
| Compliance posture | Sumsub PII payloads live next to operator/streamer accounts | Raw-payload store in its own namespace with its own RBAC + NetworkPolicy |
| Naming honesty | production namespace ≠ "shared infra" | data = shared infra; production = production app stack |
Touchpoints
Flux
- New
clusters/prod/system/namespaces/data.yaml - New
clusters/prod/data/kustomization (mirrorsstage//production/shape) - Add
clusters/prod/data/releases/chi-vf-alias.yaml—ExternalNameService pointing atclickhouse-vf.production;chi-vf-prod.yamlstays inproduction/releases/ - New
clusters/prod/data/releases/vf-pg-dm.yaml(CNPG Cluster) - New
clusters/prod/data/releases/ingest.yaml(the scraper tier from External Data Ingestion §INGEST-1) - Ingress for
wh.verifluence.iomoves from stage/production todata
App secrets
- Stage + prod app pods gain
DM_DATABASE_URL(read/write intovf-pg-dm) - Ingest tier gains
INGEST_DATABASE_URL(same DB; writes outbox) - Stage + prod fan-out clients gain
PGBOSS_DATABASE_URLif they need direct queue access (most don't — fan-out is owned by ingest) - Existing
DATABASE_URLkeeps pointing at the env-local Postgres forpublic.*/auth.*only
NetworkPolicies
stageandproductionnamespaces: egress allowed todata/vf-pg-dm:5432anddata/chi-vf:8123,9000datanamespace: egress to the public internet (Kick, Sumsub, X, Scrape.do, ScraperAPI); ingress only onwh.verifluence.io- Default deny everywhere else
Migrations
- New migrator release
dm-migratorownsdm.*andingest.*migrations againstvf-pg-dm. Distinct from the per-env app migrator that runspublic.*. - Phase A of kick-external-schema (migration 0123) splits into two scripts:
- In env DBs: create
public.streamer_channel_link(no FK — soft reference across DBs) - In
vf-pg-dm: createdm.streamer_channel,dm.streamer_channel_slug_history, move/rebuild dm.* descendants
- In env DBs: create
- The Phase A backfill becomes a one-off Job with two DSNs — reads from
production/vf-pg, writes todata/vf-pg-dm. Run once during cutover.
Code
- Add
DM_DBto theEnvinterface alongsideDB;postgres()driver instances for each. api/src/streamer_channel.tshelper queriesDM_DB; everything inpublic.*keeps queryingDB.- One-line touch on every existing call site that crosses the boundary — typed at the env level, so the compiler enforces it.
Migration sequence
- Land the namespace skeleton.
data.yamlnamespace + empty Flux kustomization. No workloads yet. Verify CHO, CNPG operators see it. - Add ClickHouse alias. Create the
ExternalNameServicechi-vfindatapointing atclickhouse-vf.production. Rolling-deploy stage apps withCLICKHOUSE_URLrepointed tochi-vf.data.svc.cluster.local. The CHI itself does not move. - Provision
vf-pg-dm. CNPG Cluster + initial backup target. Empty schemas. No app traffic yet. - Stand up ingest tier in
data. Per External Data Ingestion §INGEST-1, write-only tovf-pg-dmandchi-vf. Pollers in dry-run mode (outbox writes, fan-out paused). - Run kick-external-schema Phase A (migration 0123) against
vf-pg-dm+ env DBs in parallel. Backfilldm.streamer_channelfromproduction/vf-pg.public.streamer_channels. - Switch app code to use
DM_DB. Repoint reads/writes per the kick-external-schema code-change-scope table. - Enable fan-out (External Data Ingestion §INGEST-8). Stage starts consuming. Once green, strip stage's direct pollers (§INGEST-2).
- Production launches against the final layout — no second cutover.
Phases 1–3 are safely idempotent and reversible. Phase 4 onward inherits the rollout plan in the External Data Ingestion proposal.
Open questions
- CNPG topology for
vf-pg-dm. Single primary + WAL archive enough fordm(mostly append/upsert, ~10–20k writes/h)? Or 1 primary + 1 replica from day one? auth.*placement. Stays per-env (user-account data, not env-agnostic). Confirm — nothing in the proposals suggests otherwise.ClickHouse cutover safety— resolved: CHI stays inproduction; thedata/chi-vfExternalName provides cross-namespace DNS without moving the workload.- Naming.
datavsdmvsingestvsshared. Recommendation:data— covers PG + CH + ingest workload + future Kafka/JetStream without renaming. - Backups across two PG clusters. Standardise on the same CNPG backup secret + S3 bucket prefix per cluster (
vf-pg-prod/,vf-pg-stage/,vf-pg-dm/). Different retention per cluster is fine. - Production launch coupling. This proposal is on the critical path only if we want production to launch against the final layout. If launch slips, this can ship after — at the cost of one extra
dm-extraction migration later.