Bulk analytics operations
This page covers deployment, configuration, monitoring, retention, and troubleshooting for HEAT Bulk Analytics DB and the bulk analytics pipeline.
Deployment checklist
Standard deploy/k8s manifests include:
| Resource | Namespace | Purpose |
|---|---|---|
heat-bulk-analytics-postgres StatefulSet | heat | PostgreSQL 15, 32Gi PVC |
heat-bulk-analytics-postgres Service | heat | Cluster-internal DNS |
heat-bulk-analytics-postgres-secrets | heat | POSTGRES_USER, POSTGRES_PASSWORD, connection string |
After apply:
- Confirm pod
heat-bulk-analytics-postgres-0is Running. - Rebuild and redeploy core-api and system-utils when bulk analytics code changes.
- Restart core-api so setup registers DataSource
HEAT Bulk Analytics DB(same automatic pattern as HEAT Managed Object Store). - Confirm system-utils runner is linked to
node-output-query,system-bulk-tabular-writer, andsystem-bulk-analytics-query(preset bootstrap on core-api startup).
No manual DataSource entry in Cluster Manager is required for standard deploys.
Platform configuration
| Key | Default | Purpose |
|---|---|---|
bulk_analytics.enabled | true | When true, session delete best-effort drops bulk_ni_{nodeInstanceId} databases for bulk-writer nodes on that session |
Opt-out of retention cleanup only (ingest still works when Postgres is deployed):
- Set platform key
bulk_analytics.enabledtofalse, or - Set environment variable
BULK_ANALYTICS_ENABLED=falseon core-api before startup.
The DataSource is still registered when analytics Postgres is deployed; the key does not gate DataSource registration.
See Platform configuration for operator-facing keys.
Resource planning
| Resource | Guidance |
|---|---|
| Analytics Postgres PVC | 32Gi base; monitor growth during large ingests |
| system-utils memory | Allow headroom for large outputIds JSON and CSV parsing (deployment supports up to 4Gi) |
| Ingest duration | Sequential per-output ingest; 100k+ outputs may take hours |
Query maxRows | Cap large SELECT results (default 10000) |
Monitoring ingest
Watch the writer node in Cluster Manager or session detail:
| Signal | Meaning |
|---|---|
StatusDetails: Starting bulk ingest: | Pipeline started |
StatusDetails: Ingesting output … | Working through outputIds |
StatusDetails: Ingested output … | Output completed |
StatusDetails: Skipped (already ingested) | Idempotent re-run hit ledger |
LastState: Processing | Still running |
LastState: ProcessingSucceeded | Catalog emitted |
Reprocess after failure is safe: _ingested_outputs prevents duplicate inserts for completed output ids.
Retention and cleanup
When bulk_analytics.enabled is true, deleting a session triggers best-effort DROP DATABASE bulk_ni_{nodeInstanceId} for each system-bulk-tabular-writer node on that session. Failure to reach analytics Postgres does not block session delete.
Analytics Postgres holds derivative telemetry (including user GUID columns from CSV). Treat cluster-internal access and retention per deployment policy. Do not expose analytics Postgres via public ingress.
Troubleshooting
DataSource 'HEAT Bulk Analytics DB' not configured
Cause: core-api setup has not registered the DataSource, or core-api started before analytics Postgres was available.
Fix:
- Confirm
heat-bulk-analytics-postgrespod is Running. - Restart core-api (setup runs on startup and updates stale connection strings when secrets change).
28P01: password authentication failed
Cause: Postgres was initialized with credentials that do not match the DataSource connection string (often after secret fixes on an existing PVC).
Fix (keep data):
kubectl -n heat exec -it heat-bulk-analytics-postgres-0 -- \
psql -U heat_bulk_analytics -d postgres -c "ALTER USER heat_bulk_analytics PASSWORD '<password-from-secret>';"Fix (fresh database): delete the StatefulSet PVC and redeploy so Postgres re-initializes with current secrets.
No runner available for this node template
Cause: Node template not linked to system-utils runner in the environment database.
Fix: Restart core-api after deploy (preset bootstrap links bulk templates). Reprocess the node.
Writer slow but progressing
Expected for v1: sequential output fetch, CSV parse, and per-row INSERT. Progress in StatusDetails confirms healthy operation.
All columns are text in catalog
Cause: table was created before column type inference was deployed, or column values were too mixed to infer types.
Fix: drop bulk_ni_{writerNodeInstanceId} and re-run the writer on a fresh database.
Query fails on numeric comparison
Cause: column remained text (see above) or SQL references wrong quoted identifier.
Fix: use catalog qualifiedFrom and valueKind; add CAST only for string columns.
Security notes
- Analytics Postgres is cluster-internal only.
system-bulk-analytics-queryallows SELECT only; no writes through the query node.- CSV may contain external user GUIDs; resolve display names in presentation layers per HEAT pseudonymisation rules, not inside bulk SQL.
Core API inspection (operators)
For tools that cannot reach analytics Postgres directly, Core API exposes read-only /api/bulk-analytics routes (writer nodeInstanceId, catalog, table list, ad-hoc SQL). Same auth and network trust as other Core API calls (HEAT_API_TOKEN). Full endpoint catalog is in the internal engineering doc _internal/bulk-analytics-api (not on the public docs nav).