Bulk analytics operations

This page covers deployment, configuration, monitoring, retention, and troubleshooting for HEAT Bulk Analytics DB and the bulk analytics pipeline.

Deployment checklist

Standard deploy/k8s manifests include:

Resource	Namespace	Purpose
`heat-bulk-analytics-postgres` StatefulSet	`heat`	PostgreSQL 15, 32Gi PVC
`heat-bulk-analytics-postgres` Service	`heat`	Cluster-internal DNS
`heat-bulk-analytics-postgres-secrets`	`heat`	`POSTGRES_USER`, `POSTGRES_PASSWORD`, connection string

After apply:

Confirm pod heat-bulk-analytics-postgres-0 is Running.
Rebuild and redeploy core-api and system-utils when bulk analytics code changes.
Restart core-api so setup registers DataSource HEAT Bulk Analytics DB (same automatic pattern as HEAT Managed Object Store).
Confirm system-utils runner is linked to node-output-query, system-bulk-tabular-writer, and system-bulk-analytics-query (preset bootstrap on core-api startup).

No manual DataSource entry in Cluster Manager is required for standard deploys.

Platform configuration

Key	Default	Purpose
`bulk_analytics.enabled`	`true`	When `true`, session delete best-effort drops `bulk_ni_{nodeInstanceId}` databases for bulk-writer nodes on that session

Opt-out of retention cleanup only (ingest still works when Postgres is deployed):

Set platform key bulk_analytics.enabled to false, or
Set environment variable BULK_ANALYTICS_ENABLED=false on core-api before startup.

The DataSource is still registered when analytics Postgres is deployed; the key does not gate DataSource registration.

See Platform configuration for operator-facing keys.

Resource planning

Resource	Guidance
Analytics Postgres PVC	32Gi base; monitor growth during large ingests
system-utils memory	Allow headroom for large `outputIds` JSON and CSV parsing (deployment supports up to 4Gi)
Ingest duration	Sequential per-output ingest; 100k+ outputs may take hours
Query `maxRows`	Cap large SELECT results (default 10000)

Monitoring ingest

Watch the writer node in Cluster Manager or session detail:

Signal	Meaning
`StatusDetails`: `Starting bulk ingest:`	Pipeline started
`StatusDetails`: `Ingesting output …`	Working through `outputIds`
`StatusDetails`: `Ingested output …`	Output completed
`StatusDetails`: `Skipped (already ingested)`	Idempotent re-run hit ledger
`LastState`: `Processing`	Still running
`LastState`: `ProcessingSucceeded`	Catalog emitted

Reprocess after failure is safe: _ingested_outputs prevents duplicate inserts for completed output ids.

Retention and cleanup

When bulk_analytics.enabled is true, deleting a session triggers best-effort DROP DATABASE bulk_ni_{nodeInstanceId} for each system-bulk-tabular-writer node on that session. Failure to reach analytics Postgres does not block session delete.

Analytics Postgres holds derivative telemetry (including user GUID columns from CSV). Treat cluster-internal access and retention per deployment policy. Do not expose analytics Postgres via public ingress.

Troubleshooting

`DataSource 'HEAT Bulk Analytics DB' not configured`

Cause: core-api setup has not registered the DataSource, or core-api started before analytics Postgres was available.

Fix:

Confirm heat-bulk-analytics-postgres pod is Running.
Restart core-api (setup runs on startup and updates stale connection strings when secrets change).

`28P01: password authentication failed`

Cause: Postgres was initialized with credentials that do not match the DataSource connection string (often after secret fixes on an existing PVC).

Fix (keep data):


kubectl -n heat exec -it heat-bulk-analytics-postgres-0 -- \
  psql -U heat_bulk_analytics -d postgres -c "ALTER USER heat_bulk_analytics PASSWORD '<password-from-secret>';"

Fix (fresh database): delete the StatefulSet PVC and redeploy so Postgres re-initializes with current secrets.

`No runner available for this node template`

Cause: Node template not linked to system-utils runner in the environment database.

Fix: Restart core-api after deploy (preset bootstrap links bulk templates). Reprocess the node.

Writer slow but progressing

Expected for v1: sequential output fetch, CSV parse, and per-row INSERT. Progress in StatusDetails confirms healthy operation.

All columns are `text` in catalog

Cause: table was created before column type inference was deployed, or column values were too mixed to infer types.

Fix: drop bulk_ni_{writerNodeInstanceId} and re-run the writer on a fresh database.

Query fails on numeric comparison

Cause: column remained text (see above) or SQL references wrong quoted identifier.

Fix: use catalog qualifiedFrom and valueKind; add CAST only for string columns.

Security notes

Analytics Postgres is cluster-internal only.
system-bulk-analytics-query allows SELECT only; no writes through the query node.
CSV may contain external user GUIDs; resolve display names in presentation layers per HEAT pseudonymisation rules, not inside bulk SQL.

Core API inspection (operators)

For tools that cannot reach analytics Postgres directly, Core API exposes read-only /api/bulk-analytics routes (writer nodeInstanceId, catalog, table list, ad-hoc SQL). Same auth and network trust as other Core API calls (HEAT_API_TOKEN). Full endpoint catalog is in the internal engineering doc _internal/bulk-analytics-api (not on the public docs nav).