Skip to Content
This documentation is provided with the HEAT environment and is relevant for this HEAT instance only.
RunnersAudio UtilsAudio Utils Runner

Audio Utils Runner

The Audio Utils runner (audio-utils-runner) processes captured or uploaded audio: normalize to PCM WAV, emit JSON metrics, run speech-to-text, and combine metrics with transcripts for voice analysis.

Node template selection

TemplatePurposeLimitationsDetails
convert-to-pcmDecode WAV/AIFF/AIFC uploads and emit PCM WAV plus a per-file acceptance report.Non-audio blobs are skipped (node does not fail the whole batch); resample/bit-depth changes need numpy/librosa in the image.convert-to-pcm
audio-metricsCompute JSON audio metrics (duration, levels, and related fields) per accepted file.Tolerates mixed non-audio upstream blobs; marks invalid items with parseError.audio-metrics
generate-transcriptSpeech-to-text (Whisper, faster-whisper, whisperx, wav2vec) keyed by content hash.Heavy CPU/GPU and model dependencies; needs audio-metrics and PCM map inputs; air-gap sites must bundle models.generate-transcript
voice-analysisMerge metrics and transcript signals into a combined JSON analysis document.Expects wired upstream parents; evolving output schema; not a dashboard publisher.voice-analysis

Typical pipeline