Audio transcription support (#18398)

* install new packages for transcription support

* add config options

* audio maintainer modifications to support transcription

* pass main config to audio process

* embeddings support

* api and transcription post processor

* embeddings maintainer support for post processor

* live audio transcription with sherpa and faster-whisper

* update dispatcher with live transcription topic

* frontend websocket

* frontend live transcription

* frontend changes for speech events

* i18n changes

* docs

* mqtt docs

* fix linter

* use float16 and small model on gpu for real-time

* fix return value and use requestor to embed description instead of passing embeddings

* run real-time transcription in its own thread

* tweaks

* publish live transcriptions on their own topic instead of tracked_object_update

* config validator and docs

* clarify docs
This commit is contained in:
Josh Hawkins
2025-05-27 10:26:00 -05:00
committed by Blake Blackshear
parent 2385c403ee
commit 6dc36fcbb4
29 changed files with 2322 additions and 51 deletions

View File

@@ -71,3 +71,8 @@ prometheus-client == 0.21.*
# TFLite
tflite_runtime @ https://github.com/frigate-nvr/TFlite-builds/releases/download/v2.17.1/tflite_runtime-2.17.1-cp311-cp311-linux_x86_64.whl; platform_machine == 'x86_64'
tflite_runtime @ https://github.com/feranick/TFlite-builds/releases/download/v2.17.1/tflite_runtime-2.17.1-cp311-cp311-linux_aarch64.whl; platform_machine == 'aarch64'
# audio transcription
sherpa-onnx==1.12.*
faster-whisper==1.1.*
librosa==0.11.*
soundfile==0.13.*