Add automatic RKNN conversion and support for semantic search model (#19676)

* Create RKNN model runner and and use for jina v1 clip * Formatting * Handle model type inference * Properly provide input to RKNN * Adjust rknn conversion * Update docs * Formatting * Fix path handling * Handle inputs * Cleanup * Change normalization for better accuracy * Clarify supported models * Remove testing
2025-09-26 19:41:29 +08:00 · 2025-08-21 05:30:14 -06:00
parent efeb089ff8
commit 1be84d6833
4 changed files with 233 additions and 23 deletions
--- a/docs/docs/configuration/hardware_acceleration_enrichments.md
+++ b/docs/docs/configuration/hardware_acceleration_enrichments.md
@@ -5,11 +5,11 @@ title: Enrichments

 # Enrichments

-Some of Frigate's enrichments can use a discrete GPU for accelerated processing.
+Some of Frigate's enrichments can use a discrete GPU / NPU for accelerated processing.

 ## Requirements

-Object detection and enrichments (like Semantic Search, Face Recognition, and License Plate Recognition) are independent features. To use a GPU for object detection, see the [Object Detectors](/configuration/object_detectors.md) documentation. If you want to use your GPU for any supported enrichments, you must choose the appropriate Frigate Docker image for your GPU and configure the enrichment according to its specific documentation.
+Object detection and enrichments (like Semantic Search, Face Recognition, and License Plate Recognition) are independent features. To use a GPU / NPU for object detection, see the [Object Detectors](/configuration/object_detectors.md) documentation. If you want to use your GPU for any supported enrichments, you must choose the appropriate Frigate Docker image for your GPU / NPU and configure the enrichment according to its specific documentation.

 - **AMD**

@@ -23,6 +23,9 @@ Object detection and enrichments (like Semantic Search, Face Recognition, and Li
  - Nvidia GPUs will automatically be detected and used for enrichments in the `-tensorrt` Frigate image.
  - Jetson devices will automatically be detected and used for enrichments in the `-tensorrt-jp6` Frigate image.

+- **RockChip**
+  - RockChip NPU will automatically be detected and used for semantic search (v1 only) in the `-rk` Frigate image.
+
 Utilizing a GPU for enrichments does not require you to use the same GPU for object detection. For example, you can run the `tensorrt` Docker image for enrichments and still use other dedicated hardware like a Coral or Hailo for object detection. However, one combination that is not supported is TensorRT for object detection and OpenVINO for enrichments.

 :::note
--- a/docs/docs/configuration/semantic_search.md
+++ b/docs/docs/configuration/semantic_search.md
@@ -78,7 +78,7 @@ Switching between V1 and V2 requires reindexing your embeddings. The embeddings

 ### GPU Acceleration

-The CLIP models are downloaded in ONNX format, and the `large` model can be accelerated using GPU hardware, when available. This depends on the Docker build that is used. You can also target a specific device in a multi-GPU installation.
+The CLIP models are downloaded in ONNX format, and the `large` model can be accelerated using GPU / NPU hardware, when available. This depends on the Docker build that is used. You can also target a specific device in a multi-GPU installation.

 ```yaml
 semantic_search:
@@ -90,7 +90,7 @@ semantic_search:

 :::info

-If the correct build is used for your GPU and the `large` model is configured, then the GPU will be detected and used automatically. 
+If the correct build is used for your GPU / NPU and the `large` model is configured, then the GPU / NPU will be detected and used automatically. 
 Specify the `device` option to target a specific GPU in a multi-GPU system (see [onnxruntime's provider options](https://onnxruntime.ai/docs/execution-providers/)). 
 If you do not specify a device, the first available GPU will be used.