more docs clarity

improve live view console errors
Refactor object genai to be a post-processor (#20331 )
2025-10-04 15:13:22 +08:00 · 2025-10-03 06:41:43 -05:00 · 2025-10-03 06:36:54 -05:00 · 2025-10-02 12:48:11 -06:00 · 2025-10-02 10:17:25 -05:00 · 2025-10-02 07:21:37 -06:00
31 changed files with 755 additions and 451 deletions
--- a/docs/docs/configuration/live.md
+++ b/docs/docs/configuration/live.md
@@ -250,6 +250,7 @@ Note that disabling a camera through the config file (`enabled: False`) removes
   - Check go2rtc configuration for transcoding (e.g., audio to AAC/OPUS).
   - Test with a different stream via the UI dropdown (if `live -> streams` is configured).
   - For WebRTC-specific issues, ensure port 8555 is forwarded and candidates are set (see (WebRTC Extra Configuration)(#webrtc-extra-configuration)).
+   - If your cameras are streaming at a high resolution, your browser may be struggling to load all of the streams before the buffering timeout occurs. Frigate prioritizes showing a true live view as quickly as possible. If the fallback occurs often, change your live view settings to use a lower bandwidth substream.

 3. **It doesn't seem like my cameras are streaming on the Live dashboard. Why?**

--- a/frigate/data_processing/post/object_descriptions.py
+++ b/frigate/data_processing/post/object_descriptions.py
@@ -0,0 +1,349 @@
+"""Post processor for object descriptions using GenAI."""
+
+import datetime
+import logging
+import os
+import threading
+from pathlib import Path
+from typing import TYPE_CHECKING, Any
+
+import cv2
+import numpy as np
+from peewee import DoesNotExist
+
+from frigate.comms.inter_process import InterProcessRequestor
+from frigate.config import CameraConfig, FrigateConfig
+from frigate.const import CLIPS_DIR, UPDATE_EVENT_DESCRIPTION
+from frigate.data_processing.post.semantic_trigger import SemanticTriggerProcessor
+from frigate.data_processing.types import PostProcessDataEnum
+from frigate.genai import GenAIClient
+from frigate.models import Event
+from frigate.types import TrackedObjectUpdateTypesEnum
+from frigate.util.builtin import EventsPerSecond, InferenceSpeed
+from frigate.util.image import create_thumbnail, ensure_jpeg_bytes
+from frigate.util.path import get_event_thumbnail_bytes
+
+if TYPE_CHECKING:
+    from frigate.embeddings import Embeddings
+
+from ..post.api import PostProcessorApi
+from ..types import DataProcessorMetrics
+
+logger = logging.getLogger(__name__)
+
+MAX_THUMBNAILS = 10
+
+
+class ObjectDescriptionProcessor(PostProcessorApi):
+    def __init__(
+        self,
+        config: FrigateConfig,
+        embeddings: "Embeddings",
+        requestor: InterProcessRequestor,
+        metrics: DataProcessorMetrics,
+        client: GenAIClient,
+        semantic_trigger_processor: SemanticTriggerProcessor | None,
+    ):
+        super().__init__(config, metrics, None)
+        self.config = config
+        self.embeddings = embeddings
+        self.requestor = requestor
+        self.metrics = metrics
+        self.genai_client = client
+        self.semantic_trigger_processor = semantic_trigger_processor
+        self.tracked_events: dict[str, list[Any]] = {}
+        self.early_request_sent: dict[str, bool] = {}
+        self.object_desc_speed = InferenceSpeed(self.metrics.object_desc_speed)
+        self.object_desc_dps = EventsPerSecond()
+        self.object_desc_dps.start()
+
+    def __handle_frame_update(
+        self, camera: str, data: dict, yuv_frame: np.ndarray
+    ) -> None:
+        """Handle an update to a frame for an object."""
+        camera_config = self.config.cameras[camera]
+
+        # no need to save our own thumbnails if genai is not enabled
+        # or if the object has become stationary
+        if not data["stationary"]:
+            if data["id"] not in self.tracked_events:
+                self.tracked_events[data["id"]] = []
+
+            data["thumbnail"] = create_thumbnail(yuv_frame, data["box"])
+
+            # Limit the number of thumbnails saved
+            if len(self.tracked_events[data["id"]]) >= MAX_THUMBNAILS:
+                # Always keep the first thumbnail for the event
+                self.tracked_events[data["id"]].pop(1)
+
+            self.tracked_events[data["id"]].append(data)
+
+        # check if we're configured to send an early request after a minimum number of updates received
+        if camera_config.objects.genai.send_triggers.after_significant_updates:
+            if (
+                len(self.tracked_events.get(data["id"], []))
+                >= camera_config.objects.genai.send_triggers.after_significant_updates
+                and data["id"] not in self.early_request_sent
+            ):
+                if data["has_clip"] and data["has_snapshot"]:
+                    event: Event = Event.get(Event.id == data["id"])
+
+                    if (
+                        not camera_config.objects.genai.objects
+                        or event.label in camera_config.objects.genai.objects
+                    ) and (
+                        not camera_config.objects.genai.required_zones
+                        or set(data["entered_zones"])
+                        & set(camera_config.objects.genai.required_zones)
+                    ):
+                        logger.debug(f"{camera} sending early request to GenAI")
+
+                        self.early_request_sent[data["id"]] = True
+                        threading.Thread(
+                            target=self._genai_embed_description,
+                            name=f"_genai_embed_description_{event.id}",
+                            daemon=True,
+                            args=(
+                                event,
+                                [
+                                    data["thumbnail"]
+                                    for data in self.tracked_events[data["id"]]
+                                ],
+                            ),
+                        ).start()
+
+    def __handle_frame_finalize(
+        self, camera: str, event: Event, thumbnail: bytes
+    ) -> None:
+        """Handle the finalization of a frame."""
+        camera_config = self.config.cameras[camera]
+
+        if (
+            camera_config.objects.genai.enabled
+            and camera_config.objects.genai.send_triggers.tracked_object_end
+            and (
+                not camera_config.objects.genai.objects
+                or event.label in camera_config.objects.genai.objects
+            )
+            and (
+                not camera_config.objects.genai.required_zones
+                or set(event.zones) & set(camera_config.objects.genai.required_zones)
+            )
+        ):
+            self._process_genai_description(event, camera_config, thumbnail)
+
+    def __regenerate_description(self, event_id: str, source: str, force: bool) -> None:
+        """Regenerate the description for an event."""
+        try:
+            event: Event = Event.get(Event.id == event_id)
+        except DoesNotExist:
+            logger.error(f"Event {event_id} not found for description regeneration")
+            return
+
+        if self.genai_client is None:
+            logger.error("GenAI not enabled")
+            return
+
+        camera_config = self.config.cameras[event.camera]
+        if not camera_config.objects.genai.enabled and not force:
+            logger.error(f"GenAI not enabled for camera {event.camera}")
+            return
+
+        thumbnail = get_event_thumbnail_bytes(event)
+
+        # ensure we have a jpeg to pass to the model
+        thumbnail = ensure_jpeg_bytes(thumbnail)
+
+        logger.debug(
+            f"Trying {source} regeneration for {event}, has_snapshot: {event.has_snapshot}"
+        )
+
+        if event.has_snapshot and source == "snapshot":
+            snapshot_image = self._read_and_crop_snapshot(event)
+            if not snapshot_image:
+                return
+
+        embed_image = (
+            [snapshot_image]
+            if event.has_snapshot and source == "snapshot"
+            else (
+                [data["thumbnail"] for data in self.tracked_events[event_id]]
+                if len(self.tracked_events.get(event_id, [])) > 0
+                else [thumbnail]
+            )
+        )
+
+        self._genai_embed_description(event, embed_image)
+
+    def process_data(self, frame_data: dict, data_type: PostProcessDataEnum) -> None:
+        """Process a frame update."""
+        self.metrics.object_desc_dps.value = self.object_desc_dps.eps()
+
+        if data_type != PostProcessDataEnum.tracked_object:
+            return
+
+        state: str | None = frame_data.get("state", None)
+
+        if state is not None:
+            logger.debug(f"Processing {state} for {frame_data['camera']}")
+
+        if state == "update":
+            self.__handle_frame_update(
+                frame_data["camera"], frame_data["data"], frame_data["yuv_frame"]
+            )
+        elif state == "finalize":
+            self.__handle_frame_finalize(
+                frame_data["camera"], frame_data["event"], frame_data["thumbnail"]
+            )
+
+    def handle_request(self, topic: str, data: dict[str, Any]) -> str | None:
+        """Handle a request."""
+        if topic == "regenerate_description":
+            self.__regenerate_description(
+                data["event_id"], data["source"], data["force"]
+            )
+        return None
+
+    def _read_and_crop_snapshot(self, event: Event) -> bytes | None:
+        """Read, decode, and crop the snapshot image."""
+
+        snapshot_file = os.path.join(CLIPS_DIR, f"{event.camera}-{event.id}.jpg")
+
+        if not os.path.isfile(snapshot_file):
+            logger.error(
+                f"Cannot load snapshot for {event.id}, file not found: {snapshot_file}"
+            )
+            return None
+
+        try:
+            with open(snapshot_file, "rb") as image_file:
+                snapshot_image = image_file.read()
+
+                img = cv2.imdecode(
+                    np.frombuffer(snapshot_image, dtype=np.int8),
+                    cv2.IMREAD_COLOR,
+                )
+
+                # Crop snapshot based on region
+                # provide full image if region doesn't exist (manual events)
+                height, width = img.shape[:2]
+                x1_rel, y1_rel, width_rel, height_rel = event.data.get(
+                    "region", [0, 0, 1, 1]
+                )
+                x1, y1 = int(x1_rel * width), int(y1_rel * height)
+
+                cropped_image = img[
+                    y1 : y1 + int(height_rel * height),
+                    x1 : x1 + int(width_rel * width),
+                ]
+
+                _, buffer = cv2.imencode(".jpg", cropped_image)
+
+                return buffer.tobytes()
+        except Exception:
+            return None
+
+    def _process_genai_description(
+        self, event: Event, camera_config: CameraConfig, thumbnail
+    ) -> None:
+        if event.has_snapshot and camera_config.objects.genai.use_snapshot:
+            snapshot_image = self._read_and_crop_snapshot(event)
+            if not snapshot_image:
+                return
+
+        num_thumbnails = len(self.tracked_events.get(event.id, []))
+
+        # ensure we have a jpeg to pass to the model
+        thumbnail = ensure_jpeg_bytes(thumbnail)
+
+        embed_image = (
+            [snapshot_image]
+            if event.has_snapshot and camera_config.objects.genai.use_snapshot
+            else (
+                [data["thumbnail"] for data in self.tracked_events[event.id]]
+                if num_thumbnails > 0
+                else [thumbnail]
+            )
+        )
+
+        if camera_config.objects.genai.debug_save_thumbnails and num_thumbnails > 0:
+            logger.debug(f"Saving {num_thumbnails} thumbnails for event {event.id}")
+
+            Path(os.path.join(CLIPS_DIR, f"genai-requests/{event.id}")).mkdir(
+                parents=True, exist_ok=True
+            )
+
+            for idx, data in enumerate(self.tracked_events[event.id], 1):
+                jpg_bytes: bytes | None = data["thumbnail"]
+
+                if jpg_bytes is None:
+                    logger.warning(f"Unable to save thumbnail {idx} for {event.id}.")
+                else:
+                    with open(
+                        os.path.join(
+                            CLIPS_DIR,
+                            f"genai-requests/{event.id}/{idx}.jpg",
+                        ),
+                        "wb",
+                    ) as j:
+                        j.write(jpg_bytes)
+
+        # Generate the description. Call happens in a thread since it is network bound.
+        threading.Thread(
+            target=self._genai_embed_description,
+            name=f"_genai_embed_description_{event.id}",
+            daemon=True,
+            args=(
+                event,
+                embed_image,
+            ),
+        ).start()
+
+        # Delete tracked events based on the event_id
+        if event.id in self.tracked_events:
+            del self.tracked_events[event.id]
+
+    def _genai_embed_description(self, event: Event, thumbnails: list[bytes]) -> None:
+        """Embed the description for an event."""
+        start = datetime.datetime.now().timestamp()
+        camera_config = self.config.cameras[event.camera]
+        description = self.genai_client.generate_object_description(
+            camera_config, thumbnails, event
+        )
+
+        if not description:
+            logger.debug("Failed to generate description for %s", event.id)
+            return
+
+        # fire and forget description update
+        self.requestor.send_data(
+            UPDATE_EVENT_DESCRIPTION,
+            {
+                "type": TrackedObjectUpdateTypesEnum.description,
+                "id": event.id,
+                "description": description,
+                "camera": event.camera,
+            },
+        )
+
+        # Embed the description
+        if self.config.semantic_search.enabled:
+            self.embeddings.embed_description(event.id, description)
+
+            # Check semantic trigger for this description
+            if self.semantic_trigger_processor is not None:
+                self.semantic_trigger_processor.process_data(
+                    {"event_id": event.id, "camera": event.camera, "type": "text"},
+                    PostProcessDataEnum.tracked_object,
+                )
+
+        # Update inference timing metrics
+        self.object_desc_speed.update(datetime.datetime.now().timestamp() - start)
+        self.object_desc_dps.update()
+
+        logger.debug(
+            "Generated description for %s (%d images): %s",
+            event.id,
+            len(thumbnails),
+            description,
+        )
--- a/frigate/data_processing/post/review_descriptions.py
+++ b/frigate/data_processing/post/review_descriptions.py
@@ -43,6 +43,21 @@ class ReviewDescriptionProcessor(PostProcessorApi):
        self.review_descs_dps = EventsPerSecond()
        self.review_descs_dps.start()

+    def calculate_frame_count(self) -> int:
+        """Calculate optimal number of frames based on context size."""
+        # With our preview images (height of 180px) each image should be ~100 tokens per image
+        # We want to be conservative to not have too long of query times with too many images
+        context_size = self.genai_client.get_context_size()
+
+        if context_size > 10000:
+            return 20
+        elif context_size > 6000:
+            return 16
+        elif context_size > 4000:
+            return 12
+        else:
+            return 8
+
    def process_data(self, data, data_type):
        self.metrics.review_desc_dps.value = self.review_descs_dps.eps()

@@ -176,7 +191,6 @@ class ReviewDescriptionProcessor(PostProcessorApi):
        camera: str,
        start_time: float,
        end_time: float,
-        desired_frame_count: int = 12,
    ) -> list[str]:
        preview_dir = os.path.join(CACHE_DIR, "preview_frames")
        file_start = f"preview_{camera}"
@@ -203,6 +217,8 @@ class ReviewDescriptionProcessor(PostProcessorApi):
            all_frames.append(os.path.join(preview_dir, file))

        frame_count = len(all_frames)
+        desired_frame_count = self.calculate_frame_count()
+
        if frame_count <= desired_frame_count:
            return all_frames

@@ -235,7 +251,7 @@ def run_analysis(
        "start": datetime.datetime.fromtimestamp(final_data["start_time"]).strftime(
            "%A, %I:%M %p"
        ),
-        "duration": final_data["end_time"] - final_data["start_time"],
+        "duration": round(final_data["end_time"] - final_data["start_time"]),
    }

    objects = []
--- a/frigate/data_processing/types.py
+++ b/frigate/data_processing/types.py
@@ -22,6 +22,8 @@ class DataProcessorMetrics:
    yolov9_lpr_pps: Synchronized
    review_desc_speed: Synchronized
    review_desc_dps: Synchronized
+    object_desc_speed: Synchronized
+    object_desc_dps: Synchronized
    classification_speeds: dict[str, Synchronized]
    classification_cps: dict[str, Synchronized]

@@ -38,6 +40,8 @@ class DataProcessorMetrics:
        self.yolov9_lpr_pps = manager.Value("d", 0.0)
        self.review_desc_speed = manager.Value("d", 0.0)
        self.review_desc_dps = manager.Value("d", 0.0)
+        self.object_desc_speed = manager.Value("d", 0.0)
+        self.object_desc_dps = manager.Value("d", 0.0)
        self.classification_speeds = manager.dict()
        self.classification_cps = manager.dict()

--- a/frigate/embeddings/maintainer.py
+++ b/frigate/embeddings/maintainer.py
@@ -3,14 +3,10 @@
 import base64
 import datetime
 import logging
-import os
 import threading
 from multiprocessing.synchronize import Event as MpEvent
-from pathlib import Path
-from typing import Any, Optional
+from typing import Any

-import cv2
-import numpy as np
 from peewee import DoesNotExist

 from frigate.comms.detections_updater import DetectionSubscriber, DetectionTypeEnum
@@ -30,16 +26,12 @@ from frigate.comms.recordings_updater import (
    RecordingsDataTypeEnum,
 )
 from frigate.comms.review_updater import ReviewDataSubscriber
-from frigate.config import CameraConfig, FrigateConfig
+from frigate.config import FrigateConfig
 from frigate.config.camera.camera import CameraTypeEnum
 from frigate.config.camera.updater import (
    CameraConfigUpdateEnum,
    CameraConfigUpdateSubscriber,
 )
-from frigate.const import (
-    CLIPS_DIR,
-    UPDATE_EVENT_DESCRIPTION,
-)
 from frigate.data_processing.common.license_plate.model import (
    LicensePlateModelRunner,
 )
@@ -50,6 +42,7 @@ from frigate.data_processing.post.audio_transcription import (
 from frigate.data_processing.post.license_plate import (
    LicensePlatePostProcessor,
 )
+from frigate.data_processing.post.object_descriptions import ObjectDescriptionProcessor
 from frigate.data_processing.post.review_descriptions import ReviewDescriptionProcessor
 from frigate.data_processing.post.semantic_trigger import SemanticTriggerProcessor
 from frigate.data_processing.real_time.api import RealTimeProcessorApi
@@ -67,13 +60,8 @@ from frigate.db.sqlitevecq import SqliteVecQueueDatabase
 from frigate.events.types import EventTypeEnum, RegenerateDescriptionEnum
 from frigate.genai import get_genai_client
 from frigate.models import Event, Recordings, ReviewSegment, Trigger
-from frigate.types import TrackedObjectUpdateTypesEnum
 from frigate.util.builtin import serialize
-from frigate.util.image import (
-    SharedMemoryFrameManager,
-    calculate_region,
-    ensure_jpeg_bytes,
-)
+from frigate.util.image import SharedMemoryFrameManager
 from frigate.util.path import get_event_thumbnail_bytes

 from .embeddings import Embeddings
@@ -235,20 +223,30 @@ class EmbeddingMaintainer(threading.Thread):
                AudioTranscriptionPostProcessor(self.config, self.requestor, metrics)
            )

+        semantic_trigger_processor: SemanticTriggerProcessor | None = None
        if self.config.semantic_search.enabled:
+            semantic_trigger_processor = SemanticTriggerProcessor(
+                db,
+                self.config,
+                self.requestor,
+                metrics,
+                self.embeddings,
+            )
+            self.post_processors.append(semantic_trigger_processor)
+
+        if any(c.objects.genai.enabled_in_config for c in self.config.cameras.values()):
            self.post_processors.append(
-                SemanticTriggerProcessor(
-                    db,
+                ObjectDescriptionProcessor(
                    self.config,
-                    self.requestor,
-                    metrics,
                    self.embeddings,
+                    self.requestor,
+                    self.metrics,
+                    self.genai_client,
+                    semantic_trigger_processor,
                )
            )

        self.stop_event = stop_event
-        self.tracked_events: dict[str, list[Any]] = {}
-        self.early_request_sent: dict[str, bool] = {}

        # recordings data
        self.recordings_available_through: dict[str, float] = {}
@@ -337,11 +335,8 @@ class EmbeddingMaintainer(threading.Thread):

        camera_config = self.config.cameras[camera]

-        # no need to process updated objects if face recognition, lpr, genai are disabled
-        if (
-            not camera_config.objects.genai.enabled
-            and len(self.realtime_processors) == 0
-        ):
+        # no need to process updated objects if no processors are active
+        if len(self.realtime_processors) == 0 and len(self.post_processors) == 0:
            return

        # Create our own thumbnail based on the bounding box and the frame time
@@ -361,57 +356,17 @@ class EmbeddingMaintainer(threading.Thread):
        for processor in self.realtime_processors:
            processor.process_frame(data, yuv_frame)

-        # no need to save our own thumbnails if genai is not enabled
-        # or if the object has become stationary
-        if self.genai_client is not None and not data["stationary"]:
-            if data["id"] not in self.tracked_events:
-                self.tracked_events[data["id"]] = []
-
-            data["thumbnail"] = self._create_thumbnail(yuv_frame, data["box"])
-
-            # Limit the number of thumbnails saved
-            if len(self.tracked_events[data["id"]]) >= MAX_THUMBNAILS:
-                # Always keep the first thumbnail for the event
-                self.tracked_events[data["id"]].pop(1)
-
-            self.tracked_events[data["id"]].append(data)
-
-        # check if we're configured to send an early request after a minimum number of updates received
-        if (
-            self.genai_client is not None
-            and camera_config.objects.genai.send_triggers.after_significant_updates
-        ):
-            if (
-                len(self.tracked_events.get(data["id"], []))
-                >= camera_config.objects.genai.send_triggers.after_significant_updates
-                and data["id"] not in self.early_request_sent
-            ):
-                if data["has_clip"] and data["has_snapshot"]:
-                    event: Event = Event.get(Event.id == data["id"])
-
-                    if (
-                        not camera_config.objects.genai.objects
-                        or event.label in camera_config.objects.genai.objects
-                    ) and (
-                        not camera_config.objects.genai.required_zones
-                        or set(data["entered_zones"])
-                        & set(camera_config.objects.genai.required_zones)
-                    ):
-                        logger.debug(f"{camera} sending early request to GenAI")
-
-                        self.early_request_sent[data["id"]] = True
-                        threading.Thread(
-                            target=self._genai_embed_description,
-                            name=f"_genai_embed_description_{event.id}",
-                            daemon=True,
-                            args=(
-                                event,
-                                [
-                                    data["thumbnail"]
-                                    for data in self.tracked_events[data["id"]]
-                                ],
-                            ),
-                        ).start()
+        for processor in self.post_processors:
+            if isinstance(processor, ObjectDescriptionProcessor):
+                processor.process_data(
+                    {
+                        "camera": camera,
+                        "data": data,
+                        "state": "update",
+                        "yuv_frame": yuv_frame,
+                    },
+                    PostProcessDataEnum.tracked_object,
+                )

        self.frame_manager.close(frame_name)

@@ -424,12 +379,13 @@ class EmbeddingMaintainer(threading.Thread):
                break

            event_id, camera, updated_db = ended
-            camera_config = self.config.cameras[camera]

            # expire in realtime processors
            for processor in self.realtime_processors:
                processor.expire_object(event_id, camera)

+            thumbnail: bytes | None = None
+
            if updated_db:
                try:
                    event: Event = Event.get(Event.id == event_id)
@@ -446,23 +402,6 @@ class EmbeddingMaintainer(threading.Thread):
                # Embed the thumbnail
                self._embed_thumbnail(event_id, thumbnail)

-                # Run GenAI
-                if (
-                    camera_config.objects.genai.enabled
-                    and camera_config.objects.genai.send_triggers.tracked_object_end
-                    and self.genai_client is not None
-                    and (
-                        not camera_config.objects.genai.objects
-                        or event.label in camera_config.objects.genai.objects
-                    )
-                    and (
-                        not camera_config.objects.genai.required_zones
-                        or set(event.zones)
-                        & set(camera_config.objects.genai.required_zones)
-                    )
-                ):
-                    self._process_genai_description(event, camera_config, thumbnail)
-
            # call any defined post processors
            for processor in self.post_processors:
                if isinstance(processor, LicensePlatePostProcessor):
@@ -492,16 +431,25 @@ class EmbeddingMaintainer(threading.Thread):
                        {"event_id": event_id, "camera": camera, "type": "image"},
                        PostProcessDataEnum.tracked_object,
                    )
+                elif isinstance(processor, ObjectDescriptionProcessor):
+                    if not updated_db:
+                        continue
+
+                    processor.process_data(
+                        {
+                            "event": event,
+                            "camera": camera,
+                            "state": "finalize",
+                            "thumbnail": thumbnail,
+                        },
+                        PostProcessDataEnum.tracked_object,
+                    )
                else:
                    processor.process_data(
                        {"event_id": event_id, "camera": camera},
                        PostProcessDataEnum.tracked_object,
                    )

-            # Delete tracked events based on the event_id
-            if event_id in self.tracked_events:
-                del self.tracked_events[event_id]
-
    def _expire_dedicated_lpr(self) -> None:
        """Remove plates not seen for longer than expiration timeout for dedicated lpr cameras."""
        now = datetime.datetime.now().timestamp()
@@ -570,9 +518,16 @@ class EmbeddingMaintainer(threading.Thread):
        event_id, source, force = payload

        if event_id:
-            self.handle_regenerate_description(
-                event_id, RegenerateDescriptionEnum(source), force
-            )
+            for processor in self.post_processors:
+                if isinstance(processor, ObjectDescriptionProcessor):
+                    processor.handle_request(
+                        "regenerate_description",
+                        {
+                            "event_id": event_id,
+                            "source": RegenerateDescriptionEnum(source),
+                            "force": force,
+                        },
+                    )

    def _process_frame_updates(self) -> None:
        """Process event updates"""
@@ -622,208 +577,9 @@ class EmbeddingMaintainer(threading.Thread):

        self.frame_manager.close(frame_name)

-    def _create_thumbnail(self, yuv_frame, box, height=500) -> Optional[bytes]:
-        """Return jpg thumbnail of a region of the frame."""
-        frame = cv2.cvtColor(yuv_frame, cv2.COLOR_YUV2BGR_I420)
-        region = calculate_region(
-            frame.shape, box[0], box[1], box[2], box[3], height, multiplier=1.4
-        )
-        frame = frame[region[1] : region[3], region[0] : region[2]]
-        width = int(height * frame.shape[1] / frame.shape[0])
-        frame = cv2.resize(frame, dsize=(width, height), interpolation=cv2.INTER_AREA)
-        ret, jpg = cv2.imencode(".jpg", frame, [int(cv2.IMWRITE_JPEG_QUALITY), 100])
-
-        if ret:
-            return jpg.tobytes()
-
-        return None
-
    def _embed_thumbnail(self, event_id: str, thumbnail: bytes) -> None:
        """Embed the thumbnail for an event."""
        if not self.config.semantic_search.enabled:
            return

        self.embeddings.embed_thumbnail(event_id, thumbnail)
-
-    def _process_genai_description(
-        self, event: Event, camera_config: CameraConfig, thumbnail
-    ) -> None:
-        if event.has_snapshot and camera_config.objects.genai.use_snapshot:
-            snapshot_image = self._read_and_crop_snapshot(event, camera_config)
-            if not snapshot_image:
-                return
-
-        num_thumbnails = len(self.tracked_events.get(event.id, []))
-
-        # ensure we have a jpeg to pass to the model
-        thumbnail = ensure_jpeg_bytes(thumbnail)
-
-        embed_image = (
-            [snapshot_image]
-            if event.has_snapshot and camera_config.objects.genai.use_snapshot
-            else (
-                [data["thumbnail"] for data in self.tracked_events[event.id]]
-                if num_thumbnails > 0
-                else [thumbnail]
-            )
-        )
-
-        if camera_config.objects.genai.debug_save_thumbnails and num_thumbnails > 0:
-            logger.debug(f"Saving {num_thumbnails} thumbnails for event {event.id}")
-
-            Path(os.path.join(CLIPS_DIR, f"genai-requests/{event.id}")).mkdir(
-                parents=True, exist_ok=True
-            )
-
-            for idx, data in enumerate(self.tracked_events[event.id], 1):
-                jpg_bytes: bytes = data["thumbnail"]
-
-                if jpg_bytes is None:
-                    logger.warning(f"Unable to save thumbnail {idx} for {event.id}.")
-                else:
-                    with open(
-                        os.path.join(
-                            CLIPS_DIR,
-                            f"genai-requests/{event.id}/{idx}.jpg",
-                        ),
-                        "wb",
-                    ) as j:
-                        j.write(jpg_bytes)
-
-        # Generate the description. Call happens in a thread since it is network bound.
-        threading.Thread(
-            target=self._genai_embed_description,
-            name=f"_genai_embed_description_{event.id}",
-            daemon=True,
-            args=(
-                event,
-                embed_image,
-            ),
-        ).start()
-
-    def _genai_embed_description(self, event: Event, thumbnails: list[bytes]) -> None:
-        """Embed the description for an event."""
-        camera_config = self.config.cameras[event.camera]
-
-        description = self.genai_client.generate_object_description(
-            camera_config, thumbnails, event
-        )
-
-        if not description:
-            logger.debug("Failed to generate description for %s", event.id)
-            return
-
-        # fire and forget description update
-        self.requestor.send_data(
-            UPDATE_EVENT_DESCRIPTION,
-            {
-                "type": TrackedObjectUpdateTypesEnum.description,
-                "id": event.id,
-                "description": description,
-                "camera": event.camera,
-            },
-        )
-
-        # Embed the description
-        if self.config.semantic_search.enabled:
-            self.embeddings.embed_description(event.id, description)
-
-        # Check semantic trigger for this description
-        for processor in self.post_processors:
-            if isinstance(processor, SemanticTriggerProcessor):
-                processor.process_data(
-                    {"event_id": event.id, "camera": event.camera, "type": "text"},
-                    PostProcessDataEnum.tracked_object,
-                )
-            else:
-                continue
-
-        logger.debug(
-            "Generated description for %s (%d images): %s",
-            event.id,
-            len(thumbnails),
-            description,
-        )
-
-    def _read_and_crop_snapshot(self, event: Event, camera_config) -> bytes | None:
-        """Read, decode, and crop the snapshot image."""
-
-        snapshot_file = os.path.join(CLIPS_DIR, f"{event.camera}-{event.id}.jpg")
-
-        if not os.path.isfile(snapshot_file):
-            logger.error(
-                f"Cannot load snapshot for {event.id}, file not found: {snapshot_file}"
-            )
-            return None
-
-        try:
-            with open(snapshot_file, "rb") as image_file:
-                snapshot_image = image_file.read()
-
-                img = cv2.imdecode(
-                    np.frombuffer(snapshot_image, dtype=np.int8),
-                    cv2.IMREAD_COLOR,
-                )
-
-                # Crop snapshot based on region
-                # provide full image if region doesn't exist (manual events)
-                height, width = img.shape[:2]
-                x1_rel, y1_rel, width_rel, height_rel = event.data.get(
-                    "region", [0, 0, 1, 1]
-                )
-                x1, y1 = int(x1_rel * width), int(y1_rel * height)
-
-                cropped_image = img[
-                    y1 : y1 + int(height_rel * height),
-                    x1 : x1 + int(width_rel * width),
-                ]
-
-                _, buffer = cv2.imencode(".jpg", cropped_image)
-
-                return buffer.tobytes()
-        except Exception:
-            return None
-
-    def handle_regenerate_description(
-        self, event_id: str, source: str, force: bool
-    ) -> None:
-        try:
-            event: Event = Event.get(Event.id == event_id)
-        except DoesNotExist:
-            logger.error(f"Event {event_id} not found for description regeneration")
-            return
-
-        if self.genai_client is None:
-            logger.error("GenAI not enabled")
-            return
-
-        camera_config = self.config.cameras[event.camera]
-        if not camera_config.objects.genai.enabled and not force:
-            logger.error(f"GenAI not enabled for camera {event.camera}")
-            return
-
-        thumbnail = get_event_thumbnail_bytes(event)
-
-        # ensure we have a jpeg to pass to the model
-        thumbnail = ensure_jpeg_bytes(thumbnail)
-
-        logger.debug(
-            f"Trying {source} regeneration for {event}, has_snapshot: {event.has_snapshot}"
-        )
-
-        if event.has_snapshot and source == "snapshot":
-            snapshot_image = self._read_and_crop_snapshot(event, camera_config)
-            if not snapshot_image:
-                return
-
-        embed_image = (
-            [snapshot_image]
-            if event.has_snapshot and source == "snapshot"
-            else (
-                [data["thumbnail"] for data in self.tracked_events[event_id]]
-                if len(self.tracked_events.get(event_id, [])) > 0
-                else [thumbnail]
-            )
-        )
-
-        self._genai_embed_description(event, embed_image)
--- a/frigate/genai/init.py
+++ b/frigate/genai/init.py
@@ -32,7 +32,7 @@ def register_genai_provider(key: GenAIProviderEnum):
 class GenAIClient:
    """Generative AI client for Frigate."""

-    def __init__(self, genai_config: GenAIConfig, timeout: int = 60) -> None:
+    def __init__(self, genai_config: GenAIConfig, timeout: int = 120) -> None:
        self.genai_config: GenAIConfig = genai_config
        self.timeout = timeout
        self.provider = self._init_provider()
@@ -66,12 +66,15 @@ class GenAIClient:
        context_prompt = f"""
 Please analyze the sequence of images ({len(thumbnails)} total) taken in chronological order from the perspective of the {review_data["camera"].replace("_", " ")} security camera.

+**Normal activity patterns for this property:**
+{activity_context_prompt}
+
 Your task is to provide a clear, accurate description of the scene that:
 1. States exactly what is happening based on observable actions and movements.
 2. Evaluates whether the observable evidence suggests normal activity for this property or genuine security concerns.
 3. Assigns a potential_threat_level based on the definitions below, applying them consistently.

-Provide an objective assessment. The goal is accuracy—neither missing genuine threats nor over-flagging routine activity for this property.
+**IMPORTANT: Start by checking if the activity matches the normal patterns above. If it does, assign Level 0. Only consider higher threat levels if the activity clearly deviates from normal patterns or shows genuine security concerns.**

 When forming your description:
 - **CRITICAL: Only describe objects explicitly listed in "Detected objects" below.** Do not infer or mention additional people, vehicles, or objects not present in the detected objects list, even if visual patterns suggest them. If only a car is detected, do not describe a person interacting with it unless "person" is also in the detected objects list.
@@ -81,10 +84,7 @@ When forming your description:
 - Consider the full sequence chronologically: what happens from start to finish, how duration and actions relate to the location and objects involved.
 - **Use the actual timestamp provided in "Activity started at"** below for time of day context—do not infer time from image brightness or darkness. Unusual hours (late night/early morning) should increase suspicion when the observable behavior itself appears questionable. However, recognize that some legitimate activities can occur at any hour.
 - Identify patterns that suggest genuine security concerns: testing doors/windows on vehicles or buildings, accessing unauthorized areas, attempting to conceal actions, extended loitering without apparent purpose, taking items, behavior that clearly doesn't align with the zone context and detected objects.
- **Weigh all evidence holistically**: Consider the complete picture including zone, objects, time, and actions together. A single ambiguous action should not override strong contextual evidence of normal activity. The overall pattern determines the threat level.
-
-**Normal activity patterns for this property:**
-{activity_context_prompt}
+- **Weigh all evidence holistically**: Start by checking if the activity matches the normal patterns above. If it does, assign Level 0. Only consider Level 1 if the activity clearly deviates from normal patterns or shows genuine security concerns that warrant attention.

 Your response MUST be a flat JSON object with:
 - `scene` (string): A narrative description of what happens across the sequence from start to finish. **Only describe actions you can actually observe happening in the frames provided.** Do not infer or assume actions that aren't visible (e.g., if you see someone walking but never see them sit, don't say they sat down). Include setting, detected objects, and their observable actions. Avoid speculation or filling in assumed behaviors. Your description should align with and support the threat level you assign.
@@ -93,9 +93,9 @@ Your response MUST be a flat JSON object with:
 {get_concern_prompt()}

 Threat-level definitions:
- 0 — Normal activity: What you observe is consistent with expected activity for this property type. The observable evidence—considering zone context, detected objects, and timing together—supports a benign explanation. Use this for routine activities even if minor ambiguous elements exist.
- 1 — Potentially suspicious: Observable behavior raises genuine security concerns that warrant human review. The evidence doesn't support a routine explanation when you consider the zone, objects, and actions together. Examples: testing doors/windows on vehicles or structures, accessing areas that don't align with the activity, taking items that likely don't belong to them, behavior clearly inconsistent with the zone and context, or activity that lacks any visible legitimate indicators. Reserve this level for situations that actually merit closer attention—not routine activities for this property.
- 2 — Immediate threat: Clear evidence of forced entry, break-in, vandalism, aggression, weapons, theft in progress, or active property damage.
+- 0 — **Normal activity (DEFAULT)**: What you observe matches the normal activity patterns above or is consistent with expected activity for this property type. The observable evidence—considering zone context, detected objects, and timing together—supports a benign explanation. **Use this level for routine activities even if minor ambiguous elements exist.**
+- 1 — **Potentially suspicious**: Observable behavior raises genuine security concerns that warrant human review. The evidence doesn't support a routine explanation and clearly deviates from the normal patterns above. Examples: testing doors/windows on vehicles or structures, accessing areas that don't align with the activity, taking items that likely don't belong to them, behavior clearly inconsistent with the zone and context, or activity that lacks any visible legitimate indicators. **Only use this level when the activity clearly doesn't match normal patterns.**
+- 2 — **Immediate threat**: Clear evidence of forced entry, break-in, vandalism, aggression, weapons, theft in progress, or active property damage.

 Sequence details:
 - Frame 1 = earliest, Frame {len(thumbnails)} = latest
@@ -234,9 +234,9 @@ Rules for the report:
    ) -> Optional[str]:
        """Generate a description for the frame."""
        try:
-            prompt = camera_config.genai.object_prompts.get(
+            prompt = camera_config.objects.genai.object_prompts.get(
                event.label,
-                camera_config.genai.prompt,
+                camera_config.objects.genai.prompt,
            ).format(**model_to_dict(event))
        except KeyError as e:
            logger.error(f"Invalid key in GenAI prompt: {e}")
@@ -253,6 +253,10 @@ Rules for the report:
        """Submit a request to the provider."""
        return None

+    def get_context_size(self) -> int:
+        """Get the context window size for this provider in tokens."""
+        return 4096
+

 def get_genai_client(config: FrigateConfig) -> Optional[GenAIClient]:
    """Get the GenAI client."""
--- a/frigate/genai/azure-openai.py
+++ b/frigate/genai/azure-openai.py
@@ -71,3 +71,7 @@ class OpenAIClient(GenAIClient):
        if len(result.choices) > 0:
            return result.choices[0].message.content.strip()
        return None
+
+    def get_context_size(self) -> int:
+        """Get the context window size for Azure OpenAI."""
+        return 128000
--- a/frigate/genai/gemini.py
+++ b/frigate/genai/gemini.py
@@ -53,3 +53,8 @@ class GeminiClient(GenAIClient):
            # No description was generated
            return None
        return description
+
+    def get_context_size(self) -> int:
+        """Get the context window size for Gemini."""
+        # Gemini Pro Vision has a 1M token context window
+        return 1000000
--- a/frigate/genai/ollama.py
+++ b/frigate/genai/ollama.py
@@ -54,3 +54,9 @@ class OllamaClient(GenAIClient):
        except (TimeoutException, ResponseError) as e:
            logger.warning("Ollama returned an error: %s", str(e))
            return None
+
+    def get_context_size(self) -> int:
+        """Get the context window size for Ollama."""
+        return self.genai_config.provider_options.get("options", {}).get(
+            "num_ctx", 4096
+        )
--- a/frigate/genai/openai.py
+++ b/frigate/genai/openai.py
@@ -66,3 +66,8 @@ class OpenAIClient(GenAIClient):
        except (TimeoutException, Exception) as e:
            logger.warning("OpenAI returned an error: %s", str(e))
            return None
+
+    def get_context_size(self) -> int:
+        """Get the context window size for OpenAI."""
+        # OpenAI GPT-4 Vision models have 128K token context window
+        return 128000
--- a/frigate/stats/util.py
+++ b/frigate/stats/util.py
@@ -361,6 +361,14 @@ def stats_snapshot(
                embeddings_metrics.review_desc_dps.value, 2
            )

+        if embeddings_metrics.object_desc_speed.value > 0.0:
+            stats["embeddings"]["object_description_speed"] = round(
+                embeddings_metrics.object_desc_speed.value * 1000, 2
+            )
+            stats["embeddings"]["object_descriptions"] = round(
+                embeddings_metrics.object_desc_dps.value, 2
+            )
+
        for key in embeddings_metrics.classification_speeds.keys():
            stats["embeddings"][f"{key}_classification_speed"] = round(
                embeddings_metrics.classification_speeds[key].value * 1000, 2
--- a/frigate/util/image.py
+++ b/frigate/util/image.py
@@ -995,7 +995,26 @@ def get_histogram(image, x_min, y_min, x_max, y_max):
    return cv2.normalize(hist, hist).flatten()


-def ensure_jpeg_bytes(image_data):
+def create_thumbnail(
+    yuv_frame: np.ndarray, box: tuple[int, int, int, int], height=500
+) -> Optional[bytes]:
+    """Return jpg thumbnail of a region of the frame."""
+    frame = cv2.cvtColor(yuv_frame, cv2.COLOR_YUV2BGR_I420)
+    region = calculate_region(
+        frame.shape, box[0], box[1], box[2], box[3], height, multiplier=1.4
+    )
+    frame = frame[region[1] : region[3], region[0] : region[2]]
+    width = int(height * frame.shape[1] / frame.shape[0])
+    frame = cv2.resize(frame, dsize=(width, height), interpolation=cv2.INTER_AREA)
+    ret, jpg = cv2.imencode(".jpg", frame, [int(cv2.IMWRITE_JPEG_QUALITY), 100])
+
+    if ret:
+        return jpg.tobytes()
+
+    return None
+
+
+def ensure_jpeg_bytes(image_data: bytes) -> bytes:
    """Ensure image data is jpeg bytes for genai"""
    try:
        img_array = np.frombuffer(image_data, dtype=np.uint8)
--- a/web/src/components/card/ExportCard.tsx
+++ b/web/src/components/card/ExportCard.tsx
@@ -70,7 +70,10 @@ export default function ExportCard({
        (editName.update?.length ?? 0) > 0
      ) {
        submitRename();
+        return true;
      }
+
+      return false;
    },
  );

--- a/web/src/components/card/ReviewCard.tsx
+++ b/web/src/components/card/ReviewCard.tsx
@@ -109,6 +109,7 @@ export default function ReviewCard({

  useKeyboardListener(["Shift"], (_, modifiers) => {
    bypassDialogRef.current = modifiers.shift;
+    return false;
  });

  const handleDelete = useCallback(() => {
--- a/web/src/components/filter/ReviewActionGroup.tsx
+++ b/web/src/components/filter/ReviewActionGroup.tsx
@@ -75,6 +75,7 @@ export default function ReviewActionGroup({

  useKeyboardListener(["Shift"], (_, modifiers) => {
    setBypassDialog(modifiers.shift);
+    return false;
  });

  const handleDelete = useCallback(() => {
--- a/web/src/components/filter/SearchActionGroup.tsx
+++ b/web/src/components/filter/SearchActionGroup.tsx
@@ -62,6 +62,7 @@ export default function SearchActionGroup({

  useKeyboardListener(["Shift"], (_, modifiers) => {
    setBypassDialog(modifiers.shift);
+    return false;
  });

  const handleDelete = useCallback(() => {
--- a/web/src/components/overlay/PtzControlPanel.tsx
+++ b/web/src/components/overlay/PtzControlPanel.tsx
@@ -83,7 +83,7 @@ export default function PtzControlPanel({
    ],
    (key, modifiers) => {
      if (modifiers.repeat || !key) {
-        return;
+        return true;
      }

      if (["1", "2", "3", "4", "5", "6", "7", "8", "9"].includes(key)) {
@@ -95,34 +95,36 @@ export default function PtzControlPanel({
        ) {
          sendPtz(`preset_${ptz.presets[presetNumber - 1]}`);
        }
-        return;
+        return true;
      }

      if (!modifiers.down) {
        sendPtz("STOP");
-        return;
+        return true;
      }

      switch (key) {
        case "ArrowLeft":
          sendPtz("MOVE_LEFT");
-          break;
+          return true;
        case "ArrowRight":
          sendPtz("MOVE_RIGHT");
-          break;
+          return true;
        case "ArrowUp":
          sendPtz("MOVE_UP");
-          break;
+          return true;
        case "ArrowDown":
          sendPtz("MOVE_DOWN");
-          break;
+          return true;
        case "+":
          sendPtz(modifiers.shift ? "FOCUS_IN" : "ZOOM_IN");
-          break;
+          return true;
        case "-":
          sendPtz(modifiers.shift ? "FOCUS_OUT" : "ZOOM_OUT");
-          break;
+          return true;
      }
+
+      return false;
    },
  );

--- a/web/src/components/overlay/detail/ReviewDetailDialog.tsx
+++ b/web/src/components/overlay/detail/ReviewDetailDialog.tsx
@@ -175,6 +175,8 @@ export default function ReviewDetailDialog({
    if (key == "Esc" && modifiers.down && !modifiers.repeat) {
      setIsOpen(false);
    }
+
+    return true;
  });

  const Overlay = isDesktop ? Sheet : MobilePage;
--- a/web/src/components/player/GenericVideoPlayer.tsx
+++ b/web/src/components/player/GenericVideoPlayer.tsx
@@ -60,7 +60,7 @@ export function GenericVideoPlayer({
    ["ArrowDown", "ArrowLeft", "ArrowRight", "ArrowUp", " ", "f", "m"],
    (key, modifiers) => {
      if (!modifiers.down || modifiers.repeat) {
-        return;
+        return true;
      }

      switch (key) {
@@ -92,6 +92,8 @@ export function GenericVideoPlayer({
          }
          break;
      }
+
+      return true;
    },
  );

--- a/web/src/components/player/MsePlayer.tsx
+++ b/web/src/components/player/MsePlayer.tsx
@@ -88,7 +88,7 @@ function MSEPlayer({
    (error: LivePlayerError, description: string = "Unknown error") => {
      // eslint-disable-next-line no-console
      console.error(
-        `${camera} - MSE error '${error}': ${description} See the documentation: https://docs.frigate.video/configuration/live`,
+        `${camera} - MSE error '${error}': ${description} See the documentation: https://docs.frigate.video/configuration/live/#live-view-faq`,
      );
      onError?.(error);
    },
@@ -484,7 +484,10 @@ function MSEPlayer({
            videoRef.current
          ) {
            onDisconnect();
-            handleError("stalled", "Media playback has stalled.");
+            handleError(
+              "stalled",
+              `Media playback has stalled after ${timeoutDuration / 1000} seconds due to insufficient buffering or a network interruption.`,
+            );
          }
        }, timeoutDuration),
      );
--- a/web/src/components/player/VideoControls.tsx
+++ b/web/src/components/player/VideoControls.tsx
@@ -144,7 +144,7 @@ export default function VideoControls({
  const onKeyboardShortcut = useCallback(
    (key: string | null, modifiers: KeyModifiers) => {
      if (!modifiers.down) {
-        return;
+        return true;
      }

      switch (key) {
@@ -174,6 +174,8 @@ export default function VideoControls({
          onPlayPause(!isPlaying);
          break;
      }
+
+      return true;
    },
    // only update when preview only changes
    // eslint-disable-next-line react-hooks/exhaustive-deps
--- a/web/src/components/player/WebRTCPlayer.tsx
+++ b/web/src/components/player/WebRTCPlayer.tsx
@@ -42,7 +42,7 @@ export default function WebRtcPlayer({
    (error: LivePlayerError, description: string = "Unknown error") => {
      // eslint-disable-next-line no-console
      console.error(
-        `${camera} - WebRTC error '${error}': ${description} See the documentation: https://docs.frigate.video/configuration/live`,
+        `${camera} - WebRTC error '${error}': ${description} See the documentation: https://docs.frigate.video/configuration/live/#live-view-faq`,
      );
      onError?.(error);
    },
@@ -339,7 +339,10 @@ export default function WebRtcPlayer({
                    document.visibilityState === "visible" &&
                    pcRef.current != undefined
                  ) {
-                    handleError("stalled", "WebRTC connection stalled.");
+                    handleError(
+                      "stalled",
+                      "Media playback has stalled after 3 seconds due to insufficient buffering or a network interruption.",
+                    );
                  }
                }, 3000),
              );
--- a/web/src/hooks/use-keyboard-listener.tsx
+++ b/web/src/hooks/use-keyboard-listener.tsx
@@ -1,4 +1,4 @@
-import { useCallback, useEffect } from "react";
+import { MutableRefObject, useCallback, useEffect, useMemo } from "react";

 export type KeyModifiers = {
  down: boolean;
@@ -9,9 +9,17 @@ export type KeyModifiers = {

 export default function useKeyboardListener(
  keys: string[],
-  listener: (key: string | null, modifiers: KeyModifiers) => void,
-  preventDefault: boolean = true,
+  listener?: (key: string | null, modifiers: KeyModifiers) => boolean,
+  contentRef?: MutableRefObject<HTMLDivElement | null>,
 ) {
+  const pageKeys = useMemo(
+    () =>
+      contentRef != undefined
+        ? ["ArrowDown", "ArrowUp", "PageDown", "PageUp"]
+        : [],
+    [contentRef],
+  );
+
  const keyDownListener = useCallback(
    (e: KeyboardEvent) => {
      // @ts-expect-error we know this field exists
@@ -26,14 +34,44 @@ export default function useKeyboardListener(
        shift: e.shiftKey,
      };

-      if (keys.includes(e.key)) {
+      if (contentRef && pageKeys.includes(e.key)) {
+        switch (e.key) {
+          case "ArrowDown":
+            contentRef.current?.scrollBy({
+              top: 100,
+              behavior: "smooth",
+            });
+            break;
+          case "ArrowUp":
+            contentRef.current?.scrollBy({
+              top: -100,
+              behavior: "smooth",
+            });
+            break;
+          case "PageDown":
+            contentRef.current?.scrollBy({
+              top: contentRef.current.clientHeight / 2,
+              behavior: "smooth",
+            });
+            break;
+          case "PageUp":
+            contentRef.current?.scrollBy({
+              top: -contentRef.current.clientHeight / 2,
+              behavior: "smooth",
+            });
+            break;
+        }
+      } else if (keys.includes(e.key) && listener) {
+        const preventDefault = listener(e.key, modifiers);
        if (preventDefault) e.preventDefault();
-        listener(e.key, modifiers);
-      } else if (e.key === "Shift" || e.key === "Control" || e.key === "Meta") {
+      } else if (
+        listener &&
+        (e.key === "Shift" || e.key === "Control" || e.key === "Meta")
+      ) {
        listener(null, modifiers);
      }
    },
-    [keys, listener, preventDefault],
+    [keys, pageKeys, listener, contentRef],
  );

  const keyUpListener = useCallback(
@@ -49,10 +87,13 @@ export default function useKeyboardListener(
        shift: false,
      };

-      if (keys.includes(e.key)) {
-        e.preventDefault();
-        listener(e.key, modifiers);
-      } else if (e.key === "Shift" || e.key === "Control" || e.key === "Meta") {
+      if (listener && keys.includes(e.key)) {
+        const preventDefault = listener(e.key, modifiers);
+        if (preventDefault) e.preventDefault();
+      } else if (
+        listener &&
+        (e.key === "Shift" || e.key === "Control" || e.key === "Meta")
+      ) {
        listener(null, modifiers);
      }
    },
--- a/web/src/pages/Exports.tsx
+++ b/web/src/pages/Exports.tsx
@@ -13,12 +13,13 @@ import { Button } from "@/components/ui/button";
 import { Dialog, DialogContent, DialogTitle } from "@/components/ui/dialog";
 import { Input } from "@/components/ui/input";
 import { Toaster } from "@/components/ui/sonner";
+import useKeyboardListener from "@/hooks/use-keyboard-listener";
 import { useSearchEffect } from "@/hooks/use-overlay-state";
 import { cn } from "@/lib/utils";
 import { DeleteClipType, Export } from "@/types/export";
 import axios from "axios";

-import { useCallback, useEffect, useMemo, useState } from "react";
+import { useCallback, useEffect, useMemo, useRef, useState } from "react";
 import { isMobile } from "react-device-detect";
 import { useTranslation } from "react-i18next";

@@ -109,6 +110,11 @@ function Exports() {
    [mutate, t],
  );

+  // Keyboard Listener
+
+  const contentRef = useRef<HTMLDivElement | null>(null);
+  useKeyboardListener([], undefined, contentRef);
+
  return (
    <div className="flex size-full flex-col gap-2 overflow-hidden px-1 pt-2 md:p-2">
      <Toaster closeButton={true} />
@@ -194,7 +200,10 @@ function Exports() {

      <div className="w-full overflow-hidden">
        {exports && filteredExports && filteredExports.length > 0 ? (
-          <div className="scrollbar-container grid size-full gap-2 overflow-y-auto sm:grid-cols-2 lg:grid-cols-3 xl:grid-cols-4">
+          <div
+            ref={contentRef}
+            className="scrollbar-container grid size-full gap-2 overflow-y-auto sm:grid-cols-2 lg:grid-cols-3 xl:grid-cols-4"
+          >
            {Object.values(exports).map((item) => (
              <ExportCard
                key={item.name}
--- a/web/src/pages/FaceLibrary.tsx
+++ b/web/src/pages/FaceLibrary.tsx
@@ -46,7 +46,14 @@ import { FaceLibraryData, RecognizedFaceData } from "@/types/face";
 import { FaceRecognitionConfig, FrigateConfig } from "@/types/frigateConfig";
 import { TooltipPortal } from "@radix-ui/react-tooltip";
 import axios from "axios";
-import { useCallback, useEffect, useMemo, useRef, useState } from "react";
+import {
+  MutableRefObject,
+  useCallback,
+  useEffect,
+  useMemo,
+  useRef,
+  useState,
+} from "react";
 import { isDesktop, isMobile } from "react-device-detect";
 import { Trans, useTranslation } from "react-i18next";
 import {
@@ -109,6 +116,7 @@ export default function FaceLibrary() {
  const [upload, setUpload] = useState(false);
  const [addFace, setAddFace] = useState(false);

+  // input focus for keyboard shortcuts
  const onUploadImage = useCallback(
    (file: File) => {
      const formData = new FormData();
@@ -260,28 +268,37 @@ export default function FaceLibrary() {

  // keyboard

-  useKeyboardListener(["a", "Escape"], (key, modifiers) => {
-    if (modifiers.repeat || !modifiers.down) {
-      return;
-    }
+  const contentRef = useRef<HTMLDivElement | null>(null);
+  useKeyboardListener(
+    ["a", "Escape"],
+    (key, modifiers) => {
+      if (!modifiers.down) {
+        return true;
+      }

-    switch (key) {
-      case "a":
-        if (modifiers.ctrl) {
-          if (selectedFaces.length) {
-            setSelectedFaces([]);
-          } else {
-            setSelectedFaces([
-              ...(pageToggle === "train" ? trainImages : faceImages),
-            ]);
+      switch (key) {
+        case "a":
+          if (modifiers.ctrl && !modifiers.repeat) {
+            if (selectedFaces.length) {
+              setSelectedFaces([]);
+            } else {
+              setSelectedFaces([
+                ...(pageToggle === "train" ? trainImages : faceImages),
+              ]);
+            }
+
+            return true;
          }
-        }
-        break;
-      case "Escape":
-        setSelectedFaces([]);
-        break;
-    }
-  });
+          break;
+        case "Escape":
+          setSelectedFaces([]);
+          return true;
+      }
+
+      return false;
+    },
+    contentRef,
+  );

  useEffect(() => {
    setSelectedFaces([]);
@@ -401,6 +418,7 @@ export default function FaceLibrary() {
        (pageToggle == "train" ? (
          <TrainingGrid
            config={config}
+            contentRef={contentRef}
            attemptImages={trainImages}
            faceNames={faces}
            selectedFaces={selectedFaces}
@@ -409,6 +427,7 @@ export default function FaceLibrary() {
          />
        ) : (
          <FaceGrid
+            contentRef={contentRef}
            faceImages={faceImages}
            pageToggle={pageToggle}
            selectedFaces={selectedFaces}
@@ -601,6 +620,7 @@ function LibrarySelector({

 type TrainingGridProps = {
  config: FrigateConfig;
+  contentRef: MutableRefObject<HTMLDivElement | null>;
  attemptImages: string[];
  faceNames: string[];
  selectedFaces: string[];
@@ -609,6 +629,7 @@ type TrainingGridProps = {
 };
 function TrainingGrid({
  config,
+  contentRef,
  attemptImages,
  faceNames,
  selectedFaces,
@@ -691,7 +712,10 @@ function TrainingGrid({
        setInputFocused={() => {}}
      />

-      <div className="scrollbar-container flex flex-wrap gap-2 overflow-y-scroll p-1">
+      <div
+        ref={contentRef}
+        className="scrollbar-container flex flex-wrap gap-2 overflow-y-scroll p-1"
+      >
        {Object.entries(faceGroups).map(([key, group]) => {
          const event = events?.find((ev) => ev.id == key);
          return (
@@ -1029,6 +1053,7 @@ function FaceAttempt({
 }

 type FaceGridProps = {
+  contentRef: MutableRefObject<HTMLDivElement | null>;
  faceImages: string[];
  pageToggle: string;
  selectedFaces: string[];
@@ -1036,12 +1061,15 @@ type FaceGridProps = {
  onDelete: (name: string, ids: string[]) => void;
 };
 function FaceGrid({
+  contentRef,
  faceImages,
  pageToggle,
  selectedFaces,
  onClickFaces,
  onDelete,
 }: FaceGridProps) {
+  const { t } = useTranslation(["views/faceLibrary"]);
+
  const sortedFaces = useMemo(
    () => (faceImages || []).sort().reverse(),
    [faceImages],
@@ -1051,13 +1079,14 @@ function FaceGrid({
    return (
      <div className="absolute left-1/2 top-1/2 flex -translate-x-1/2 -translate-y-1/2 flex-col items-center justify-center text-center">
        <LuFolderCheck className="size-16" />
-        (t("nofaces"))
+        {t("nofaces")}
      </div>
    );
  }

  return (
    <div
+      ref={contentRef}
      className={cn(
        "scrollbar-container gap-2 overflow-y-scroll p-1",
        isDesktop ? "flex flex-wrap" : "grid grid-cols-2 md:grid-cols-4",
--- a/web/src/pages/Live.tsx
+++ b/web/src/pages/Live.tsx
@@ -56,14 +56,16 @@ function Live() {

  useKeyboardListener(["f"], (key, modifiers) => {
    if (!modifiers.down) {
-      return;
+      return true;
    }

    switch (key) {
      case "f":
        toggleFullscreen();
-        break;
+        return true;
    }
+
+    return false;
  });

  // document title
--- a/web/src/pages/Logs.tsx
+++ b/web/src/pages/Logs.tsx
@@ -337,7 +337,7 @@ function Logs() {
    ["PageDown", "PageUp", "ArrowDown", "ArrowUp"],
    (key, modifiers) => {
      if (!key || !modifiers.down || !lazyLogWrapperRef.current) {
-        return;
+        return true;
      }

      const container =
@@ -346,7 +346,7 @@ function Logs() {
      const logLineHeight = container?.querySelector(".log-line")?.clientHeight;

      if (!logLineHeight) {
-        return;
+        return true;
      }

      const scrollAmount = key.includes("Page")
@@ -354,6 +354,7 @@ function Logs() {
        : logLineHeight;
      const direction = key.includes("Down") ? 1 : -1;
      container?.scrollBy({ top: scrollAmount * direction });
+      return true;
    },
  );

--- a/web/src/views/classification/ModelTrainingView.tsx
+++ b/web/src/views/classification/ModelTrainingView.tsx
@@ -37,7 +37,14 @@ import { cn } from "@/lib/utils";
 import { CustomClassificationModelConfig } from "@/types/frigateConfig";
 import { TooltipPortal } from "@radix-ui/react-tooltip";
 import axios from "axios";
-import { useCallback, useEffect, useMemo, useState } from "react";
+import {
+  MutableRefObject,
+  useCallback,
+  useEffect,
+  useMemo,
+  useRef,
+  useState,
+} from "react";
 import { isDesktop, isMobile } from "react-device-detect";
 import { Trans, useTranslation } from "react-i18next";
 import { LuPencil, LuTrash2 } from "react-icons/lu";
@@ -226,30 +233,38 @@ export default function ModelTrainingView({ model }: ModelTrainingViewProps) {

  // keyboard

-  useKeyboardListener(["a", "Escape"], (key, modifiers) => {
-    if (modifiers.repeat || !modifiers.down) {
-      return;
-    }
+  const contentRef = useRef<HTMLDivElement | null>(null);
+  useKeyboardListener(
+    ["a", "Escape"],
+    (key, modifiers) => {
+      if (!modifiers.down) {
+        return true;
+      }

-    switch (key) {
-      case "a":
-        if (modifiers.ctrl) {
-          if (selectedImages.length) {
-            setSelectedImages([]);
-          } else {
-            setSelectedImages([
-              ...(pageToggle === "train"
-                ? trainImages || []
-                : dataset?.[pageToggle] || []),
-            ]);
+      switch (key) {
+        case "a":
+          if (modifiers.ctrl && !modifiers.repeat) {
+            if (selectedImages.length) {
+              setSelectedImages([]);
+            } else {
+              setSelectedImages([
+                ...(pageToggle === "train"
+                  ? trainImages || []
+                  : dataset?.[pageToggle] || []),
+              ]);
+            }
+            return true;
          }
-        }
-        break;
-      case "Escape":
-        setSelectedImages([]);
-        break;
-    }
-  });
+          break;
+        case "Escape":
+          setSelectedImages([]);
+          return true;
+      }
+
+      return false;
+    },
+    contentRef,
+  );

  useEffect(() => {
    setSelectedImages([]);
@@ -370,6 +385,7 @@ export default function ModelTrainingView({ model }: ModelTrainingViewProps) {
      {pageToggle == "train" ? (
        <TrainGrid
          model={model}
+          contentRef={contentRef}
          classes={Object.keys(dataset || {})}
          trainImages={trainImages || []}
          trainFilter={trainFilter}
@@ -380,6 +396,7 @@ export default function ModelTrainingView({ model }: ModelTrainingViewProps) {
        />
      ) : (
        <DatasetGrid
+          contentRef={contentRef}
          modelName={model.name}
          categoryName={pageToggle}
          images={dataset?.[pageToggle] || []}
@@ -579,6 +596,7 @@ function LibrarySelector({
 }

 type DatasetGridProps = {
+  contentRef: MutableRefObject<HTMLDivElement | null>;
  modelName: string;
  categoryName: string;
  images: string[];
@@ -587,6 +605,7 @@ type DatasetGridProps = {
  onDelete: (ids: string[]) => void;
 };
 function DatasetGrid({
+  contentRef,
  modelName,
  categoryName,
  images,
@@ -602,7 +621,10 @@ function DatasetGrid({
  );

  return (
-    <div className="flex flex-wrap gap-2 overflow-y-auto p-2">
+    <div
+      ref={contentRef}
+      className="scrollbar-container flex flex-wrap gap-2 overflow-y-auto p-2"
+    >
      {classData.map((image) => (
        <div
          className={cn(
@@ -658,6 +680,7 @@ function DatasetGrid({

 type TrainGridProps = {
  model: CustomClassificationModelConfig;
+  contentRef: MutableRefObject<HTMLDivElement | null>;
  classes: string[];
  trainImages: string[];
  trainFilter?: TrainFilter;
@@ -668,6 +691,7 @@ type TrainGridProps = {
 };
 function TrainGrid({
  model,
+  contentRef,
  classes,
  trainImages,
  trainFilter,
@@ -726,8 +750,9 @@ function TrainGrid({

  return (
    <div
+      ref={contentRef}
      className={cn(
-        "flex flex-wrap gap-2 overflow-y-auto p-2",
+        "scrollbar-container flex flex-wrap gap-2 overflow-y-auto p-2",
        isMobile && "justify-center",
      )}
    >
--- a/web/src/views/events/EventView.tsx
+++ b/web/src/views/events/EventView.tsx
@@ -650,42 +650,41 @@ function DetectionReview({

  // keyboard

-  useKeyboardListener(["a", "r", "PageDown", "PageUp"], (key, modifiers) => {
-    if (modifiers.repeat || !modifiers.down) {
-      return;
-    }
+  useKeyboardListener(
+    ["a", "r", "Escape"],
+    (key, modifiers) => {
+      if (!modifiers.down) {
+        return true;
+      }

-    switch (key) {
-      case "a":
-        if (modifiers.ctrl) {
-          onSelectAllReviews();
-        }
-        break;
-      case "r":
-        if (selectedReviews.length > 0) {
-          currentItems?.forEach((item) => {
-            if (selectedReviews.includes(item.id)) {
-              item.has_been_reviewed = true;
-              markItemAsReviewed(item);
-            }
-          });
+      switch (key) {
+        case "a":
+          if (modifiers.ctrl && !modifiers.repeat) {
+            onSelectAllReviews();
+            return true;
+          }
+          break;
+        case "r":
+          if (selectedReviews.length > 0 && !modifiers.repeat) {
+            currentItems?.forEach((item) => {
+              if (selectedReviews.includes(item.id)) {
+                item.has_been_reviewed = true;
+                markItemAsReviewed(item);
+              }
+            });
+            setSelectedReviews([]);
+            return true;
+          }
+          break;
+        case "Escape":
          setSelectedReviews([]);
-        }
-        break;
-      case "PageDown":
-        contentRef.current?.scrollBy({
-          top: contentRef.current.clientHeight / 2,
-          behavior: "smooth",
-        });
-        break;
-      case "PageUp":
-        contentRef.current?.scrollBy({
-          top: -contentRef.current.clientHeight / 2,
-          behavior: "smooth",
-        });
-        break;
-    }
-  });
+          return true;
+      }
+
+      return false;
+    },
+    contentRef,
+  );

  return (
    <>
--- a/web/src/views/live/LiveCameraView.tsx
+++ b/web/src/views/live/LiveCameraView.tsx
@@ -309,21 +309,25 @@ export default function LiveCameraView({

  useKeyboardListener(["m"], (key, modifiers) => {
    if (!modifiers.down) {
-      return;
+      return true;
    }

    switch (key) {
      case "m":
        if (supportsAudioOutput) {
          setAudio(!audio);
+          return true;
        }
        break;
      case "t":
        if (supports2WayTalk) {
          setMic(!mic);
+          return true;
        }
        break;
    }
+
+    return false;
  });

  // layout state
--- a/web/src/views/search/SearchView.tsx
+++ b/web/src/views/search/SearchView.tsx
@@ -308,16 +308,24 @@ export default function SearchView({

  const onKeyboardShortcut = useCallback(
    (key: string | null, modifiers: KeyModifiers) => {
-      if (!modifiers.down || !uniqueResults || inputFocused) {
-        return;
+      if (inputFocused) {
+        return false;
+      }
+
+      if (!modifiers.down || !uniqueResults) {
+        return true;
      }

      switch (key) {
        case "a":
-          if (modifiers.ctrl) {
+          if (modifiers.ctrl && !modifiers.repeat) {
            onSelectAllObjects();
+            return true;
          }
          break;
+        case "Escape":
+          setSelectedObjects([]);
+          return true;
        case "ArrowLeft":
          if (uniqueResults.length > 0) {
            const currentIndex = searchDetail
@@ -334,8 +342,7 @@ export default function SearchView({

            setSearchDetail(uniqueResults[newIndex]);
          }
-          break;
-
+          return true;
        case "ArrowRight":
          if (uniqueResults.length > 0) {
            const currentIndex = searchDetail
@@ -351,28 +358,18 @@ export default function SearchView({

            setSearchDetail(uniqueResults[newIndex]);
          }
-          break;
-        case "PageDown":
-          contentRef.current?.scrollBy({
-            top: contentRef.current.clientHeight / 2,
-            behavior: "smooth",
-          });
-          break;
-        case "PageUp":
-          contentRef.current?.scrollBy({
-            top: -contentRef.current.clientHeight / 2,
-            behavior: "smooth",
-          });
-          break;
+          return true;
      }
+
+      return false;
    },
    [uniqueResults, inputFocused, onSelectAllObjects, searchDetail],
  );

  useKeyboardListener(
-    ["a", "ArrowLeft", "ArrowRight", "PageDown", "PageUp"],
+    ["a", "Escape", "ArrowLeft", "ArrowRight"],
    onKeyboardShortcut,
-    !inputFocused,
+    contentRef,
  );

  // scroll into view
Author	SHA1	Message	Date
Josh Hawkins	aa1c0ded62	more docs clarity	2025-10-03 06:41:43 -05:00
Josh Hawkins	248d934d89	improve live view console errors	2025-10-03 06:36:54 -05:00
Nicolas Mowen	2d45ea271e	Refactor object genai to be a post-processor (#20331 ) * Refactor object genai to be a post-processor * Include function correctly	2025-10-02 12:48:11 -06:00
Nicolas Mowen	37999abbe6	Improve review summary performance (#20328 ) * Undo vite * Balance the prompt * Round duration * Calculate context size to determine number of images * Increase number of images	2025-10-02 10:17:25 -05:00
Nicolas Mowen	2030809a6d	Make keyboard shortcuts consistent (#20326 ) * Make keyboard shortcuts consistent * Cleanup * Refactor prevent default to not require separate input * Fix * Implement escape for reviews * Implement escape for explore * Send content ref to get page changes for free	2025-10-02 07:21:37 -06:00
Josh Hawkins	85ace6a6be	Add input focused boolean to face library keyboard listener (#20325 ) Because the "a" key is used by the keyboard listener for select all, this would prevent it from being used in the tracked object details pane. This change mimics what is already done in Explore.	2025-10-02 06:31:09 -06:00
Nicolas Mowen	ed6b892200	Fix object genai prompt access (#20322 )	2025-10-02 05:48:16 -06:00