Compare commits

..

59 Commits

Author SHA1 Message Date
blakeblackshear
700bd1e3ef use a thread to capture frames from the subprocess so it can be killed properly 2019-07-30 19:11:22 -05:00
Alexis Birkill
c9e9f7a735 Fix comparison of object x-coord against mask (#52) 2019-07-30 19:11:22 -05:00
blakeblackshear
aea4dc8724 a few fixes 2019-07-30 19:11:22 -05:00
blakeblackshear
12d5007b90 add required packages for VAAPI 2019-07-30 19:11:22 -05:00
blakeblackshear
8970e73f75 comment formatting and comment out mask in example config 2019-07-30 19:11:22 -05:00
blakeblackshear
1ba006b24f add some comments to the sample config 2019-07-30 19:11:22 -05:00
blakeblackshear
4a58f16637 tweak the label position 2019-07-30 19:11:22 -05:00
blakeblackshear
436b876b24 add support for ffmpeg hwaccel params and better mask handling 2019-07-30 19:11:22 -05:00
blakeblackshear
a770ab7f69 specify a client id for frigate 2019-07-30 19:11:22 -05:00
blakeblackshear
806acaf445 update dockerignore and debug option 2019-07-30 19:11:22 -05:00
Kyle Niewiada
c653567cc1 Add area labels to bounding boxes (#47)
* Add object size to the bounding box

Remove script from Dockerfile

Fix framerate command

Move default value for framerate

update dockerfile

dockerfile changes

Add person_area label to surrounding box


Update dockerfile


ffmpeg config bug


Add `person_area` label to `best_person` frame


Resolve debug view showing area label for non-persons


Add object size to the bounding box


Add object size to the bounding box

* Move object area outside of conditional to work with all object types
2019-07-30 19:11:22 -05:00
blakeblackshear
8fee8f86a2 take_frame config example 2019-07-30 19:11:22 -05:00
blakeblackshear
59a4b0e650 add ability to process every nth frame 2019-07-30 19:11:22 -05:00
blakeblackshear
834a3df0bc added missing scripts 2019-07-30 19:11:22 -05:00
blakeblackshear
c41b104997 extra ffmpeg params to reduce latency 2019-07-30 19:11:22 -05:00
blakeblackshear
7028b05856 add a benchmark script 2019-07-30 19:11:22 -05:00
blakeblackshear
2d22a04391 reduce verbosity of ffmpeg 2019-07-30 19:11:22 -05:00
blakeblackshear
baa587028b use a regular subprocess for ffmpeg, refactor bounding box drawing 2019-07-30 19:11:22 -05:00
blakeblackshear
2b51dc3e5b experimental: running ffmpeg directly and capturing raw frames 2019-07-30 19:11:22 -05:00
blakeblackshear
9f8278ea8f working odroid build, still needs hwaccel 2019-07-30 19:11:22 -05:00
Blake Blackshear
56b9c754f5 Update README.md 2019-06-18 06:19:13 -07:00
Blake Blackshear
5c4f5ef3f0 Create FUNDING.yml 2019-06-18 06:15:05 -07:00
Blake Blackshear
8c924896c5 Merge pull request #36 from drcrimzon/patch-1
Add MQTT connection error handling
2019-05-15 07:10:53 -05:00
Mike Wilkinson
2c2f0044b9 Remove error redundant check 2019-05-14 11:09:57 -04:00
Mike Wilkinson
874e9085a7 Add MQTT connection error handling 2019-05-14 08:34:14 -04:00
Blake Blackshear
e791d6646b Merge pull request #34 from blakeblackshear/watchdog
0.1.2
2019-05-11 07:43:09 -05:00
blakeblackshear
3019b0218c make the threshold configurable per region. fixes #31 2019-05-11 07:39:27 -05:00
blakeblackshear
6900e140d5 add a watchdog to the capture process to detect silent failures. fixes #27 2019-05-11 07:16:15 -05:00
Blake Blackshear
911c1b2bfa Merge pull request #32 from tubalainen/patch-2
Clarification on username and password for MQTT
2019-05-11 07:14:19 -05:00
Blake Blackshear
f4587462cf Merge pull request #33 from tubalainen/patch-3
Update of the home assistant integration example
2019-05-11 07:14:01 -05:00
tubalainen
cac1faa8ac Update of the home assistant integration example
sensor to binary_sensor
device_class type "moving" does not exist, update to "motion"
2019-05-10 16:47:40 +02:00
tubalainen
9525bae5a3 Clarification on username and password for MQTT 2019-05-10 16:36:22 +02:00
blakeblackshear
dbcfd109f6 fix missing import 2019-05-10 06:19:39 -05:00
Blake Blackshear
f95d8b6210 Merge pull request #26 from blakeblackshear/mask
add the ability to mask the standing location of a person
2019-05-01 06:43:32 -05:00
blakeblackshear
4dacf02ef9 add the ability to mask the standing location of a person 2019-04-30 20:35:22 -05:00
Blake Blackshear
3e803b6a03 Merge pull request #20 from blakeblackshear/edgetpu
edgetpu
2019-03-30 08:28:36 -05:00
blakeblackshear
7a7f507781 update diagram 2019-03-30 08:22:41 -05:00
blakeblackshear
e0b9b616ce cleanup and update readme 2019-03-30 07:58:31 -05:00
blakeblackshear
4476bd8a13 log capture process pid 2019-03-29 21:18:20 -05:00
blakeblackshear
5aa3775c77 create a camera object for each camera in the config 2019-03-29 21:14:24 -05:00
blakeblackshear
edf0cd36df add back flask endpoints 2019-03-29 21:02:40 -05:00
blakeblackshear
0279121d77 WIP: convert to camera class 2019-03-29 20:49:27 -05:00
blakeblackshear
8774e537dc implementing a config file for a single camera 2019-03-28 07:30:58 -05:00
blakeblackshear
0514eeac03 switch to a thread for object detection 2019-03-27 20:44:57 -05:00
blakeblackshear
a074945394 missing param and updated readme 2019-03-27 06:55:32 -05:00
blakeblackshear
a26d2217d4 implement min person size again 2019-03-27 06:45:27 -05:00
blakeblackshear
200d769003 removing motion detection 2019-03-27 06:17:00 -05:00
blakeblackshear
48aa245914 convert docker build to x86 2019-03-26 05:44:33 -05:00
blakeblackshear
ada8ffccf9 fix for queue size growing too large 2019-03-25 20:35:44 -05:00
blakeblackshear
bca4e78e9a use a queue instead 2019-03-25 06:24:36 -05:00
blakeblackshear
7d3027e056 looping over all regions with motion. ugly, but working 2019-03-20 07:11:38 -05:00
blakeblackshear
c406fda288 fixes 2019-03-19 06:29:58 -05:00
blakeblackshear
8ff9a982b6 start the detection process 2019-03-18 07:48:04 -05:00
blakeblackshear
f2c205be99 prep frames for object detection in a separate process 2019-03-18 07:24:24 -05:00
blakeblackshear
862aa2d3f0 only resize when needed 2019-03-17 20:12:31 -05:00
blakeblackshear
8bae05cfe2 first working version, single region and motion detection disabled 2019-03-17 09:03:52 -05:00
blakeblackshear
de9c3f4d74 wait 5 seconds to clear the motion flag 2019-03-15 20:16:19 -05:00
blakeblackshear
c12e19349e only cleanup old objects when motion is detected so stationary objects are still detected 2019-03-15 20:15:41 -05:00
blakeblackshear
afb70f11a8 switch mqtt to a binary on/off instead of sending a message for each score 2019-03-12 20:54:43 -05:00
20 changed files with 784 additions and 734 deletions

View File

@@ -1 +1,6 @@
README.md
README.md
diagram.png
.gitignore
debug
config/
*.pyc

1
.github/FUNDING.yml vendored Normal file
View File

@@ -0,0 +1 @@
ko_fi: blakeblackshear

View File

@@ -1,68 +1,70 @@
FROM ubuntu:16.04
FROM ubuntu:18.04
# Install system packages
RUN apt-get -qq update && apt-get -qq install --no-install-recommends -y python3 \
python3-dev \
python-pil \
python-lxml \
python-tk \
ARG DEVICE
# Install packages for apt repo
RUN apt-get -qq update && apt-get -qq install --no-install-recommends -y \
apt-transport-https \
ca-certificates \
curl \
wget \
gnupg-agent \
dirmngr \
software-properties-common \
&& rm -rf /var/lib/apt/lists/*
COPY scripts/install_odroid_repo.sh .
RUN if [ "$DEVICE" = "odroid" ]; then \
sh /install_odroid_repo.sh; \
fi
RUN apt-get -qq update && apt-get -qq install --no-install-recommends -y \
python3 \
# OpenCV dependencies
ffmpeg \
build-essential \
cmake \
git \
libgtk2.0-dev \
pkg-config \
libavcodec-dev \
libavformat-dev \
libswscale-dev \
libtbb2 \
libtbb-dev \
cmake \
unzip \
pkg-config \
libjpeg-dev \
libpng-dev \
libtiff-dev \
libjasper-dev \
libdc1394-22-dev \
x11-apps \
wget \
vim \
ffmpeg \
unzip \
libavcodec-dev \
libavformat-dev \
libswscale-dev \
libv4l-dev \
libxvidcore-dev \
libx264-dev \
libgtk-3-dev \
libatlas-base-dev \
gfortran \
python3-dev \
# Coral USB Python API Dependencies
libusb-1.0-0 \
python3-pip \
python3-pil \
python3-numpy \
libc++1 \
libc++abi1 \
libunwind8 \
libgcc1 \
# VAAPI drivers for Intel hardware accel
libva-drm2 libva2 i965-va-driver vainfo \
&& rm -rf /var/lib/apt/lists/*
# Install core packages
RUN wget -q -O /tmp/get-pip.py --no-check-certificate https://bootstrap.pypa.io/get-pip.py && python3 /tmp/get-pip.py
RUN pip install -U pip \
numpy \
matplotlib \
notebook \
jupyter \
pandas \
moviepy \
tensorflow \
keras \
autovizwidget \
Flask \
imutils \
paho-mqtt
# Install tensorflow models object detection
RUN GIT_SSL_NO_VERIFY=true git clone -q https://github.com/tensorflow/models /usr/local/lib/python3.5/dist-packages/tensorflow/models
RUN wget -q -P /usr/local/src/ --no-check-certificate https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-python-3.5.1.tar.gz
# Download & build protobuf-python
RUN cd /usr/local/src/ \
&& tar xf protobuf-python-3.5.1.tar.gz \
&& rm protobuf-python-3.5.1.tar.gz \
&& cd /usr/local/src/protobuf-3.5.1/ \
&& ./configure \
&& make \
&& make install \
&& ldconfig \
&& rm -rf /usr/local/src/protobuf-3.5.1/
# Add dataframe display widget
RUN jupyter nbextension enable --py --sys-prefix widgetsnbextension
paho-mqtt \
PyYAML
# Download & build OpenCV
# TODO: use multistage build to reduce image size:
# https://medium.com/@denismakogon/pain-and-gain-running-opencv-application-with-golang-and-docker-on-alpine-3-7-435aa11c7aec
# https://www.merixstudio.com/blog/docker-multi-stage-builds-python-development/
RUN wget -q -P /usr/local/src/ --no-check-certificate https://github.com/opencv/opencv/archive/4.0.1.zip
RUN cd /usr/local/src/ \
&& unzip 4.0.1.zip \
@@ -73,18 +75,35 @@ RUN cd /usr/local/src/ \
&& cmake -D CMAKE_INSTALL_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local/ .. \
&& make -j4 \
&& make install \
&& ldconfig \
&& rm -rf /usr/local/src/opencv-4.0.1
# Download and install EdgeTPU libraries for Coral
RUN wget https://dl.google.com/coral/edgetpu_api/edgetpu_api_latest.tar.gz -O edgetpu_api.tar.gz --trust-server-names \
&& tar xzf edgetpu_api.tar.gz
COPY scripts/install_edgetpu_api.sh edgetpu_api/install.sh
RUN cd edgetpu_api \
&& /bin/bash install.sh
# Copy a python 3.6 version
RUN cd /usr/local/lib/python3.6/dist-packages/edgetpu/swig/ \
&& ln -s _edgetpu_cpp_wrapper.cpython-35m-arm-linux-gnueabihf.so _edgetpu_cpp_wrapper.cpython-36m-arm-linux-gnueabihf.so
# symlink the model and labels
RUN wget https://dl.google.com/coral/canned_models/mobilenet_ssd_v2_coco_quant_postprocess_edgetpu.tflite -O mobilenet_ssd_v2_coco_quant_postprocess_edgetpu.tflite --trust-server-names
RUN wget https://dl.google.com/coral/canned_models/coco_labels.txt -O coco_labels.txt --trust-server-names
RUN ln -s mobilenet_ssd_v2_coco_quant_postprocess_edgetpu.tflite /frozen_inference_graph.pb
RUN ln -s /coco_labels.txt /label_map.pbtext
# Minimize image size
RUN (apt-get autoremove -y; \
apt-get autoclean -y)
# Set TF object detection available
ENV PYTHONPATH "$PYTHONPATH:/usr/local/lib/python3.5/dist-packages/tensorflow/models/research:/usr/local/lib/python3.5/dist-packages/tensorflow/models/research/slim"
RUN cd /usr/local/lib/python3.5/dist-packages/tensorflow/models/research && protoc object_detection/protos/*.proto --python_out=.
WORKDIR /opt/frigate/
ADD frigate frigate/
COPY detect_objects.py .
COPY benchmark.py .
CMD ["python3", "-u", "detect_objects.py"]
CMD ["python3", "-u", "detect_objects.py"]

121
README.md
View File

@@ -1,18 +1,20 @@
<a href='https://ko-fi.com/P5P7XGO9' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://az743702.vo.msecnd.net/cdn/kofi4.png?v=2' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>
# Frigate - Realtime Object Detection for RTSP Cameras
**Note:** This version requires the use of a [Google Coral USB Accelerator](https://coral.withgoogle.com/products/accelerator/)
Uses OpenCV and Tensorflow to perform realtime object detection locally for RTSP cameras. Designed for integration with HomeAssistant or others via MQTT.
- Leverages multiprocessing and threads heavily with an emphasis on realtime over processing every frame
- Allows you to define specific regions (squares) in the image to look for motion/objects
- Motion detection runs in a separate process per region and signals to object detection to avoid wasting CPU cycles looking for objects when there is no motion
- Object detection with Tensorflow runs in a separate process per region
- Detected objects are placed on a shared mp.Queue and aggregated into a list of recently detected objects in a separate thread
- A person score is calculated as the sum of all scores/5
- Motion and object info is published over MQTT for integration into HomeAssistant or others
- Allows you to define specific regions (squares) in the image to look for objects
- No motion detection (for now)
- Object detection with Tensorflow runs in a separate thread
- Object info is published over MQTT for integration into HomeAssistant as a binary sensor
- An endpoint is available to view an MJPEG stream for debugging
![Diagram](diagram.png)
## Example video
## Example video (from older version)
You see multiple bounding boxes because it draws bounding boxes from all frames in the past 1 second where a person was detected. Not all of the bounding boxes were from the current frame.
[![](http://img.youtube.com/vi/nqHbCtyo4dY/0.jpg)](http://www.youtube.com/watch?v=nqHbCtyo4dY "Frigate")
@@ -22,24 +24,16 @@ Build the container with
docker build -t frigate .
```
Download a model from the [zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md).
Download the cooresponding label map from [here](https://github.com/tensorflow/models/tree/master/research/object_detection/data).
The `mobilenet_ssd_v2_coco_quant_postprocess_edgetpu.tflite` model is included and used by default. You can use your own model and labels by mounting files in the container at `/frozen_inference_graph.pb` and `/label_map.pbtext`. Models must be compatible with the Coral according to [this](https://coral.withgoogle.com/models/).
Run the container with
```
docker run --rm \
-v <path_to_frozen_detection_graph.pb>:/frozen_inference_graph.pb:ro \
-v <path_to_labelmap.pbtext>:/label_map.pbtext:ro \
--privileged \
-v /dev/bus/usb:/dev/bus/usb \
-v <path_to_config_dir>:/config:ro \
-p 5000:5000 \
-e RTSP_URL='<rtsp_url>' \
-e REGIONS='<box_size_1>,<x_offset_1>,<y_offset_1>,<min_person_size_1>,<min_motion_size_1>,<mask_file_1>:<box_size_2>,<x_offset_2>,<y_offset_2>,<min_person_size_2>,<min_motion_size_2>,<mask_file_2>' \
-e MQTT_HOST='your.mqtthost.com' \
-e MQTT_USER='username' \
-e MQTT_PASS='password' \
-e MQTT_TOPIC_PREFIX='cameras/1' \
-e DEBUG='0' \
-e RTSP_PASSWORD='password' \
frigate:latest
```
@@ -48,100 +42,59 @@ Example docker-compose:
frigate:
container_name: frigate
restart: unless-stopped
privileged: true
image: frigate:latest
volumes:
- <path_to_frozen_detection_graph.pb>:/frozen_inference_graph.pb:ro
- <path_to_labelmap.pbtext>:/label_map.pbtext:ro
- /dev/bus/usb:/dev/bus/usb
- <path_to_config>:/config
ports:
- "127.0.0.1:5000:5000"
- "5000:5000"
environment:
RTSP_URL: "<rtsp_url>"
REGIONS: "<box_size_1>,<x_offset_1>,<y_offset_1>,<min_person_size_1>,<min_motion_size_1>,<mask_file_1>:<box_size_2>,<x_offset_2>,<y_offset_2>,<min_person_size_2>,<min_motion_size_2>,<mask_file_2>"
MQTT_HOST: "your.mqtthost.com"
MQTT_USER: "username" #optional
MQTT_PASS: "password" #optional
MQTT_TOPIC_PREFIX: "cameras/1"
DEBUG: "0"
RTSP_PASSWORD: "password"
```
Here is an example `REGIONS` env variable:
`350,0,300,5000,200,mask-0-300.bmp:400,350,250,2000,200,mask-350-250.bmp:400,750,250,2000,200,mask-750-250.bmp`
A `config.yml` file must exist in the `config` directory. See example [here](config/config.yml).
First region broken down (all are required):
- `350` - size of the square (350px by 350px)
- `0` - x coordinate of upper left corner (top left of image is 0,0)
- `300` - y coordinate of upper left corner (top left of image is 0,0)
- `5000` - minimum person bounding box size (width*height for bounding box of identified person)
- `200` - minimum number of changed pixels to trigger motion
- `mask-0-300.bmp` - a bmp file with the masked regions as pure black, must be the same size as the region
Mask files go in the `/config` directory.
Access the mjpeg stream at http://localhost:5000
Access the mjpeg stream at `http://localhost:5000/<camera_name>` and the best person snapshot at `http://localhost:5000/<camera_name>/best_person.jpg`
## Integration with HomeAssistant
```
camera:
- name: Camera Last Person
platform: generic
still_image_url: http://<ip>:5000/best_person.jpg
still_image_url: http://<ip>:5000/<camera_name>/best_person.jpg
binary_sensor:
- name: Camera Motion
- name: Camera Person
platform: mqtt
state_topic: "cameras/1/motion"
device_class: motion
availability_topic: "cameras/1/available"
sensor:
- name: Camera Person Score
platform: mqtt
state_topic: "cameras/1/objects"
state_topic: "frigate/<camera_name>/objects"
value_template: '{{ value_json.person }}'
unit_of_measurement: '%'
availability_topic: "cameras/1/available"
device_class: motion
availability_topic: "frigate/available"
```
## Tips
- Lower the framerate of the RTSP feed on the camera to reduce the CPU usage for capturing the feed
- Use SSDLite models to reduce CPU usage
## Future improvements
- [ ] Build tensorflow from source for CPU optimizations
- [x] Remove motion detection for now
- [x] Try running object detection in a thread rather than a process
- [x] Implement min person size again
- [x] Switch to a config file
- [x] Handle multiple cameras in the same container
- [ ] Attempt to figure out coral symlinking
- [ ] Add object list to config with min scores for mqtt
- [ ] Move mjpeg encoding to a separate process
- [ ] Simplify motion detection (check entire image against mask, resize instead of gaussian blur)
- [ ] See if motion detection is even worth running
- [ ] Scan for people across entire image rather than specfic regions
- [ ] Dynamically resize detection area and follow people
- [ ] Add ability to turn detection on and off via MQTT
- [ ] MQTT motion occasionally gets stuck ON
- [ ] Output movie clips of people for notifications, etc.
- [ ] Integrate with homeassistant push camera
- [ ] Merge bounding boxes that span multiple regions
- [ ] Switch to a config file
- [ ] Allow motion regions to be different than object detection regions
- [ ] Implement mode to save labeled objects for training
- [ ] Try and reduce CPU usage by simplifying the tensorflow model to just include the objects we care about
- [ ] Look into GPU accelerated decoding of RTSP stream
- [ ] Send video over a socket and use JSMPEG
- [ ] Look into neural compute stick
## Building Tensorflow from source for CPU optimizations
https://www.tensorflow.org/install/source#docker_linux_builds
used `tensorflow/tensorflow:1.12.0-devel-py3`
## Optimizing the graph (cant say I saw much difference in CPU usage)
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md#optimizing-for-deployment
```
docker run -it -v ${PWD}:/lab -v ${PWD}/../back_camera_model/models/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb:/frozen_inference_graph.pb:ro tensorflow/tensorflow:1.12.0-devel-py3 bash
bazel build tensorflow/tools/graph_transforms:transform_graph
bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=/frozen_inference_graph.pb \
--out_graph=/lab/optimized_inception_graph.pb \
--inputs='image_tensor' \
--outputs='num_detections,detection_scores,detection_boxes,detection_classes' \
--transforms='
strip_unused_nodes(type=float, shape="1,300,300,3")
remove_nodes(op=Identity, op=CheckNumerics)
fold_constants(ignore_errors=true)
fold_batch_norms
fold_old_batch_norms'
```
- [x] Look into neural compute stick

20
benchmark.py Normal file
View File

@@ -0,0 +1,20 @@
import statistics
import numpy as np
from edgetpu.detection.engine import DetectionEngine
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = '/frozen_inference_graph.pb'
# Load the edgetpu engine and labels
engine = DetectionEngine(PATH_TO_CKPT)
frame = np.zeros((300,300,3), np.uint8)
flattened_frame = np.expand_dims(frame, axis=0).flatten()
detection_times = []
for x in range(0, 1000):
objects = engine.DetectWithInputTensor(flattened_frame, threshold=0.1, top_k=3)
detection_times.append(engine.get_inference_time())
print("Average inference time: " + str(statistics.mean(detection_times)))

BIN
config/back-mask.bmp Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB

65
config/config.yml Normal file
View File

@@ -0,0 +1,65 @@
web_port: 5000
mqtt:
host: mqtt.server.com
topic_prefix: frigate
# user: username # Optional -- Uncomment for use
# password: password # Optional -- Uncomment for use
cameras:
back:
rtsp:
user: viewer
host: 10.0.10.10
port: 554
# values that begin with a "$" will be replaced with environment variable
password: $RTSP_PASSWORD
path: /cam/realmonitor?channel=1&subtype=2
################
## Optional mask. Must be the same dimensions as your video feed.
## The mask works by looking at the bottom center of the bounding box for the detected
## person in the image. If that pixel in the mask is a black pixel, it ignores it as a
## false positive. In my mask, the grass and driveway visible from my backdoor camera
## are white. The garage doors, sky, and trees (anywhere it would be impossible for a
## person to stand) are black.
################
# mask: back-mask.bmp
################
# Allows you to limit the framerate within frigate for cameras that do not support
# custom framerates. A value of 1 tells frigate to look at every frame, 2 every 2nd frame,
# 3 every 3rd frame, etc.
################
take_frame: 1
################
# Optional hardware acceleration parameters for ffmpeg. If your hardware supports it, it can
# greatly reduce the CPU power used to decode the video stream. You will need to determine which
# parameters work for your specific hardware. These may work for those with Intel hardware that
# supports QuickSync.
################
# ffmpeg_hwaccel_args:
# - -hwaccel
# - vaapi
# - -hwaccel_device
# - /dev/dri/renderD128
# - -hwaccel_output_format
# - yuv420p
regions:
- size: 350
x_offset: 0
y_offset: 300
min_person_area: 5000
threshold: 0.5
- size: 400
x_offset: 350
y_offset: 250
min_person_area: 2000
threshold: 0.5
- size: 400
x_offset: 750
y_offset: 250
min_person_area: 2000
threshold: 0.5

Binary file not shown.

Before

Width:  |  Height:  |  Size: 239 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 313 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 313 KiB

View File

@@ -1,247 +1,99 @@
import os
import cv2
import imutils
import time
import datetime
import ctypes
import logging
import multiprocessing as mp
import threading
import json
from contextlib import closing
import queue
import yaml
import numpy as np
from object_detection.utils import visualization_utils as vis_util
from flask import Flask, Response, make_response, send_file
from flask import Flask, Response, make_response
import paho.mqtt.client as mqtt
from frigate.util import tonumpyarray
from frigate.mqtt import MqttMotionPublisher, MqttObjectPublisher
from frigate.objects import ObjectParser, ObjectCleaner, BestPersonFrame
from frigate.motion import detect_motion
from frigate.video import fetch_frames, FrameTracker
from frigate.object_detection import detect_objects
from frigate.video import Camera
from frigate.object_detection import PreppedQueueProcessor
RTSP_URL = os.getenv('RTSP_URL')
with open('/config/config.yml') as f:
CONFIG = yaml.safe_load(f)
MQTT_HOST = os.getenv('MQTT_HOST')
MQTT_USER = os.getenv('MQTT_USER')
MQTT_PASS = os.getenv('MQTT_PASS')
MQTT_TOPIC_PREFIX = os.getenv('MQTT_TOPIC_PREFIX')
MQTT_HOST = CONFIG['mqtt']['host']
MQTT_PORT = CONFIG.get('mqtt', {}).get('port', 1883)
MQTT_TOPIC_PREFIX = CONFIG.get('mqtt', {}).get('topic_prefix', 'frigate')
MQTT_USER = CONFIG.get('mqtt', {}).get('user')
MQTT_PASS = CONFIG.get('mqtt', {}).get('password')
# REGIONS = "350,0,300,50:400,350,250,50:400,750,250,50"
# REGIONS = "400,350,250,50"
REGIONS = os.getenv('REGIONS')
DEBUG = (os.getenv('DEBUG') == '1')
WEB_PORT = CONFIG.get('web_port', 5000)
DEBUG = (CONFIG.get('debug', '0') == '1')
def main():
DETECTED_OBJECTS = []
recent_motion_frames = {}
# Parse selected regions
regions = []
for region_string in REGIONS.split(':'):
region_parts = region_string.split(',')
region_mask_image = cv2.imread("/config/{}".format(region_parts[5]), cv2.IMREAD_GRAYSCALE)
region_mask = np.where(region_mask_image==[0])
regions.append({
'size': int(region_parts[0]),
'x_offset': int(region_parts[1]),
'y_offset': int(region_parts[2]),
'min_person_area': int(region_parts[3]),
'min_object_size': int(region_parts[4]),
'mask': region_mask,
# Event for motion detection signaling
'motion_detected': mp.Event(),
# create shared array for storing 10 detected objects
# note: this must be a double even though the value you are storing
# is a float. otherwise it stops updating the value in shared
# memory. probably something to do with the size of the memory block
'output_array': mp.Array(ctypes.c_double, 6*10)
})
# capture a single frame and check the frame shape so the correct array
# size can be allocated in memory
video = cv2.VideoCapture(RTSP_URL)
ret, frame = video.read()
if ret:
frame_shape = frame.shape
else:
print("Unable to capture video stream")
exit(1)
video.release()
# compute the flattened array length from the array shape
flat_array_length = frame_shape[0] * frame_shape[1] * frame_shape[2]
# create shared array for storing the full frame image data
shared_arr = mp.Array(ctypes.c_uint16, flat_array_length)
# create shared value for storing the frame_time
shared_frame_time = mp.Value('d', 0.0)
# Lock to control access to the frame
frame_lock = mp.Lock()
# Condition for notifying that a new frame is ready
frame_ready = mp.Condition()
# Condition for notifying that motion status changed globally
motion_changed = mp.Condition()
# Condition for notifying that objects were parsed
objects_parsed = mp.Condition()
# Queue for detected objects
object_queue = mp.Queue()
# shape current frame so it can be treated as an image
frame_arr = tonumpyarray(shared_arr).reshape(frame_shape)
# start the process to capture frames from the RTSP stream and store in a shared array
capture_process = mp.Process(target=fetch_frames, args=(shared_arr,
shared_frame_time, frame_lock, frame_ready, frame_shape, RTSP_URL))
capture_process.daemon = True
# for each region, start a separate process for motion detection and object detection
detection_processes = []
motion_processes = []
for region in regions:
detection_process = mp.Process(target=detect_objects, args=(shared_arr,
object_queue,
shared_frame_time,
frame_lock, frame_ready,
region['motion_detected'],
frame_shape,
region['size'], region['x_offset'], region['y_offset'],
region['min_person_area'],
DEBUG))
detection_process.daemon = True
detection_processes.append(detection_process)
motion_process = mp.Process(target=detect_motion, args=(shared_arr,
shared_frame_time,
frame_lock, frame_ready,
region['motion_detected'],
motion_changed,
frame_shape,
region['size'], region['x_offset'], region['y_offset'],
region['min_object_size'], region['mask'],
DEBUG))
motion_process.daemon = True
motion_processes.append(motion_process)
# start a thread to store recent motion frames for processing
frame_tracker = FrameTracker(frame_arr, shared_frame_time, frame_ready, frame_lock,
recent_motion_frames, motion_changed, [region['motion_detected'] for region in regions])
frame_tracker.start()
# start a thread to store the highest scoring recent person frame
best_person_frame = BestPersonFrame(objects_parsed, recent_motion_frames, DETECTED_OBJECTS,
motion_changed, [region['motion_detected'] for region in regions])
best_person_frame.start()
# start a thread to parse objects from the queue
object_parser = ObjectParser(object_queue, objects_parsed, DETECTED_OBJECTS)
object_parser.start()
# start a thread to expire objects from the detected objects list
object_cleaner = ObjectCleaner(objects_parsed, DETECTED_OBJECTS)
object_cleaner.start()
# connect to mqtt and setup last will
def on_connect(client, userdata, flags, rc):
def on_connect(client, userdata, flags, rc):
print("On connect called")
if rc != 0:
if rc == 3:
print ("MQTT Server unavailable")
elif rc == 4:
print ("MQTT Bad username or password")
elif rc == 5:
print ("MQTT Not authorized")
else:
print ("Unable to connect to MQTT: Connection refused. Error code: " + str(rc))
# publish a message to signal that the service is running
client.publish(MQTT_TOPIC_PREFIX+'/available', 'online', retain=True)
client = mqtt.Client()
client = mqtt.Client(client_id="frigate")
client.on_connect = on_connect
client.will_set(MQTT_TOPIC_PREFIX+'/available', payload='offline', qos=1, retain=True)
if not MQTT_USER is None:
client.username_pw_set(MQTT_USER, password=MQTT_PASS)
client.connect(MQTT_HOST, 1883, 60)
client.connect(MQTT_HOST, MQTT_PORT, 60)
client.loop_start()
# start a thread to publish object scores (currently only person)
mqtt_publisher = MqttObjectPublisher(client, MQTT_TOPIC_PREFIX, objects_parsed, DETECTED_OBJECTS)
mqtt_publisher.start()
# start thread to publish motion status
mqtt_motion_publisher = MqttMotionPublisher(client, MQTT_TOPIC_PREFIX, motion_changed,
[region['motion_detected'] for region in regions])
mqtt_motion_publisher.start()
# start the process of capturing frames
capture_process.start()
print("capture_process pid ", capture_process.pid)
# start the object detection processes
for detection_process in detection_processes:
detection_process.start()
print("detection_process pid ", detection_process.pid)
# start the motion detection processes
for motion_process in motion_processes:
motion_process.start()
print("motion_process pid ", motion_process.pid)
# Queue for prepped frames, max size set to (number of cameras * 5)
max_queue_size = len(CONFIG['cameras'].items())*5
prepped_frame_queue = queue.Queue(max_queue_size)
cameras = {}
for name, config in CONFIG['cameras'].items():
cameras[name] = Camera(name, config, prepped_frame_queue, client, MQTT_TOPIC_PREFIX)
prepped_queue_processor = PreppedQueueProcessor(
cameras,
prepped_frame_queue
)
prepped_queue_processor.start()
for name, camera in cameras.items():
camera.start()
print("Capture process for {}: {}".format(name, camera.get_capture_pid()))
# create a flask app that encodes frames a mjpeg on demand
app = Flask(__name__)
@app.route('/best_person.jpg')
def best_person():
frame = np.zeros(frame_shape, np.uint8) if best_person_frame.best_frame is None else best_person_frame.best_frame
ret, jpg = cv2.imencode('.jpg', frame)
@app.route('/<camera_name>/best_person.jpg')
def best_person(camera_name):
best_person_frame = cameras[camera_name].get_best_person()
if best_person_frame is None:
best_person_frame = np.zeros((720,1280,3), np.uint8)
ret, jpg = cv2.imencode('.jpg', best_person_frame)
response = make_response(jpg.tobytes())
response.headers['Content-Type'] = 'image/jpg'
return response
@app.route('/')
def index():
@app.route('/<camera_name>')
def mjpeg_feed(camera_name):
# return a multipart response
return Response(imagestream(),
return Response(imagestream(camera_name),
mimetype='multipart/x-mixed-replace; boundary=frame')
def imagestream():
def imagestream(camera_name):
while True:
# max out at 5 FPS
time.sleep(0.2)
# make a copy of the current detected objects
detected_objects = DETECTED_OBJECTS.copy()
# lock and make a copy of the current frame
with frame_lock:
frame = frame_arr.copy()
# convert to RGB for drawing
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# draw the bounding boxes on the screen
for obj in detected_objects:
vis_util.draw_bounding_box_on_image_array(frame,
obj['ymin'],
obj['xmin'],
obj['ymax'],
obj['xmax'],
color='red',
thickness=2,
display_str_list=["{}: {}%".format(obj['name'],int(obj['score']*100))],
use_normalized_coordinates=False)
for region in regions:
color = (255,255,255)
if region['motion_detected'].is_set():
color = (0,255,0)
cv2.rectangle(frame, (region['x_offset'], region['y_offset']),
(region['x_offset']+region['size'], region['y_offset']+region['size']),
color, 2)
# convert back to BGR
frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
frame = cameras[camera_name].get_current_frame_with_objects()
# encode the image into a jpg
ret, jpg = cv2.imencode('.jpg', frame)
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + jpg.tobytes() + b'\r\n\r\n')
app.run(host='0.0.0.0', debug=False)
app.run(host='0.0.0.0', port=WEB_PORT, debug=False)
capture_process.join()
for detection_process in detection_processes:
detection_process.join()
for motion_process in motion_processes:
motion_process.join()
frame_tracker.join()
best_person_frame.join()
object_parser.join()
object_cleaner.join()
mqtt_publisher.join()
camera.join()
if __name__ == '__main__':
main()
main()

Binary file not shown.

Before

Width:  |  Height:  |  Size: 308 KiB

After

Width:  |  Height:  |  Size: 283 KiB

View File

@@ -1,109 +0,0 @@
import datetime
import numpy as np
import cv2
import imutils
from . util import tonumpyarray
# do the actual motion detection
def detect_motion(shared_arr, shared_frame_time, frame_lock, frame_ready, motion_detected, motion_changed,
frame_shape, region_size, region_x_offset, region_y_offset, min_motion_area, mask, debug):
# shape shared input array into frame for processing
arr = tonumpyarray(shared_arr).reshape(frame_shape)
avg_frame = None
avg_delta = None
frame_time = 0.0
motion_frames = 0
while True:
now = datetime.datetime.now().timestamp()
with frame_ready:
# if there isnt a frame ready for processing or it is old, wait for a signal
if shared_frame_time.value == frame_time or (now - shared_frame_time.value) > 0.5:
frame_ready.wait()
# lock and make a copy of the cropped frame
with frame_lock:
cropped_frame = arr[region_y_offset:region_y_offset+region_size, region_x_offset:region_x_offset+region_size].copy().astype('uint8')
frame_time = shared_frame_time.value
# convert to grayscale
gray = cv2.cvtColor(cropped_frame, cv2.COLOR_BGR2GRAY)
# apply image mask to remove areas from motion detection
gray[mask] = [255]
# apply gaussian blur
gray = cv2.GaussianBlur(gray, (21, 21), 0)
if avg_frame is None:
avg_frame = gray.copy().astype("float")
continue
# look at the delta from the avg_frame
frameDelta = cv2.absdiff(gray, cv2.convertScaleAbs(avg_frame))
if avg_delta is None:
avg_delta = frameDelta.copy().astype("float")
# compute the average delta over the past few frames
# the alpha value can be modified to configure how sensitive the motion detection is.
# higher values mean the current frame impacts the delta a lot, and a single raindrop may
# register as motion, too low and a fast moving person wont be detected as motion
# this also assumes that a person is in the same location across more than a single frame
cv2.accumulateWeighted(frameDelta, avg_delta, 0.2)
# compute the threshold image for the current frame
current_thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]
# black out everything in the avg_delta where there isnt motion in the current frame
avg_delta_image = cv2.convertScaleAbs(avg_delta)
avg_delta_image[np.where(current_thresh==[0])] = [0]
# then look for deltas above the threshold, but only in areas where there is a delta
# in the current frame. this prevents deltas from previous frames from being included
thresh = cv2.threshold(avg_delta_image, 25, 255, cv2.THRESH_BINARY)[1]
# dilate the thresholded image to fill in holes, then find contours
# on thresholded image
thresh = cv2.dilate(thresh, None, iterations=2)
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
motion_found = False
# loop over the contours
for c in cnts:
# if the contour is big enough, count it as motion
contour_area = cv2.contourArea(c)
if contour_area > min_motion_area:
motion_found = True
if debug:
cv2.drawContours(cropped_frame, [c], -1, (0, 255, 0), 2)
x, y, w, h = cv2.boundingRect(c)
cv2.putText(cropped_frame, str(contour_area), (x, y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 100, 0), 2)
else:
break
if motion_found:
motion_frames += 1
# if there have been enough consecutive motion frames, report motion
if motion_frames >= 3:
# only average in the current frame if the difference persists for at least 3 frames
cv2.accumulateWeighted(gray, avg_frame, 0.01)
motion_detected.set()
with motion_changed:
motion_changed.notify_all()
else:
# when no motion, just keep averaging the frames together
cv2.accumulateWeighted(gray, avg_frame, 0.01)
motion_frames = 0
if motion_detected.is_set():
motion_detected.clear()
with motion_changed:
motion_changed.notify_all()
if debug and motion_frames == 3:
cv2.imwrite("/lab/debug/motion-{}-{}-{}.jpg".format(region_x_offset, region_y_offset, datetime.datetime.now().timestamp()), cropped_frame)
cv2.imwrite("/lab/debug/avg_delta-{}-{}-{}.jpg".format(region_x_offset, region_y_offset, datetime.datetime.now().timestamp()), avg_delta_image)

View File

@@ -1,29 +1,6 @@
import json
import threading
class MqttMotionPublisher(threading.Thread):
def __init__(self, client, topic_prefix, motion_changed, motion_flags):
threading.Thread.__init__(self)
self.client = client
self.topic_prefix = topic_prefix
self.motion_changed = motion_changed
self.motion_flags = motion_flags
def run(self):
last_sent_motion = ""
while True:
with self.motion_changed:
self.motion_changed.wait()
# send message for motion
motion_status = 'OFF'
if any(obj.is_set() for obj in self.motion_flags):
motion_status = 'ON'
if last_sent_motion != motion_status:
last_sent_motion = motion_status
self.client.publish(self.topic_prefix+'/motion', motion_status, retain=False)
class MqttObjectPublisher(threading.Thread):
def __init__(self, client, topic_prefix, objects_parsed, detected_objects):
threading.Thread.__init__(self)
@@ -43,11 +20,11 @@ class MqttObjectPublisher(threading.Thread):
with self.objects_parsed:
self.objects_parsed.wait()
# add all the person scores in detected objects and
# average over past 1 seconds (5fps)
# add all the person scores in detected objects
detected_objects = self._detected_objects.copy()
avg_person_score = sum([obj['score'] for obj in detected_objects if obj['name'] == 'person'])/5
payload['person'] = int(avg_person_score*100)
person_score = sum([obj['score'] for obj in detected_objects if obj['name'] == 'person'])
# if the person score is more than 100, set person to ON
payload['person'] = 'ON' if int(person_score*100) > 100 else 'OFF'
# send message for objects if different
new_payload = json.dumps(payload, sort_keys=True)

View File

@@ -1,114 +1,112 @@
import datetime
import time
import cv2
import threading
import numpy as np
import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
from edgetpu.detection.engine import DetectionEngine
from . util import tonumpyarray
# TODO: make dynamic?
NUM_CLASSES = 90
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = '/frozen_inference_graph.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = '/label_map.pbtext'
# Loading label map
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES,
use_display_name=True)
category_index = label_map_util.create_category_index(categories)
# Function to read labels from text files.
def ReadLabelFile(file_path):
with open(file_path, 'r') as f:
lines = f.readlines()
ret = {}
for line in lines:
pair = line.strip().split(maxsplit=1)
ret[int(pair[0])] = pair[1].strip()
return ret
# do the actual object detection
def tf_detect_objects(cropped_frame, sess, detection_graph, region_size, region_x_offset, region_y_offset, debug):
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(cropped_frame, axis=0)
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
class PreppedQueueProcessor(threading.Thread):
def __init__(self, cameras, prepped_frame_queue):
# Each box represents a part of the image where a particular object was detected.
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
scores = detection_graph.get_tensor_by_name('detection_scores:0')
classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
# Actual detection.
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
if debug:
if len([value for index,value in enumerate(classes[0]) if str(category_index.get(value).get('name')) == 'person' and scores[0,index] > 0.5]) > 0:
vis_util.visualize_boxes_and_labels_on_image_array(
cropped_frame,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=4)
cv2.imwrite("/lab/debug/obj-{}-{}-{}.jpg".format(region_x_offset, region_y_offset, datetime.datetime.now().timestamp()), cropped_frame)
# build an array of detected objects
objects = []
for index, value in enumerate(classes[0]):
score = scores[0, index]
if score > 0.5:
box = boxes[0, index].tolist()
objects.append({
'name': str(category_index.get(value).get('name')),
'score': float(score),
'ymin': int((box[0] * region_size) + region_y_offset),
'xmin': int((box[1] * region_size) + region_x_offset),
'ymax': int((box[2] * region_size) + region_y_offset),
'xmax': int((box[3] * region_size) + region_x_offset)
})
return objects
def detect_objects(shared_arr, object_queue, shared_frame_time, frame_lock, frame_ready,
motion_detected, frame_shape, region_size, region_x_offset, region_y_offset,
min_person_area, debug):
# shape shared input array into frame for processing
arr = tonumpyarray(shared_arr).reshape(frame_shape)
# Load a (frozen) Tensorflow model into memory before the processing loop
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
sess = tf.Session(graph=detection_graph)
frame_time = 0.0
while True:
now = datetime.datetime.now().timestamp()
# wait until motion is detected
motion_detected.wait()
with frame_ready:
# if there isnt a frame ready for processing or it is old, wait for a new frame
if shared_frame_time.value == frame_time or (now - shared_frame_time.value) > 0.5:
frame_ready.wait()
threading.Thread.__init__(self)
self.cameras = cameras
self.prepped_frame_queue = prepped_frame_queue
# make a copy of the cropped frame
with frame_lock:
cropped_frame = arr[region_y_offset:region_y_offset+region_size, region_x_offset:region_x_offset+region_size].copy()
frame_time = shared_frame_time.value
# Load the edgetpu engine and labels
self.engine = DetectionEngine(PATH_TO_CKPT)
self.labels = ReadLabelFile(PATH_TO_LABELS)
# convert to RGB
cropped_frame_rgb = cv2.cvtColor(cropped_frame, cv2.COLOR_BGR2RGB)
# do the object detection
objects = tf_detect_objects(cropped_frame_rgb, sess, detection_graph, region_size, region_x_offset, region_y_offset, debug)
for obj in objects:
# ignore persons below the size threshold
if obj['name'] == 'person' and (obj['xmax']-obj['xmin'])*(obj['ymax']-obj['ymin']) < min_person_area:
continue
obj['frame_time'] = frame_time
object_queue.put(obj)
def run(self):
# process queue...
while True:
frame = self.prepped_frame_queue.get()
# Actual detection.
objects = self.engine.DetectWithInputTensor(frame['frame'], threshold=frame['region_threshold'], top_k=3)
# print(self.engine.get_inference_time())
# parse and pass detected objects back to the camera
parsed_objects = []
for obj in objects:
box = obj.bounding_box.flatten().tolist()
parsed_objects.append({
'frame_time': frame['frame_time'],
'name': str(self.labels[obj.label_id]),
'score': float(obj.score),
'xmin': int((box[0] * frame['region_size']) + frame['region_x_offset']),
'ymin': int((box[1] * frame['region_size']) + frame['region_y_offset']),
'xmax': int((box[2] * frame['region_size']) + frame['region_x_offset']),
'ymax': int((box[3] * frame['region_size']) + frame['region_y_offset'])
})
self.cameras[frame['camera_name']].add_objects(parsed_objects)
# should this be a region class?
class FramePrepper(threading.Thread):
def __init__(self, camera_name, shared_frame, frame_time, frame_ready,
frame_lock,
region_size, region_x_offset, region_y_offset, region_threshold,
prepped_frame_queue):
threading.Thread.__init__(self)
self.camera_name = camera_name
self.shared_frame = shared_frame
self.frame_time = frame_time
self.frame_ready = frame_ready
self.frame_lock = frame_lock
self.region_size = region_size
self.region_x_offset = region_x_offset
self.region_y_offset = region_y_offset
self.region_threshold = region_threshold
self.prepped_frame_queue = prepped_frame_queue
def run(self):
frame_time = 0.0
while True:
now = datetime.datetime.now().timestamp()
with self.frame_ready:
# if there isnt a frame ready for processing or it is old, wait for a new frame
if self.frame_time.value == frame_time or (now - self.frame_time.value) > 0.5:
self.frame_ready.wait()
# make a copy of the cropped frame
with self.frame_lock:
cropped_frame = self.shared_frame[self.region_y_offset:self.region_y_offset+self.region_size, self.region_x_offset:self.region_x_offset+self.region_size].copy()
frame_time = self.frame_time.value
# Resize to 300x300 if needed
if cropped_frame.shape != (300, 300, 3):
cropped_frame = cv2.resize(cropped_frame, dsize=(300, 300), interpolation=cv2.INTER_LINEAR)
# Expand dimensions since the model expects images to have shape: [1, 300, 300, 3]
frame_expanded = np.expand_dims(cropped_frame, axis=0)
# add the frame to the queue
if not self.prepped_frame_queue.full():
self.prepped_frame_queue.put({
'camera_name': self.camera_name,
'frame_time': frame_time,
'frame': frame_expanded.flatten().copy(),
'region_size': self.region_size,
'region_threshold': self.region_threshold,
'region_x_offset': self.region_x_offset,
'region_y_offset': self.region_y_offset
})
else:
print("queue full. moving on")

View File

@@ -2,22 +2,7 @@ import time
import datetime
import threading
import cv2
from object_detection.utils import visualization_utils as vis_util
class ObjectParser(threading.Thread):
def __init__(self, object_queue, objects_parsed, detected_objects):
threading.Thread.__init__(self)
self._object_queue = object_queue
self._objects_parsed = objects_parsed
self._detected_objects = detected_objects
def run(self):
while True:
obj = self._object_queue.get()
self._detected_objects.append(obj)
# notify that objects were parsed
with self._objects_parsed:
self._objects_parsed.notify_all()
from . util import draw_box_with_label
class ObjectCleaner(threading.Thread):
def __init__(self, objects_parsed, detected_objects):
@@ -28,14 +13,18 @@ class ObjectCleaner(threading.Thread):
def run(self):
while True:
# wait a bit before checking for expired frames
time.sleep(0.2)
# expire the objects that are more than 1 second old
now = datetime.datetime.now().timestamp()
# look for the first object found within the last second
# (newest objects are appended to the end)
detected_objects = self._detected_objects.copy()
num_to_delete = 0
for obj in detected_objects:
if now-obj['frame_time']<1:
if now-obj['frame_time']<2:
break
num_to_delete += 1
if num_to_delete > 0:
@@ -44,80 +33,56 @@ class ObjectCleaner(threading.Thread):
# notify that parsed objects were changed
with self._objects_parsed:
self._objects_parsed.notify_all()
# wait a bit before checking for more expired frames
time.sleep(0.2)
# Maintains the frame and person with the highest score from the most recent
# motion event
class BestPersonFrame(threading.Thread):
def __init__(self, objects_parsed, recent_frames, detected_objects, motion_changed, motion_regions):
def __init__(self, objects_parsed, recent_frames, detected_objects):
threading.Thread.__init__(self)
self.objects_parsed = objects_parsed
self.recent_frames = recent_frames
self.detected_objects = detected_objects
self.motion_changed = motion_changed
self.motion_regions = motion_regions
self.best_person = None
self.best_frame = None
def run(self):
motion_start = 0.0
motion_end = 0.0
while True:
# while there is motion
while len([r for r in self.motion_regions if r.is_set()]) > 0:
# wait until objects have been parsed
with self.objects_parsed:
self.objects_parsed.wait()
# wait until objects have been parsed
with self.objects_parsed:
self.objects_parsed.wait()
# make a copy of detected objects
detected_objects = self.detected_objects.copy()
detected_people = [obj for obj in detected_objects if obj['name'] == 'person']
# make a copy of the recent frames
recent_frames = self.recent_frames.copy()
# make a copy of detected objects
detected_objects = self.detected_objects.copy()
detected_people = [obj for obj in detected_objects if obj['name'] == 'person']
# get the highest scoring person
new_best_person = max(detected_people, key=lambda x:x['score'], default=self.best_person)
# get the highest scoring person
new_best_person = max(detected_people, key=lambda x:x['score'], default=self.best_person)
# if there isnt a person, continue
if new_best_person is None:
continue
# if there isnt a person, continue
if new_best_person is None:
continue
# if there is no current best_person
if self.best_person is None:
# if there is no current best_person
if self.best_person is None:
self.best_person = new_best_person
# if there is already a best_person
else:
now = datetime.datetime.now().timestamp()
# if the new best person is a higher score than the current best person
# or the current person is more than 1 minute old, use the new best person
if new_best_person['score'] > self.best_person['score'] or (now - self.best_person['frame_time']) > 60:
self.best_person = new_best_person
# if there is already a best_person
else:
now = datetime.datetime.now().timestamp()
# if the new best person is a higher score than the current best person
# or the current person is more than 1 minute old, use the new best person
if new_best_person['score'] > self.best_person['score'] or (now - self.best_person['frame_time']) > 60:
self.best_person = new_best_person
if not self.best_person is None and self.best_person['frame_time'] in recent_frames:
best_frame = recent_frames[self.best_person['frame_time']]
best_frame = cv2.cvtColor(best_frame, cv2.COLOR_BGR2RGB)
# draw the bounding box on the frame
vis_util.draw_bounding_box_on_image_array(best_frame,
self.best_person['ymin'],
self.best_person['xmin'],
self.best_person['ymax'],
self.best_person['xmax'],
color='red',
thickness=2,
display_str_list=["{}: {}%".format(self.best_person['name'],int(self.best_person['score']*100))],
use_normalized_coordinates=False)
# convert back to BGR
self.best_frame = cv2.cvtColor(best_frame, cv2.COLOR_RGB2BGR)
motion_end = datetime.datetime.now().timestamp()
# wait for the global motion flag to change
with self.motion_changed:
self.motion_changed.wait()
motion_start = datetime.datetime.now().timestamp()
# make a copy of the recent frames
recent_frames = self.recent_frames.copy()
if not self.best_person is None and self.best_person['frame_time'] in recent_frames:
best_frame = recent_frames[self.best_person['frame_time']]
label = "{}: {}% {}".format(self.best_person['name'],int(self.best_person['score']*100),int(self.best_person['area']))
draw_box_with_label(best_frame, self.best_person['xmin'], self.best_person['ymin'],
self.best_person['xmax'], self.best_person['ymax'], label)
self.best_frame = cv2.cvtColor(best_frame, cv2.COLOR_RGB2BGR)

View File

@@ -1,5 +1,26 @@
import numpy as np
import cv2
# convert shared memory array into numpy array
def tonumpyarray(mp_arr):
return np.frombuffer(mp_arr.get_obj(), dtype=np.uint16)
return np.frombuffer(mp_arr.get_obj(), dtype=np.uint8)
def draw_box_with_label(frame, x_min, y_min, x_max, y_max, label):
color = (255,0,0)
cv2.rectangle(frame, (x_min, y_min),
(x_max, y_max),
color, 2)
font_scale = 0.5
font = cv2.FONT_HERSHEY_SIMPLEX
# get the width and height of the text box
size = cv2.getTextSize(label, font, fontScale=font_scale, thickness=2)
text_width = size[0][0]
text_height = size[0][1]
line_height = text_height + size[1]
# set the text start position
text_offset_x = x_min
text_offset_y = 0 if y_min < line_height else y_min - (line_height+8)
# make the coords of the box with a small padding of two pixels
textbox_coords = ((text_offset_x, text_offset_y), (text_offset_x + text_width + 2, text_offset_y + line_height))
cv2.rectangle(frame, textbox_coords[0], textbox_coords[1], color, cv2.FILLED)
cv2.putText(frame, label, (text_offset_x, text_offset_y + line_height - 3), font, fontScale=font_scale, color=(0, 0, 0), thickness=2)

View File

@@ -1,95 +1,323 @@
import os
import time
import datetime
import cv2
import threading
from . util import tonumpyarray
# fetch the frames as fast a possible, only decoding the frames when the
# detection_process has consumed the current frame
def fetch_frames(shared_arr, shared_frame_time, frame_lock, frame_ready, frame_shape, rtsp_url):
# convert shared memory array into numpy and shape into image array
arr = tonumpyarray(shared_arr).reshape(frame_shape)
# start the video capture
video = cv2.VideoCapture()
video.open(rtsp_url)
# keep the buffer small so we minimize old data
video.set(cv2.CAP_PROP_BUFFERSIZE,1)
bad_frame_counter = 0
while True:
# check if the video stream is still open, and reopen if needed
if not video.isOpened():
success = video.open(rtsp_url)
if not success:
time.sleep(1)
continue
# grab the frame, but dont decode it yet
ret = video.grab()
# snapshot the time the frame was grabbed
frame_time = datetime.datetime.now()
if ret:
# go ahead and decode the current frame
ret, frame = video.retrieve()
if ret:
# Lock access and update frame
with frame_lock:
arr[:] = frame
shared_frame_time.value = frame_time.timestamp()
# Notify with the condition that a new frame is ready
with frame_ready:
frame_ready.notify_all()
bad_frame_counter = 0
else:
print("Unable to decode frame")
bad_frame_counter += 1
else:
print("Unable to grab a frame")
bad_frame_counter += 1
if bad_frame_counter > 100:
video.release()
video.release()
import ctypes
import multiprocessing as mp
import subprocess as sp
import numpy as np
from . util import tonumpyarray, draw_box_with_label
from . object_detection import FramePrepper
from . objects import ObjectCleaner, BestPersonFrame
from . mqtt import MqttObjectPublisher
# Stores 2 seconds worth of frames when motion is detected so they can be used for other threads
class FrameTracker(threading.Thread):
def __init__(self, shared_frame, frame_time, frame_ready, frame_lock, recent_frames, motion_changed, motion_regions):
def __init__(self, shared_frame, frame_time, frame_ready, frame_lock, recent_frames):
threading.Thread.__init__(self)
self.shared_frame = shared_frame
self.frame_time = frame_time
self.frame_ready = frame_ready
self.frame_lock = frame_lock
self.recent_frames = recent_frames
self.motion_changed = motion_changed
self.motion_regions = motion_regions
def run(self):
frame_time = 0.0
while True:
# while there is motion
while len([r for r in self.motion_regions if r.is_set()]) > 0:
now = datetime.datetime.now().timestamp()
# wait for a frame
with self.frame_ready:
# if there isnt a frame ready for processing or it is old, wait for a signal
if self.frame_time.value == frame_time or (now - self.frame_time.value) > 0.5:
self.frame_ready.wait()
# lock and make a copy of the frame
with self.frame_lock:
frame = self.shared_frame.copy().astype('uint8')
frame_time = self.frame_time.value
# add the frame to recent frames
self.recent_frames[frame_time] = frame
now = datetime.datetime.now().timestamp()
# wait for a frame
with self.frame_ready:
# if there isnt a frame ready for processing or it is old, wait for a signal
if self.frame_time.value == frame_time or (now - self.frame_time.value) > 0.5:
self.frame_ready.wait()
# lock and make a copy of the frame
with self.frame_lock:
frame = self.shared_frame.copy()
frame_time = self.frame_time.value
# add the frame to recent frames
self.recent_frames[frame_time] = frame
# delete any old frames
stored_frame_times = list(self.recent_frames.keys())
for k in stored_frame_times:
if (now - k) > 2:
del self.recent_frames[k]
# delete any old frames
stored_frame_times = list(self.recent_frames.keys())
for k in stored_frame_times:
if (now - k) > 2:
del self.recent_frames[k]
def get_frame_shape(rtsp_url):
# capture a single frame and check the frame shape so the correct array
# size can be allocated in memory
video = cv2.VideoCapture(rtsp_url)
ret, frame = video.read()
frame_shape = frame.shape
video.release()
return frame_shape
def get_rtsp_url(rtsp_config):
if (rtsp_config['password'].startswith('$')):
rtsp_config['password'] = os.getenv(rtsp_config['password'][1:])
return 'rtsp://{}:{}@{}:{}{}'.format(rtsp_config['user'],
rtsp_config['password'], rtsp_config['host'], rtsp_config['port'],
rtsp_config['path'])
class CameraWatchdog(threading.Thread):
def __init__(self, camera):
threading.Thread.__init__(self)
self.camera = camera
def run(self):
while True:
# wait a bit before checking
time.sleep(10)
if (datetime.datetime.now().timestamp() - self.camera.frame_time.value) > 2:
print("last frame is more than 2 seconds old, restarting camera capture...")
self.camera.start_or_restart_capture()
time.sleep(5)
# Thread to read the stdout of the ffmpeg process and update the current frame
class CameraCapture(threading.Thread):
def __init__(self, camera):
threading.Thread.__init__(self)
self.camera = camera
def run(self):
frame_num = 0
while True:
if self.camera.ffmpeg_process.poll() != None:
print("ffmpeg process is not running. exiting capture thread...")
break
raw_image = self.camera.ffmpeg_process.stdout.read(self.camera.frame_size)
if len(raw_image) == 0:
print("ffmpeg didnt return a frame. something is wrong. exiting capture thread...")
break
frame_num += 1
if (frame_num % self.camera.take_frame) != 0:
continue
with self.camera.frame_lock:
self.camera.frame_time.value = datetime.datetime.now().timestamp()
# wait for the global motion flag to change
with self.motion_changed:
self.motion_changed.wait()
self.camera.current_frame[:] = (
np
.frombuffer(raw_image, np.uint8)
.reshape(self.camera.frame_shape)
)
# Notify with the condition that a new frame is ready
with self.camera.frame_ready:
self.camera.frame_ready.notify_all()
class Camera:
def __init__(self, name, config, prepped_frame_queue, mqtt_client, mqtt_prefix):
self.name = name
self.config = config
self.detected_objects = []
self.recent_frames = {}
self.rtsp_url = get_rtsp_url(self.config['rtsp'])
self.take_frame = self.config.get('take_frame', 1)
self.ffmpeg_hwaccel_args = self.config.get('ffmpeg_hwaccel_args', [])
self.regions = self.config['regions']
self.frame_shape = get_frame_shape(self.rtsp_url)
self.frame_size = self.frame_shape[0] * self.frame_shape[1] * self.frame_shape[2]
self.mqtt_client = mqtt_client
self.mqtt_topic_prefix = '{}/{}'.format(mqtt_prefix, self.name)
# create a numpy array for the current frame in initialize to zeros
self.current_frame = np.zeros(self.frame_shape, np.uint8)
# create shared value for storing the frame_time
self.frame_time = mp.Value('d', 0.0)
# Lock to control access to the frame
self.frame_lock = mp.Lock()
# Condition for notifying that a new frame is ready
self.frame_ready = mp.Condition()
# Condition for notifying that objects were parsed
self.objects_parsed = mp.Condition()
self.ffmpeg_process = None
self.capture_thread = None
# for each region, create a separate thread to resize the region and prep for detection
self.detection_prep_threads = []
for region in self.config['regions']:
# set a default threshold of 0.5 if not defined
if not 'threshold' in region:
region['threshold'] = 0.5
if not isinstance(region['threshold'], float):
print('Threshold is not a float. Setting to 0.5 default.')
region['threshold'] = 0.5
self.detection_prep_threads.append(FramePrepper(
self.name,
self.current_frame,
self.frame_time,
self.frame_ready,
self.frame_lock,
region['size'], region['x_offset'], region['y_offset'], region['threshold'],
prepped_frame_queue
))
# start a thread to store recent motion frames for processing
self.frame_tracker = FrameTracker(self.current_frame, self.frame_time,
self.frame_ready, self.frame_lock, self.recent_frames)
self.frame_tracker.start()
# start a thread to store the highest scoring recent person frame
self.best_person_frame = BestPersonFrame(self.objects_parsed, self.recent_frames, self.detected_objects)
self.best_person_frame.start()
# start a thread to expire objects from the detected objects list
self.object_cleaner = ObjectCleaner(self.objects_parsed, self.detected_objects)
self.object_cleaner.start()
# start a thread to publish object scores (currently only person)
mqtt_publisher = MqttObjectPublisher(self.mqtt_client, self.mqtt_topic_prefix, self.objects_parsed, self.detected_objects)
mqtt_publisher.start()
# create a watchdog thread for capture process
self.watchdog = CameraWatchdog(self)
# load in the mask for person detection
if 'mask' in self.config:
self.mask = cv2.imread("/config/{}".format(self.config['mask']), cv2.IMREAD_GRAYSCALE)
else:
self.mask = None
if self.mask is None:
self.mask = np.zeros((self.frame_shape[0], self.frame_shape[1], 1), np.uint8)
self.mask[:] = 255
def start_or_restart_capture(self):
if not self.ffmpeg_process is None:
print("Killing the existing ffmpeg process...")
self.ffmpeg_process.kill()
self.ffmpeg_process.wait()
print("Waiting for the capture thread to exit...")
self.capture_thread.join()
self.ffmpeg_process = None
self.capture_thread = None
# create the process to capture frames from the RTSP stream and store in a shared array
print("Creating a new ffmpeg process...")
self.start_ffmpeg()
print("Creating a new capture thread...")
self.capture_thread = CameraCapture(self)
print("Starting a new capture thread...")
self.capture_thread.start()
def start_ffmpeg(self):
ffmpeg_global_args = [
'-hide_banner', '-loglevel', 'panic'
]
ffmpeg_input_args = [
'-avoid_negative_ts', 'make_zero',
'-fflags', 'nobuffer',
'-flags', 'low_delay',
'-strict', 'experimental',
'-fflags', '+genpts',
'-rtsp_transport', 'tcp',
'-stimeout', '5000000',
'-use_wallclock_as_timestamps', '1'
]
ffmpeg_cmd = (['ffmpeg'] +
ffmpeg_global_args +
self.ffmpeg_hwaccel_args +
ffmpeg_input_args +
['-i', self.rtsp_url,
'-f', 'rawvideo',
'-pix_fmt', 'rgb24',
'pipe:'])
print(" ".join(ffmpeg_cmd))
self.ffmpeg_process = sp.Popen(ffmpeg_cmd, stdout = sp.PIPE, bufsize=self.frame_size)
def start(self):
self.start_or_restart_capture()
# start the object detection prep threads
for detection_prep_thread in self.detection_prep_threads:
detection_prep_thread.start()
self.watchdog.start()
def join(self):
self.capture_thread.join()
def get_capture_pid(self):
return self.ffmpeg_process.pid
def add_objects(self, objects):
if len(objects) == 0:
return
for obj in objects:
# Store object area to use in bounding box labels
obj['area'] = (obj['xmax']-obj['xmin'])*(obj['ymax']-obj['ymin'])
if obj['name'] == 'person':
# find the matching region
region = None
for r in self.regions:
if (
obj['xmin'] >= r['x_offset'] and
obj['ymin'] >= r['y_offset'] and
obj['xmax'] <= r['x_offset']+r['size'] and
obj['ymax'] <= r['y_offset']+r['size']
):
region = r
break
# if the min person area is larger than the
# detected person, don't add it to detected objects
if region and 'min_person_area' in region and region['min_person_area'] > obj['area']:
continue
# compute the coordinates of the person and make sure
# the location isnt outside the bounds of the image (can happen from rounding)
y_location = min(int(obj['ymax']), len(self.mask)-1)
x_location = min(int((obj['xmax']-obj['xmin'])/2.0)+obj['xmin'], len(self.mask[0])-1)
# if the person is in a masked location, continue
if self.mask[y_location][x_location] == [0]:
continue
self.detected_objects.append(obj)
with self.objects_parsed:
self.objects_parsed.notify_all()
def get_best_person(self):
return self.best_person_frame.best_frame
def get_current_frame_with_objects(self):
# make a copy of the current detected objects
detected_objects = self.detected_objects.copy()
# lock and make a copy of the current frame
with self.frame_lock:
frame = self.current_frame.copy()
# draw the bounding boxes on the screen
for obj in detected_objects:
label = "{}: {}% {}".format(obj['name'],int(obj['score']*100),int(obj['area']))
draw_box_with_label(frame, obj['xmin'], obj['ymin'], obj['xmax'], obj['ymax'], label)
for region in self.regions:
color = (255,255,255)
cv2.rectangle(frame, (region['x_offset'], region['y_offset']),
(region['x_offset']+region['size'], region['y_offset']+region['size']),
color, 2)
# convert to BGR
frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
return frame

View File

@@ -0,0 +1,50 @@
#!/bin/bash
set -e
CPU_ARCH=$(uname -m)
OS_VERSION=$(uname -v)
echo "CPU_ARCH ${CPU_ARCH}"
echo "OS_VERSION ${OS_VERSION}"
if [[ "${CPU_ARCH}" == "x86_64" ]]; then
echo "Recognized as Linux on x86_64."
LIBEDGETPU_SUFFIX=x86_64
HOST_GNU_TYPE=x86_64-linux-gnu
elif [[ "${CPU_ARCH}" == "armv7l" ]]; then
echo "Recognized as Linux on ARM32 platform."
LIBEDGETPU_SUFFIX=arm32
HOST_GNU_TYPE=arm-linux-gnueabihf
elif [[ "${CPU_ARCH}" == "aarch64" ]]; then
echo "Recognized as generic ARM64 platform."
LIBEDGETPU_SUFFIX=arm64
HOST_GNU_TYPE=aarch64-linux-gnu
fi
if [[ -z "${HOST_GNU_TYPE}" ]]; then
echo "Your platform is not supported."
exit 1
fi
echo "Using maximum operating frequency."
LIBEDGETPU_SRC="libedgetpu/libedgetpu_${LIBEDGETPU_SUFFIX}.so"
LIBEDGETPU_DST="/usr/lib/${HOST_GNU_TYPE}/libedgetpu.so.1.0"
# Runtime library.
echo "Installing Edge TPU runtime library [${LIBEDGETPU_DST}]..."
if [[ -f "${LIBEDGETPU_DST}" ]]; then
echo "File already exists. Replacing it..."
rm -f "${LIBEDGETPU_DST}"
fi
cp -p "${LIBEDGETPU_SRC}" "${LIBEDGETPU_DST}"
ldconfig
echo "Done."
# Python API.
WHEEL=$(ls edgetpu-*-py3-none-any.whl 2>/dev/null)
if [[ $? == 0 ]]; then
echo "Installing Edge TPU Python API..."
python3 -m pip install --no-deps "${WHEEL}"
echo "Done."
fi

View File

@@ -0,0 +1,5 @@
#!/bin/bash
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys D986B59D
echo "deb http://deb.odroid.in/5422-s bionic main" > /etc/apt/sources.list.d/odroid.list