mirror of
https://github.com/swdee/go-rknnlite.git
synced 2025-09-27 03:35:56 +08:00
added ReID example
This commit is contained in:
@@ -98,6 +98,8 @@ See the [example](example) directory.
|
|||||||
* [PPOCR Detect](example/ppocr#ppocr-detect) - Takes an image and detects areas of text.
|
* [PPOCR Detect](example/ppocr#ppocr-detect) - Takes an image and detects areas of text.
|
||||||
* [PPOCR Recognise](example/ppocr#ppocr-recognise) - Takes an area of text and performs OCR on it.
|
* [PPOCR Recognise](example/ppocr#ppocr-recognise) - Takes an area of text and performs OCR on it.
|
||||||
* [PPOCR System](example/ppocr#ppocr-system) - Combines both Detect and Recognise.
|
* [PPOCR System](example/ppocr#ppocr-system) - Combines both Detect and Recognise.
|
||||||
|
* Tracking
|
||||||
|
* [Re-Identification Demo](example/reid) - Re-Identify (ReID) similar objects for tracking, uses batch processing.
|
||||||
* Streaming
|
* Streaming
|
||||||
* [HTTP Stream with ByteTrack Tracking](example/stream) - Demo that streams a video over HTTP with YOLO object detection and ByteTrack object tracking.
|
* [HTTP Stream with ByteTrack Tracking](example/stream) - Demo that streams a video over HTTP with YOLO object detection and ByteTrack object tracking.
|
||||||
* Slicing Aided Hyper Inference
|
* Slicing Aided Hyper Inference
|
||||||
|
188
example/reid/README.md
Normal file
188
example/reid/README.md
Normal file
@@ -0,0 +1,188 @@
|
|||||||
|
|
||||||
|
# Re-Identification (ReID)
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Object trackers like ByteTrack can be used to track visible objects frame‐to‐frame,
|
||||||
|
but they rely on the assumption that an object's appearance and location change
|
||||||
|
smoothly over time. If a person goes behind a building or is briefly hidden
|
||||||
|
by another passerby, the tracker can lose that objects identity. When that same
|
||||||
|
person reemerges, the tracker often treats them as a new object, assigning a new ID.
|
||||||
|
This makes analyzing a persons complete path through a scene difficult
|
||||||
|
or makes counting unique objects much harder.
|
||||||
|
|
||||||
|
Re-Identification (ReID) models help solve this problem by using embedding features
|
||||||
|
which encode an object into a fixed length vector that captures distinctive
|
||||||
|
patterns, shapes, or other visual signatures. When an object disappears and
|
||||||
|
then reappears you can compare the newly detected objects embedding against a list of
|
||||||
|
past objects. If the similarity (using Cosine or Euclidean distance)
|
||||||
|
exceeds a chosen threshold, you can confidently link the new detection back to the
|
||||||
|
original track ID.
|
||||||
|
|
||||||
|
|
||||||
|
## Datasets
|
||||||
|
|
||||||
|
The [OSNet model](https://paperswithcode.com/paper/omni-scale-feature-learning-for-person-re) is
|
||||||
|
lite weight and provides good accuracy for reidentification tasks, however
|
||||||
|
it must be trained using a dataset to identify specific object classes.
|
||||||
|
|
||||||
|
This example uses the [Market1501](https://paperswithcode.com/dataset/market-1501)
|
||||||
|
dataset trained for reidentifying people.
|
||||||
|
|
||||||
|
To support other object classifications such as Vehicles, Faces, or Animals, you
|
||||||
|
will need to source and train these accordingly.
|
||||||
|
|
||||||
|
|
||||||
|
## Occlusion Example
|
||||||
|
|
||||||
|
In the [people walking video](https://github.com/swdee/go-rknnlite-data/raw/master/people-walking.mp4)
|
||||||
|
a lady wearing a CK branded jacket starts
|
||||||
|
in the beginning of the scene and becomes occluded by passersby. When she reappears Bytetrack
|
||||||
|
detects them as a new person.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
Make sure you have downloaded the data files first for the examples.
|
||||||
|
You only need to do this once for all examples.
|
||||||
|
|
||||||
|
```
|
||||||
|
cd example/
|
||||||
|
git clone --depth=1 https://github.com/swdee/go-rknnlite-data.git data
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
Command line Usage.
|
||||||
|
```
|
||||||
|
$ go run reid.go -h
|
||||||
|
|
||||||
|
Usage of /tmp/go-build147978858/b001/exe/reid:
|
||||||
|
-d string
|
||||||
|
Data file containing object co-ordinates (default "../data/reid-objects.dat")
|
||||||
|
-e float
|
||||||
|
The Euclidean distance [0.0-1.0], a value less than defines a match (default 0.51)
|
||||||
|
-i string
|
||||||
|
Image file to run inference on (default "../data/reid-walking.jpg")
|
||||||
|
-m string
|
||||||
|
RKNN compiled model file (default "../data/models/rk3588/osnet-market1501-batch8-rk3588.rknn")
|
||||||
|
-p string
|
||||||
|
Rockchip CPU Model number [rk3562|rk3566|rk3568|rk3576|rk3582|rk3582|rk3588] (default "rk3588")
|
||||||
|
```
|
||||||
|
|
||||||
|
Run the ReID example on rk3588 or replace with your Platform model.
|
||||||
|
```
|
||||||
|
cd example/reid/
|
||||||
|
go run reid.go -p rk3588
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
This will result in the output of:
|
||||||
|
```
|
||||||
|
Driver Version: 0.9.6, API Version: 2.3.0 (c949ad889d@2024-11-07T11:35:33)
|
||||||
|
Model Input Number: 1, Ouput Number: 1
|
||||||
|
Input tensors:
|
||||||
|
index=0, name=input, n_dims=4, dims=[8, 256, 128, 3], n_elems=786432, size=786432, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-14, scale=0.018658
|
||||||
|
Output tensors:
|
||||||
|
index=0, name=output, n_dims=2, dims=[8, 512, 0, 0], n_elems=4096, size=4096, fmt=UNDEFINED, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.018782
|
||||||
|
Comparing object 0 at (0,0,134,361)
|
||||||
|
Object 0 at (0,0,134,361) has euclidean distance: 0.000000 (same person)
|
||||||
|
Object 1 at (134,0,251,325) has euclidean distance: 0.423271 (same person)
|
||||||
|
Object 2 at (251,0,326,208) has euclidean distance: 0.465061 (same person)
|
||||||
|
Object 3 at (326,0,394,187) has euclidean distance: 0.445583 (same person)
|
||||||
|
Comparing object 1 at (394,0,513,357)
|
||||||
|
Object 0 at (0,0,134,361) has euclidean distance: 0.781510 (different person)
|
||||||
|
Object 1 at (134,0,251,325) has euclidean distance: 0.801649 (different person)
|
||||||
|
Object 2 at (251,0,326,208) has euclidean distance: 0.680299 (different person)
|
||||||
|
Object 3 at (326,0,394,187) has euclidean distance: 0.686542 (different person)
|
||||||
|
Comparing object 2 at (513,0,588,246)
|
||||||
|
Object 0 at (0,0,134,361) has euclidean distance: 0.860921 (different person)
|
||||||
|
Object 1 at (134,0,251,325) has euclidean distance: 0.873663 (different person)
|
||||||
|
Object 2 at (251,0,326,208) has euclidean distance: 0.870753 (different person)
|
||||||
|
Object 3 at (326,0,394,187) has euclidean distance: 0.820761 (different person)
|
||||||
|
Comparing object 3 at (588,0,728,360)
|
||||||
|
Object 0 at (0,0,134,361) has euclidean distance: 0.762738 (different person)
|
||||||
|
Object 1 at (134,0,251,325) has euclidean distance: 0.800668 (different person)
|
||||||
|
Object 2 at (251,0,326,208) has euclidean distance: 0.763694 (different person)
|
||||||
|
Object 3 at (326,0,394,187) has euclidean distance: 0.769597 (different person)
|
||||||
|
Model first run speed: batch preparation=3.900093ms, inference=47.935686ms, post processing=262.203µs, total time=52.097982ms
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
### Docker
|
||||||
|
|
||||||
|
To run the ReID example using the prebuilt docker image, make sure the data files have been downloaded first,
|
||||||
|
then run.
|
||||||
|
```
|
||||||
|
# from project root directory
|
||||||
|
|
||||||
|
docker run --rm \
|
||||||
|
--device /dev/dri:/dev/dri \
|
||||||
|
-v "$(pwd):/go/src/app" \
|
||||||
|
-v "$(pwd)/example/data:/go/src/data" \
|
||||||
|
-v "/usr/include/rknn_api.h:/usr/include/rknn_api.h" \
|
||||||
|
-v "/usr/lib/librknnrt.so:/usr/lib/librknnrt.so" \
|
||||||
|
-w /go/src/app \
|
||||||
|
swdee/go-rknnlite:latest \
|
||||||
|
go run ./example/reid/reid.go -p rk3588
|
||||||
|
```
|
||||||
|
|
||||||
|
### Interpreting Results
|
||||||
|
|
||||||
|
The above example uses people detected with a YOLOv5 model and then cropped to
|
||||||
|
create the sample input.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Objects A1 to A4 represent the same person and objects B1, C1, and D1 are other
|
||||||
|
people from the same scene.
|
||||||
|
|
||||||
|
The first set of comparisons:
|
||||||
|
```
|
||||||
|
Comparing object 0 [A1] at (0,0,134,361)
|
||||||
|
Object 0 [A1] at (0,0,134,361) has euclidean distance: 0.000000 (same person)
|
||||||
|
Object 1 [A2] at (134,0,251,325) has euclidean distance: 0.423271 (same person)
|
||||||
|
Object 2 [A3] at (251,0,326,208) has euclidean distance: 0.465061 (same person)
|
||||||
|
Object 3 [A4] at (326,0,394,187) has euclidean distance: 0.445583 (same person)
|
||||||
|
```
|
||||||
|
|
||||||
|
Object 0 is A1, when compared to itself it has a euclidean distance of 0.0.
|
||||||
|
Objects 1-3 are A2 to A4, each of these have a similar
|
||||||
|
distance ranging from 0.42 to 0.46.
|
||||||
|
|
||||||
|
A euclidean distance range is from 0.0 (same object) to 1.0 (different object), so
|
||||||
|
the lower the distance the more similar the object is. A threshold of `0.51`
|
||||||
|
is used to define what the maximum distance can be for the object to be considered
|
||||||
|
the same or different. Your use case and datasets may require calibration of
|
||||||
|
the ideal threshold.
|
||||||
|
|
||||||
|
The remaining results compare the people B1, C1, and D1.
|
||||||
|
```
|
||||||
|
Comparing object 1 [B1] at (394,0,513,357)
|
||||||
|
Object 0 [A1] at (0,0,134,361) has euclidean distance: 0.781510 (different person)
|
||||||
|
Object 1 [A2] at (134,0,251,325) has euclidean distance: 0.801649 (different person)
|
||||||
|
Object 2 [A3] at (251,0,326,208) has euclidean distance: 0.680299 (different person)
|
||||||
|
Object 3 [A4] at (326,0,394,187) has euclidean distance: 0.686542 (different person)
|
||||||
|
Comparing object 2 [C1] at (513,0,588,246)
|
||||||
|
Object 0 [A1] at (0,0,134,361) has euclidean distance: 0.860921 (different person)
|
||||||
|
Object 1 [A2] at (134,0,251,325) has euclidean distance: 0.873663 (different person)
|
||||||
|
Object 2 [A3] at (251,0,326,208) has euclidean distance: 0.870753 (different person)
|
||||||
|
Object 3 [A4] at (326,0,394,187) has euclidean distance: 0.820761 (different person)
|
||||||
|
Comparing object 3 [D1] at (588,0,728,360)
|
||||||
|
Object 0 [A1] at (0,0,134,361) has euclidean distance: 0.762738 (different person)
|
||||||
|
Object 1 [A2] at (134,0,251,325) has euclidean distance: 0.800668 (different person)
|
||||||
|
Object 2 [A3] at (251,0,326,208) has euclidean distance: 0.763694 (different person)
|
||||||
|
Object 3 [A4] at (326,0,394,187) has euclidean distance: 0.769597 (different person)
|
||||||
|
```
|
||||||
|
|
||||||
|
All of these other people have a euclidean distance greater than 0.68 indicating
|
||||||
|
they are different people.
|
||||||
|
|
||||||
|
|
||||||
|
## Postprocessing
|
||||||
|
|
||||||
|
[Convenience functions](https://github.com/swdee/go-rknnlite-data/raw/master/postprocess/reid.go)
|
||||||
|
are provided for calculating the Euclidean Distance or Cosine Similarity
|
||||||
|
depending on how the Model has been trained.
|
524
example/reid/reid.go
Normal file
524
example/reid/reid.go
Normal file
@@ -0,0 +1,524 @@
|
|||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"bufio"
|
||||||
|
"flag"
|
||||||
|
"fmt"
|
||||||
|
"github.com/swdee/go-rknnlite"
|
||||||
|
"github.com/swdee/go-rknnlite/postprocess/reid"
|
||||||
|
"gocv.io/x/gocv"
|
||||||
|
"image"
|
||||||
|
"log"
|
||||||
|
"os"
|
||||||
|
"strconv"
|
||||||
|
"strings"
|
||||||
|
"time"
|
||||||
|
)
|
||||||
|
|
||||||
|
func main() {
|
||||||
|
// disable logging timestamps
|
||||||
|
log.SetFlags(0)
|
||||||
|
|
||||||
|
// read in cli flags
|
||||||
|
modelFile := flag.String("m", "../data/models/rk3588/osnet-market1501-batch8-rk3588.rknn", "RKNN compiled model file")
|
||||||
|
imgFile := flag.String("i", "../data/reid-walking.jpg", "Image file to run inference on")
|
||||||
|
objsFile := flag.String("d", "../data/reid-objects.dat", "Data file containing object co-ordinates")
|
||||||
|
rkPlatform := flag.String("p", "rk3588", "Rockchip CPU Model number [rk3562|rk3566|rk3568|rk3576|rk3582|rk3582|rk3588]")
|
||||||
|
euDist := flag.Float64("e", 0.51, "The Euclidean distance [0.0-1.0], a value less than defines a match")
|
||||||
|
flag.Parse()
|
||||||
|
|
||||||
|
err := rknnlite.SetCPUAffinityByPlatform(*rkPlatform, rknnlite.FastCores)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Printf("Failed to set CPU Affinity: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// check if user specified model file or if default is being used. if default
|
||||||
|
// then pick the default platform model to use.
|
||||||
|
if f := flag.Lookup("m"); f != nil && f.Value.String() == f.DefValue && *rkPlatform != "rk3588" {
|
||||||
|
*modelFile = strings.ReplaceAll(*modelFile, "rk3588", *rkPlatform)
|
||||||
|
}
|
||||||
|
|
||||||
|
// create rknn runtime instance
|
||||||
|
rt, err := rknnlite.NewRuntimeByPlatform(*rkPlatform, *modelFile)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error initializing RKNN runtime: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// set runtime to leave output tensors as int8
|
||||||
|
rt.SetWantFloat(false)
|
||||||
|
|
||||||
|
// optional querying of model file tensors and SDK version for printing
|
||||||
|
// to stdout. not necessary for production inference code
|
||||||
|
err = rt.Query(os.Stdout)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error querying runtime: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// load objects file
|
||||||
|
objs, err := ParseObjects(*objsFile)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error parsing objects: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// load image
|
||||||
|
img := gocv.IMRead(*imgFile, gocv.IMReadColor)
|
||||||
|
|
||||||
|
if img.Empty() {
|
||||||
|
log.Fatal("Error reading image from: ", *imgFile)
|
||||||
|
}
|
||||||
|
|
||||||
|
// convert colorspace
|
||||||
|
srcImg := gocv.NewMat()
|
||||||
|
gocv.CvtColor(img, &srcImg, gocv.ColorBGRToRGB)
|
||||||
|
|
||||||
|
defer img.Close()
|
||||||
|
defer srcImg.Close()
|
||||||
|
|
||||||
|
start := time.Now()
|
||||||
|
|
||||||
|
// create a batch to process all images in the compare and dataset's
|
||||||
|
// in a single forward pass
|
||||||
|
batch := rknnlite.NewBatch(
|
||||||
|
int(rt.InputAttrs()[0].Dims[0]),
|
||||||
|
int(rt.InputAttrs()[0].Dims[2]),
|
||||||
|
int(rt.InputAttrs()[0].Dims[1]),
|
||||||
|
int(rt.InputAttrs()[0].Dims[3]),
|
||||||
|
rt.GetInputTypeFloat32(),
|
||||||
|
)
|
||||||
|
|
||||||
|
// scale size is the size of the input tensor dimensions to scale the object too
|
||||||
|
scaleSize := image.Pt(int(rt.InputAttrs()[0].Dims[1]), int(rt.InputAttrs()[0].Dims[2]))
|
||||||
|
|
||||||
|
// add the compare images to the batch
|
||||||
|
for _, cmpObj := range objs.Compare {
|
||||||
|
err := AddObjectToBatch(batch, srcImg, cmpObj, scaleSize)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error creating batch: ", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// add the dataset images to the batch
|
||||||
|
for _, dtObj := range objs.Dataset {
|
||||||
|
err := AddObjectToBatch(batch, srcImg, dtObj, scaleSize)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error creating batch: ", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
defer batch.Close()
|
||||||
|
|
||||||
|
endBatch := time.Now()
|
||||||
|
|
||||||
|
// run inference on the batch
|
||||||
|
outputs, err := rt.Inference([]gocv.Mat{batch.Mat()})
|
||||||
|
|
||||||
|
endInference := time.Now()
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Runtime inferencing failed with error: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// get total number of compare objects
|
||||||
|
totalCmp := len(objs.Compare)
|
||||||
|
|
||||||
|
// compare each object to those objects in the dataset for similarity
|
||||||
|
for i, cmpObj := range objs.Compare {
|
||||||
|
// get the compare objects output
|
||||||
|
cmpOutput, err := batch.GetOutputInt(i, outputs.Output[0], int(outputs.OutputAttributes().DimForDFL))
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Getting output tensor failed with error: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
log.Printf("Comparing object %d at (%d,%d,%d,%d)\n", i,
|
||||||
|
cmpObj.X1, cmpObj.Y1, cmpObj.X2, cmpObj.Y2)
|
||||||
|
|
||||||
|
for j, dtObj := range objs.Dataset {
|
||||||
|
// get each objects outputs
|
||||||
|
nextOutput, err := batch.GetOutputInt(totalCmp+j, outputs.Output[0], int(outputs.OutputAttributes().DimForDFL))
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Getting output tensor failed with error: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
dist := CompareObjects(
|
||||||
|
cmpOutput,
|
||||||
|
nextOutput,
|
||||||
|
outputs.OutputAttributes().Scales[0],
|
||||||
|
outputs.OutputAttributes().ZPs[0],
|
||||||
|
)
|
||||||
|
|
||||||
|
// check euclidean distance to determine match of same person or not
|
||||||
|
objRes := "different person"
|
||||||
|
|
||||||
|
if dist < float32(*euDist) {
|
||||||
|
objRes = "same person"
|
||||||
|
}
|
||||||
|
|
||||||
|
log.Printf(" Object %d at (%d,%d,%d,%d) has euclidean distance: %f (%s)\n",
|
||||||
|
j,
|
||||||
|
dtObj.X1, dtObj.Y1, dtObj.X2, dtObj.Y2,
|
||||||
|
dist, objRes)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
endCompare := time.Now()
|
||||||
|
|
||||||
|
log.Printf("Model first run speed: batch preparation=%s, inference=%s, post processing=%s, total time=%s\n",
|
||||||
|
endBatch.Sub(start).String(),
|
||||||
|
endInference.Sub(endBatch).String(),
|
||||||
|
endCompare.Sub(endInference).String(),
|
||||||
|
endCompare.Sub(start).String(),
|
||||||
|
)
|
||||||
|
|
||||||
|
// free outputs allocated in C memory after you have finished post processing
|
||||||
|
err = outputs.Free()
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error freeing Outputs: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// close runtime and release resources
|
||||||
|
err = rt.Close()
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error closing RKNN runtime: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
log.Println("done")
|
||||||
|
|
||||||
|
/*
|
||||||
|
//CompareObject(rt, srcImg, cmpObj, objs.Dataset)
|
||||||
|
|
||||||
|
//rgbImg := img.Clone()
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
frameWidth := 67
|
||||||
|
frameHeight := 177
|
||||||
|
|
||||||
|
roiRect1 := image.Rect(497, 195, 497+frameWidth, 195+frameHeight)
|
||||||
|
|
||||||
|
// cklady
|
||||||
|
//roiRect1 := image.Rect(0, 0, 134, 361)
|
||||||
|
|
||||||
|
roiImg1 := rgbImg.Region(roiRect1)
|
||||||
|
|
||||||
|
cropImg1 := rgbImg.Clone()
|
||||||
|
scaleSize1 := image.Pt(int(rt.InputAttrs()[0].Dims[1]), int(rt.InputAttrs()[0].Dims[2]))
|
||||||
|
gocv.Resize(roiImg1, &cropImg1, scaleSize1, 0, 0, gocv.InterpolationArea)
|
||||||
|
|
||||||
|
defer img.Close()
|
||||||
|
defer rgbImg.Close()
|
||||||
|
defer cropImg1.Close()
|
||||||
|
defer roiImg1.Close()
|
||||||
|
|
||||||
|
gocv.IMWrite("/tmp/frame-master.jpg", cropImg1)
|
||||||
|
|
||||||
|
batch := rt.NewBatch(
|
||||||
|
int(rt.InputAttrs()[0].Dims[0]),
|
||||||
|
int(rt.InputAttrs()[0].Dims[2]),
|
||||||
|
int(rt.InputAttrs()[0].Dims[1]),
|
||||||
|
int(rt.InputAttrs()[0].Dims[3]),
|
||||||
|
)
|
||||||
|
err = batch.Add(cropImg1)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error creating batch: ", err)
|
||||||
|
}
|
||||||
|
defer batch.Close()
|
||||||
|
|
||||||
|
// perform inference on image file
|
||||||
|
outputs, err := rt.Inference([]gocv.Mat{batch.Mat()})
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Runtime inferencing failed with error: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
output, err := batch.GetOutputInt(0, outputs.Output[0], int(outputs.OutputAttributes().DimForDFL))
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Getting output tensor failed with error: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
fingerPrint := DequantizeAndL2Normalize(
|
||||||
|
output,
|
||||||
|
outputs.OutputAttributes().Scales[0],
|
||||||
|
outputs.OutputAttributes().ZPs[0],
|
||||||
|
)
|
||||||
|
|
||||||
|
// seed the EMA fingerprint to the master
|
||||||
|
emaFP := make([]float32, len(fingerPrint))
|
||||||
|
copy(emaFP, fingerPrint)
|
||||||
|
const alpha = 0.9 // smoothing factor
|
||||||
|
|
||||||
|
hash, err := FingerprintHash(fingerPrint)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatalf("hashing failed: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
log.Println("object fingerprint:", hash)
|
||||||
|
|
||||||
|
// free outputs allocated in C memory after you have finished post processing
|
||||||
|
err = outputs.Free()
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error freeing Outputs: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
// sample 2 images
|
||||||
|
|
||||||
|
yOffsets := []int{1, 195, 388}
|
||||||
|
xOffsets := []int{497, 565, 633, 701, 769, 836, 904}
|
||||||
|
|
||||||
|
images := [][]int{}
|
||||||
|
|
||||||
|
for _, ny := range yOffsets {
|
||||||
|
for _, nx := range xOffsets {
|
||||||
|
images = append(images, []int{nx, ny})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ck lady
|
||||||
|
|
||||||
|
// images := [][]int{
|
||||||
|
// {134, 0, 117, 325},
|
||||||
|
// {251, 0, 75, 208},
|
||||||
|
// {326, 0, 68, 187},
|
||||||
|
// }
|
||||||
|
|
||||||
|
|
||||||
|
// Image 2
|
||||||
|
for frame, next := range images {
|
||||||
|
|
||||||
|
roiRect2 := image.Rect(next[0], next[1], next[0]+frameWidth, next[1]+frameHeight)
|
||||||
|
// ck lady
|
||||||
|
//roiRect2 := image.Rect(next[0], next[1], next[0]+next[2], next[1]+next[3])
|
||||||
|
roiImg2 := rgbImg.Region(roiRect2)
|
||||||
|
|
||||||
|
cropImg2 := rgbImg.Clone()
|
||||||
|
scaleSize2 := image.Pt(int(rt.InputAttrs()[0].Dims[1]), int(rt.InputAttrs()[0].Dims[2]))
|
||||||
|
gocv.Resize(roiImg2, &cropImg2, scaleSize2, 0, 0, gocv.InterpolationArea)
|
||||||
|
|
||||||
|
defer cropImg2.Close()
|
||||||
|
defer roiImg2.Close()
|
||||||
|
|
||||||
|
gocv.IMWrite(fmt.Sprintf("/tmp/frame-%d.jpg", frame), cropImg2)
|
||||||
|
|
||||||
|
start := time.Now()
|
||||||
|
|
||||||
|
batch.Clear()
|
||||||
|
err = batch.Add(cropImg2)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error creating batch: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
outputs, err = rt.Inference([]gocv.Mat{batch.Mat()})
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Runtime inferencing failed with error: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
endInference := time.Now()
|
||||||
|
|
||||||
|
output, err := batch.GetOutputInt(0, outputs.Output[0], int(outputs.OutputAttributes().DimForDFL))
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Getting output tensor failed with error: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
fingerPrint2 := DequantizeAndL2Normalize(
|
||||||
|
output,
|
||||||
|
outputs.OutputAttributes().Scales[0],
|
||||||
|
outputs.OutputAttributes().ZPs[0],
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
// sim := CosineSimilarity(fingerPrint, fingerPrint2)
|
||||||
|
// dist := CosineDistance(fingerPrint, fingerPrint2)
|
||||||
|
// fmt.Printf("Frame %d, cosine similarity: %f, distance=%f\n", frame, sim, dist)
|
||||||
|
|
||||||
|
|
||||||
|
// compute Euclidean (L2) distance directly
|
||||||
|
dist := EuclideanDistance(fingerPrint, fingerPrint2)
|
||||||
|
|
||||||
|
// 3) compute vs EMA
|
||||||
|
emaDist := EuclideanDistance(emaFP, fingerPrint2)
|
||||||
|
|
||||||
|
endDetect := time.Now()
|
||||||
|
|
||||||
|
objRes := "different person"
|
||||||
|
if emaDist < 0.51 {
|
||||||
|
objRes = "same person"
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Printf("Frame %d, euclidean distance: %f, ema=%f (%s)\n", frame, dist, emaDist, objRes)
|
||||||
|
|
||||||
|
log.Printf(" Inference=%s, detect=%s, total time=%s\n",
|
||||||
|
endInference.Sub(start).String(),
|
||||||
|
endDetect.Sub(endInference).String(),
|
||||||
|
endDetect.Sub(start).String(),
|
||||||
|
)
|
||||||
|
|
||||||
|
// free outputs allocated in C memory after you have finished post processing
|
||||||
|
err = outputs.Free()
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error freeing Outputs: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// 4) update the EMA fingerprint
|
||||||
|
if frame >= 7 && frame <= 13 {
|
||||||
|
|
||||||
|
// emaFP = α*emaFP + (1-α)*fp2
|
||||||
|
for i := range emaFP {
|
||||||
|
emaFP[i] = alpha*emaFP[i] + (1-alpha)*fingerPrint2[i]
|
||||||
|
}
|
||||||
|
// 5) re‐normalize emaFP back to unit length
|
||||||
|
var sum float32
|
||||||
|
for _, v := range emaFP {
|
||||||
|
sum += v * v
|
||||||
|
}
|
||||||
|
norm := float32(math.Sqrt(float64(sum)))
|
||||||
|
if norm > 0 {
|
||||||
|
for i := range emaFP {
|
||||||
|
emaFP[i] /= norm
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
// close runtime and release resources
|
||||||
|
err = rt.Close()
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
log.Fatal("Error closing RKNN runtime: ", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
log.Println("done")
|
||||||
|
*/
|
||||||
|
}
|
||||||
|
|
||||||
|
// Box holds object bounding box coordinates (x1, y1, x2, y2)
|
||||||
|
type Box struct {
|
||||||
|
X1, Y1, X2, Y2 int
|
||||||
|
}
|
||||||
|
|
||||||
|
// Objects is a struct to represent the compare and dataset objects parsed
|
||||||
|
// from the objects data file
|
||||||
|
type Objects struct {
|
||||||
|
Compare []Box
|
||||||
|
Dataset []Box
|
||||||
|
}
|
||||||
|
|
||||||
|
// ParseObjects reads the TOML-like objects data file returns the two lists
|
||||||
|
// of objects and their bounding box coordinates
|
||||||
|
func ParseObjects(path string) (*Objects, error) {
|
||||||
|
|
||||||
|
f, err := os.Open(path)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
defer f.Close()
|
||||||
|
|
||||||
|
objs := &Objects{}
|
||||||
|
section := "" // either "compare" or "dataset"
|
||||||
|
scanner := bufio.NewScanner(f)
|
||||||
|
|
||||||
|
for scanner.Scan() {
|
||||||
|
line := strings.TrimSpace(scanner.Text())
|
||||||
|
|
||||||
|
// skip blank or comment
|
||||||
|
if line == "" || strings.HasPrefix(line, "#") {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
// section header
|
||||||
|
if strings.HasPrefix(line, "[") && strings.HasSuffix(line, "]") {
|
||||||
|
section = strings.ToLower(line[1 : len(line)-1])
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
// data line, expect four ints separated by commas
|
||||||
|
fields := strings.Split(line, ",")
|
||||||
|
|
||||||
|
if len(fields) != 4 {
|
||||||
|
return nil, fmt.Errorf("invalid data line %q", line)
|
||||||
|
}
|
||||||
|
|
||||||
|
nums := make([]int, 4)
|
||||||
|
|
||||||
|
for i, fstr := range fields {
|
||||||
|
v, err := strconv.Atoi(strings.TrimSpace(fstr))
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("parsing %q: %w", fstr, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
nums[i] = v
|
||||||
|
}
|
||||||
|
|
||||||
|
// define box
|
||||||
|
box := Box{nums[0], nums[1], nums[2], nums[3]}
|
||||||
|
|
||||||
|
switch section {
|
||||||
|
|
||||||
|
case "compare":
|
||||||
|
objs.Compare = append(objs.Compare, box)
|
||||||
|
|
||||||
|
case "dataset":
|
||||||
|
objs.Dataset = append(objs.Dataset, box)
|
||||||
|
|
||||||
|
default:
|
||||||
|
return nil, fmt.Errorf("line %q outside of a known section", line)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if err := scanner.Err(); err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
return objs, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// AddObjectToBatch adds the cropped object from source image to the batch for
|
||||||
|
// running inference on
|
||||||
|
func AddObjectToBatch(batch *rknnlite.Batch, srcImg gocv.Mat, obj Box,
|
||||||
|
scaleSize image.Point) error {
|
||||||
|
|
||||||
|
// get the objects region of interest from source Mat
|
||||||
|
objRect := image.Rect(obj.X1, obj.Y1, obj.X2, obj.Y2)
|
||||||
|
objRoi := srcImg.Region(objRect)
|
||||||
|
|
||||||
|
objImg := objRoi.Clone()
|
||||||
|
gocv.Resize(objRoi, &objImg, scaleSize, 0, 0, gocv.InterpolationArea)
|
||||||
|
|
||||||
|
defer objRoi.Close()
|
||||||
|
defer objImg.Close()
|
||||||
|
|
||||||
|
return batch.Add(objImg)
|
||||||
|
}
|
||||||
|
|
||||||
|
// CompareObjects compares the outputs of two objects
|
||||||
|
func CompareObjects(objA []int8, objB []int8, scales float32,
|
||||||
|
ZPs int32) float32 {
|
||||||
|
|
||||||
|
// get the fingerprint of both objects
|
||||||
|
fpA := reid.DequantizeAndL2Normalize(objA, scales, ZPs)
|
||||||
|
fpB := reid.DequantizeAndL2Normalize(objB, scales, ZPs)
|
||||||
|
|
||||||
|
// compute Euclidean (L2) distance directly
|
||||||
|
return reid.EuclideanDistance(fpA, fpB)
|
||||||
|
}
|
129
postprocess/reid/reid.go
Normal file
129
postprocess/reid/reid.go
Normal file
@@ -0,0 +1,129 @@
|
|||||||
|
package reid
|
||||||
|
|
||||||
|
import (
|
||||||
|
"bytes"
|
||||||
|
"crypto/sha256"
|
||||||
|
"encoding/binary"
|
||||||
|
"encoding/hex"
|
||||||
|
"math"
|
||||||
|
)
|
||||||
|
|
||||||
|
// DequantizeAndL2Normalize converts a quantized int8 vector "q" into a float32 vector,
|
||||||
|
// applies dequantization using the provided scale "s" and zero-point "z",
|
||||||
|
// and then normalizes the result to unit length using L2 normalization.
|
||||||
|
//
|
||||||
|
// This is commonly used to convert quantized embedding vectors back to a
|
||||||
|
// normalized float form for comparison or similarity calculations.
|
||||||
|
//
|
||||||
|
// If the resulting vector has zero magnitude, the function returns the
|
||||||
|
// unnormalized dequantized vector.
|
||||||
|
func DequantizeAndL2Normalize(q []int8, s float32, z int32) []float32 {
|
||||||
|
|
||||||
|
N := len(q)
|
||||||
|
x := make([]float32, N)
|
||||||
|
|
||||||
|
// dequantize
|
||||||
|
for i := 0; i < N; i++ {
|
||||||
|
x[i] = float32(int32(q[i])-z) * s
|
||||||
|
}
|
||||||
|
|
||||||
|
// compute L2 norm
|
||||||
|
var sumSquares float32
|
||||||
|
|
||||||
|
for _, v := range x {
|
||||||
|
sumSquares += v * v
|
||||||
|
}
|
||||||
|
|
||||||
|
norm := float32(math.Sqrt(float64(sumSquares)))
|
||||||
|
|
||||||
|
if norm == 0 {
|
||||||
|
// avoid /0
|
||||||
|
return x
|
||||||
|
}
|
||||||
|
|
||||||
|
// normalize
|
||||||
|
for i := 0; i < N; i++ {
|
||||||
|
x[i] /= norm
|
||||||
|
}
|
||||||
|
|
||||||
|
return x
|
||||||
|
}
|
||||||
|
|
||||||
|
// FingerprintHash takes an L2-normalized []float32 and returns
|
||||||
|
// a hex-encoded SHA-256 hash of its binary representation.
|
||||||
|
func FingerprintHash(feat []float32) (string, error) {
|
||||||
|
|
||||||
|
buf := new(bytes.Buffer)
|
||||||
|
|
||||||
|
// write each float32 in little‐endian
|
||||||
|
for _, v := range feat {
|
||||||
|
if err := binary.Write(buf, binary.LittleEndian, v); err != nil {
|
||||||
|
return "", err
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
sum := sha256.Sum256(buf.Bytes())
|
||||||
|
|
||||||
|
return hex.EncodeToString(sum[:]), nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// CosineSimilarity returns the cosine of the angle between vectors a and b.
|
||||||
|
// Assumes len(a)==len(b). If you have already L2‐normalized them,
|
||||||
|
// this is just their dot-product.
|
||||||
|
func CosineSimilarity(a, b []float32) float32 {
|
||||||
|
|
||||||
|
var dot float32
|
||||||
|
|
||||||
|
for i := range a {
|
||||||
|
dot += a[i] * b[i]
|
||||||
|
}
|
||||||
|
|
||||||
|
// If not already normalized, you’d divide by norms here.
|
||||||
|
return dot
|
||||||
|
}
|
||||||
|
|
||||||
|
// CosineDistance returns 1 – cosine similarity, which is a proper distance metric
|
||||||
|
// in [0,2]. For L2-normalized vectors this is in [0,2], and small values mean
|
||||||
|
// "very similar."
|
||||||
|
func CosineDistance(a, b []float32) float32 {
|
||||||
|
return 1 - CosineSimilarity(a, b)
|
||||||
|
}
|
||||||
|
|
||||||
|
// EuclideanDistance returns the L2 distance between two vectors.
|
||||||
|
// Lower means "more similar" when your features are L2-normalized.
|
||||||
|
func EuclideanDistance(a, b []float32) float32 {
|
||||||
|
var sum float32
|
||||||
|
|
||||||
|
for i := range a {
|
||||||
|
d := a[i] - b[i]
|
||||||
|
sum += d * d
|
||||||
|
}
|
||||||
|
|
||||||
|
return float32(math.Sqrt(float64(sum)))
|
||||||
|
}
|
||||||
|
|
||||||
|
// NormalizeVec normalizes the input float32 slice to unit length and returns
|
||||||
|
// a new slice. If the input vector has zero magnitude, it returns the original
|
||||||
|
// slice unchanged.
|
||||||
|
func NormalizeVec(v []float32) []float32 {
|
||||||
|
|
||||||
|
norm := float32(0.0)
|
||||||
|
|
||||||
|
for _, x := range v {
|
||||||
|
norm += x * x
|
||||||
|
}
|
||||||
|
|
||||||
|
if norm == 0 {
|
||||||
|
return v // avoid division by zero
|
||||||
|
}
|
||||||
|
|
||||||
|
norm = float32(math.Sqrt(float64(norm)))
|
||||||
|
|
||||||
|
out := make([]float32, len(v))
|
||||||
|
|
||||||
|
for i, x := range v {
|
||||||
|
out[i] = x / norm
|
||||||
|
}
|
||||||
|
|
||||||
|
return out
|
||||||
|
}
|
Reference in New Issue
Block a user