added ReID example

This commit is contained in:
swdee
2025-07-07 15:31:18 +12:00
parent 2f87fc87e6
commit d18166968f
4 changed files with 843 additions and 0 deletions

View File

@@ -98,6 +98,8 @@ See the [example](example) directory.
* [PPOCR Detect](example/ppocr#ppocr-detect) - Takes an image and detects areas of text.
* [PPOCR Recognise](example/ppocr#ppocr-recognise) - Takes an area of text and performs OCR on it.
* [PPOCR System](example/ppocr#ppocr-system) - Combines both Detect and Recognise.
* Tracking
* [Re-Identification Demo](example/reid) - Re-Identify (ReID) similar objects for tracking, uses batch processing.
* Streaming
* [HTTP Stream with ByteTrack Tracking](example/stream) - Demo that streams a video over HTTP with YOLO object detection and ByteTrack object tracking.
* Slicing Aided Hyper Inference

188
example/reid/README.md Normal file
View File

@@ -0,0 +1,188 @@
# Re-Identification (ReID)
## Overview
Object trackers like ByteTrack can be used to track visible objects frametoframe,
but they rely on the assumption that an object's appearance and location change
smoothly over time. If a person goes behind a building or is briefly hidden
by another passerby, the tracker can lose that objects identity. When that same
person reemerges, the tracker often treats them as a new object, assigning a new ID.
This makes analyzing a persons complete path through a scene difficult
or makes counting unique objects much harder.
Re-Identification (ReID) models help solve this problem by using embedding features
which encode an object into a fixed length vector that captures distinctive
patterns, shapes, or other visual signatures. When an object disappears and
then reappears you can compare the newly detected objects embedding against a list of
past objects. If the similarity (using Cosine or Euclidean distance)
exceeds a chosen threshold, you can confidently link the new detection back to the
original track ID.
## Datasets
The [OSNet model](https://paperswithcode.com/paper/omni-scale-feature-learning-for-person-re) is
lite weight and provides good accuracy for reidentification tasks, however
it must be trained using a dataset to identify specific object classes.
This example uses the [Market1501](https://paperswithcode.com/dataset/market-1501)
dataset trained for reidentifying people.
To support other object classifications such as Vehicles, Faces, or Animals, you
will need to source and train these accordingly.
## Occlusion Example
In the [people walking video](https://github.com/swdee/go-rknnlite-data/raw/master/people-walking.mp4)
a lady wearing a CK branded jacket starts
in the beginning of the scene and becomes occluded by passersby. When she reappears Bytetrack
detects them as a new person.
![CK Lady](https://github.com/swdee/go-rknnlite-data/raw/master/docimg/reid-ck-lady-movement.jpg)
## Usage
Make sure you have downloaded the data files first for the examples.
You only need to do this once for all examples.
```
cd example/
git clone --depth=1 https://github.com/swdee/go-rknnlite-data.git data
```
Command line Usage.
```
$ go run reid.go -h
Usage of /tmp/go-build147978858/b001/exe/reid:
-d string
Data file containing object co-ordinates (default "../data/reid-objects.dat")
-e float
The Euclidean distance [0.0-1.0], a value less than defines a match (default 0.51)
-i string
Image file to run inference on (default "../data/reid-walking.jpg")
-m string
RKNN compiled model file (default "../data/models/rk3588/osnet-market1501-batch8-rk3588.rknn")
-p string
Rockchip CPU Model number [rk3562|rk3566|rk3568|rk3576|rk3582|rk3582|rk3588] (default "rk3588")
```
Run the ReID example on rk3588 or replace with your Platform model.
```
cd example/reid/
go run reid.go -p rk3588
```
This will result in the output of:
```
Driver Version: 0.9.6, API Version: 2.3.0 (c949ad889d@2024-11-07T11:35:33)
Model Input Number: 1, Ouput Number: 1
Input tensors:
index=0, name=input, n_dims=4, dims=[8, 256, 128, 3], n_elems=786432, size=786432, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-14, scale=0.018658
Output tensors:
index=0, name=output, n_dims=2, dims=[8, 512, 0, 0], n_elems=4096, size=4096, fmt=UNDEFINED, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.018782
Comparing object 0 at (0,0,134,361)
Object 0 at (0,0,134,361) has euclidean distance: 0.000000 (same person)
Object 1 at (134,0,251,325) has euclidean distance: 0.423271 (same person)
Object 2 at (251,0,326,208) has euclidean distance: 0.465061 (same person)
Object 3 at (326,0,394,187) has euclidean distance: 0.445583 (same person)
Comparing object 1 at (394,0,513,357)
Object 0 at (0,0,134,361) has euclidean distance: 0.781510 (different person)
Object 1 at (134,0,251,325) has euclidean distance: 0.801649 (different person)
Object 2 at (251,0,326,208) has euclidean distance: 0.680299 (different person)
Object 3 at (326,0,394,187) has euclidean distance: 0.686542 (different person)
Comparing object 2 at (513,0,588,246)
Object 0 at (0,0,134,361) has euclidean distance: 0.860921 (different person)
Object 1 at (134,0,251,325) has euclidean distance: 0.873663 (different person)
Object 2 at (251,0,326,208) has euclidean distance: 0.870753 (different person)
Object 3 at (326,0,394,187) has euclidean distance: 0.820761 (different person)
Comparing object 3 at (588,0,728,360)
Object 0 at (0,0,134,361) has euclidean distance: 0.762738 (different person)
Object 1 at (134,0,251,325) has euclidean distance: 0.800668 (different person)
Object 2 at (251,0,326,208) has euclidean distance: 0.763694 (different person)
Object 3 at (326,0,394,187) has euclidean distance: 0.769597 (different person)
Model first run speed: batch preparation=3.900093ms, inference=47.935686ms, post processing=262.203µs, total time=52.097982ms
done
```
### Docker
To run the ReID example using the prebuilt docker image, make sure the data files have been downloaded first,
then run.
```
# from project root directory
docker run --rm \
--device /dev/dri:/dev/dri \
-v "$(pwd):/go/src/app" \
-v "$(pwd)/example/data:/go/src/data" \
-v "/usr/include/rknn_api.h:/usr/include/rknn_api.h" \
-v "/usr/lib/librknnrt.so:/usr/lib/librknnrt.so" \
-w /go/src/app \
swdee/go-rknnlite:latest \
go run ./example/reid/reid.go -p rk3588
```
### Interpreting Results
The above example uses people detected with a YOLOv5 model and then cropped to
create the sample input.
![CK Lady](https://github.com/swdee/go-rknnlite-data/raw/master/reid-walking.jpg)
Objects A1 to A4 represent the same person and objects B1, C1, and D1 are other
people from the same scene.
The first set of comparisons:
```
Comparing object 0 [A1] at (0,0,134,361)
Object 0 [A1] at (0,0,134,361) has euclidean distance: 0.000000 (same person)
Object 1 [A2] at (134,0,251,325) has euclidean distance: 0.423271 (same person)
Object 2 [A3] at (251,0,326,208) has euclidean distance: 0.465061 (same person)
Object 3 [A4] at (326,0,394,187) has euclidean distance: 0.445583 (same person)
```
Object 0 is A1, when compared to itself it has a euclidean distance of 0.0.
Objects 1-3 are A2 to A4, each of these have a similar
distance ranging from 0.42 to 0.46.
A euclidean distance range is from 0.0 (same object) to 1.0 (different object), so
the lower the distance the more similar the object is. A threshold of `0.51`
is used to define what the maximum distance can be for the object to be considered
the same or different. Your use case and datasets may require calibration of
the ideal threshold.
The remaining results compare the people B1, C1, and D1.
```
Comparing object 1 [B1] at (394,0,513,357)
Object 0 [A1] at (0,0,134,361) has euclidean distance: 0.781510 (different person)
Object 1 [A2] at (134,0,251,325) has euclidean distance: 0.801649 (different person)
Object 2 [A3] at (251,0,326,208) has euclidean distance: 0.680299 (different person)
Object 3 [A4] at (326,0,394,187) has euclidean distance: 0.686542 (different person)
Comparing object 2 [C1] at (513,0,588,246)
Object 0 [A1] at (0,0,134,361) has euclidean distance: 0.860921 (different person)
Object 1 [A2] at (134,0,251,325) has euclidean distance: 0.873663 (different person)
Object 2 [A3] at (251,0,326,208) has euclidean distance: 0.870753 (different person)
Object 3 [A4] at (326,0,394,187) has euclidean distance: 0.820761 (different person)
Comparing object 3 [D1] at (588,0,728,360)
Object 0 [A1] at (0,0,134,361) has euclidean distance: 0.762738 (different person)
Object 1 [A2] at (134,0,251,325) has euclidean distance: 0.800668 (different person)
Object 2 [A3] at (251,0,326,208) has euclidean distance: 0.763694 (different person)
Object 3 [A4] at (326,0,394,187) has euclidean distance: 0.769597 (different person)
```
All of these other people have a euclidean distance greater than 0.68 indicating
they are different people.
## Postprocessing
[Convenience functions](https://github.com/swdee/go-rknnlite-data/raw/master/postprocess/reid.go)
are provided for calculating the Euclidean Distance or Cosine Similarity
depending on how the Model has been trained.

524
example/reid/reid.go Normal file
View File

@@ -0,0 +1,524 @@
package main
import (
"bufio"
"flag"
"fmt"
"github.com/swdee/go-rknnlite"
"github.com/swdee/go-rknnlite/postprocess/reid"
"gocv.io/x/gocv"
"image"
"log"
"os"
"strconv"
"strings"
"time"
)
func main() {
// disable logging timestamps
log.SetFlags(0)
// read in cli flags
modelFile := flag.String("m", "../data/models/rk3588/osnet-market1501-batch8-rk3588.rknn", "RKNN compiled model file")
imgFile := flag.String("i", "../data/reid-walking.jpg", "Image file to run inference on")
objsFile := flag.String("d", "../data/reid-objects.dat", "Data file containing object co-ordinates")
rkPlatform := flag.String("p", "rk3588", "Rockchip CPU Model number [rk3562|rk3566|rk3568|rk3576|rk3582|rk3582|rk3588]")
euDist := flag.Float64("e", 0.51, "The Euclidean distance [0.0-1.0], a value less than defines a match")
flag.Parse()
err := rknnlite.SetCPUAffinityByPlatform(*rkPlatform, rknnlite.FastCores)
if err != nil {
log.Printf("Failed to set CPU Affinity: %v", err)
}
// check if user specified model file or if default is being used. if default
// then pick the default platform model to use.
if f := flag.Lookup("m"); f != nil && f.Value.String() == f.DefValue && *rkPlatform != "rk3588" {
*modelFile = strings.ReplaceAll(*modelFile, "rk3588", *rkPlatform)
}
// create rknn runtime instance
rt, err := rknnlite.NewRuntimeByPlatform(*rkPlatform, *modelFile)
if err != nil {
log.Fatal("Error initializing RKNN runtime: ", err)
}
// set runtime to leave output tensors as int8
rt.SetWantFloat(false)
// optional querying of model file tensors and SDK version for printing
// to stdout. not necessary for production inference code
err = rt.Query(os.Stdout)
if err != nil {
log.Fatal("Error querying runtime: ", err)
}
// load objects file
objs, err := ParseObjects(*objsFile)
if err != nil {
log.Fatal("Error parsing objects: ", err)
}
// load image
img := gocv.IMRead(*imgFile, gocv.IMReadColor)
if img.Empty() {
log.Fatal("Error reading image from: ", *imgFile)
}
// convert colorspace
srcImg := gocv.NewMat()
gocv.CvtColor(img, &srcImg, gocv.ColorBGRToRGB)
defer img.Close()
defer srcImg.Close()
start := time.Now()
// create a batch to process all images in the compare and dataset's
// in a single forward pass
batch := rknnlite.NewBatch(
int(rt.InputAttrs()[0].Dims[0]),
int(rt.InputAttrs()[0].Dims[2]),
int(rt.InputAttrs()[0].Dims[1]),
int(rt.InputAttrs()[0].Dims[3]),
rt.GetInputTypeFloat32(),
)
// scale size is the size of the input tensor dimensions to scale the object too
scaleSize := image.Pt(int(rt.InputAttrs()[0].Dims[1]), int(rt.InputAttrs()[0].Dims[2]))
// add the compare images to the batch
for _, cmpObj := range objs.Compare {
err := AddObjectToBatch(batch, srcImg, cmpObj, scaleSize)
if err != nil {
log.Fatal("Error creating batch: ", err)
}
}
// add the dataset images to the batch
for _, dtObj := range objs.Dataset {
err := AddObjectToBatch(batch, srcImg, dtObj, scaleSize)
if err != nil {
log.Fatal("Error creating batch: ", err)
}
}
defer batch.Close()
endBatch := time.Now()
// run inference on the batch
outputs, err := rt.Inference([]gocv.Mat{batch.Mat()})
endInference := time.Now()
if err != nil {
log.Fatal("Runtime inferencing failed with error: ", err)
}
// get total number of compare objects
totalCmp := len(objs.Compare)
// compare each object to those objects in the dataset for similarity
for i, cmpObj := range objs.Compare {
// get the compare objects output
cmpOutput, err := batch.GetOutputInt(i, outputs.Output[0], int(outputs.OutputAttributes().DimForDFL))
if err != nil {
log.Fatal("Getting output tensor failed with error: ", err)
}
log.Printf("Comparing object %d at (%d,%d,%d,%d)\n", i,
cmpObj.X1, cmpObj.Y1, cmpObj.X2, cmpObj.Y2)
for j, dtObj := range objs.Dataset {
// get each objects outputs
nextOutput, err := batch.GetOutputInt(totalCmp+j, outputs.Output[0], int(outputs.OutputAttributes().DimForDFL))
if err != nil {
log.Fatal("Getting output tensor failed with error: ", err)
}
dist := CompareObjects(
cmpOutput,
nextOutput,
outputs.OutputAttributes().Scales[0],
outputs.OutputAttributes().ZPs[0],
)
// check euclidean distance to determine match of same person or not
objRes := "different person"
if dist < float32(*euDist) {
objRes = "same person"
}
log.Printf(" Object %d at (%d,%d,%d,%d) has euclidean distance: %f (%s)\n",
j,
dtObj.X1, dtObj.Y1, dtObj.X2, dtObj.Y2,
dist, objRes)
}
}
endCompare := time.Now()
log.Printf("Model first run speed: batch preparation=%s, inference=%s, post processing=%s, total time=%s\n",
endBatch.Sub(start).String(),
endInference.Sub(endBatch).String(),
endCompare.Sub(endInference).String(),
endCompare.Sub(start).String(),
)
// free outputs allocated in C memory after you have finished post processing
err = outputs.Free()
if err != nil {
log.Fatal("Error freeing Outputs: ", err)
}
// close runtime and release resources
err = rt.Close()
if err != nil {
log.Fatal("Error closing RKNN runtime: ", err)
}
log.Println("done")
/*
//CompareObject(rt, srcImg, cmpObj, objs.Dataset)
//rgbImg := img.Clone()
frameWidth := 67
frameHeight := 177
roiRect1 := image.Rect(497, 195, 497+frameWidth, 195+frameHeight)
// cklady
//roiRect1 := image.Rect(0, 0, 134, 361)
roiImg1 := rgbImg.Region(roiRect1)
cropImg1 := rgbImg.Clone()
scaleSize1 := image.Pt(int(rt.InputAttrs()[0].Dims[1]), int(rt.InputAttrs()[0].Dims[2]))
gocv.Resize(roiImg1, &cropImg1, scaleSize1, 0, 0, gocv.InterpolationArea)
defer img.Close()
defer rgbImg.Close()
defer cropImg1.Close()
defer roiImg1.Close()
gocv.IMWrite("/tmp/frame-master.jpg", cropImg1)
batch := rt.NewBatch(
int(rt.InputAttrs()[0].Dims[0]),
int(rt.InputAttrs()[0].Dims[2]),
int(rt.InputAttrs()[0].Dims[1]),
int(rt.InputAttrs()[0].Dims[3]),
)
err = batch.Add(cropImg1)
if err != nil {
log.Fatal("Error creating batch: ", err)
}
defer batch.Close()
// perform inference on image file
outputs, err := rt.Inference([]gocv.Mat{batch.Mat()})
if err != nil {
log.Fatal("Runtime inferencing failed with error: ", err)
}
output, err := batch.GetOutputInt(0, outputs.Output[0], int(outputs.OutputAttributes().DimForDFL))
if err != nil {
log.Fatal("Getting output tensor failed with error: ", err)
}
fingerPrint := DequantizeAndL2Normalize(
output,
outputs.OutputAttributes().Scales[0],
outputs.OutputAttributes().ZPs[0],
)
// seed the EMA fingerprint to the master
emaFP := make([]float32, len(fingerPrint))
copy(emaFP, fingerPrint)
const alpha = 0.9 // smoothing factor
hash, err := FingerprintHash(fingerPrint)
if err != nil {
log.Fatalf("hashing failed: %v", err)
}
log.Println("object fingerprint:", hash)
// free outputs allocated in C memory after you have finished post processing
err = outputs.Free()
if err != nil {
log.Fatal("Error freeing Outputs: ", err)
}
// sample 2 images
yOffsets := []int{1, 195, 388}
xOffsets := []int{497, 565, 633, 701, 769, 836, 904}
images := [][]int{}
for _, ny := range yOffsets {
for _, nx := range xOffsets {
images = append(images, []int{nx, ny})
}
}
// ck lady
// images := [][]int{
// {134, 0, 117, 325},
// {251, 0, 75, 208},
// {326, 0, 68, 187},
// }
// Image 2
for frame, next := range images {
roiRect2 := image.Rect(next[0], next[1], next[0]+frameWidth, next[1]+frameHeight)
// ck lady
//roiRect2 := image.Rect(next[0], next[1], next[0]+next[2], next[1]+next[3])
roiImg2 := rgbImg.Region(roiRect2)
cropImg2 := rgbImg.Clone()
scaleSize2 := image.Pt(int(rt.InputAttrs()[0].Dims[1]), int(rt.InputAttrs()[0].Dims[2]))
gocv.Resize(roiImg2, &cropImg2, scaleSize2, 0, 0, gocv.InterpolationArea)
defer cropImg2.Close()
defer roiImg2.Close()
gocv.IMWrite(fmt.Sprintf("/tmp/frame-%d.jpg", frame), cropImg2)
start := time.Now()
batch.Clear()
err = batch.Add(cropImg2)
if err != nil {
log.Fatal("Error creating batch: ", err)
}
outputs, err = rt.Inference([]gocv.Mat{batch.Mat()})
if err != nil {
log.Fatal("Runtime inferencing failed with error: ", err)
}
endInference := time.Now()
output, err := batch.GetOutputInt(0, outputs.Output[0], int(outputs.OutputAttributes().DimForDFL))
if err != nil {
log.Fatal("Getting output tensor failed with error: ", err)
}
fingerPrint2 := DequantizeAndL2Normalize(
output,
outputs.OutputAttributes().Scales[0],
outputs.OutputAttributes().ZPs[0],
)
// sim := CosineSimilarity(fingerPrint, fingerPrint2)
// dist := CosineDistance(fingerPrint, fingerPrint2)
// fmt.Printf("Frame %d, cosine similarity: %f, distance=%f\n", frame, sim, dist)
// compute Euclidean (L2) distance directly
dist := EuclideanDistance(fingerPrint, fingerPrint2)
// 3) compute vs EMA
emaDist := EuclideanDistance(emaFP, fingerPrint2)
endDetect := time.Now()
objRes := "different person"
if emaDist < 0.51 {
objRes = "same person"
}
fmt.Printf("Frame %d, euclidean distance: %f, ema=%f (%s)\n", frame, dist, emaDist, objRes)
log.Printf(" Inference=%s, detect=%s, total time=%s\n",
endInference.Sub(start).String(),
endDetect.Sub(endInference).String(),
endDetect.Sub(start).String(),
)
// free outputs allocated in C memory after you have finished post processing
err = outputs.Free()
if err != nil {
log.Fatal("Error freeing Outputs: ", err)
}
// 4) update the EMA fingerprint
if frame >= 7 && frame <= 13 {
// emaFP = α*emaFP + (1-α)*fp2
for i := range emaFP {
emaFP[i] = alpha*emaFP[i] + (1-alpha)*fingerPrint2[i]
}
// 5) renormalize emaFP back to unit length
var sum float32
for _, v := range emaFP {
sum += v * v
}
norm := float32(math.Sqrt(float64(sum)))
if norm > 0 {
for i := range emaFP {
emaFP[i] /= norm
}
}
}
}
// close runtime and release resources
err = rt.Close()
if err != nil {
log.Fatal("Error closing RKNN runtime: ", err)
}
log.Println("done")
*/
}
// Box holds object bounding box coordinates (x1, y1, x2, y2)
type Box struct {
X1, Y1, X2, Y2 int
}
// Objects is a struct to represent the compare and dataset objects parsed
// from the objects data file
type Objects struct {
Compare []Box
Dataset []Box
}
// ParseObjects reads the TOML-like objects data file returns the two lists
// of objects and their bounding box coordinates
func ParseObjects(path string) (*Objects, error) {
f, err := os.Open(path)
if err != nil {
return nil, err
}
defer f.Close()
objs := &Objects{}
section := "" // either "compare" or "dataset"
scanner := bufio.NewScanner(f)
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
// skip blank or comment
if line == "" || strings.HasPrefix(line, "#") {
continue
}
// section header
if strings.HasPrefix(line, "[") && strings.HasSuffix(line, "]") {
section = strings.ToLower(line[1 : len(line)-1])
continue
}
// data line, expect four ints separated by commas
fields := strings.Split(line, ",")
if len(fields) != 4 {
return nil, fmt.Errorf("invalid data line %q", line)
}
nums := make([]int, 4)
for i, fstr := range fields {
v, err := strconv.Atoi(strings.TrimSpace(fstr))
if err != nil {
return nil, fmt.Errorf("parsing %q: %w", fstr, err)
}
nums[i] = v
}
// define box
box := Box{nums[0], nums[1], nums[2], nums[3]}
switch section {
case "compare":
objs.Compare = append(objs.Compare, box)
case "dataset":
objs.Dataset = append(objs.Dataset, box)
default:
return nil, fmt.Errorf("line %q outside of a known section", line)
}
}
if err := scanner.Err(); err != nil {
return nil, err
}
return objs, nil
}
// AddObjectToBatch adds the cropped object from source image to the batch for
// running inference on
func AddObjectToBatch(batch *rknnlite.Batch, srcImg gocv.Mat, obj Box,
scaleSize image.Point) error {
// get the objects region of interest from source Mat
objRect := image.Rect(obj.X1, obj.Y1, obj.X2, obj.Y2)
objRoi := srcImg.Region(objRect)
objImg := objRoi.Clone()
gocv.Resize(objRoi, &objImg, scaleSize, 0, 0, gocv.InterpolationArea)
defer objRoi.Close()
defer objImg.Close()
return batch.Add(objImg)
}
// CompareObjects compares the outputs of two objects
func CompareObjects(objA []int8, objB []int8, scales float32,
ZPs int32) float32 {
// get the fingerprint of both objects
fpA := reid.DequantizeAndL2Normalize(objA, scales, ZPs)
fpB := reid.DequantizeAndL2Normalize(objB, scales, ZPs)
// compute Euclidean (L2) distance directly
return reid.EuclideanDistance(fpA, fpB)
}

129
postprocess/reid/reid.go Normal file
View File

@@ -0,0 +1,129 @@
package reid
import (
"bytes"
"crypto/sha256"
"encoding/binary"
"encoding/hex"
"math"
)
// DequantizeAndL2Normalize converts a quantized int8 vector "q" into a float32 vector,
// applies dequantization using the provided scale "s" and zero-point "z",
// and then normalizes the result to unit length using L2 normalization.
//
// This is commonly used to convert quantized embedding vectors back to a
// normalized float form for comparison or similarity calculations.
//
// If the resulting vector has zero magnitude, the function returns the
// unnormalized dequantized vector.
func DequantizeAndL2Normalize(q []int8, s float32, z int32) []float32 {
N := len(q)
x := make([]float32, N)
// dequantize
for i := 0; i < N; i++ {
x[i] = float32(int32(q[i])-z) * s
}
// compute L2 norm
var sumSquares float32
for _, v := range x {
sumSquares += v * v
}
norm := float32(math.Sqrt(float64(sumSquares)))
if norm == 0 {
// avoid /0
return x
}
// normalize
for i := 0; i < N; i++ {
x[i] /= norm
}
return x
}
// FingerprintHash takes an L2-normalized []float32 and returns
// a hex-encoded SHA-256 hash of its binary representation.
func FingerprintHash(feat []float32) (string, error) {
buf := new(bytes.Buffer)
// write each float32 in littleendian
for _, v := range feat {
if err := binary.Write(buf, binary.LittleEndian, v); err != nil {
return "", err
}
}
sum := sha256.Sum256(buf.Bytes())
return hex.EncodeToString(sum[:]), nil
}
// CosineSimilarity returns the cosine of the angle between vectors a and b.
// Assumes len(a)==len(b). If you have already L2normalized them,
// this is just their dot-product.
func CosineSimilarity(a, b []float32) float32 {
var dot float32
for i := range a {
dot += a[i] * b[i]
}
// If not already normalized, youd divide by norms here.
return dot
}
// CosineDistance returns 1 cosine similarity, which is a proper distance metric
// in [0,2]. For L2-normalized vectors this is in [0,2], and small values mean
// "very similar."
func CosineDistance(a, b []float32) float32 {
return 1 - CosineSimilarity(a, b)
}
// EuclideanDistance returns the L2 distance between two vectors.
// Lower means "more similar" when your features are L2-normalized.
func EuclideanDistance(a, b []float32) float32 {
var sum float32
for i := range a {
d := a[i] - b[i]
sum += d * d
}
return float32(math.Sqrt(float64(sum)))
}
// NormalizeVec normalizes the input float32 slice to unit length and returns
// a new slice. If the input vector has zero magnitude, it returns the original
// slice unchanged.
func NormalizeVec(v []float32) []float32 {
norm := float32(0.0)
for _, x := range v {
norm += x * x
}
if norm == 0 {
return v // avoid division by zero
}
norm = float32(math.Sqrt(float64(norm)))
out := make([]float32, len(v))
for i, x := range v {
out[i] = x / norm
}
return out
}