mirror of
https://github.com/langhuihui/monibuca.git
synced 2025-12-24 13:48:04 +08:00
693 lines
20 KiB
Markdown
693 lines
20 KiB
Markdown
# BufReader: Zero-Copy Network Reading with Non-Contiguous Memory Buffers
|
||
|
||
## Table of Contents
|
||
|
||
- [1. Problem: Traditional Contiguous Memory Buffer Bottlenecks](#1-problem-traditional-contiguous-memory-buffer-bottlenecks)
|
||
- [2. Core Solution: Non-Contiguous Memory Buffer Passing Mechanism](#2-core-solution-non-contiguous-memory-buffer-passing-mechanism)
|
||
- [3. Performance Validation](#3-performance-validation)
|
||
- [4. Usage Guide](#4-usage-guide)
|
||
|
||
## TL;DR (Key Takeaways)
|
||
|
||
**Core Innovation**: Non-Contiguous Memory Buffer Passing Mechanism
|
||
- Data stored as **sliced memory blocks**, non-contiguous layout
|
||
- Pass references via **ReadRange callback**, zero-copy
|
||
- Memory blocks **reused from object pool**, avoiding allocation and GC
|
||
|
||
**Performance Data** (Streaming server, 100 concurrent streams):
|
||
```
|
||
bufio.Reader: 79 GB allocated, 134 GCs, 374.6 ns/op
|
||
BufReader: 0.6 GB allocated, 2 GCs, 30.29 ns/op
|
||
|
||
Result: 98.5% GC reduction, 11.6x throughput improvement
|
||
```
|
||
|
||
**Ideal For**: High-concurrency network servers, streaming media, long-running services
|
||
|
||
---
|
||
|
||
## 1. Problem: Traditional Contiguous Memory Buffer Bottlenecks
|
||
|
||
### 1.1 bufio.Reader's Contiguous Memory Model
|
||
|
||
The standard library `bufio.Reader` uses a **fixed-size contiguous memory buffer**:
|
||
|
||
```go
|
||
type Reader struct {
|
||
buf []byte // Single contiguous buffer (e.g., 4KB)
|
||
r, w int // Read/write pointers
|
||
}
|
||
|
||
func (b *Reader) Read(p []byte) (n int, err error) {
|
||
// Copy from contiguous buffer to target
|
||
n = copy(p, b.buf[b.r:b.w]) // Must copy
|
||
return
|
||
}
|
||
```
|
||
|
||
**Cost of Contiguous Memory**:
|
||
|
||
```
|
||
Reading 16KB data (with 4KB buffer):
|
||
|
||
Network → bufio buffer → User buffer
|
||
↓ (4KB contiguous) ↓
|
||
1st [████] → Copy to result[0:4KB]
|
||
2nd [████] → Copy to result[4KB:8KB]
|
||
3rd [████] → Copy to result[8KB:12KB]
|
||
4th [████] → Copy to result[12KB:16KB]
|
||
|
||
Total: 4 network reads + 4 memory copies
|
||
Allocates result (16KB contiguous memory)
|
||
```
|
||
|
||
### 1.2 Issues in High-Concurrency Scenarios
|
||
|
||
In streaming servers (100 concurrent connections, 30fps each):
|
||
|
||
```go
|
||
// Typical processing pattern
|
||
func handleStream(conn net.Conn) {
|
||
reader := bufio.NewReaderSize(conn, 4096)
|
||
for {
|
||
// Allocate contiguous buffer for each packet
|
||
packet := make([]byte, 1024) // Allocation 1
|
||
n, _ := reader.Read(packet) // Copy 1
|
||
|
||
// Forward to multiple subscribers
|
||
for _, sub := range subscribers {
|
||
data := make([]byte, n) // Allocations 2-N
|
||
copy(data, packet[:n]) // Copies 2-N
|
||
sub.Write(data)
|
||
}
|
||
}
|
||
}
|
||
|
||
// Performance impact:
|
||
// 100 connections × 30fps × (1 + subscribers) allocations = massive temporary memory
|
||
// Triggers frequent GC, system instability
|
||
```
|
||
|
||
**Core Problems**:
|
||
1. Must maintain contiguous memory layout → Frequent copying
|
||
2. Allocate new buffer for each packet → Massive temporary objects
|
||
3. Forwarding requires multiple copies → CPU wasted on memory operations
|
||
|
||
## 2. Core Solution: Non-Contiguous Memory Buffer Passing Mechanism
|
||
|
||
### 2.1 Design Philosophy
|
||
|
||
BufReader uses **non-contiguous memory block slices**:
|
||
|
||
```
|
||
No longer require data in contiguous memory:
|
||
1. Data scattered across multiple memory blocks (slice)
|
||
2. Each block independently managed and reused
|
||
3. Pass by reference, no data copying
|
||
```
|
||
|
||
**Core Data Structures**:
|
||
|
||
```go
|
||
type BufReader struct {
|
||
Allocator *ScalableMemoryAllocator // Object pool allocator
|
||
buf MemoryReader // Memory block slice
|
||
}
|
||
|
||
type MemoryReader struct {
|
||
Buffers [][]byte // Multiple memory blocks, non-contiguous!
|
||
Size int // Total size
|
||
Length int // Readable length
|
||
}
|
||
```
|
||
|
||
### 2.2 Non-Contiguous Memory Buffer Model
|
||
|
||
#### Contiguous vs Non-Contiguous Comparison
|
||
|
||
```
|
||
bufio.Reader (Contiguous Memory):
|
||
┌─────────────────────────────────┐
|
||
│ 4KB Fixed Buffer │
|
||
│ [Read][Available] │
|
||
└─────────────────────────────────┘
|
||
- Must copy to contiguous target buffer
|
||
- Fixed size limitation
|
||
- Read portion wastes space
|
||
|
||
BufReader (Non-Contiguous Memory):
|
||
┌──────┐ ┌──────┐ ┌────────┐ ┌──────┐
|
||
│Block1│→│Block2│→│ Block3 │→│Block4│
|
||
│ 512B │ │ 1KB │ │ 2KB │ │ 3KB │
|
||
└──────┘ └──────┘ └────────┘ └──────┘
|
||
- Directly pass reference to each block (zero-copy)
|
||
- Flexible block sizes
|
||
- Recycle immediately after processing
|
||
```
|
||
|
||
#### Memory Block Chain Workflow
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
participant N as Network
|
||
participant P as Object Pool
|
||
participant B as BufReader.buf
|
||
participant U as User Code
|
||
|
||
N->>P: 1st read (returns 512B)
|
||
P-->>B: Block1 (512B) - from pool or new
|
||
B->>B: Buffers = [Block1]
|
||
|
||
N->>P: 2nd read (returns 1KB)
|
||
P-->>B: Block2 (1KB) - reused from pool
|
||
B->>B: Buffers = [Block1, Block2]
|
||
|
||
N->>P: 3rd read (returns 2KB)
|
||
P-->>B: Block3 (2KB)
|
||
B->>B: Buffers = [Block1, Block2, Block3]
|
||
|
||
U->>B: ReadRange(4096)
|
||
B->>U: yield(Block1) - pass reference
|
||
B->>U: yield(Block2) - pass reference
|
||
B->>U: yield(Block3) - pass reference
|
||
B->>U: yield(Block4[0:512])
|
||
|
||
U->>B: Processing complete
|
||
B->>P: Recycle Block1, Block2, Block3, Block4
|
||
Note over P: Memory blocks return to pool for reuse
|
||
```
|
||
|
||
### 2.3 Zero-Copy Passing: ReadRange API
|
||
|
||
**Core API**:
|
||
|
||
```go
|
||
func (r *BufReader) ReadRange(n int, yield func([]byte)) error
|
||
```
|
||
|
||
**How It Works**:
|
||
|
||
```go
|
||
// Internal implementation (simplified)
|
||
func (r *BufReader) ReadRange(n int, yield func([]byte)) error {
|
||
remaining := n
|
||
|
||
// Iterate through memory block slice
|
||
for _, block := range r.buf.Buffers {
|
||
if remaining <= 0 {
|
||
break
|
||
}
|
||
|
||
if len(block) <= remaining {
|
||
// Pass entire block
|
||
yield(block) // Zero-copy: pass reference directly!
|
||
remaining -= len(block)
|
||
} else {
|
||
// Pass portion
|
||
yield(block[:remaining])
|
||
remaining = 0
|
||
}
|
||
}
|
||
|
||
// Recycle processed blocks
|
||
r.recycleFront()
|
||
return nil
|
||
}
|
||
```
|
||
|
||
**Usage Example**:
|
||
|
||
```go
|
||
// Read 4096 bytes of data
|
||
reader.ReadRange(4096, func(chunk []byte) {
|
||
// chunk is reference to original memory block
|
||
// May be called multiple times with different sized blocks
|
||
// e.g.: 512B, 1KB, 2KB, 512B
|
||
|
||
processData(chunk) // Process directly, zero-copy!
|
||
})
|
||
|
||
// Characteristics:
|
||
// - No need to allocate target buffer
|
||
// - No need to copy data
|
||
// - Each chunk automatically recycled after processing
|
||
```
|
||
|
||
### 2.4 Advantages in Real Network Scenarios
|
||
|
||
**Scenario: Read 10KB from network, each read returns 500B-2KB**
|
||
|
||
```
|
||
bufio.Reader (Contiguous Memory):
|
||
1. Read 2KB to internal buffer (contiguous)
|
||
2. Copy 2KB to user buffer ← Copy
|
||
3. Read 1.5KB to internal buffer
|
||
4. Copy 1.5KB to user buffer ← Copy
|
||
5. Read 2KB...
|
||
6. Copy 2KB... ← Copy
|
||
... Repeat ...
|
||
Total: Multiple network reads + Multiple memory copies
|
||
Must allocate 10KB contiguous buffer
|
||
|
||
BufReader (Non-Contiguous Memory):
|
||
1. Read 2KB → Block1, append to slice
|
||
2. Read 1.5KB → Block2, append to slice
|
||
3. Read 2KB → Block3, append to slice
|
||
4. Read 2KB → Block4, append to slice
|
||
5. Read 2.5KB → Block5, append to slice
|
||
6. ReadRange(10KB):
|
||
→ yield(Block1) - 2KB
|
||
→ yield(Block2) - 1.5KB
|
||
→ yield(Block3) - 2KB
|
||
→ yield(Block4) - 2KB
|
||
→ yield(Block5) - 2.5KB
|
||
Total: Multiple network reads + 0 memory copies
|
||
No contiguous memory needed, process block by block
|
||
```
|
||
|
||
### 2.5 Real Application: Stream Forwarding
|
||
|
||
**Problem Scenario**: 100 concurrent streams, each forwarded to 10 subscribers
|
||
|
||
**Traditional Approach** (Contiguous Memory):
|
||
|
||
```go
|
||
func forwardStream_Traditional(reader *bufio.Reader, subscribers []net.Conn) {
|
||
packet := make([]byte, 4096) // Alloc 1: contiguous memory
|
||
n, _ := reader.Read(packet) // Copy 1: from bufio buffer
|
||
|
||
// Copy for each subscriber
|
||
for _, sub := range subscribers {
|
||
data := make([]byte, n) // Allocs 2-11: 10 times
|
||
copy(data, packet[:n]) // Copies 2-11: 10 times
|
||
sub.Write(data)
|
||
}
|
||
}
|
||
// Per packet: 11 allocations + 11 copies
|
||
// 100 concurrent × 30fps × 11 = 33,000 allocations/sec
|
||
```
|
||
|
||
**BufReader Approach** (Non-Contiguous Memory):
|
||
|
||
```go
|
||
func forwardStream_BufReader(reader *BufReader, subscribers []net.Conn) {
|
||
reader.ReadRange(4096, func(chunk []byte) {
|
||
// chunk is original memory block reference, may be non-contiguous
|
||
// All subscribers share the same memory block!
|
||
|
||
for _, sub := range subscribers {
|
||
sub.Write(chunk) // Send reference directly, zero-copy
|
||
}
|
||
})
|
||
}
|
||
// Per packet: 0 allocations + 0 copies
|
||
// 100 concurrent × 30fps × 0 = 0 allocations/sec
|
||
```
|
||
|
||
**Performance Comparison**:
|
||
- Allocations: 33,000/sec → 0/sec
|
||
- Memory copies: 33,000/sec → 0/sec
|
||
- GC pressure: High → Very low
|
||
|
||
### 2.6 Memory Block Lifecycle
|
||
|
||
```mermaid
|
||
stateDiagram-v2
|
||
[*] --> Get from Pool
|
||
Get from Pool --> Read Network Data
|
||
Read Network Data --> Append to Slice
|
||
Append to Slice --> Pass to User
|
||
Pass to User --> User Processing
|
||
User Processing --> Recycle to Pool
|
||
Recycle to Pool --> Get from Pool
|
||
|
||
note right of Get from Pool
|
||
Reuse existing blocks
|
||
Avoid GC
|
||
end note
|
||
|
||
note right of Pass to User
|
||
Pass reference, zero-copy
|
||
May pass to multiple subscribers
|
||
end note
|
||
|
||
note right of Recycle to Pool
|
||
Active recycling
|
||
Immediately reusable
|
||
end note
|
||
```
|
||
|
||
**Key Points**:
|
||
1. Memory blocks **circularly reused** in pool, bypassing GC
|
||
2. Pass references instead of copying data, achieving zero-copy
|
||
3. Recycle immediately after processing, minimizing memory footprint
|
||
|
||
### 2.7 Core Code Implementation
|
||
|
||
```go
|
||
// Create BufReader
|
||
func NewBufReader(reader io.Reader) *BufReader {
|
||
return &BufReader{
|
||
Allocator: NewScalableMemoryAllocator(16384), // Object pool
|
||
feedData: func() error {
|
||
// Get memory block from pool, read network data directly
|
||
buf, err := r.Allocator.Read(reader, r.BufLen)
|
||
if err != nil {
|
||
return err
|
||
}
|
||
// Append to slice (only add reference)
|
||
r.buf.Buffers = append(r.buf.Buffers, buf)
|
||
r.buf.Length += len(buf)
|
||
return nil
|
||
},
|
||
}
|
||
}
|
||
|
||
// Zero-copy reading
|
||
func (r *BufReader) ReadRange(n int, yield func([]byte)) error {
|
||
for r.buf.Length < n {
|
||
r.feedData() // Read more data from network
|
||
}
|
||
|
||
// Pass references block by block
|
||
for _, block := range r.buf.Buffers {
|
||
yield(block) // Zero-copy passing
|
||
}
|
||
|
||
// Recycle processed blocks
|
||
r.recycleFront()
|
||
return nil
|
||
}
|
||
|
||
// Recycle memory blocks to pool
|
||
func (r *BufReader) Recycle() {
|
||
if r.Allocator != nil {
|
||
r.Allocator.Recycle() // Return all blocks to pool
|
||
}
|
||
}
|
||
```
|
||
|
||
## 3. Performance Validation
|
||
|
||
### 3.1 Test Design
|
||
|
||
**Real Network Simulation**: Each read returns random size (64-2048 bytes), simulating real network fluctuations
|
||
|
||
**Core Test Scenarios**:
|
||
1. **Concurrent Network Connection Reading** - Simulate 100+ concurrent connections
|
||
2. **GC Pressure Test** - Demonstrate long-term running differences
|
||
3. **Streaming Server** - Real business scenario (100 streams × forwarding)
|
||
|
||
### 3.2 Performance Test Results
|
||
|
||
**Test Environment**: Apple M2 Pro, Go 1.23.0
|
||
|
||
#### GC Pressure Test (Core Comparison)
|
||
|
||
| Metric | bufio.Reader | BufReader | Improvement |
|
||
|--------|-------------|-----------|-------------|
|
||
| Operation Latency | 1874 ns/op | 112.7 ns/op | **16.6x faster** |
|
||
| Allocation Count | 5,576,659 | 3,918 | **99.93% reduction** |
|
||
| Per Operation | 2 allocs/op | 0 allocs/op | **Zero allocation** |
|
||
| Throughput | 2.8M ops/s | 45.7M ops/s | **16x improvement** |
|
||
|
||
#### Streaming Server Scenario
|
||
|
||
| Metric | bufio.Reader | BufReader | Improvement |
|
||
|--------|-------------|-----------|-------------|
|
||
| Operation Latency | 374.6 ns/op | 30.29 ns/op | **12.4x faster** |
|
||
| Memory Allocation | 79,508 MB | 601 MB | **99.2% reduction** |
|
||
| **GC Runs** | **134** | **2** | **98.5% reduction** ⭐ |
|
||
| Throughput | 10.1M ops/s | 117M ops/s | **11.6x improvement** |
|
||
|
||
#### Performance Visualization
|
||
|
||
```
|
||
📊 GC Runs Comparison (Core Advantage)
|
||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||
bufio.Reader ████████████████████████████████████████████████████████████████ 134 runs
|
||
BufReader █ 2 runs ← 98.5% reduction!
|
||
|
||
📊 Total Memory Allocation
|
||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||
bufio.Reader ████████████████████████████████████████████████████████████████ 79 GB
|
||
BufReader █ 0.6 GB ← 99.2% reduction!
|
||
|
||
📊 Throughput Comparison
|
||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||
bufio.Reader █████ 10.1M ops/s
|
||
BufReader ████████████████████████████████████████████████████████ 117M ops/s
|
||
```
|
||
|
||
### 3.3 Why Non-Contiguous Memory Is So Fast
|
||
|
||
**Reason 1: Zero-Copy Passing**
|
||
```go
|
||
// bufio - Must copy
|
||
buf := make([]byte, 1024)
|
||
reader.Read(buf) // Copy to contiguous memory
|
||
|
||
// BufReader - Pass reference
|
||
reader.ReadRange(1024, func(chunk []byte) {
|
||
// chunk is original memory block, no copy
|
||
})
|
||
```
|
||
|
||
**Reason 2: Memory Block Reuse**
|
||
```
|
||
bufio: Allocate → Use → GC → Reallocate → ...
|
||
BufReader: Allocate → Use → Return to pool → Reuse from pool → ...
|
||
↑ Same memory block reused repeatedly, no GC
|
||
```
|
||
|
||
**Reason 3: Multi-Subscriber Sharing**
|
||
```
|
||
Traditional: 1 packet → Copy 10 times → 10 subscribers
|
||
BufReader: 1 packet → Pass reference → 10 subscribers share
|
||
↑ Only 1 memory block, all 10 subscribers reference it
|
||
```
|
||
|
||
## 4. Usage Guide
|
||
|
||
### 4.1 Basic Usage
|
||
|
||
```go
|
||
func handleConnection(conn net.Conn) {
|
||
// Create BufReader
|
||
reader := util.NewBufReader(conn)
|
||
defer reader.Recycle() // Return all blocks to pool
|
||
|
||
// Zero-copy read and process
|
||
reader.ReadRange(4096, func(chunk []byte) {
|
||
// chunk is non-contiguous memory block
|
||
// Process directly, no copy needed
|
||
processChunk(chunk)
|
||
})
|
||
}
|
||
```
|
||
|
||
### 4.2 Real-World Use Cases
|
||
|
||
**Scenario 1: Protocol Parsing**
|
||
|
||
```go
|
||
// Parse FLV packet (header + data)
|
||
func parseFLV(reader *BufReader) {
|
||
// Read packet type (1 byte)
|
||
packetType, _ := reader.ReadByte()
|
||
|
||
// Read data size (3 bytes)
|
||
dataSize, _ := reader.ReadBE32(3)
|
||
|
||
// Skip timestamp etc (7 bytes)
|
||
reader.Skip(7)
|
||
|
||
// Zero-copy read data (may span multiple non-contiguous blocks)
|
||
reader.ReadRange(int(dataSize), func(chunk []byte) {
|
||
// chunk may be complete data or partial
|
||
// Parse block by block, no need to wait for complete data
|
||
parseDataChunk(packetType, chunk)
|
||
})
|
||
}
|
||
```
|
||
|
||
**Scenario 2: High-Concurrency Forwarding**
|
||
|
||
```go
|
||
// Read from one source, forward to multiple targets
|
||
func relay(source *BufReader, targets []io.Writer) {
|
||
reader.ReadRange(8192, func(chunk []byte) {
|
||
// All targets share the same memory block
|
||
for _, target := range targets {
|
||
target.Write(chunk) // Zero-copy forwarding
|
||
}
|
||
})
|
||
}
|
||
```
|
||
|
||
**Scenario 3: Streaming Server**
|
||
|
||
```go
|
||
// Receive RTSP stream and distribute to subscribers
|
||
type Stream struct {
|
||
reader *BufReader
|
||
subscribers []*Subscriber
|
||
}
|
||
|
||
func (s *Stream) Process() {
|
||
s.reader.ReadRange(65536, func(frame []byte) {
|
||
// frame may be part of video frame (non-contiguous)
|
||
// Send directly to all subscribers
|
||
for _, sub := range s.subscribers {
|
||
sub.WriteFrame(frame) // Shared memory, zero-copy
|
||
}
|
||
})
|
||
}
|
||
```
|
||
|
||
### 4.3 Best Practices
|
||
|
||
**✅ Correct Usage**:
|
||
|
||
```go
|
||
// 1. Always recycle resources
|
||
reader := util.NewBufReader(conn)
|
||
defer reader.Recycle()
|
||
|
||
// 2. Process directly in callback, don't save references
|
||
reader.ReadRange(1024, func(data []byte) {
|
||
processData(data) // ✅ Process immediately
|
||
})
|
||
|
||
// 3. Explicitly copy when retention needed
|
||
var saved []byte
|
||
reader.ReadRange(1024, func(data []byte) {
|
||
saved = append(saved, data...) // ✅ Explicit copy
|
||
})
|
||
```
|
||
|
||
**❌ Wrong Usage**:
|
||
|
||
```go
|
||
// ❌ Don't save references
|
||
var dangling []byte
|
||
reader.ReadRange(1024, func(data []byte) {
|
||
dangling = data // Wrong: data will be recycled
|
||
})
|
||
// dangling is now a dangling reference!
|
||
|
||
// ❌ Don't forget to recycle
|
||
reader := util.NewBufReader(conn)
|
||
// Missing defer reader.Recycle()
|
||
// Memory blocks cannot be returned to pool
|
||
```
|
||
|
||
### 4.4 Performance Optimization Tips
|
||
|
||
**Tip 1: Batch Processing**
|
||
|
||
```go
|
||
// ✅ Optimized: Read multiple packets at once
|
||
reader.ReadRange(65536, func(chunk []byte) {
|
||
// One chunk may contain multiple packets
|
||
for len(chunk) >= 4 {
|
||
size := int(binary.BigEndian.Uint32(chunk[:4]))
|
||
packet := chunk[4 : 4+size]
|
||
processPacket(packet)
|
||
chunk = chunk[4+size:]
|
||
}
|
||
})
|
||
```
|
||
|
||
**Tip 2: Choose Appropriate Block Size**
|
||
|
||
```go
|
||
// Choose based on application scenario
|
||
const (
|
||
SmallPacket = 4 << 10 // 4KB - RTSP/HTTP
|
||
MediumPacket = 16 << 10 // 16KB - Audio streams
|
||
LargePacket = 64 << 10 // 64KB - Video streams
|
||
)
|
||
|
||
reader := util.NewBufReaderWithBufLen(conn, LargePacket)
|
||
```
|
||
|
||
## 5. Summary
|
||
|
||
### Core Innovation: Non-Contiguous Memory Buffering
|
||
|
||
BufReader's core is not "better buffering" but **fundamentally changing the memory layout model**:
|
||
|
||
```
|
||
Traditional thinking: Data must be in contiguous memory
|
||
BufReader: Data can be scattered across blocks, passed by reference
|
||
|
||
Result:
|
||
✓ Zero-copy: No need to reassemble into contiguous memory
|
||
✓ Zero allocation: Memory blocks reused from object pool
|
||
✓ Zero GC pressure: No temporary objects created
|
||
```
|
||
|
||
### Key Advantages
|
||
|
||
| Feature | Implementation | Performance Impact |
|
||
|---------|---------------|-------------------|
|
||
| **Zero-Copy** | Pass memory block references | No copy overhead |
|
||
| **Zero Allocation** | Object pool reuse | 98.5% GC reduction |
|
||
| **Multi-Subscriber Sharing** | Same block referenced multiple times | 10x+ memory savings |
|
||
| **Flexible Block Sizes** | Adapt to network fluctuations | No reassembly needed |
|
||
|
||
### Ideal Use Cases
|
||
|
||
| Scenario | Recommended | Reason |
|
||
|----------|------------|---------|
|
||
| **High-concurrency network servers** | BufReader ⭐ | 98% GC reduction, 10x+ throughput |
|
||
| **Stream forwarding** | BufReader ⭐ | Zero-copy multicast, memory sharing |
|
||
| **Protocol parsers** | BufReader ⭐ | Parse block by block, no complete packet needed |
|
||
| **Long-running services** | BufReader ⭐ | Stable system, minimal GC impact |
|
||
| Simple file reading | bufio.Reader | Standard library sufficient |
|
||
|
||
### Key Points
|
||
|
||
Remember when using BufReader:
|
||
|
||
1. **Accept non-contiguous data**: Process each block via callback
|
||
2. **Don't hold references**: Data recycled after callback returns
|
||
3. **Leverage ReadRange**: This is the core zero-copy API
|
||
4. **Must call Recycle()**: Return memory blocks to pool
|
||
|
||
### Performance Data
|
||
|
||
**Streaming Server (100 concurrent streams, continuous running)**:
|
||
|
||
```
|
||
1-hour running estimation:
|
||
|
||
bufio.Reader (Contiguous Memory):
|
||
- Allocates 2.8 TB memory
|
||
- Triggers 4,800 GCs
|
||
- Frequent system pauses
|
||
|
||
BufReader (Non-Contiguous Memory):
|
||
- Allocates 21 GB memory (133x less)
|
||
- Triggers 72 GCs (67x less)
|
||
- Almost no GC impact
|
||
```
|
||
|
||
### Testing and Documentation
|
||
|
||
**Run Tests**:
|
||
```bash
|
||
sh scripts/benchmark_bufreader.sh
|
||
```
|
||
|
||
## References
|
||
|
||
- [GoMem Project](https://github.com/langhuihui/gomem) - Memory object pool implementation
|
||
- [Monibuca v5](https://m7s.live) - Streaming media server
|
||
- Test Code: `pkg/util/buf_reader_benchmark_test.go`
|
||
|
||
---
|
||
|
||
**Core Idea**: Eliminate traditional contiguous buffer copying overhead through non-contiguous memory block slices and zero-copy reference passing, achieving high-performance network data processing.
|