monibuca/doc/bufreader_analysis.md

# BufReader: Zero-Copy Network Reading with Advanced Memory Management

## Table of Contents

- [1. Memory Allocation Issues in Standard Library bufio.Reader](#1-memory-allocation-issues-in-standard-library-bufioreader)
- [2. BufReader: A Zero-Copy Solution](#2-bufreader-a-zero-copy-solution)
- [3. Performance Benchmarks](#3-performance-benchmarks)
- [4. Real-World Use Cases](#4-real-world-use-cases)
- [5. Best Practices](#5-best-practices)
- [6. Performance Optimization Tips](#6-performance-optimization-tips)
- [7. Summary](#7-summary)

## TL;DR (Key Takeaways)

If you're short on time, here are the most important conclusions:

**BufReader's Core Advantages** (Concurrent Scenarios):
- ⭐ **98.5% GC Reduction**: 134 GCs → 2 GCs (streaming server scenario)
- 🚀 **99.93% Less Allocations**: 5.57 million → 3,918 allocations
- 🔄 **10-20x Throughput Improvement**: Zero allocation + memory reuse

**Key Data**:
```
Streaming Server Scenario (100 concurrent streams):
bufio.Reader: 79 GB allocated, 134 GCs
BufReader:    0.6 GB allocated, 2 GCs
```

**Ideal Use Cases**:
- ✅ High-concurrency network servers
- ✅ Streaming media processing
- ✅ Long-running services (24/7)

**Quick Test**:
```bash
sh scripts/benchmark_bufreader.sh
```

---

## Introduction

In high-performance network programming, frequent memory allocation and copying are major sources of performance bottlenecks. While Go's standard library `bufio.Reader` provides buffered reading capabilities, it still involves significant memory allocation and copying operations when processing network data streams. This article provides an in-depth analysis of these issues and introduces `BufReader` from the Monibuca project, demonstrating how to achieve zero-copy, high-performance network data reading through the GoMem memory allocator.

## 1. Memory Allocation Issues in Standard Library bufio.Reader

### 1.1 How bufio.Reader Works

`bufio.Reader` uses a fixed-size internal buffer to reduce system call frequency:

```go
type Reader struct {
    buf          []byte    // Fixed-size buffer
    rd           io.Reader // Underlying reader
    r, w         int       // Read/write positions
}

func (b *Reader) Read(p []byte) (n int, err error) {
    // 1. If buffer is empty, read data from underlying reader to fill buffer
    if b.r == b.w {
        n, err = b.rd.Read(b.buf)  // Data copied to internal buffer
        b.w += n
    }

    // 2. Copy data from buffer to target slice
    n = copy(p, b.buf[b.r:b.w])    // Another data copy
    b.r += n
    return
}
```

### 1.2 Memory Allocation Problem Analysis

When using `bufio.Reader` to read network data, the following issues exist:

**Issue 1: Multiple Memory Copies**

```mermaid
sequenceDiagram
    participant N as Network Socket
    participant B as bufio.Reader Internal Buffer
    participant U as User Buffer
    participant A as Application Layer

    N->>B: System call reads data (1st copy)
    Note over B: Data stored in fixed buffer
    B->>U: copy() to user buffer (2nd copy)
    Note over U: User gets data copy
    U->>A: Pass to application layer (possible 3rd copy)
    Note over A: Application processes data
```

Each read operation requires at least two memory copies:
1. From network socket to `bufio.Reader`'s internal buffer
2. From internal buffer to user-provided slice

**Issue 2: Fixed Buffer Limitations**

```go
// bufio.Reader uses fixed-size buffer
reader := bufio.NewReaderSize(conn, 4096)  // Fixed 4KB

// Reading large chunks requires multiple operations
data := make([]byte, 16384)  // Need to read 16KB
for total := 0; total < 16384; {
    n, err := reader.Read(data[total:])  // Need to loop 4 times
    total += n
}
```

**Issue 3: Frequent Memory Allocation**

```go
// Each read requires allocating new slices
func processPackets(reader *bufio.Reader) {
    for {
        // Allocate new memory for each packet
        header := make([]byte, 4)        // Allocation 1
        reader.Read(header)

        size := binary.BigEndian.Uint32(header)
        payload := make([]byte, size)    // Allocation 2
        reader.Read(payload)

        // After processing, memory is GC'd
        processPayload(payload)
        // Next iteration allocates again...
    }
}
```

### 1.3 Performance Impact

In high-frequency network data processing scenarios, these issues lead to:

1. **Increased CPU Overhead**: Frequent `copy()` operations consume CPU resources
2. **Higher GC Pressure**: Massive temporary memory allocations increase garbage collection burden
3. **Increased Latency**: Each memory allocation and copy adds processing latency
4. **Reduced Throughput**: Memory operations become bottlenecks, limiting overall throughput

## 2. BufReader: A Zero-Copy Solution

### 2.1 Design Philosophy

`BufReader` is designed based on the following core principles:

1. **Zero-Copy Reading**: Read directly from network to final memory location, avoiding intermediate copies
2. **Memory Reuse**: Reuse memory blocks through GoMem allocator, avoiding frequent allocations
3. **Chained Buffering**: Use multiple memory blocks in a linked list instead of a single fixed buffer
4. **On-Demand Allocation**: Dynamically adjust memory usage based on actual read amount

### 2.2 Core Data Structures

```go
type BufReader struct {
    Allocator *ScalableMemoryAllocator  // Scalable memory allocator
    buf       MemoryReader               // Memory block chain reader
    totalRead int                        // Total bytes read
    BufLen    int                        // Block size per read
    Mouth     chan []byte                // Data input channel
    feedData  func() error               // Data feeding function
}

// MemoryReader manages multiple memory blocks
type MemoryReader struct {
    *Memory                    // Memory manager
    Buffers [][]byte          // Memory block chain
    Size    int               // Total size
    Length  int               // Readable length
}
```

### 2.3 Workflow

#### 2.3.1 Zero-Copy Data Reading Flow

```mermaid
sequenceDiagram
    participant N as Network Socket
    participant A as ScalableMemoryAllocator
    participant B as BufReader.buf
    participant U as User Code

    U->>B: Read(n)
    B->>B: Check if buffer has data
    alt Buffer empty
        B->>A: Request memory block
        Note over A: Get from pool or allocate new block
        A-->>B: Return memory block reference
        B->>N: Read directly to memory block
        Note over N,B: Zero-copy: data written to final location
    end
    B-->>U: Return slice view of memory block
    Note over U: User uses directly, no copy needed
    U->>U: Process data
    U->>A: Recycle memory block (optional)
    Note over A: Block returns to pool for reuse
```

#### 2.3.2 Memory Block Management Flow

```mermaid
graph TD
    A[Start Reading] --> B{buf has data?}
    B -->|Yes| C[Return data view directly]
    B -->|No| D[Call feedData]
    D --> E[Allocator.Read requests memory]
    E --> F{Pool has free block?}
    F -->|Yes| G[Reuse existing memory block]
    F -->|No| H[Allocate new memory block]
    G --> I[Read data from network]
    H --> I
    I --> J[Append to buf.Buffers]
    J --> K[Update Size and Length]
    K --> C
    C --> L[User reads data]
    L --> M{Data processed?}
    M -->|Yes| N[ClipFront recycle front blocks]
    N --> O[Allocator.Free return to pool]
    O --> P[End]
    M -->|No| A
```

### 2.4 Core Implementation Analysis

#### 2.4.1 Initialization and Memory Allocation

```go
func NewBufReader(reader io.Reader) *BufReader {
    return NewBufReaderWithBufLen(reader, defaultBufSize)
}

func NewBufReaderWithBufLen(reader io.Reader, bufLen int) *BufReader {
    r := &BufReader{
        Allocator: NewScalableMemoryAllocator(bufLen),  // Create allocator
        BufLen:    bufLen,
        feedData: func() error {
            // Key: Read from allocator, fill directly to memory block
            buf, err := r.Allocator.Read(reader, r.BufLen)
            if err != nil {
                return err
            }
            n := len(buf)
            r.totalRead += n
            // Directly append memory block reference, no copy
            r.buf.Buffers = append(r.buf.Buffers, buf)
            r.buf.Size += n
            r.buf.Length += n
            return nil
        },
    }
    r.buf.Memory = &Memory{}
    return r
}
```

**Zero-Copy Key Points**:
- `Allocator.Read()` reads directly from `io.Reader` to allocated memory block
- Returned `buf` is a reference to the actual data storage memory block
- `append(r.buf.Buffers, buf)` only appends reference, no data copy

#### 2.4.2 Read Operations

```go
func (r *BufReader) ReadByte() (b byte, err error) {
    // If buffer is empty, trigger data filling
    for r.buf.Length == 0 {
        if err = r.feedData(); err != nil {
            return
        }
    }
    // Read from memory block chain, no copy needed
    return r.buf.ReadByte()
}

func (r *BufReader) ReadRange(n int, yield func([]byte)) error {
    for r.recycleFront(); n > 0 && err == nil; err = r.feedData() {
        if r.buf.Length > 0 {
            if r.buf.Length >= n {
                // Directly pass slice view of memory block, no copy
                r.buf.RangeN(n, yield)
                return
            }
            n -= r.buf.Length
            r.buf.Range(yield)
        }
    }
    return
}
```

**Zero-Copy Benefits**:
- `yield` callback receives a slice view of the memory block
- User code directly operates on original memory blocks without intermediate copying
- After reading, processed blocks are automatically recycled

#### 2.4.3 Memory Recycling

```go
func (r *BufReader) recycleFront() {
    // Clean up processed memory blocks
    r.buf.ClipFront(r.Allocator.Free)
}

func (r *BufReader) Recycle() {
    r.buf = MemoryReader{}
    if r.Allocator != nil {
        // Return all memory blocks to allocator
        r.Allocator.Recycle()
    }
    if r.Mouth != nil {
        close(r.Mouth)
    }
}
```

### 2.5 Comparison with bufio.Reader

```mermaid
graph LR
    subgraph "bufio.Reader (Multiple Copies)"
        A1[Network] -->|System Call| B1[Kernel Buffer]
        B1 -->|Copy 1| C1[bufio Buffer]
        C1 -->|Copy 2| D1[User Slice]
        D1 -->|Copy 3?| E1[Application]
    end

    subgraph "BufReader (Zero-Copy)"
        A2[Network] -->|System Call| B2[Kernel Buffer]
        B2 -->|Direct Read| C2[GoMem Block]
        C2 -->|Slice View| D2[User Code]
        D2 -->|Recycle| C2
        C2 -->|Reuse| C2
    end
```

| Feature | bufio.Reader | BufReader |
|---------|-------------|-----------|
| Memory Copies | 2-3 times | 0 times (slice view) |
| Buffer Mode | Fixed-size single buffer | Variable-size chained buffer |
| Memory Allocation | May allocate each read | Object pool reuse |
| Memory Recycling | GC automatic | Active return to pool |
| Large Data Handling | Multiple operations needed | Single append to chain |
| GC Pressure | High | Very low |

## 3. Performance Benchmarks

### 3.1 Test Scenario Design

#### 3.1.1 Real Network Simulation

To make benchmarks more realistic, we implemented a `mockNetworkReader` that simulates real network behavior.

**Real Network Characteristics**:

In real network reading scenarios, the data length returned by each `Read()` call is **uncertain**, affected by multiple factors:

- TCP receive window size
- Network latency and bandwidth
- OS buffer state
- Network congestion
- Network quality fluctuations

**Simulation Implementation**:

```go
type mockNetworkReader struct {
    data     []byte
    offset   int
    rng      *rand.Rand
    minChunk int  // Minimum chunk size
    maxChunk int  // Maximum chunk size
}

func (m *mockNetworkReader) Read(p []byte) (n int, err error) {
    // Each time return random length data between minChunk and maxChunk
    chunkSize := m.minChunk + m.rng.Intn(m.maxChunk-m.minChunk+1)
    n = copy(p[:chunkSize], m.data[m.offset:])
    m.offset += n
    return n, nil
}
```

**Different Network Condition Simulations**:

| Network Condition | Data Block Range | Real Scenario |
|------------------|-----------------|---------------|
| Good Network | 1024-4096 bytes | Stable LAN, premium network |
| Normal Network | 256-2048 bytes | Regular internet connection |
| Poor Network | 64-512 bytes | High latency, small TCP window |
| Worst Network | 1-128 bytes | Mobile network, severe congestion |

This simulation makes benchmark results more realistic and reliable.

#### 3.1.2 Test Scenario List

We focus on the following core scenarios:

1. **Concurrent Network Connection Reading** - Demonstrates zero allocation
2. **Concurrent Protocol Parsing** - Simulates real applications
3. **GC Pressure Test** - Shows long-term running advantages ⭐
4. **Streaming Server Scenario** - Real business scenario ⭐

### 3.2 Benchmark Design

#### Core Test Scenarios

Benchmarks focus on **concurrent network scenarios** and **GC pressure** comparison:

**1. Concurrent Network Connection Reading**
- Simulates 100+ concurrent connections continuously reading data
- Each read processes 1KB data packets
- bufio: Allocates new buffer each time (`make([]byte, 1024)`)
- BufReader: Zero-copy processing (`ReadRange`)

**2. Concurrent Protocol Parsing**
- Simulates streaming server parsing protocol packets
- Reads packet header (4 bytes) + data content
- Compares memory allocation strategies

**3. GC Pressure Test** (⭐ Core)
- Continuous concurrent reading and processing
- Tracks GC count, total memory allocation, allocation count
- Demonstrates differences in long-term running

**4. Streaming Server Scenario** (⭐ Real Application)
- Simulates 100 concurrent streams
- Each stream reads and forwards data to subscribers
- Complete real application scenario comparison

#### Key Test Logic

**Concurrent Reading**:
```go
// bufio.Reader - Allocate each time
buf := make([]byte, 1024)  // 1KB allocation
n, _ := reader.Read(buf)
processData(buf[:n])

// BufReader - Zero-copy
reader.ReadRange(1024, func(data []byte) {
    processData(data)  // Direct use, no allocation
})
```

**GC Statistics**:
```go
// Record GC statistics
var beforeGC, afterGC runtime.MemStats
runtime.ReadMemStats(&beforeGC)

b.RunParallel(func(pb *testing.PB) {
    // Concurrent testing...
})

runtime.ReadMemStats(&afterGC)
b.ReportMetric(float64(afterGC.NumGC-beforeGC.NumGC), "gc-runs")
b.ReportMetric(float64(afterGC.TotalAlloc-beforeGC.TotalAlloc)/1024/1024, "MB-alloc")
```

Complete test code: `pkg/util/buf_reader_benchmark_test.go`

### 3.3 Running Benchmarks

We provide complete benchmark code (`pkg/util/buf_reader_benchmark_test.go`) and convenient test scripts.

#### Method 1: Using Test Script (Recommended)

```bash
# Run complete benchmark suite
sh scripts/benchmark_bufreader.sh
```

This script will run all tests sequentially and output user-friendly results.

#### Method 2: Manual Testing

```bash
cd pkg/util

# Run all benchmarks
go test -bench=BenchmarkConcurrent -benchmem -benchtime=2s -test.run=xxx

# Run specific tests
go test -bench=BenchmarkGCPressure -benchmem -benchtime=5s -test.run=xxx

# Run streaming server scenario
go test -bench=BenchmarkStreamingServer -benchmem -benchtime=3s -test.run=xxx
```

#### Method 3: Run Key Tests Only

```bash
cd pkg/util

# GC pressure comparison (core advantage)
go test -bench=BenchmarkGCPressure -benchmem -test.run=xxx

# Streaming server scenario (real application)
go test -bench=BenchmarkStreamingServer -benchmem -test.run=xxx
```

### 3.4 Actual Performance Test Results

Actual results from running benchmarks on Apple M2 Pro:

**Test Environment**:
- CPU: Apple M2 Pro (12 cores)
- OS: macOS (darwin/arm64)
- Go: 1.23.0

#### 3.4.1 Core Performance Comparison

| Test Scenario | bufio.Reader | BufReader | Difference |
|--------------|-------------|-----------|-----------|
| **Concurrent Network Read** | 103.2 ns/op<br/>1027 B/op, 1 allocs | 147.6 ns/op<br/>4 B/op, 0 allocs | Zero alloc ⭐ |
| **GC Pressure Test** | 1874 ns/op<br/>5,576,659 mallocs<br/>3 gc-runs | 112.7 ns/op<br/>3,918 mallocs<br/>2 gc-runs | **16.6x faster** ⭐⭐⭐ |
| **Streaming Server** | 374.6 ns/op<br/>79,508 MB-alloc<br/>134 gc-runs | 30.29 ns/op<br/>601 MB-alloc<br/>2 gc-runs | **12.4x faster** ⭐⭐⭐ |

#### 3.4.2 GC Pressure Comparison (Core Finding)

**GC Pressure Test** results best demonstrate long-term running differences:

**bufio.Reader**:
```
Operation Latency:   1874 ns/op
Allocation Count:    5,576,659 times (over 5 million!)
GC Runs:            3 times
Per Operation:      2 allocs/op
```

**BufReader**:
```
Operation Latency:   112.7 ns/op (16.6x faster)
Allocation Count:    3,918 times (99.93% reduction)
GC Runs:            2 times
Per Operation:      0 allocs/op (zero allocation!)
```

**Key Metrics**:
- 🚀 **16x Throughput Improvement**: 45.7M ops/s vs 2.8M ops/s
- ⭐ **99.93% Allocation Reduction**: From 5.57 million to 3,918 times
- ✨ **Zero Allocation Operations**: 0 allocs/op vs 2 allocs/op

#### 3.4.3 Streaming Server Scenario (Real Application)

Simulating 100 concurrent streams, continuously reading and forwarding data:

**bufio.Reader**:
```
Operation Latency:   374.6 ns/op
Memory Allocation:   79,508 MB (79 GB!)
GC Runs:            134 times
Per Operation:      4 allocs/op
```

**BufReader**:
```
Operation Latency:   30.29 ns/op (12.4x faster)
Memory Allocation:   601 MB (99.2% reduction)
GC Runs:            2 times (98.5% reduction!)
Per Operation:      0 allocs/op
```

**Stunning Differences**:
- 🎯 **GC Runs: 134 → 2** (98.5% reduction)
- 💾 **Memory Allocation: 79 GB → 0.6 GB** (132x reduction)
- ⚡ **Throughput: 10.1M → 117M ops/s** (11.6x improvement)

#### 3.4.4 Long-Term Running Impact

For streaming server scenarios, **1-hour running** estimation:

**bufio.Reader**:
```
Estimated Memory Allocation: ~2.8 TB
Estimated GC Runs: ~4,800 times
Cumulative GC Pause: Significant
```

**BufReader**:
```
Estimated Memory Allocation: ~21 GB (133x reduction)
Estimated GC Runs: ~72 times (67x reduction)
Cumulative GC Pause: Minimal
```

**Usage Recommendations**:

| Scenario | Recommended | Reason |
|----------|------------|---------|
| Simple file reading | bufio.Reader | Standard library sufficient |
| **High-concurrency network server** | **BufReader** ⭐ | **98% GC reduction** |
| **Streaming media processing** | **BufReader** ⭐ | **Zero allocation, high throughput** |
| **Long-running services** | **BufReader** ⭐ | **More stable system** |

#### 3.4.5 Essential Reasons for Performance Improvement

While bufio.Reader is faster in some simple scenarios, BufReader's design goals are not to be faster in all cases, but rather:

1. **Eliminate Memory Allocation** - Avoid frequent `make([]byte, n)` in real applications
2. **Reduce GC Pressure** - Reuse memory through object pool, reducing garbage collection burden
3. **Zero-Copy Processing** - Provide `ReadRange` API for direct data manipulation
4. **Chained Buffering** - Support complex data processing patterns

In scenarios like **Monibuca streaming server**, the value of these features far exceeds microsecond-level latency differences.

**Real Impact**: When handling 1000 concurrent streaming connections:

```go
// bufio.Reader approach
// 1000 connections × 30fps × 1024 bytes/packet = 30,720,000 allocations per second
// 1024 bytes per allocation = ~30GB/sec temporary memory allocation
// Triggers massive GC

// BufReader approach
// 0 allocations (memory reuse)
// 90%+ GC pressure reduction
// Significantly improved system stability
```

**Selection Guidelines**:

- 📁 **Simple file reading** → bufio.Reader
- 🔄 **High-concurrency network services** → BufReader (98% GC reduction)
- 💾 **Long-running services** → BufReader (zero allocation)
- 🎯 **Streaming server** → BufReader (10-20x throughput)

## 4. Real-World Use Cases

### 4.1 RTSP Protocol Parsing

```go
// Use BufReader to parse RTSP requests
func parseRTSPRequest(conn net.Conn) (*RTSPRequest, error) {
    reader := util.NewBufReader(conn)
    defer reader.Recycle()

    // Read request line: zero-copy, no memory allocation
    requestLine, err := reader.ReadLine()
    if err != nil {
        return nil, err
    }

    // Read headers: directly operate on memory blocks
    headers, err := reader.ReadMIMEHeader()
    if err != nil {
        return nil, err
    }

    // Read body (if present)
    if contentLength := headers.Get("Content-Length"); contentLength != "" {
        length, _ := strconv.Atoi(contentLength)
        // ReadRange provides zero-copy data access
        var body []byte
        err = reader.ReadRange(length, func(chunk []byte) {
            body = append(body, chunk...)
        })
    }

    return &RTSPRequest{
        RequestLine: requestLine,
        Headers:     headers,
    }, nil
}
```

### 4.2 Streaming Media Packet Parsing

```go
// Use BufReader to parse FLV packets
func parseFLVPackets(conn net.Conn) error {
    reader := util.NewBufReader(conn)
    defer reader.Recycle()

    for {
        // Read packet header: 4 bytes
        packetType, err := reader.ReadByte()
        if err != nil {
            return err
        }

        // Read data size: 3 bytes big-endian
        dataSize, err := reader.ReadBE32(3)
        if err != nil {
            return err
        }

        // Read timestamp: 4 bytes
        timestamp, err := reader.ReadBE32(4)
        if err != nil {
            return err
        }

        // Skip StreamID: 3 bytes
        if err := reader.Skip(3); err != nil {
            return err
        }

        // Read actual data: zero-copy processing
        err = reader.ReadRange(int(dataSize), func(data []byte) {
            // Process data directly, no copy needed
            processPacket(packetType, timestamp, data)
        })
        if err != nil {
            return err
        }

        // Skip previous tag size
        if err := reader.Skip(4); err != nil {
            return err
        }
    }
}
```

### 4.3 Performance-Critical Scenarios

BufReader is particularly suitable for:

1. **High-frequency small packet processing**: Network protocol parsing, RTP/RTCP packet handling
2. **Large data stream transmission**: Continuous reading of video/audio streams
3. **Multi-step protocol reading**: Protocols requiring step-by-step reading of different length data
4. **Low-latency requirements**: Real-time streaming media transmission, online gaming
5. **High-concurrency scenarios**: Servers with massive concurrent connections

## 5. Best Practices

### 5.1 Correct Usage Patterns

```go
// ✅ Correct: Specify appropriate block size on creation
func goodExample(conn net.Conn) {
    // Choose block size based on actual packet size
    reader := util.NewBufReaderWithBufLen(conn, 16384)  // 16KB blocks
    defer reader.Recycle()  // Ensure resource recycling

    // Use ReadRange for zero-copy
    reader.ReadRange(1024, func(data []byte) {
        // Process directly, don't hold reference to data
        process(data)
    })
}

// ❌ Wrong: Forget to recycle resources
func badExample1(conn net.Conn) {
    reader := util.NewBufReader(conn)
    // Missing defer reader.Recycle()
    // Memory blocks cannot be returned to object pool
}

// ❌ Wrong: Holding data reference
var globalData []byte

func badExample2(conn net.Conn) {
    reader := util.NewBufReader(conn)
    defer reader.Recycle()

    reader.ReadRange(1024, func(data []byte) {
        // ❌ Wrong: data will be recycled after Recycle
        globalData = data  // Dangling reference
    })
}

// ✅ Correct: Copy when data needs to be retained
func goodExample2(conn net.Conn) {
    reader := util.NewBufReader(conn)
    defer reader.Recycle()

    var saved []byte
    reader.ReadRange(1024, func(data []byte) {
        // Explicitly copy when retention needed
        saved = make([]byte, len(data))
        copy(saved, data)
    })
    // Now safe to use saved
}
```

### 5.2 Block Size Selection

```go
// Choose appropriate block size based on scenario
const (
    // Small packet protocols (e.g., RTSP, HTTP headers)
    SmallPacketSize = 4 << 10   // 4KB

    // Medium data streams (e.g., audio)
    MediumPacketSize = 16 << 10  // 16KB

    // Large data streams (e.g., video)
    LargePacketSize = 64 << 10   // 64KB
)

func createReaderForProtocol(conn net.Conn, protocol string) *util.BufReader {
    var bufSize int
    switch protocol {
    case "rtsp", "http":
        bufSize = SmallPacketSize
    case "audio":
        bufSize = MediumPacketSize
    case "video":
        bufSize = LargePacketSize
    default:
        bufSize = util.defaultBufSize
    }
    return util.NewBufReaderWithBufLen(conn, bufSize)
}
```

### 5.3 Error Handling

```go
func robustRead(conn net.Conn) error {
    reader := util.NewBufReader(conn)
    defer func() {
        // Ensure resources are recycled in all cases
        reader.Recycle()
    }()

    // Set timeout
    conn.SetReadDeadline(time.Now().Add(5 * time.Second))

    // Read data
    data, err := reader.ReadBytes(1024)
    if err != nil {
        if err == io.EOF {
            // Normal end
            return nil
        }
        // Handle other errors
        return fmt.Errorf("read error: %w", err)
    }

    // Process data
    processData(data)
    return nil
}
```

## 6. Performance Optimization Tips

### 6.1 Batch Processing

```go
// ✅ Optimized: Batch reading and processing
func optimizedBatchRead(reader *util.BufReader) error {
    // Read large chunk of data at once
    return reader.ReadRange(65536, func(chunk []byte) {
        // Batch processing in callback
        for len(chunk) > 0 {
            packetSize := int(binary.BigEndian.Uint32(chunk[:4]))
            packet := chunk[4 : 4+packetSize]
            processPacket(packet)
            chunk = chunk[4+packetSize:]
        }
    })
}

// ❌ Inefficient: Read one by one
func inefficientRead(reader *util.BufReader) error {
    for {
        size, err := reader.ReadBE32(4)
        if err != nil {
            return err
        }
        packet, err := reader.ReadBytes(int(size))
        if err != nil {
            return err
        }
        processPacket(packet.Buffers[0])
    }
}
```

### 6.2 Avoid Unnecessary Copying

```go
// ✅ Optimized: Direct processing, no copy
func zeroCopyProcess(reader *util.BufReader) error {
    return reader.ReadRange(4096, func(data []byte) {
        // Operate directly on original memory
        sum := 0
        for _, b := range data {
            sum += int(b)
        }
        reportChecksum(sum)
    })
}

// ❌ Inefficient: Unnecessary copy
func unnecessaryCopy(reader *util.BufReader) error {
    mem, err := reader.ReadBytes(4096)
    if err != nil {
        return err
    }
    // Another copy performed
    data := make([]byte, mem.Size)
    copy(data, mem.Buffers[0])

    sum := 0
    for _, b := range data {
        sum += int(b)
    }
    reportChecksum(sum)
    return nil
}
```

### 6.3 Proper Resource Management

```go
// ✅ Optimized: Use object pool to manage BufReader
type ConnectionPool struct {
    readers sync.Pool
}

func (p *ConnectionPool) GetReader(conn net.Conn) *util.BufReader {
    if reader := p.readers.Get(); reader != nil {
        r := reader.(*util.BufReader)
        // Re-initialize
        return r
    }
    return util.NewBufReader(conn)
}

func (p *ConnectionPool) PutReader(reader *util.BufReader) {
    reader.Recycle()  // Recycle memory blocks
    p.readers.Put(reader)  // Recycle BufReader object itself
}

// Use connection pool
func handleConnection(pool *ConnectionPool, conn net.Conn) {
    reader := pool.GetReader(conn)
    defer pool.PutReader(reader)

    // Handle connection
    processConnection(reader)
}
```

## 7. Summary

### 7.1 Performance Comparison Visualization

Based on actual benchmark results (concurrent scenarios):

```
📊 GC Runs Comparison (Core Advantage) ⭐⭐⭐
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
bufio.Reader   ████████████████████████████████████████████████████████████████  134 runs
BufReader      █  2 runs  ← 98.5% reduction!

📊 Total Memory Allocation Comparison
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
bufio.Reader   ████████████████████████████████████████████████████████████████  79 GB
BufReader      █  0.6 GB  ← 99.2% reduction!

📊 Operation Throughput Comparison
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
bufio.Reader   █████  10.1M ops/s
BufReader      ████████████████████████████████████████████████████████  117M ops/s  ← 11.6x!
```

**Key Metrics** (Streaming Server Scenario):
- 🎯 **GC Runs**: From 134 to 2 (98.5% reduction)
- 💾 **Memory Allocation**: From 79 GB to 0.6 GB (132x reduction)
- ⚡ **Throughput**: 11.6x improvement

### 7.2 Core Advantages

BufReader achieves zero-copy, high-performance network data reading through:

1. **Zero-Copy Architecture**
   - Data read directly from network to final memory location
   - Use slice views to avoid data copying
   - Chained buffer supports large data processing

2. **Memory Reuse Mechanism**
   - GoMem object pool reuses memory blocks
   - Active memory management reduces GC pressure
   - Configurable block sizes adapt to different scenarios

3. **Significant Performance Improvement** (in concurrent scenarios)
   - GC runs reduced by 98.5% (134 → 2)
   - Memory allocation reduced by 99.2% (79 GB → 0.6 GB)
   - Throughput improved by 10-20x
   - Significantly improved system stability

### 7.3 Ideal Use Cases

BufReader is particularly suitable for:

- ✅ High-performance network servers
- ✅ Streaming media data processing
- ✅ Real-time protocol parsing
- ✅ Large data stream transmission
- ✅ Low-latency requirements
- ✅ High-concurrency environments

Not suitable for:

- ❌ Simple file reading (standard library sufficient)
- ❌ Single small data reads
- ❌ Performance-insensitive scenarios

### 7.4 Choosing Between bufio.Reader and BufReader

| Scenario | Recommended |
|----------|------------|
| Simple file reading | bufio.Reader |
| Low-frequency network reads | bufio.Reader |
| High-performance network server | BufReader |
| Streaming media processing | BufReader |
| Protocol parsers | BufReader |
| Zero-copy requirements | BufReader |
| Memory-sensitive scenarios | BufReader |

### 7.5 Key Points

Remember when using BufReader:

1. **Always call Recycle()**: Ensure memory blocks are returned to object pool
2. **Don't hold data references**: Data in ReadRange callback will be recycled
3. **Choose appropriate block size**: Adjust based on actual packet size
4. **Leverage ReadRange**: Achieve true zero-copy processing
5. **Use with GoMem**: Fully leverage memory reuse advantages

Through the combination of BufReader and GoMem, Monibuca achieves high-performance network data processing, providing solid infrastructure support for streaming media servers.

## References

- [GoMem Project](https://github.com/langhuihui/gomem)
- [Monibuca v5 Documentation](https://m7s.live)
- [Object Reuse Technology Deep Dive](./arch/reuse.md)
- Go standard library `bufio` package source code
- Go standard library `sync.Pool` documentation