# BufReader: Zero-Copy Network Reading with Advanced Memory Management ## Table of Contents - [1. Memory Allocation Issues in Standard Library bufio.Reader](#1-memory-allocation-issues-in-standard-library-bufioreader) - [2. BufReader: A Zero-Copy Solution](#2-bufreader-a-zero-copy-solution) - [3. Performance Benchmarks](#3-performance-benchmarks) - [4. Real-World Use Cases](#4-real-world-use-cases) - [5. Best Practices](#5-best-practices) - [6. Performance Optimization Tips](#6-performance-optimization-tips) - [7. Summary](#7-summary) ## TL;DR (Key Takeaways) If you're short on time, here are the most important conclusions: **BufReader's Core Advantages** (Concurrent Scenarios): - ⭐ **98.5% GC Reduction**: 134 GCs → 2 GCs (streaming server scenario) - 🚀 **99.93% Less Allocations**: 5.57 million → 3,918 allocations - 🔄 **10-20x Throughput Improvement**: Zero allocation + memory reuse **Key Data**: ``` Streaming Server Scenario (100 concurrent streams): bufio.Reader: 79 GB allocated, 134 GCs BufReader: 0.6 GB allocated, 2 GCs ``` **Ideal Use Cases**: - ✅ High-concurrency network servers - ✅ Streaming media processing - ✅ Long-running services (24/7) **Quick Test**: ```bash sh scripts/benchmark_bufreader.sh ``` --- ## Introduction In high-performance network programming, frequent memory allocation and copying are major sources of performance bottlenecks. While Go's standard library `bufio.Reader` provides buffered reading capabilities, it still involves significant memory allocation and copying operations when processing network data streams. This article provides an in-depth analysis of these issues and introduces `BufReader` from the Monibuca project, demonstrating how to achieve zero-copy, high-performance network data reading through the GoMem memory allocator. ## 1. Memory Allocation Issues in Standard Library bufio.Reader ### 1.1 How bufio.Reader Works `bufio.Reader` uses a fixed-size internal buffer to reduce system call frequency: ```go type Reader struct { buf []byte // Fixed-size buffer rd io.Reader // Underlying reader r, w int // Read/write positions } func (b *Reader) Read(p []byte) (n int, err error) { // 1. If buffer is empty, read data from underlying reader to fill buffer if b.r == b.w { n, err = b.rd.Read(b.buf) // Data copied to internal buffer b.w += n } // 2. Copy data from buffer to target slice n = copy(p, b.buf[b.r:b.w]) // Another data copy b.r += n return } ``` ### 1.2 Memory Allocation Problem Analysis When using `bufio.Reader` to read network data, the following issues exist: **Issue 1: Multiple Memory Copies** ```mermaid sequenceDiagram participant N as Network Socket participant B as bufio.Reader Internal Buffer participant U as User Buffer participant A as Application Layer N->>B: System call reads data (1st copy) Note over B: Data stored in fixed buffer B->>U: copy() to user buffer (2nd copy) Note over U: User gets data copy U->>A: Pass to application layer (possible 3rd copy) Note over A: Application processes data ``` Each read operation requires at least two memory copies: 1. From network socket to `bufio.Reader`'s internal buffer 2. From internal buffer to user-provided slice **Issue 2: Fixed Buffer Limitations** ```go // bufio.Reader uses fixed-size buffer reader := bufio.NewReaderSize(conn, 4096) // Fixed 4KB // Reading large chunks requires multiple operations data := make([]byte, 16384) // Need to read 16KB for total := 0; total < 16384; { n, err := reader.Read(data[total:]) // Need to loop 4 times total += n } ``` **Issue 3: Frequent Memory Allocation** ```go // Each read requires allocating new slices func processPackets(reader *bufio.Reader) { for { // Allocate new memory for each packet header := make([]byte, 4) // Allocation 1 reader.Read(header) size := binary.BigEndian.Uint32(header) payload := make([]byte, size) // Allocation 2 reader.Read(payload) // After processing, memory is GC'd processPayload(payload) // Next iteration allocates again... } } ``` ### 1.3 Performance Impact In high-frequency network data processing scenarios, these issues lead to: 1. **Increased CPU Overhead**: Frequent `copy()` operations consume CPU resources 2. **Higher GC Pressure**: Massive temporary memory allocations increase garbage collection burden 3. **Increased Latency**: Each memory allocation and copy adds processing latency 4. **Reduced Throughput**: Memory operations become bottlenecks, limiting overall throughput ## 2. BufReader: A Zero-Copy Solution ### 2.1 Design Philosophy `BufReader` is designed based on the following core principles: 1. **Zero-Copy Reading**: Read directly from network to final memory location, avoiding intermediate copies 2. **Memory Reuse**: Reuse memory blocks through GoMem allocator, avoiding frequent allocations 3. **Chained Buffering**: Use multiple memory blocks in a linked list instead of a single fixed buffer 4. **On-Demand Allocation**: Dynamically adjust memory usage based on actual read amount ### 2.2 Core Data Structures ```go type BufReader struct { Allocator *ScalableMemoryAllocator // Scalable memory allocator buf MemoryReader // Memory block chain reader totalRead int // Total bytes read BufLen int // Block size per read Mouth chan []byte // Data input channel feedData func() error // Data feeding function } // MemoryReader manages multiple memory blocks type MemoryReader struct { *Memory // Memory manager Buffers [][]byte // Memory block chain Size int // Total size Length int // Readable length } ``` ### 2.3 Workflow #### 2.3.1 Zero-Copy Data Reading Flow ```mermaid sequenceDiagram participant N as Network Socket participant A as ScalableMemoryAllocator participant B as BufReader.buf participant U as User Code U->>B: Read(n) B->>B: Check if buffer has data alt Buffer empty B->>A: Request memory block Note over A: Get from pool or allocate new block A-->>B: Return memory block reference B->>N: Read directly to memory block Note over N,B: Zero-copy: data written to final location end B-->>U: Return slice view of memory block Note over U: User uses directly, no copy needed U->>U: Process data U->>A: Recycle memory block (optional) Note over A: Block returns to pool for reuse ``` #### 2.3.2 Memory Block Management Flow ```mermaid graph TD A[Start Reading] --> B{buf has data?} B -->|Yes| C[Return data view directly] B -->|No| D[Call feedData] D --> E[Allocator.Read requests memory] E --> F{Pool has free block?} F -->|Yes| G[Reuse existing memory block] F -->|No| H[Allocate new memory block] G --> I[Read data from network] H --> I I --> J[Append to buf.Buffers] J --> K[Update Size and Length] K --> C C --> L[User reads data] L --> M{Data processed?} M -->|Yes| N[ClipFront recycle front blocks] N --> O[Allocator.Free return to pool] O --> P[End] M -->|No| A ``` ### 2.4 Core Implementation Analysis #### 2.4.1 Initialization and Memory Allocation ```go func NewBufReader(reader io.Reader) *BufReader { return NewBufReaderWithBufLen(reader, defaultBufSize) } func NewBufReaderWithBufLen(reader io.Reader, bufLen int) *BufReader { r := &BufReader{ Allocator: NewScalableMemoryAllocator(bufLen), // Create allocator BufLen: bufLen, feedData: func() error { // Key: Read from allocator, fill directly to memory block buf, err := r.Allocator.Read(reader, r.BufLen) if err != nil { return err } n := len(buf) r.totalRead += n // Directly append memory block reference, no copy r.buf.Buffers = append(r.buf.Buffers, buf) r.buf.Size += n r.buf.Length += n return nil }, } r.buf.Memory = &Memory{} return r } ``` **Zero-Copy Key Points**: - `Allocator.Read()` reads directly from `io.Reader` to allocated memory block - Returned `buf` is a reference to the actual data storage memory block - `append(r.buf.Buffers, buf)` only appends reference, no data copy #### 2.4.2 Read Operations ```go func (r *BufReader) ReadByte() (b byte, err error) { // If buffer is empty, trigger data filling for r.buf.Length == 0 { if err = r.feedData(); err != nil { return } } // Read from memory block chain, no copy needed return r.buf.ReadByte() } func (r *BufReader) ReadRange(n int, yield func([]byte)) error { for r.recycleFront(); n > 0 && err == nil; err = r.feedData() { if r.buf.Length > 0 { if r.buf.Length >= n { // Directly pass slice view of memory block, no copy r.buf.RangeN(n, yield) return } n -= r.buf.Length r.buf.Range(yield) } } return } ``` **Zero-Copy Benefits**: - `yield` callback receives a slice view of the memory block - User code directly operates on original memory blocks without intermediate copying - After reading, processed blocks are automatically recycled #### 2.4.3 Memory Recycling ```go func (r *BufReader) recycleFront() { // Clean up processed memory blocks r.buf.ClipFront(r.Allocator.Free) } func (r *BufReader) Recycle() { r.buf = MemoryReader{} if r.Allocator != nil { // Return all memory blocks to allocator r.Allocator.Recycle() } if r.Mouth != nil { close(r.Mouth) } } ``` ### 2.5 Comparison with bufio.Reader ```mermaid graph LR subgraph "bufio.Reader (Multiple Copies)" A1[Network] -->|System Call| B1[Kernel Buffer] B1 -->|Copy 1| C1[bufio Buffer] C1 -->|Copy 2| D1[User Slice] D1 -->|Copy 3?| E1[Application] end subgraph "BufReader (Zero-Copy)" A2[Network] -->|System Call| B2[Kernel Buffer] B2 -->|Direct Read| C2[GoMem Block] C2 -->|Slice View| D2[User Code] D2 -->|Recycle| C2 C2 -->|Reuse| C2 end ``` | Feature | bufio.Reader | BufReader | |---------|-------------|-----------| | Memory Copies | 2-3 times | 0 times (slice view) | | Buffer Mode | Fixed-size single buffer | Variable-size chained buffer | | Memory Allocation | May allocate each read | Object pool reuse | | Memory Recycling | GC automatic | Active return to pool | | Large Data Handling | Multiple operations needed | Single append to chain | | GC Pressure | High | Very low | ## 3. Performance Benchmarks ### 3.1 Test Scenario Design #### 3.1.1 Real Network Simulation To make benchmarks more realistic, we implemented a `mockNetworkReader` that simulates real network behavior. **Real Network Characteristics**: In real network reading scenarios, the data length returned by each `Read()` call is **uncertain**, affected by multiple factors: - TCP receive window size - Network latency and bandwidth - OS buffer state - Network congestion - Network quality fluctuations **Simulation Implementation**: ```go type mockNetworkReader struct { data []byte offset int rng *rand.Rand minChunk int // Minimum chunk size maxChunk int // Maximum chunk size } func (m *mockNetworkReader) Read(p []byte) (n int, err error) { // Each time return random length data between minChunk and maxChunk chunkSize := m.minChunk + m.rng.Intn(m.maxChunk-m.minChunk+1) n = copy(p[:chunkSize], m.data[m.offset:]) m.offset += n return n, nil } ``` **Different Network Condition Simulations**: | Network Condition | Data Block Range | Real Scenario | |------------------|-----------------|---------------| | Good Network | 1024-4096 bytes | Stable LAN, premium network | | Normal Network | 256-2048 bytes | Regular internet connection | | Poor Network | 64-512 bytes | High latency, small TCP window | | Worst Network | 1-128 bytes | Mobile network, severe congestion | This simulation makes benchmark results more realistic and reliable. #### 3.1.2 Test Scenario List We focus on the following core scenarios: 1. **Concurrent Network Connection Reading** - Demonstrates zero allocation 2. **Concurrent Protocol Parsing** - Simulates real applications 3. **GC Pressure Test** - Shows long-term running advantages ⭐ 4. **Streaming Server Scenario** - Real business scenario ⭐ ### 3.2 Benchmark Design #### Core Test Scenarios Benchmarks focus on **concurrent network scenarios** and **GC pressure** comparison: **1. Concurrent Network Connection Reading** - Simulates 100+ concurrent connections continuously reading data - Each read processes 1KB data packets - bufio: Allocates new buffer each time (`make([]byte, 1024)`) - BufReader: Zero-copy processing (`ReadRange`) **2. Concurrent Protocol Parsing** - Simulates streaming server parsing protocol packets - Reads packet header (4 bytes) + data content - Compares memory allocation strategies **3. GC Pressure Test** (⭐ Core) - Continuous concurrent reading and processing - Tracks GC count, total memory allocation, allocation count - Demonstrates differences in long-term running **4. Streaming Server Scenario** (⭐ Real Application) - Simulates 100 concurrent streams - Each stream reads and forwards data to subscribers - Complete real application scenario comparison #### Key Test Logic **Concurrent Reading**: ```go // bufio.Reader - Allocate each time buf := make([]byte, 1024) // 1KB allocation n, _ := reader.Read(buf) processData(buf[:n]) // BufReader - Zero-copy reader.ReadRange(1024, func(data []byte) { processData(data) // Direct use, no allocation }) ``` **GC Statistics**: ```go // Record GC statistics var beforeGC, afterGC runtime.MemStats runtime.ReadMemStats(&beforeGC) b.RunParallel(func(pb *testing.PB) { // Concurrent testing... }) runtime.ReadMemStats(&afterGC) b.ReportMetric(float64(afterGC.NumGC-beforeGC.NumGC), "gc-runs") b.ReportMetric(float64(afterGC.TotalAlloc-beforeGC.TotalAlloc)/1024/1024, "MB-alloc") ``` Complete test code: `pkg/util/buf_reader_benchmark_test.go` ### 3.3 Running Benchmarks We provide complete benchmark code (`pkg/util/buf_reader_benchmark_test.go`) and convenient test scripts. #### Method 1: Using Test Script (Recommended) ```bash # Run complete benchmark suite sh scripts/benchmark_bufreader.sh ``` This script will run all tests sequentially and output user-friendly results. #### Method 2: Manual Testing ```bash cd pkg/util # Run all benchmarks go test -bench=BenchmarkConcurrent -benchmem -benchtime=2s -test.run=xxx # Run specific tests go test -bench=BenchmarkGCPressure -benchmem -benchtime=5s -test.run=xxx # Run streaming server scenario go test -bench=BenchmarkStreamingServer -benchmem -benchtime=3s -test.run=xxx ``` #### Method 3: Run Key Tests Only ```bash cd pkg/util # GC pressure comparison (core advantage) go test -bench=BenchmarkGCPressure -benchmem -test.run=xxx # Streaming server scenario (real application) go test -bench=BenchmarkStreamingServer -benchmem -test.run=xxx ``` ### 3.4 Actual Performance Test Results Actual results from running benchmarks on Apple M2 Pro: **Test Environment**: - CPU: Apple M2 Pro (12 cores) - OS: macOS (darwin/arm64) - Go: 1.23.0 #### 3.4.1 Core Performance Comparison | Test Scenario | bufio.Reader | BufReader | Difference | |--------------|-------------|-----------|-----------| | **Concurrent Network Read** | 103.2 ns/op
1027 B/op, 1 allocs | 147.6 ns/op
4 B/op, 0 allocs | Zero alloc ⭐ | | **GC Pressure Test** | 1874 ns/op
5,576,659 mallocs
3 gc-runs | 112.7 ns/op
3,918 mallocs
2 gc-runs | **16.6x faster** ⭐⭐⭐ | | **Streaming Server** | 374.6 ns/op
79,508 MB-alloc
134 gc-runs | 30.29 ns/op
601 MB-alloc
2 gc-runs | **12.4x faster** ⭐⭐⭐ | #### 3.4.2 GC Pressure Comparison (Core Finding) **GC Pressure Test** results best demonstrate long-term running differences: **bufio.Reader**: ``` Operation Latency: 1874 ns/op Allocation Count: 5,576,659 times (over 5 million!) GC Runs: 3 times Per Operation: 2 allocs/op ``` **BufReader**: ``` Operation Latency: 112.7 ns/op (16.6x faster) Allocation Count: 3,918 times (99.93% reduction) GC Runs: 2 times Per Operation: 0 allocs/op (zero allocation!) ``` **Key Metrics**: - 🚀 **16x Throughput Improvement**: 45.7M ops/s vs 2.8M ops/s - ⭐ **99.93% Allocation Reduction**: From 5.57 million to 3,918 times - ✨ **Zero Allocation Operations**: 0 allocs/op vs 2 allocs/op #### 3.4.3 Streaming Server Scenario (Real Application) Simulating 100 concurrent streams, continuously reading and forwarding data: **bufio.Reader**: ``` Operation Latency: 374.6 ns/op Memory Allocation: 79,508 MB (79 GB!) GC Runs: 134 times Per Operation: 4 allocs/op ``` **BufReader**: ``` Operation Latency: 30.29 ns/op (12.4x faster) Memory Allocation: 601 MB (99.2% reduction) GC Runs: 2 times (98.5% reduction!) Per Operation: 0 allocs/op ``` **Stunning Differences**: - 🎯 **GC Runs: 134 → 2** (98.5% reduction) - 💾 **Memory Allocation: 79 GB → 0.6 GB** (132x reduction) - ⚡ **Throughput: 10.1M → 117M ops/s** (11.6x improvement) #### 3.4.4 Long-Term Running Impact For streaming server scenarios, **1-hour running** estimation: **bufio.Reader**: ``` Estimated Memory Allocation: ~2.8 TB Estimated GC Runs: ~4,800 times Cumulative GC Pause: Significant ``` **BufReader**: ``` Estimated Memory Allocation: ~21 GB (133x reduction) Estimated GC Runs: ~72 times (67x reduction) Cumulative GC Pause: Minimal ``` **Usage Recommendations**: | Scenario | Recommended | Reason | |----------|------------|---------| | Simple file reading | bufio.Reader | Standard library sufficient | | **High-concurrency network server** | **BufReader** ⭐ | **98% GC reduction** | | **Streaming media processing** | **BufReader** ⭐ | **Zero allocation, high throughput** | | **Long-running services** | **BufReader** ⭐ | **More stable system** | #### 3.4.5 Essential Reasons for Performance Improvement While bufio.Reader is faster in some simple scenarios, BufReader's design goals are not to be faster in all cases, but rather: 1. **Eliminate Memory Allocation** - Avoid frequent `make([]byte, n)` in real applications 2. **Reduce GC Pressure** - Reuse memory through object pool, reducing garbage collection burden 3. **Zero-Copy Processing** - Provide `ReadRange` API for direct data manipulation 4. **Chained Buffering** - Support complex data processing patterns In scenarios like **Monibuca streaming server**, the value of these features far exceeds microsecond-level latency differences. **Real Impact**: When handling 1000 concurrent streaming connections: ```go // bufio.Reader approach // 1000 connections × 30fps × 1024 bytes/packet = 30,720,000 allocations per second // 1024 bytes per allocation = ~30GB/sec temporary memory allocation // Triggers massive GC // BufReader approach // 0 allocations (memory reuse) // 90%+ GC pressure reduction // Significantly improved system stability ``` **Selection Guidelines**: - 📁 **Simple file reading** → bufio.Reader - 🔄 **High-concurrency network services** → BufReader (98% GC reduction) - 💾 **Long-running services** → BufReader (zero allocation) - 🎯 **Streaming server** → BufReader (10-20x throughput) ## 4. Real-World Use Cases ### 4.1 RTSP Protocol Parsing ```go // Use BufReader to parse RTSP requests func parseRTSPRequest(conn net.Conn) (*RTSPRequest, error) { reader := util.NewBufReader(conn) defer reader.Recycle() // Read request line: zero-copy, no memory allocation requestLine, err := reader.ReadLine() if err != nil { return nil, err } // Read headers: directly operate on memory blocks headers, err := reader.ReadMIMEHeader() if err != nil { return nil, err } // Read body (if present) if contentLength := headers.Get("Content-Length"); contentLength != "" { length, _ := strconv.Atoi(contentLength) // ReadRange provides zero-copy data access var body []byte err = reader.ReadRange(length, func(chunk []byte) { body = append(body, chunk...) }) } return &RTSPRequest{ RequestLine: requestLine, Headers: headers, }, nil } ``` ### 4.2 Streaming Media Packet Parsing ```go // Use BufReader to parse FLV packets func parseFLVPackets(conn net.Conn) error { reader := util.NewBufReader(conn) defer reader.Recycle() for { // Read packet header: 4 bytes packetType, err := reader.ReadByte() if err != nil { return err } // Read data size: 3 bytes big-endian dataSize, err := reader.ReadBE32(3) if err != nil { return err } // Read timestamp: 4 bytes timestamp, err := reader.ReadBE32(4) if err != nil { return err } // Skip StreamID: 3 bytes if err := reader.Skip(3); err != nil { return err } // Read actual data: zero-copy processing err = reader.ReadRange(int(dataSize), func(data []byte) { // Process data directly, no copy needed processPacket(packetType, timestamp, data) }) if err != nil { return err } // Skip previous tag size if err := reader.Skip(4); err != nil { return err } } } ``` ### 4.3 Performance-Critical Scenarios BufReader is particularly suitable for: 1. **High-frequency small packet processing**: Network protocol parsing, RTP/RTCP packet handling 2. **Large data stream transmission**: Continuous reading of video/audio streams 3. **Multi-step protocol reading**: Protocols requiring step-by-step reading of different length data 4. **Low-latency requirements**: Real-time streaming media transmission, online gaming 5. **High-concurrency scenarios**: Servers with massive concurrent connections ## 5. Best Practices ### 5.1 Correct Usage Patterns ```go // ✅ Correct: Specify appropriate block size on creation func goodExample(conn net.Conn) { // Choose block size based on actual packet size reader := util.NewBufReaderWithBufLen(conn, 16384) // 16KB blocks defer reader.Recycle() // Ensure resource recycling // Use ReadRange for zero-copy reader.ReadRange(1024, func(data []byte) { // Process directly, don't hold reference to data process(data) }) } // ❌ Wrong: Forget to recycle resources func badExample1(conn net.Conn) { reader := util.NewBufReader(conn) // Missing defer reader.Recycle() // Memory blocks cannot be returned to object pool } // ❌ Wrong: Holding data reference var globalData []byte func badExample2(conn net.Conn) { reader := util.NewBufReader(conn) defer reader.Recycle() reader.ReadRange(1024, func(data []byte) { // ❌ Wrong: data will be recycled after Recycle globalData = data // Dangling reference }) } // ✅ Correct: Copy when data needs to be retained func goodExample2(conn net.Conn) { reader := util.NewBufReader(conn) defer reader.Recycle() var saved []byte reader.ReadRange(1024, func(data []byte) { // Explicitly copy when retention needed saved = make([]byte, len(data)) copy(saved, data) }) // Now safe to use saved } ``` ### 5.2 Block Size Selection ```go // Choose appropriate block size based on scenario const ( // Small packet protocols (e.g., RTSP, HTTP headers) SmallPacketSize = 4 << 10 // 4KB // Medium data streams (e.g., audio) MediumPacketSize = 16 << 10 // 16KB // Large data streams (e.g., video) LargePacketSize = 64 << 10 // 64KB ) func createReaderForProtocol(conn net.Conn, protocol string) *util.BufReader { var bufSize int switch protocol { case "rtsp", "http": bufSize = SmallPacketSize case "audio": bufSize = MediumPacketSize case "video": bufSize = LargePacketSize default: bufSize = util.defaultBufSize } return util.NewBufReaderWithBufLen(conn, bufSize) } ``` ### 5.3 Error Handling ```go func robustRead(conn net.Conn) error { reader := util.NewBufReader(conn) defer func() { // Ensure resources are recycled in all cases reader.Recycle() }() // Set timeout conn.SetReadDeadline(time.Now().Add(5 * time.Second)) // Read data data, err := reader.ReadBytes(1024) if err != nil { if err == io.EOF { // Normal end return nil } // Handle other errors return fmt.Errorf("read error: %w", err) } // Process data processData(data) return nil } ``` ## 6. Performance Optimization Tips ### 6.1 Batch Processing ```go // ✅ Optimized: Batch reading and processing func optimizedBatchRead(reader *util.BufReader) error { // Read large chunk of data at once return reader.ReadRange(65536, func(chunk []byte) { // Batch processing in callback for len(chunk) > 0 { packetSize := int(binary.BigEndian.Uint32(chunk[:4])) packet := chunk[4 : 4+packetSize] processPacket(packet) chunk = chunk[4+packetSize:] } }) } // ❌ Inefficient: Read one by one func inefficientRead(reader *util.BufReader) error { for { size, err := reader.ReadBE32(4) if err != nil { return err } packet, err := reader.ReadBytes(int(size)) if err != nil { return err } processPacket(packet.Buffers[0]) } } ``` ### 6.2 Avoid Unnecessary Copying ```go // ✅ Optimized: Direct processing, no copy func zeroCopyProcess(reader *util.BufReader) error { return reader.ReadRange(4096, func(data []byte) { // Operate directly on original memory sum := 0 for _, b := range data { sum += int(b) } reportChecksum(sum) }) } // ❌ Inefficient: Unnecessary copy func unnecessaryCopy(reader *util.BufReader) error { mem, err := reader.ReadBytes(4096) if err != nil { return err } // Another copy performed data := make([]byte, mem.Size) copy(data, mem.Buffers[0]) sum := 0 for _, b := range data { sum += int(b) } reportChecksum(sum) return nil } ``` ### 6.3 Proper Resource Management ```go // ✅ Optimized: Use object pool to manage BufReader type ConnectionPool struct { readers sync.Pool } func (p *ConnectionPool) GetReader(conn net.Conn) *util.BufReader { if reader := p.readers.Get(); reader != nil { r := reader.(*util.BufReader) // Re-initialize return r } return util.NewBufReader(conn) } func (p *ConnectionPool) PutReader(reader *util.BufReader) { reader.Recycle() // Recycle memory blocks p.readers.Put(reader) // Recycle BufReader object itself } // Use connection pool func handleConnection(pool *ConnectionPool, conn net.Conn) { reader := pool.GetReader(conn) defer pool.PutReader(reader) // Handle connection processConnection(reader) } ``` ## 7. Summary ### 7.1 Performance Comparison Visualization Based on actual benchmark results (concurrent scenarios): ``` 📊 GC Runs Comparison (Core Advantage) ⭐⭐⭐ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ bufio.Reader ████████████████████████████████████████████████████████████████ 134 runs BufReader █ 2 runs ← 98.5% reduction! 📊 Total Memory Allocation Comparison ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ bufio.Reader ████████████████████████████████████████████████████████████████ 79 GB BufReader █ 0.6 GB ← 99.2% reduction! 📊 Operation Throughput Comparison ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ bufio.Reader █████ 10.1M ops/s BufReader ████████████████████████████████████████████████████████ 117M ops/s ← 11.6x! ``` **Key Metrics** (Streaming Server Scenario): - 🎯 **GC Runs**: From 134 to 2 (98.5% reduction) - 💾 **Memory Allocation**: From 79 GB to 0.6 GB (132x reduction) - ⚡ **Throughput**: 11.6x improvement ### 7.2 Core Advantages BufReader achieves zero-copy, high-performance network data reading through: 1. **Zero-Copy Architecture** - Data read directly from network to final memory location - Use slice views to avoid data copying - Chained buffer supports large data processing 2. **Memory Reuse Mechanism** - GoMem object pool reuses memory blocks - Active memory management reduces GC pressure - Configurable block sizes adapt to different scenarios 3. **Significant Performance Improvement** (in concurrent scenarios) - GC runs reduced by 98.5% (134 → 2) - Memory allocation reduced by 99.2% (79 GB → 0.6 GB) - Throughput improved by 10-20x - Significantly improved system stability ### 7.3 Ideal Use Cases BufReader is particularly suitable for: - ✅ High-performance network servers - ✅ Streaming media data processing - ✅ Real-time protocol parsing - ✅ Large data stream transmission - ✅ Low-latency requirements - ✅ High-concurrency environments Not suitable for: - ❌ Simple file reading (standard library sufficient) - ❌ Single small data reads - ❌ Performance-insensitive scenarios ### 7.4 Choosing Between bufio.Reader and BufReader | Scenario | Recommended | |----------|------------| | Simple file reading | bufio.Reader | | Low-frequency network reads | bufio.Reader | | High-performance network server | BufReader | | Streaming media processing | BufReader | | Protocol parsers | BufReader | | Zero-copy requirements | BufReader | | Memory-sensitive scenarios | BufReader | ### 7.5 Key Points Remember when using BufReader: 1. **Always call Recycle()**: Ensure memory blocks are returned to object pool 2. **Don't hold data references**: Data in ReadRange callback will be recycled 3. **Choose appropriate block size**: Adjust based on actual packet size 4. **Leverage ReadRange**: Achieve true zero-copy processing 5. **Use with GoMem**: Fully leverage memory reuse advantages Through the combination of BufReader and GoMem, Monibuca achieves high-performance network data processing, providing solid infrastructure support for streaming media servers. ## References - [GoMem Project](https://github.com/langhuihui/gomem) - [Monibuca v5 Documentation](https://m7s.live) - [Object Reuse Technology Deep Dive](./arch/reuse.md) - Go standard library `bufio` package source code - Go standard library `sync.Pool` documentation