Go Senior Engineer's Lecture (MOOC) 008_GMP Scheduler and Go Design Philosophy

Go GMP Scheduler and Design Philosophy

Corresponding videos: 9-2 Go Language Scheduler, 18-1 Understanding Go Language Design, 18-2 Course Summary

1. Evolution of Go Scheduler

1.0 Era: Single-threaded Scheduler (Go 0.x)

  • Only one thread runs goroutines
  • All goroutines wait in a queue
  • Cannot utilize multiple cores

1.1 Era: Multi-threaded Scheduler (Go 1.0)

  • Introduced multi-threading
  • But severe global lock contention, performance bottleneck

1.2+ Era: GMP Model (Go 1.2 to Present)

Introduced the famous GMP scheduling model.

2. Detailed Explanation of GMP Model

    G (Goroutine)     - 协程,用户级轻量线程
    M (Machine/Thread) - 操作系统线程
    P (Processor)      - 逻辑处理器,调度上下文

2.1 Relationship between the three

                   ┌─────────┐
                   │ Go 程序  │
                   └────┬────┘
                        │ 创建 goroutine
           ┌────────────┼────────────┐
           ▼            ▼            ▼
        ┌─────┐     ┌─────┐     ┌─────┐
        │  G  │     │  G  │     │  G  │    ... (成千上万个)
        └──┬──┘     └──┬──┘     └──┬──┘
           │           │           │
     ┌─────┴─────┐     │     ┌────┴─────┐
     │ P 的本地队列│     │     │ P 的本地队列│
     │ [G][G][G] │     │     │ [G][G]   │
     └─────┬─────┘     │     └────┬─────┘
           │           │          │
        ┌──┴──┐     ┌──┴──┐   ┌──┴──┐
        │  P  │     │  P  │   │  P  │    (GOMAXPROCS 个)
        └──┬──┘     └──┬──┘   └──┬──┘
           │           │          │
        ┌──┴──┐     ┌──┴──┐   ┌──┴──┐
        │  M  │     │  M  │   │  M  │    (按需创建)
        └──┬──┘     └──┬──┘   └──┬──┘
           │           │          │
     ══════╧═══════════╧══════════╧══════
              操作系统内核线程

2.2 Detailed explanation of each component

G (Goroutine)
- Initial stack size is only 2KB (threads usually 1-8MB)
- Stack can grow and shrink dynamically
- Contains the executing function, stack pointer, program counter, etc.
- States: runnable, running, waiting, completed

M (Machine)
- Corresponds to an operating system thread
- Created by Go runtime as needed, up to 10000 by default
- M must hold a P to run a G
- When M is blocked by a system call, P will be taken by another M

P (Processor)
- Number determined by GOMAXPROCS (defaults to the number of CPU cores)
- Each P has a local run queue (up to 256 Gs)
- P is the core of scheduling, deciding which G runs on which M

2.3 Scheduling Policy

// 查看和设置 P 的数量
runtime.GOMAXPROCS(0)    // 获取当前值
runtime.GOMAXPROCS(4)    // 设置为 4
runtime.NumCPU()         // CPU 核数

Scheduling opportunities (possible goroutine switching points):

Switching Point Description
I/O operations File, network read/write
channel operations When sending/receiving blocks
select Multiplexing
Waiting for locks sync.Mutex, etc.
Function calls Compiler inserts checkpoints at function entry
runtime.Gosched() Manual yielding
GC STW phase of garbage collection
System calls M and P separate when syscall blocks

2.4 Work Stealing

When a P's local queue is empty:

  1. First, it tries to get Gs from the global queue.
  2. If the global queue is also empty → it randomly steals half of the Gs from another P's queue.
  3. If still none → it checks the network poller.
  4. If still none → M sleeps.
    P1 [空]           P2 [G5 G6 G7 G8]
       │                    │
       │   偷一半!          │
       │ ◄───── steal ───── │
       │                    │
    P1 [G7 G8]        P2 [G5 G6]

2.5 System Call Handling (Hand Off)

When a G executes a system call (e.g., file I/O) and blocks M:

  正常状态:     M1 ──── P1 ──── G1(syscall阻塞)

  Hand Off:    M1 ──── G1(继续阻塞在syscall)
               M2 ──── P1 ──── G2(P被转给新的M)

P will be handed off to an idle M (or a new M will be created) to keep P busy.

2.6 Network Poller (netpoller)

Network I/O does not block M; instead, it uses epoll/kqueue for asynchronous processing:

  1. G initiates a network call → G is attached to the netpoller.
  2. M is not blocked and continues to run other Gs.
  3. When the network is ready → G is put back into the runnable queue.

This is why Go can efficiently handle a large number of network connections.

3. Comparison of Goroutines and Threads

Feature Goroutine OS Thread
Stack size 2KB (dynamic growth) 1-8MB (fixed)
Creation cost ~0.3μs ~30μs
Switching cost ~0.2μs (user mode) ~1μs (kernel mode)
Scheduling method Cooperative + Preemptive Preemptive
Magnitude Millions Thousands
Communication method channel Shared memory + locks

Preemptive Scheduling (Go 1.14+)

Before Go 1.14, it was purely cooperative scheduling, and CPU-intensive goroutines could occupy M for a long time:

// Go 1.14 之前,这个会独占线程
go func() {
    for {} // 死循环,不让出
}()

Go 1.14+ introduced signal-based preemption (SIGURG), allowing preemption even without function calls.

4. Observing the Scheduler in Practice

# GODEBUG 查看调度器状态
GODEBUG=schedtrace=1000 go run main.go
# 每 1000ms 输出调度器状态

# 更详细
GODEBUG=schedtrace=1000,scheddetail=1 go run main.go

# go tool trace 可视化
// 在代码中查看
fmt.Println("goroutine数:", runtime.NumGoroutine())
fmt.Println("CPU核数:", runtime.NumCPU())
fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0))

5. Go Language Design Philosophy

Corresponding video Ch18: Understanding Go Language Design

5.1 Less is more

  • Only 25 keywords (C has 32, Java has 50+)
  • No classes, replaced by struct + method
  • No inheritance, replaced by composition (embedding)
  • No generics (before Go 1.18) — keeping it simple
  • No exceptions, replaced by multiple return values + error instead of try/catch
  • Only for loops, no while/do-while
  • No ternary operator ?:

5.2 Composition over Inheritance

// 不是继承,是组合
type Animal struct {
    Name string
}
func (a Animal) Speak() string { return a.Name + " speaks" }

type Dog struct {
    Animal  // 嵌入,不是继承
    Breed string
}
// Dog 自动获得 Speak 方法,但可以覆盖

5.3 Implicit Interface Implementation

// 不需要 implements 关键字
type Stringer interface {
    String() string
}
// 任何有 String() string 方法的类型自动实现了 Stringer
// → 解耦了定义者和实现者

5.4 Concurrency is a First-Class Citizen

  • go keyword to start goroutines (not a library function)
  • chan is a built-in type (not a Queue from a library)
  • select is language-level multiplexing
  • "Don't communicate by sharing memory; share memory by communicating."

5.5 Toolchain Philosophy

Tool Purpose
gofmt Standardizes code style, no style wars
go vet Static analysis, finds common errors
go test Built-in testing framework, no third-party needed
go doc Comments are documentation
go build Compiles into a single binary, no dependencies
go mod Built-in dependency management

5.6 Error Handling Philosophy

// 显式处理每个错误,不会被隐藏
f, err := os.Open("file.txt")
if err != nil {
    // 必须处理
}
// 虽然"啰嗦",但:
// 1. 错误路径清晰可见
// 2. 不会因为忘记 catch 而崩溃
// 3. 强迫你思考每个可能的失败

5.7 Go Proverbs by Rob Pike

Proverb Meaning
Don't communicate by sharing memory, share memory by communicating Use channels instead of locks
Concurrency is not parallelism Concurrency is structure, parallelism is execution
Channels orchestrate; mutexes serialize Channels orchestrate flow, mutexes serialize access
The bigger the interface, the weaker the abstraction Smaller interfaces are better
Make the zero value useful Zero value should be directly usable
interface{} says nothing Empty interface expresses no information
Gofmt's style is no one's favorite, yet gofmt is everyone's favorite Consistent style is more important than pretty style
A little copying is better than a little dependency A little copying is better than introducing a dependency
Clear is better than clever Clarity is better than cleverness
Errors are values Errors are values, can be handled programmatically
Don't just check errors, handle them gracefully Handle errors gracefully, don't just check them
Don't panic Don't panic easily

6. Course Summary Mind Map

Go 语言核心
├── 基础语法
│   ├── 变量、常量、类型
│   ├── 控制流(if/for/switch)
│   └── 函数(多返回值、闭包、defer)
├── 面向对象
│   ├── struct + method(代替 class)
│   ├── 组合(代替继承)
│   └── interface(隐式实现、duck typing)
├── 函数式编程
│   ├── 闭包、高阶函数
│   ├── 装饰器/中间件模式
│   └── Functional Options
├── 错误处理
│   ├── error 接口、多返回值
│   ├── panic/recover(仅用于不可恢复错误)
│   └── 统一错误处理(errWrapper 模式)
├── 测试
│   ├── 表格驱动测试
│   ├── 性能测试(Benchmark)
│   ├── pprof 性能分析
│   └── Example 文档测试
├── 并发编程
│   ├── goroutine(GMP 调度模型)
│   ├── channel(CSP 通信模型)
│   ├── select(多路复用)
│   └── sync 包(WaitGroup、Mutex)
├── 标准库
│   ├── net/http
│   ├── encoding/json
│   └── html/template
└── 工程实践
    ├── 单任务→并发→分布式爬虫
    ├── ElasticSearch 集成
    ├── Docker 容器化
    └── RPC 分布式通信

主题测试文章,只做测试使用。发布者:Walker,转转请注明出处:https://walker-learn.xyz/archives/6752

(0)
Walker的头像Walker
上一篇 13 hours ago
下一篇 20 hours ago

Related Posts

EN
简体中文 繁體中文 English