zl程序教程

您现在的位置是:首页 >  其他

当前栏目

万字长文:从实践到原理说透Golang defer

2023-02-18 16:32:40 时间

本从以go-1.16版本源码为基础,介绍了defer关键字的使用规则、实现原理和优化路线,最后介绍了几种将近的使用场景。试图对 go defer 关键字应用到实现原理有一个全面的了解。

defer 概述

Go 提供关键字defer 处理延迟调用问题。在语法上,defer与普通的函数调用没有什么区别。正如官方文档描述的那样:

A "defer" statement invokes a function whose execution is deferred to the moment the surrounding function returns, either because the surrounding function executed a return statement, reached the end of its function body, or because the corresponding goroutine is panicking.DeferStmt = "defer" Expression .The expression must be a function or method call; it cannot be parenthesized. Calls of built-in functions are restricted as for expression statements.

简单理解一下:

  1. defer延迟了函数执行(注意,不是主调函数,而是延迟函数)
  2. 被延迟的函数被调用的时机:
    1. 函数return
    2. 函数体末尾
    3. 发生panic
  3. 语法规则:
    1. 表达式必须是函数或者方法调用
    2. 不能被括号括起来
    3. 内置函数的调用受表达式语句的限制

另外,在《effective go》中也有相关描述:

Go's defer statement schedules a function call (the deferred function) to be run immediately before the function executing the defer returns. It's an unusual but effective way to deal with situations such as resources that must be released regardless of which path a function takes to return. The canonical examples are unlocking a mutex or closing a file.

翻译过来大概是:Go 的 defer语句会在执行defer 的函数返回之前安排一个函数调用(延迟函数)立即运行。这是一种不寻常但有效的方法来处理诸如必须释放资源的情况,而不管函数采用哪条路径返回。典型示例是解锁互斥锁或关闭文件。

这里用很简单的话描述了defer的威力和使用场景:高效的释放资源,如锁释放、文件关闭等。

defer机制到是有点类似C++等语言的析构函数。当函数退出或者对象销毁时做一些扫尾工作。当然defer更为灵活。

defer使用规则

go官方文档用一段简单的话,清晰明了的介绍了defer的特点:

Each time a "defer" statement executes, the function value and parameters to the call are evaluated as usual and saved anew but the actual function is not invoked. Instead, deferred functions are invoked immediately before the surrounding function returns, in the reverse order they were deferred. That is, if the surrounding function returns through an explicit return statement, deferred functions are executed after any result parameters are set by that return statement but before the function returns to its caller. If a deferred function value evaluates to nil, execution panics when the function is invoked, not when the "defer" statement is executed.

但是理解起来,可能并没有那么容易,归纳其他主要有如下特点:

  1. 参数预计算:每次defer语句执行时,会先计算出函数值和入参并保持起来;即,在执行defer语句时,延迟函数的入参已经确定,并保存了副本。
  2. 延迟调用时机:defer语句没有真正的被调用延迟函数延迟函数真正被调用是在主调函数返回前
    1. 如果主调函数有明确的return语句,则延迟函数将在所有返回值被设置(即return语句被执行)之后,主调函数返回之前被执行;
    2. 如果延迟函数nil,在延迟函数被调用时,而非执行defer语句时触发panic
  3. 执行顺序:按照defer语句的逆序执行。

本小节,将通过具体的实例代码展示上述特点,下一小节,将通过源码分析defer机制这些特点的背后原理与实现细节。

执行顺序

defer语句的执行顺序是先进后出LIFO

下面的代码展示了defer的执行顺序。main函数中依次通过defer调用了deferAdeferBdeferC三个函数,执行结果确实依次执行了deferCdeferBdeferA。在执行顺序上,和C++析构函数极为类似。

示例代码:

// 演示defer执行顺序
package main

import "fmt"

func deferA() {
 fmt.Println("deferA")
}

func deferB() {
 fmt.Println("deferB")
}

func deferC() {
 fmt.Println("deferC")
}

func main() {
 defer deferA()
 defer deferB()
 defer deferC()
 fmt.Println("main")
}

上述示例代码执行结果:

$ go run defer_1.go 
main
deferC
deferB
deferA

defer 与 return 顺序

示例代码:

// 验证 defer 与 return 执行顺序
package main

import "fmt"

func deferFunc() int {
 fmt.Println("defer func is called")
 return 0
}

func returnFunc() int {
 fmt.Println("return func is called")
 return 0
}

func returnAndDefer() int {
 defer deferFunc()
 return returnFunc()
}

func main() {
 returnAndDefer()
}

上述示例代码执行结果:

$ go run defer_4.go 
return func is called
defer func is called

上面的示例代码证实了defer机制真正执行延迟函数(示例中为deferFunc()函数),是在return语句执行returnFunc()之后执行的。大致流程顺序如图1

图1 defer 与 return 执行顺序

预计算

在下面这个例子中,变量 adefer被调用的时候就已经确定了,而非在 defer执行时。下述代码所以下面代码输出的是 1。

示例代码:

// 演示defer预计算
package main

import "fmt"

func deferD(d int) {
 fmt.Println(d)
}

func main() {
 a := 1
 defer deferD(a)
 a = a + 1
}

上述代示例码执行结果:

$ go run defer_2.go 
1

这里还有一个典型的示例,在延迟函数的实参中调用函数。

// 验证defer预计算示例2
package main

import "fmt"

func f(index int, value int) int {
 fmt.Printf("index=%d,value=%d\n", index, value)
 return index
}

func main() {
 defer f(1, f(3, 1))
 defer f(2, f(4, 2))
}

上述示例代码执行结果:

$ go run defer_2_2.go 
index=3,value=1
index=4,value=2
index=2,value=4
index=1,value=3

这个示例从另一个角度说明了defer机制的预计算特性。

  • defer压栈f(1,f(3,1)),压栈函数地址、形参1、形参2(调用f(3,1)) --> 打印index=3,value=1
  • defer压栈f(2,f(4,2)),压栈函数地址、形参1、形参2(调用f(4,2)) --> 打印index=4,value=2
  • defer出栈f(2,f(4,2)), --> 打印index=2,value=4
  • defer压栈f(1,f(3,1)),--> 打印index=1,value=3

修改命名返回值

示例代码:

// 演示 defer修改有命名返回值函数的返回值
package main

import "fmt"

func func1(a int) (e int) {
 defer func() {
  e = a + 1
 }()
 return a
}

func main() {
 e := func1(1)
 fmt.Println(e)
}

上述示例代码执行结果:

$ go run defer_3.go 
2

通过上述代码,可以观察到,func1的返回值e并不是a的值。根据前文介绍defer特点的2.1可以知道defer实际调用延迟函数是在计算a之后返回给主调函数main函数之前执行的。因此,命名返回值e是在return a执行之后,真正返回给主调函数main之前修改为a+1的。

此处,需要延伸一个知识点:go语言中函数返回值的初始化话时机。这里需要从两个点考虑:

  1. 命名返回值,如果指定了一个返回值的名字,则会在函数起始处被初始化为对应类型的零值并且作用域为整个函数。可以视为在该函数的第一行中定义了该名字的变量。
  2. 匿名返回值,如果没有指定返回值的名字,则是在返回时创建一个临时变量来接收返回值。

示例代码:

package main

import "fmt"

func funcE() (t int) {
 fmt.Printf("t = %v\n", t)
 return 2
}

func main() {
 funcE()
}

上述示例代码执行结果:

$ go run defer_5.go 
t = 0

defer 与 panic

panic机制也会导致函数提前结束执行,将后续流程交给defer语句(如果有的话)。这里defer机制可能recover这个panic也可能不做处理。还有一种情况就是延迟函数也可能panic。接下来将通过示例代码介绍一下这三种情况。

defer 函数不捕获异常

示例代码:

// 演示defer不捕获异常
package main

import (
 "fmt"
)

func deferCall() {
 defer func() { fmt.Println("defer 1: before panic print") }()
 defer func() { fmt.Println("defer 2: before panic print") }()
 panic("panic error") // trigger defer out stack
 defer func() { fmt.Println("defer 3: after panic, never exec") }()
}

func main() {
 deferCall()
 fmt.Println("main exec ok")
}

上述示例代码执行结果:

$ go run defer_panic_1.go 
defer 2: before panic print
defer 1: before panic print
panic: panic error

goroutine 1 [running]:
main.deferCall()
        /home/work/workspace/defer/defer_panic_1.go:11 +0x68
main.main()
        /home/work/workspace/defer/defer_panic_1.go:16 +0x25
exit status 2

上述代码的执行结果证明,panic语句之前的defer语句会按照先进后出的顺序依次执行;而panic语句后的不会只执行。

defer 函数捕获异常

示例代码:

// 演示defer捕获异常
package main

import (
 "fmt"
)

func deferCall() {
 defer func() { fmt.Println("defer 1: before panic print") }()
 defer func() {
  fmt.Println("defer 2: before panic print,recover")
  if err := recover(); err != nil {
   fmt.Println(err)
  }
 }()
 defer func() { fmt.Println("defer 3: before panic print") }()
 panic("panic error") // trigger defer out stack
 defer func() { fmt.Println("defer 4: after panic, never exec") }()
}
func main() {
 deferCall()
 fmt.Println("main exec ok")
}

上述示例代码执行结果:

$ go run defer_panic_2.go 
defer 3: before panic print
defer 2: before panic print,recover
panic error
defer 1: before panic print
main exec ok

上述示例代码表明:和不捕获异常的情况一样,panic语句之前的defer都会按照先进后出依次执行;不同的是在第二个defer语句中延迟函数捕获了异常,并妥当处理,因此deferCall函数中panic语句前的defer依旧会执行,并且deferCall函数安全退出,main函数也正常执行。

延迟函数中通过recover()捕获panic()抛出的异常,是比较常见的异常处理方式。

defer 函数中包含panic

示例代码:

// 演示defer抛出异常
package main

import (
 "fmt"
)

func main() {
 defer func() {
  if err := recover(); err != nil {
   fmt.Println("defer1 recover:", err)
  } else {
   fmt.Println("fatal")
  }
 }()
 defer func() {
  if err := recover(); err != nil {
   fmt.Println(err)
  }
  panic("defer panic")
 }()
 panic("main panic")
}

上述示例代码执行结果:

$ go run defer_panic_3.go 
main panic
defer1 recover: defer panic

上述示例中,main函数通过panic抛出了异常main panic该异常被第二个defer语句捕获并处理,同时第二个defer自己也通过panic抛出了异常,该异常被第一个defer语句捕获并处。

图2 panic 可以沿着defer执行路径上抛或者被recover

defer 实现

本节将通过源码来深入了解defer的设计与实现,从原理和实现层面探讨defer机制。本节涉及go源代码均是go 1.6版本。

defer执行机制

数据结构

defer 数据结构 runtime._defer

type _defer struct {
 siz     int32 // includes both arguments and results
 started bool
 heap    bool
 // openDefer indicates that this _defer is for a frame with open-coded
 // defers. We have only one defer record for the entire frame (which may
 // currently have 0, 1, or more defers active).
 openDefer bool
 sp        uintptr  // sp at time of defer
 pc        uintptr  // pc at time of defer
 fn        *funcval // can be nil for open-coded defers
 _panic    *_panic  // panic that is running defer
 link      *_defer

 // If openDefer is true, the fields below record values about the stack
 // frame and associated function that has the open-coded defer(s). sp
 // above will be the sp for the frame, and pc will be address of the
 // deferreturn call in the function.
 fd   unsafe.Pointer // funcdata for the function associated with the frame
 varp uintptr        // value of varp for the stack frame
 // framepc is the current pc associated with the stack frame. Together,
 // with sp above (which is the sp associated with the stack frame),
 // framepc/sp can be used as pc/sp pair to continue a stack trace via
 // gentraceback().
 framepc uintptr
}

_defer结构中字段含义:

  1. siz 参数和返回值共占多少字节,会直接分配在_defer后面,在注册时保存参数,在执行完成时拷贝到调用者参数和返回值空间
  2. started 标记是否已经执行
  3. heap go1.13优化,标识是否为堆分配
  4. openDefer 表示当前 defer 是否经过开放编码的优化
  5. sp 记录调用者栈指针,可以通过它判断自己注册的defer是否已经执行完了
  6. pc deferproc的返回地址
  7. fndefer 关键字中传入的函数,即延迟函数,如果开启了开放编码优化,可能为空;
  8. _panic 是触发延迟调用的结构体,可能为空;_panic指向当前的panic,表示这个defer是由这个panic触发的
  9. link 链表串联字段, 链到前一个注册的defer结构体

其他字段为open-coded配套字段,通过这些信息可以找到未注册到链表的defer函数

图3 defer 结构

执行机制

中间代码生成阶段的 gc.state.stmt 会负责处理程序中的 defer,该函数会根据条件的不同,使用三种不同的机制处理该关键字:

// stmt converts the statement n to SSA and adds it to s.
func (s *state) stmt(n *Node) {
  // ...
 case ODEFER:
 // ...
  if s.hasOpenDefers {
   s.openDeferRecord(n.Left) // 开放编码
  } else {
   d := callDefer  // 默认是堆上实现
   if n.Esc == EscNever {
    d = callDeferStack // 栈上实现
   }
   s.callResult(n.Left, d)
  }
  // ...
}

注意:这里是go 1.16版本,在go 1.17之后,这里重构了,但是逻辑基本保持一致,可以参考 ssagen.state.stmt

堆分配、栈分配和开放编码是处理 defer 关键字的三种方法,早期的 Go 语言会在堆上分配 runtime._defer 结构体,不过该实现的性能较差,Go 语言在 1.13 中引入栈上分配的结构体,减少了 30% 的额外开销,并在 1.14 中引入了基于开放编码的 defer,使得该关键字的额外开销可以忽略不计

根据 gc.state.stmt 可以看出:

  1. 如果开启开放编码(且符合条件s.hasOpenDefers==true)则调用openDeferRecord 按照开放编码实现方式处理;
  2. 如果满足n.Esc == EscNever 则将callKind设置为callDeferStack然后调用callResult按照栈上实现来处理;
  3. 否则走默认逻辑, 则将callKind设置为callDefer,然后调用callResult 按照堆上实现来处理。

接下来会分别介绍三种不同类型 defer 的设计与实现原理。

堆上实现

堆上实现是golang最早的defer实现方式。go1.12引入。

当该方案被启用时,编译器会调用 gc.state.callResult ,该函数会调用gc.state.call,因此 defer 在编译器看来也是函数调用。

gc.state.call 会负责为所有函数和方法调用生成中间代码,它的工作包括以下内容:

  1. 获取需要执行的函数名、闭包指针、代码指针和函数调用的接收方;
  2. 获取栈地址并将函数或者方法的参数写入栈中;
  3. 调用用 gc.state.newValue1A 以及相关函数生成函数调用的中间代码;
  4. 如果当前调用的函数是 defer,那么会单独生成相关的结束代码块;
  5. 获取函数的返回值地址并结束当前调用;
// Calls the function n using the specified call type.
// Returns the address of the return value (or nil if none).
func (s *state) call(n *Node, k callKind, returnResultAddr bool) *ssa.Value {
 // ...
 if k == callDeferStack { // 栈上实现逻辑分支,后面会介绍
  // ....
 } else {
  // ...
  // call target
  switch {
  case k == callDefer:
   aux := ssa.StaticAuxCall(deferproc, ACArgs, ACResults) // deferproc defer创建函数
   if testLateExpansion {
    call = s.newValue0A(ssa.OpStaticLECall, aux.LateExpansionResultType(), aux)
    call.AddArgs(callArgs...)
   } else {
    call = s.newValue1A(ssa.OpStaticCall, types.TypeMem, aux, s.mem())
   }
  // ....
  call.AuxInt = stksize // Call operations carry the argsize of the callee along with them
 }
 // ...
}

从上述代码中我们能看到实现与go其他关键字的实现类似,调用的是gc.state.call

核心思想:

  1. 在defer出现的地方插入了指令CALLruntime.deferproc ;
  2. 在函数返回的地方插入了CALL runtime.deferreturn;
  3. goroutine的控制结构中,有一张表记录defer,调用runtime.deferproc 时会将需要defer的表达式记录在表中,而在调用 runtime.deferreturn的时候,则会依次从defer表中“出栈”并执行;
  4. 如果有多个defer,调用顺序类似栈,越后面的defer表达式越先被调用。

编译器通过以下三个步骤为所有调用 defer 的函数末尾插入 runtime.deferreturn的函数调用:

  1. gc.walkstmt 在遇到 ODEFER 节点时会执行 Curfn.Func.SetHasDefer(true) 设置当前函数的 hasdefer 属性;
  2. gc.buildssa 会执行 s.hasdefer = fn.Func.HasDefer() 更新 statehasdefer
  3. cgc.state.exit会根据 statehasdefer 在函数返回之前插入 runtime.deferreturn的函数调用;
// exit processes any code that needs to be generated just before returning.
// It returns a BlockRet block that ends the control flow. Its control value
// will be set to the final memory state.
func (s *state) exit() *ssa.Block {
 if s.hasdefer {
  if s.hasOpenDefers { // 开放编码实现处理逻辑,后续会介绍
  } else {
   s.rtcall(Deferreturn, true, nil)
  }
 }
}

上面介绍了在编译阶段defer的相关逻辑——如果进行代码改造。那么在运行时又是怎么运行的呢?

可以注意到runtime.deferprocruntime.deferreturn是运行时包的函数,这两个运行时函数是 defer 关键字运行时机制的入口:

  • runtime.deferproc 负责创建新的延迟调用;
  • runtime.deferreturn负责在函数调用结束时执行所有的延迟调用;

创建延迟调用

runtime.deferproc 主要工作:

  1. 调用runtime.newdefer 创建一个新的 runtime._defer 对象;
  2. 将新创建的 runtime._defer 对象插入到runtime.g对象的_defer链表上;
  3. 设置它的函数指针 fn、程序计数器 pc 和栈指针 sp 并将相关的参数拷贝到相邻的内存空间中。
  4. 最后调用的 runtime.return0 返回,runtime.return0 是唯一一个不会触发延迟调用的函数,它可以避免递归 runtime.deferreturn的递归调用。
// Create a new deferred function fn with siz bytes of arguments.
// The compiler turns a defer statement into a call to this.
//go:nosplit
func deferproc(siz int32, fn *funcval) { // arguments of fn follow fn
  // 获取当前goroutine
 gp := getg()
 if gp.m.curg != gp {
  // go code on the system stack can't defer
  throw("defer on system stack")
 }

 // the arguments of fn are in a perilous state. The stack map
 // for deferproc does not describe them. So we can't let garbage
 // collection or stack copying trigger until we've copied them out
 // to somewhere safe. The memmove below does that.
 // Until the copy completes, we can only call nosplit routines.
   // 获取调用者指针
 sp := getcallersp()
  // 通过偏移获得参数
 argp := uintptr(unsafe.Pointer(&fn)) + unsafe.Sizeof(fn)
 callerpc := getcallerpc()
 d := newdefer(siz) // 创建了一个新的_defer 对象
 if d._panic != nil {
  throw("deferproc: d.panic != nil after newdefer")
 }
  // 注意这里,可以看出,_defer链表是头插的,这是为什么defer是逆序执行的原因
 d.link = gp._defer
 gp._defer = d
 d.fn = fn
 d.pc = callerpc
 d.sp = sp
 switch siz {
 case 0:
  // Do nothing.
 case sys.PtrSize:
  *(*uintptr)(deferArgs(d)) = *(*uintptr)(unsafe.Pointer(argp))
 default:
  memmove(deferArgs(d), unsafe.Pointer(argp), uintptr(siz))
 }

 // deferproc returns 0 normally.
 // a deferred func that stops a panic
 // makes the deferproc return 1.
 // the code the compiler generates always
 // checks the return value and jumps to the
 // end of the function if deferproc returns != 0.
 return0()
 // No code can go here - the C return register has
 // been set and must not be clobbered.
}

runtime.newdefer 依从从三个地方构建 runtime._defer

  1. 尝试从调度器的延迟调用缓存池 sched.deferpool 中回收一批_defer对象并将该对象追加到当前goroutine的缓存池中;
  2. 然后在从当前goroutine 的延迟调用缓存池 pp.deferpool 中取出一个空闲的_defer对象;
  3. 如果从pp.deferpool没有取到可用的_defer对象,则通过 runtime.mallocgc 在堆上创建一个新的_defer对象。
// Allocate a Defer, usually using per-P pool.
// Each defer must be released with freedefer.  The defer is not
// added to any defer chain yet.
//
// This must not grow the stack because there may be a frame without
// stack map information when this is called.
//
//go:nosplit
func newdefer(siz int32) *_defer {
 var d *_defer
 sc := deferclass(uintptr(siz))
 gp := getg()
 if sc < uintptr(len(p{}.deferpool)) {// 从deferpool中
  pp := gp.m.p.ptr()
  if len(pp.deferpool[sc]) == 0 && sched.deferpool[sc] != nil {
   // Take the slow path on the system stack so
   // we don't grow newdefer's stack.
   systemstack(func() {
    lock(&sched.deferlock)
        // 先去sched.deferpool回收一批_defer对象,转移到pp.deferpool中
    for len(pp.deferpool[sc]) < cap(pp.deferpool[sc])/2 && sched.deferpool[sc] != nil {
     d := sched.deferpool[sc]
     sched.deferpool[sc] = d.link
     d.link = nil
     pp.deferpool[sc] = append(pp.deferpool[sc], d)
    }
    unlock(&sched.deferlock)
   })
  }
    // 尝试从pp.deferpool中取个空闲的_defer对象
  if n := len(pp.deferpool[sc]); n > 0 {
   d = pp.deferpool[sc][n-1]
   pp.deferpool[sc][n-1] = nil
   pp.deferpool[sc] = pp.deferpool[sc][:n-1]
  }
 }
  // 实在取不到,则生成一个
 if d == nil {
  // Allocate new defer+args.
  systemstack(func() {
   total := roundupsize(totaldefersize(uintptr(siz)))
   d = (*_defer)(mallocgc(total, deferType, true))
  })
 }
 d.siz = siz
 d.heap = true
 return d
}

注意:将新创建的 runtime._defer 对象插入到runtime.g对象的_defer链表上,使用的是头插法,因此,defer的执行顺序是逆序的。

执行延迟调用

runtime.deferreturn会从当前 Goroutine 的 _defer 链表中取出最前面的 runtime._defer 并调用 runtime.jmpdefer 传入需要执行的函数和参数:

// Run a deferred function if there is one.
// The compiler inserts a call to this at the end of any
// function which calls defer.
// If there is a deferred function, this will call runtime·jmpdefer,
// which will jump to the deferred function such that it appears
// to have been called by the caller of deferreturn at the point
// just before deferreturn was called. The effect is that deferreturn
// is called again and again until there are no more deferred functions.
//
// Declared as nosplit, because the function should not be preempted once we start
// modifying the caller's frame in order to reuse the frame to call the deferred
// function.
//
// The single argument isn't actually used - it just has its address
// taken so it can be matched against pending defers.
//go:nosplit
func deferreturn(arg0 uintptr) {
 gp := getg()
 d := gp._defer // 取出第一个_defer对象
 if d == nil {
  return
 }
 // ...
 // 开放编码实现处理逻辑
 if d.openDefer {
 }

 fn := d.fn
 d.fn = nil
 gp._defer = d.link // 从g._defer链表中删除当前_defer对象
 freedefer(d) // 释放_defer对象
 // If the defer function pointer is nil, force the seg fault to happen
 // here rather than in jmpdefer. gentraceback() throws an error if it is
 // called with a callback on an LR architecture and jmpdefer is on the
 // stack, because the stack trace can be incorrect in that case - see
 // issue #8153).
 _ = fn.fn
 jmpdefer(fn, uintptr(unsafe.Pointer(&arg0))) // 调用jmpdefer
}

runtime.jmpdefer 是一个用汇编语言实现的运行时函数。它的主要工作是:

  1. 跳转到 defer 所在的代码段
  2. 并在执行结束之后跳转回 runtime.deferreturn

runtime.deferreturn 会多次判断当前 goroutine_defer 链表中是否有未执行的_defer对象,该函数只有在所有延迟函数都执行后才会返回。

栈上实现

堆上实现的defer存在如下问题:

  1. defer信息主要存储在堆上,要在堆和栈上来回拷贝返回值和参数很慢;
  2. defer结构体通过链表链起来,而链表的操作也很慢。

在go1.13中对defer的实现进行了优化:

  1. 减少了defer信息的堆分配。再通过runtime.deferprocStack将整个defer注册到defer链表中:
    1. 将一般情况的defer信息存储在函数栈帧的局部变量区域;
    2. 显示循环或者是隐式循环的defer还是需要用到go1.12中defer信息的堆分配。
func (s *state) call(n *Node, k callKind, returnResultAddr bool) *ssa.Value {
 if k == callDeferStack {
  testLateExpansion = ssa.LateCallExpansionEnabledWithin(s.f)
  // Make a defer struct d on the stack.在栈上创建_defer结构
  t := deferstruct(stksize)
  d := tempAt(n.Pos, s.curfn, t)

  s.vars[&memVar] = s.newValue1A(ssa.OpVarDef, types.TypeMem, d, s.mem())
  addr := s.addr(d)

  // Must match reflect.go:deferstruct and src/runtime/runtime2.go:_defer.
  // 0: siz
  s.store(types.Types[TUINT32],
   s.newValue1I(ssa.OpOffPtr, types.Types[TUINT32].PtrTo(), t.FieldOff(0), addr),
   s.constInt32(types.Types[TUINT32], int32(stksize)))
  // 1: started, set in deferprocStack
  // 2: heap, set in deferprocStack
  // 3: openDefer
  // 4: sp, set in deferprocStack
  // 5: pc, set in deferprocStack
  // 6: fn
  s.store(closure.Type,
   s.newValue1I(ssa.OpOffPtr, closure.Type.PtrTo(), t.FieldOff(6), addr),
   closure)
  // 7: panic, set in deferprocStack
  // 8: link, set in deferprocStack
  // 9: framepc
  // 10: varp
  // 11: fd

  // Then, store all the arguments of the defer call.
  ft := fn.Type
  off := t.FieldOff(12)
  args := n.Rlist.Slice()

  // Set receiver (for interface calls). Always a pointer.
  if rcvr != nil {
   p := s.newValue1I(ssa.OpOffPtr, ft.Recv().Type.PtrTo(), off, addr)
   s.store(types.Types[TUINTPTR], p, rcvr)
  }
  // Set receiver (for method calls).
  if n.Op == OCALLMETH {
   f := ft.Recv()
   s.storeArgWithBase(args[0], f.Type, addr, off+f.Offset)
   args = args[1:]
  }
  // Set other args.
  for _, f := range ft.Params().Fields().Slice() {
   s.storeArgWithBase(args[0], f.Type, addr, off+f.Offset)
   args = args[1:]
  }

  // Call runtime.deferprocStack with pointer to _defer record.
  ACArgs = append(ACArgs, ssa.Param{Type: types.Types[TUINTPTR], Offset: int32(Ctxt.FixedFrameSize())})
  aux := ssa.StaticAuxCall(deferprocStack, ACArgs, ACResults) // 调用deferprocStack
 // ...
 } else {
  // 堆上实现
 }
 // 栈上和堆上实现的共同逻辑
}

因为在编译期间我们已经创建了 runtime._defer 对象,所以在运行期间runtime.deferprocStack只需要设置一些未在编译期间初始化的字段,就可以将栈上的 runtime._defer 追加到函数的链表上。

// deferprocStack queues a new deferred function with a defer record on the stack.
// The defer record must have its siz and fn fields initialized.
// All other fields can contain junk.
// The defer record must be immediately followed in memory by
// the arguments of the defer.
// Nosplit because the arguments on the stack won't be scanned
// until the defer record is spliced into the gp._defer list.
//go:nosplit
func deferprocStack(d *_defer) { // 注意这里入参已经是_defer了,因此deferprocStack只是做一些简单的初始化,然后将初始化好的_defer对象插入当前goroutine的_defer链表中
 gp := getg()
 if gp.m.curg != gp {
  // go code on the system stack can't defer
  throw("defer on system stack")
 }
 // siz and fn are already set.
 // The other fields are junk on entry to deferprocStack and
 // are initialized here.
 d.started = false
 d.heap = false
 d.openDefer = false
 d.sp = getcallersp()
 d.pc = getcallerpc()
 d.framepc = 0
 d.varp = 0

 *(*uintptr)(unsafe.Pointer(&d._panic)) = 0
 *(*uintptr)(unsafe.Pointer(&d.fd)) = 0
  // 将初始化好的_defer对象插入当前goroutine的_defer链表中
 *(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer))
 *(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d)) 

 return0()
}

栈上分配和堆上分配的 runtime._defer 并没有本质的不同,只是分配位置的不同,余下逻辑共用。因此该方法可以适用于绝大多数的场景。

开放编码实现

go1.14中进一步对defer的进行了优化:

  1. 在编译阶段插入代码,把defer函数的执行逻辑展开在所属函数内,避免创建_defer对象,而且不需要注册到_defer链表。称为open coded defer
  2. 与1.13一样不适用于循环中的defer
    1. 性能几乎提升了一个数量级。
    2. open coded defer 中发生panic 或 调用runtime.Goexit(),后面未注册到的defer函数无法执行到,需要栈扫描。defer结构体中就多添加了一些字段,借助这些字段可以找到未注册到链表中的defer函数。
    3. 结果就是defer变快了,但是panic变慢了。

开启开放编码

Go1.14对defer的优化,其实就是内联。因此,它有很多内联函数类似的限制条件:

  1. 函数的 defer 数量少于或者等于 8 个;
  2. 函数的 defer 关键字不能在循环中执行;
  3. 函数的 return 语句与 defer 语句的乘积小于或者等于 15 个。

如gc.walkstmt 函数所示, defer 关键字的数量多于 8 个或者 defer 关键字处于 for 循环中,那么我们在这里都会禁用开放编码优化。

// The max number of defers in a function using open-coded defers. We enforce this
// limit because the deferBits bitmask is currently a single byte (to minimize code size)
const maxOpenDefers = 8
// The result of walkstmt MUST be assigned back to n, e.g.
//  n.Left = walkstmt(n.Left)
func walkstmt(n *Node) *Node {
 // ...
 switch n.Op {
 // ...

 case ODEFER:
  Curfn.Func.SetHasDefer(true)
  Curfn.Func.numDefers++
  if Curfn.Func.numDefers > maxOpenDefers { // maxOpenDefers == 8 defer 个数大于8个,不行
   // Don't allow open-coded defers if there are more than
   // 8 defers in the function, since we use a single
   // byte to record active defers.
   Curfn.Func.SetOpenCodedDeferDisallowed(true)
  }
  if n.Esc != EscNever { // defer在循环中也不行
   // If n.Esc is not EscNever, then this defer occurs in a loop,
   // so open-coded defers cannot be used in this function.
   Curfn.Func.SetOpenCodedDeferDisallowed(true)
  }
  fallthrough
  // ...
 return n
}

在 SSA 中间代码生成阶段,如 gc.buildssa 函数所示,启用开放编码优化的其他条件,也就是返回语句的数量与 defer 数量的乘积需要小于 15。

/ buildssa builds an SSA function for fn.
// worker indicates which of the backend workers is doing the processing.
func buildssa(fn *Node, worker int) *ssa.Func {
 // ...
 s.hasOpenDefers = Debug.N == 0 && s.hasdefer && !s.curfn.Func.OpenCodedDeferDisallowed()
 switch {
 case s.hasOpenDefers && (Ctxt.Flag_shared || Ctxt.Flag_dynlink) && thearch.LinkArch.Name == "386":
  // Don't support open-coded defers for 386 ONLY when using shared
  // libraries, because there is extra code (added by rewriteToUseGot())
  // preceding the deferreturn/ret code that is generated by gencallret()
  // that we don't track correctly.
  s.hasOpenDefers = false
 }
 if s.hasOpenDefers && s.curfn.Func.Exit.Len() > 0 {
  // Skip doing open defers if there is any extra exit code (likely
  // copying heap-allocated return values or race detection), since
  // we will not generate that code in the case of the extra
  // deferreturn/ret segment.
  s.hasOpenDefers = false
 }
 if s.hasOpenDefers &&
  s.curfn.Func.numReturns*s.curfn.Func.numDefers > 15 { // 返回语句的数量与defer数量的乘积需要小于 15
  // Since we are generating defer calls at every exit for
  // open-coded defers, skip doing open-coded defers if there are
  // too many returns (especially if there are multiple defers).
  // Open-coded defers are most important for improving performance
  // for smaller functions (which don't have many returns).
  s.hasOpenDefers = false
 }
 // ... 
}

设置

经过上述一些列的条件判断,如果最终s.hasOpenDefers == true 即开启开放编码实现。接下来将会做如下工作:

设置deferBitsdeferBits是一个bitmask激励了哪个defer需要被执行,类似在堆和栈上实现的_defer链表

func buildssa(fn *Node, worker int) *ssa.Func {
 // ...
 if s.hasOpenDefers {
  // Create the deferBits variable and stack slot.  deferBits is a
  // bitmask showing which of the open-coded defers in this function
  // have been activated.
  deferBitsTemp := tempAt(src.NoXPos, s.curfn, types.Types[TUINT8])
  s.deferBitsTemp = deferBitsTemp
  // For this value, AuxInt is initialized to zero by default
  startDeferBits := s.entryNewValue0(ssa.OpConst8, types.Types[TUINT8])
  s.vars[&deferBitsVar] = startDeferBits
  s.deferBitsAddr = s.addr(deferBitsTemp)
  s.store(types.Types[TUINT8], s.deferBitsAddr, startDeferBits)
  // Make sure that the deferBits stack slot is kept alive (for use
  // by panics) and stores to deferBits are not eliminated, even if
  // all checking code on deferBits in the function exit can be
  // eliminated, because the defer statements were all
  // unconditional.
  s.vars[&memVar] = s.newValue1Apos(ssa.OpVarLive, types.TypeMem, deferBitsTemp, s.mem(), false)
 }
 // ...
}

deferBits中的每一个比特位都表示该位对应的 defer 关键字是否需要被执行,如下图所示,其中 8 个比特的第二、第三个比特在函数返回前被设置成了 1,那么该比特位对应的函数会在函数返回前执行。

图4 Golang deferBits 示意

中间代码生成阶段的 gc.state.stmt 函数调用gc.state.openDeferRecord构造gc.openDeferInfo对象,该结构体的 closure 中存储着调用的函数,rcvr 中存储着方法的接收者,而最后的 argVals 中存储了函数的参数。

// Information about each open-coded defer.
type openDeferInfo struct {
 // The ODEFER node representing the function call of the defer
 n *Node
 // If defer call is closure call, the address of the argtmp where the
 // closure is stored.
 closure *ssa.Value
 // The node representing the argtmp where the closure is stored - used for
 // function, method, or interface call, to store a closure that panic
 // processing can use for this defer.
 closureNode *Node
 // If defer call is interface call, the address of the argtmp where the
 // receiver is stored
 rcvr *ssa.Value
 // The node representing the argtmp where the receiver is stored
 rcvrNode *Node
 // The addresses of the argtmps where the evaluated arguments of the defer
 // function call are stored.
 argVals []*ssa.Value
 // The nodes representing the argtmps where the args of the defer are stored
 argNodes []*Node
}

构造gc.openDeferInfo对象。

// openDeferRecord adds code to evaluate and store the args for an open-code defer
// call, and records info about the defer, so we can generate proper code on the
// exit paths. n is the sub-node of the defer node that is the actual function
// call. We will also record funcdata information on where the args are stored
// (as well as the deferBits variable), and this will enable us to run the proper
// defer calls during panics.
func (s *state) openDeferRecord(n *Node) {
 // ...
 opendefer := &openDeferInfo{
  n: n,
 }
 fn := n.Left
 if n.Op == OCALLFUNC {
  // We must always store the function value in a stack slot for the
  // runtime panic code to use. But in the defer exit code, we will
  // call the function directly if it is a static function.
  closureVal := s.expr(fn)
  closure := s.openDeferSave(nil, fn.Type, closureVal)
  opendefer.closureNode = closure.Aux.(*Node)
  if !(fn.Op == ONAME && fn.Class() == PFUNC) {
   opendefer.closure = closure
  }
 } else if n.Op == OCALLMETH {
  if fn.Op != ODOTMETH {
   Fatalf("OCALLMETH: n.Left not an ODOTMETH: %v", fn)
  }
  closureVal := s.getMethodClosure(fn)
  // We must always store the function value in a stack slot for the
  // runtime panic code to use. But in the defer exit code, we will
  // call the method directly.
  closure := s.openDeferSave(nil, fn.Type, closureVal)
  opendefer.closureNode = closure.Aux.(*Node)
 } else {
  if fn.Op != ODOTINTER {
   Fatalf("OCALLINTER: n.Left not an ODOTINTER: %v", fn.Op)
  }
  closure, rcvr := s.getClosureAndRcvr(fn)
  opendefer.closure = s.openDeferSave(nil, closure.Type, closure)
  // Important to get the receiver type correct, so it is recognized
  // as a pointer for GC purposes.
  opendefer.rcvr = s.openDeferSave(nil, fn.Type.Recv().Type, rcvr)
  opendefer.closureNode = opendefer.closure.Aux.(*Node)
  opendefer.rcvrNode = opendefer.rcvr.Aux.(*Node)
 }
 for _, argn := range n.Rlist.Slice() {
  var v *ssa.Value
  if canSSAType(argn.Type) {
   v = s.openDeferSave(nil, argn.Type, s.expr(argn))
  } else {
   v = s.openDeferSave(argn, argn.Type, nil)
  }
  args = append(args, v)
  argNodes = append(argNodes, v.Aux.(*Node))
 }
 opendefer.argVals = args
 opendefer.argNodes = argNodes
 index := len(s.openDefers)
 s.openDefers = append(s.openDefers, opendefer)

 // Update deferBits only after evaluation and storage to stack of
 // args/receiver/interface is successful.
 bitvalue := s.constInt8(types.Types[TUINT8], 1<<uint(index))
 newDeferBits := s.newValue2(ssa.OpOr8, types.Types[TUINT8], s.variable(&deferBitsVar, types.Types[TUINT8]), bitvalue)
 s.vars[&deferBitsVar] = newDeferBits
 s.store(types.Types[TUINT8], s.deferBitsAddr, newDeferBits)
}

很多 defer 语句可以在编译期间判断是否被执行,如果函数中的 defer 语句可以在编译期间确定,中间代码生成阶段就会直接通过gc.state.exit调用 gc.state.openDeferExit 在函数返回前生成判断 deferBits 的代码。

// exit processes any code that needs to be generated just before returning.
// It returns a BlockRet block that ends the control flow. Its control value
// will be set to the final memory state.
func (s *state) exit() *ssa.Block {
 if s.hasdefer {
  if s.hasOpenDefers {
   if shareDeferExits && s.lastDeferExit != nil && len(s.openDefers) == s.lastDeferCount {
    if s.curBlock.Kind != ssa.BlockPlain {
     panic("Block for an exit should be BlockPlain")
    }
    s.curBlock.AddEdgeTo(s.lastDeferExit)
    s.endBlock()
    return s.lastDeferFinalBlock
   }
   s.openDeferExit()
  } else {
   s.rtcall(Deferreturn, true, nil)
  }
 }
 // ...
}

执行

当程序遇到运行时才能判断的条件语句时,我们仍然需要由运行时的 runtime.deferreturn 决定是否执行 defer 关键字:

func deferreturn(arg0 uintptr) {
 gp := getg()
 d := gp._defer
 if d.openDefer {
  done := runOpenDeferFrame(gp, d)
  if !done {
   throw("unfinished open-coded defers in deferreturn")
  }
  gp._defer = d.link
  freedefer(d)
  return
 }
}

该函数为开放编码做了特殊的优化,运行时会调用runtime.runOpenDeferFrame执行活跃的开放编码延迟函数,该函数会执行以下的工作:

  1. runtime._defer 结构体中读取 deferBits、函数 defer 数量等信息;
  2. 在循环中依次读取函数的地址和参数信息并通过 deferBits 判断该函数是否需要被执行;
  3. 调用 runtime.reflectcallSave 调用需要执行的 defer 函数。
// runOpenDeferFrame runs the active open-coded defers in the frame specified by
// d. It normally processes all active defers in the frame, but stops immediately
// if a defer does a successful recover. It returns true if there are no
// remaining defers to run in the frame.
func runOpenDeferFrame(gp *g, d *_defer) bool {
 done := true
 fd := d.fd

 // Skip the maxargsize
 _, fd = readvarintUnsafe(fd)
 deferBitsOffset, fd := readvarintUnsafe(fd)
 nDefers, fd := readvarintUnsafe(fd)
 deferBits := *(*uint8)(unsafe.Pointer(d.varp - uintptr(deferBitsOffset))) // 拿到 deferBits

 for i := int(nDefers) - 1; i >= 0; i-- { // 遍历 deferBits
  // read the funcdata info for this defer
  if deferBits&(1<<i) == 0 {
   // 遍历,跳过不需要执行的defer
   continue
  }
  closure := *(**funcval)(unsafe.Pointer(d.varp - uintptr(closureOffset)))
  d.fn = closure
  // ...
  deferBits = deferBits &^ (1 << i)
  *(*uint8)(unsafe.Pointer(d.varp - uintptr(deferBitsOffset))) = deferBits
  p := d._panic
  reflectcallSave(p, unsafe.Pointer(closure), deferArgs, argWidth) // 处理需要被执行延迟函数
  if p != nil && p.aborted {
   break
  }
  d.fn = nil
  // These args are just a copy, so can be cleared immediately
  memclrNoHeapPointers(deferArgs, uintptr(argWidth))
  if d._panic != nil && d._panic.recovered {
   done = deferBits == 0
   break
  }
 }

 return done
}

runtime.reflectcallSave 最终通过调用runtime.reflectcall来执行延迟函数。

// reflectcallSave calls reflectcall after saving the caller's pc and sp in the
// panic record. This allows the runtime to return to the Goexit defer processing
// loop, in the unusual case where the Goexit may be bypassed by a successful
// recover.
func reflectcallSave(p *_panic, fn, arg unsafe.Pointer, argsize uint32) {
 if p != nil { // 处理panic
  p.argp = unsafe.Pointer(getargp(0))
  p.pc = getcallerpc()
  p.sp = unsafe.Pointer(getcallersp())
 }
 reflectcall(nil, fn, arg, argsize, argsize)
 if p != nil {
  p.pc = 0
  p.sp = unsafe.Pointer(nil)
 }
}

open coded defer 中发生panic 或调用runtime.Goexit,后面未注册到的defer函数无法执行到,需要栈扫描。_defer结构体中就多添加了一些字段,借助这些字段可以找到未注册到链表中的defer函数,结果就是defer变快了,但是panic变慢了。

至此,defer实现原理基本梳理完毕,下面介绍一下一些使用场景。

使用场景案例

由于defer类似C++中的析构函数的作用,因此可以用来做些扫尾的工作。

资源释放

C++中,利用RAIIRAII(Resource Acquisition Is Initialization)资源获取即初始化机制来确保资源分配后可以被回收。这种机制关键点就是利用了对象离开生命周期时,会自动调用析构函数,通过在析构函数中实现资源回收操作即可。golang中没有这种机制,但是可以利用defer来实现,确保对象在离开生命周期时被销毁。

// defer 关闭文件
package main

import (
 "fmt"
 "os"
)

func main() {
 fileHandler, err := os.Open("./test.txt")
 if nil != err {
  panic(err)
 }
 //检查完,发现没有错误,就可以关闭使用defer来关闭
 defer func() {
  err := fileHandler.Close()
  if nil != err {
   fmt.Println("defer关闭文件失败:", err)
  } else {
   fmt.Println("defer 关闭文件成功")
  }
 }()
}

defer 关闭文件成功

上报

上报/日志处理,使用defer能够节省大量的代码工作量,尤其是对于失败和成功都需要上报/日志的场景。

// defer 日志处理
package main

import (
 "fmt"
 "math/rand"
 "time"
)

func testDefer() {
 a, b := 0, 1
 defer func(a, b *int) {
  fmt.Printf("a=%d,b=%d\n", *a, *b)
 }(&a, &b)

 rand.Seed(time.Now().Unix())
 if rand.Int()%2 == 0 {
  a, b = 1, 2
 } else {
  return
 }
}
func main() {
 testDefer()
}
a=1,b=2
a=0,b=1

将上报/日志处理操作放在defer中能够确保即使在函数提前返回的情况下也可以正常执行,不至于遗漏。

函数执行时间

有这样的一个场景,我们需要获取一个函数的耗时,在go语言中我们会怎么做呢?

在其他语言中,可能是这样操作:

  1. 在函数块的第一行获取并记录函数开始执行的时间start
  2. 在函数结束时,再次获取当前时间endend-start就是函数执行的大概总时间

在go语言中,我们可以借助defer机制来优雅的完成。

func slowOperation() {
 defer trace("slowOperation")() // 注意函数调用,不能漏掉最后的圆括号
 time.Sleep(10 * time.Second)
}

func trace(msg string) func() {
 start := time.Now()
 return func() { log.Printf("exit %s time_cost=%s", msg, time.Since(start)) }
}
2021/12/04 23:06:42 exit slowOperation time_cost=10.000143172s

小结

通过上面的梳理,我们知道defer 关键字的实现主要依靠编译器运行时的协作来完成。defer的实现并不是一步到位,直接就是现在的样子,而是经过了数年,几个版本的迭代才呈现现在的面貌的:

  • go 1.12 堆上分配
    • 编译期将 defer 关键字转换成 runtime.deferproc 并在调用 defer 关键字的函数返回之前插入 runtime.deferreturn
    • 运行时调用 runtime.deferproc 会将一个新的 runtime._defer 对象插入到当前 g._defer的链表头;
    • 运行时调用runtime.deferreturn会从 g._defer的链表中取出runtime._defer 结构并依次执行;
  • go 1.13 栈上分配
  • defer关键字在函数体中最多执行一次时,编译期间的 gc.state.call 会将结构体分配到栈上并调用 runtime.deferprocStack
  • go 1.14 开放编码
    • 编译期间判断 defer 关键字、return 语句的个数确定是否开启开放编码优化;
    • 通过 deferBitsgc.openDeferInfo 存储 defer 关键字的相关信息;
    • 如果 defer 关键字的执行可以在编译期间确定,会在函数返回前直接插入相应的代码,否则会由运行时的 runtime.deferreturn处理。

图5 Golang defer 优化路线

三种实现机制,并不是替代关系,而是,特殊化处理的关系,条件越来越苛刻。虽然性能不断提升,但机制适用范围越来越窄。

参考文献

  • [1] https://go.dev/ref/spec#Defer_statements
  • [2] https://go.dev/doc/effective_go#defer
  • [3] Defer, Panic, and Recover
  • [4] defer
  • [3] https://segmentfault.com/a/1190000022112411
  • [5] https://zhuanlan.zhihu.com/p/56557423
  • [6]https://juejin.cn/post/7101887123539623944
  • [7]引入栈 https://go-review.googlesource.com/c/go/+/171758
  • [8]引入开放编码 https://go-review.googlesource.com/c/go/+/190098/6
  • [9]优化效果 https://github.com/golang/proposal/blob/master/design/34481-opencoded-defers.md
  • [10] https://www.topgoer.com/%E5%87%BD%E6%95%B0/%E5%BB%B6%E8%BF%9F%E8%B0%83%E7%94%A8defer.html
  • [11] https://www.luozhiyun.com/archives/523