gvisor容器运行时-入门篇
在介绍gvisor相关技术架构之前,这里简单介绍一下什么是容器运行时。
0.容器运行时
根据OCI Runtime Specification的定义,容器运行时Container Runtime是指一种软件实现,负责在宿主机上创建、启动、运行并管理容器进程的整个生命周期。runc、runsc、kata-runtime等均属于OCI容器运行时规范下的具体实现,通常被称为低层Low-level容器运行时,主要负责容器进程的创建与隔离。

在实际系统中,这些低层运行时通常由更高层的容器运行时进行调用和管理,例如containerd或CRI-O。高层运行时负责镜像的拉取与解压、容器生命周期管理,并通过OCI或CRI接口与底层运行时协作完成容器的启动。从实现方式和安全隔离模型的不同出发,OCI容器运行时通常可以粗略分为两类:
- 经典底层
| 名称 | 语言 | 特点 | 使用场景 |
|---|---|---|---|
| runc | Go + C | OCI 标准参考实现 | Docker、containerd默认 |
| crun | C | 轻量、高性能、低内存 | 适合资源受限环境(嵌入式、IoT) |
| runv 淘汰停止维护 | Go + QEMU | 支持轻量虚拟化 | 安全隔离要求高的场景 |
这里runV不是典型的OCI低层容器运行时,它更像是把VM包装成容器运行时接口的一次早期实验,现在已经废弃。
- 安全隔离增强型
| 名称 | 核心技术 | 特点 | 使用场景 |
|---|---|---|---|
| runsc | Go + 用户态内核 | 用户态模拟Linux系统调用,增加隔离 | 多租户安全容器 |
| kata-runtime | QEMU + KVM | 每个容器一个轻量级 VM | 高安全隔离场景 |
| firecracker | Rust + KVM | 超轻量微 VM | Serverless(AWS Lambda, Fargate) |
在实际运行中,用户操作的docker-cli会经过dockerd调用高层运行时containerd,然后通过containerd-shim调用低层运行时如 runc、runsc 或 kata-runtime创建容器进程。 例如Docker28.02版本(不同版本可能存在差异),普通容器的调用链为:docker-cli->dockerd->containerd->containerd-shim-runc-v2->runc
1.gVisor容器运行时
相比普通容器,gVisor提供的runsc是一种安全隔离增强型容器运行时,其在容器应用软件与宿主机系统间增加隔离层。作为用户态内核,gVisor使用内存安全的golang语言实现了类Linux内核的相关接口,承载容器内进程的系统调用执行。gVisor中的OCI规范定义的容器运行时runsc,gVisor既可以快速无缝的与docker,Kubernetes对接,使得应用运行于沙箱容器中,也使用专家模式独立运行。
注意: gVisor并未完整复刻Linux内核,而是以最小可用内核接口为目标,仅实现容器运行所需的系统调用子集,其余调用要么被拒绝,要么通过受控路径转发给宿主机内核。具体信息使用runsc help syscalls查阅
目前容器的隔离手段主要存在以下几种方式:
a. 机器级虚拟化隔离技术,使用KVM或Xen通过虚拟机监控器VMM向客户机内核暴露虚拟化硬件
b. 基于规则的执行机制,如seccomp、SELinux和AppArmor允许为应用程序或容器制定细粒度的安全策略
c. 通过拦截应用程序系统调用, 充当客户机内核
gvisor采用第三种方式实现容器应用软件与宿主机系统的安全隔离,其架构主要组件入下图所示:

其核心组件Sentry,作为用户态内核,负责拦截容器内进程的系统调用,对已实现的调用直接处理(无需转发宿主机),未实现的系统调用按策略拒绝或转发,是隔离机制的核心载体。文件系统代理进程Gofer,负责容器与宿主机文件系统的交互,避免容器直接访问宿主机文件,进一步强化隔离。gVisor向应用程序提供了一个等效于Linux内核v4.4的运行环境,绝大多数应用程序无需修改即可直接运行。但由于gVisor并未完整实现所有系统调用、/proc及/sys接口,部分依赖内核细节或硬件特性的应用可能会出现兼容性问题。
2.gVisor安装与配置
sudo apt-get update && \
sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
gnupg
curl -fsSL https://gvisor.dev/archive.key | sudo gpg --dearmor -o /usr/share/keyrings/gvisor-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/gvisor-archive-keyring.gpg] https://storage.googleapis.com/gvisor/releases release main" | sudo tee /etc/apt/sources.list.d/gvisor.list > /dev/null
sudo apt-get update && sudo apt-get install -y runsc
注意: 受网络环境限制,访问google等相关站点时可能需要科学上网
gVisor安装即可用,但为达到最优性能或适用于生产环境,可以按照官方教程在platform,文件I/O,网络,内存管理方面进行调优
- platform
gVisor中platform指沙箱的具体实现技术,其要求平台实现syscalls的拦截,上下文切换,内存映射等相关接口。这些接口实现的方法有很多,有许多不同的方法来实现这些接口,使用者通常会围绕性能和硬件要求进行各种权衡,选择合适的沙箱平台。
a. kvm,Sentry利用Linux内核的KVM接口,将容器运行环境与宿主机隔离,提供类似轻量虚拟机的安全边界,同时使容器进程与虚拟机管理程序VMM进行受控交互。非嵌套的裸机环境下性能表现最佳。
b. systrap,依赖seccomp的SECCOMP_RET_TRAP功能来拦截系统调用的systrap方案,内核向触发系统调用的线程发送SIGSYS信号,从而将控制权交给gVisor处理该系统调用。2023年取代ptrace成为默认配置。嵌套环境性能优于kvm
c. ptrace,平台使用PTRACE_SYSEMU来执行用户代码,同时不允许其直接执行宿主机系统调用。由于只依赖ptrace,这个平台几乎可以在任何支持ptrace的环境中运行(即使是在不支持嵌套虚拟化的虚拟机中),适用范围非常广泛。但上下文切换开销较高,对于系统调用频繁的应用程序,性能可能会受到显著影响。
可以再容器运行时指定platform,也支持设置默认平台
{
"runtimes": {
"runsc": {
"path": "/usr/bin/runsc",
"runtimeArgs": [
"--platform=kvm"
]
}
}
}
sudo systemctl restart docker
docker run --runtime=runsc --platform=kvm ....
- 文件I/O
由于gVisor通常用作通用目的沙箱,默认支持所有的IO交互模式,文件I/O通常是影响gVisor沙盒中进程性能的关键因素。gVisor通过Gofer文件代理进程访问宿主机文件系统。每个沙箱实例都会启动单独的Gofer进程,通过LISAFS协议与其对应的Sentry通信。默认情况下,沙箱中进程的文件操作调用链如下: process -> sentry -> gopher -> 文件I/O
a. 使用tmpfs作为整个沙箱的文件系统的overlay层,所有的变更都发生在tmpfs文件系统中,底层文件保持不变。配置overlay层后,sentry直接在overlay层完成文件操作,只在需要时使用Gofer完成磁盘I/O操作。
b. directfs选项允许sentry直接使用Gofer提供的文件描述符操作文件,不再每次都通过RPC交换数据,从而提高性能,同时仍保持对宿主文件系统的访问受控。
c. file-access=shared功能,允许容器对文件系统镜像的改动在多个实例中共享(本质共享rootfs的写入层,并不修改基础镜像本身)。
- 网络
gVisor内部实现了一套网络协议栈,其设计目标是强化与宿主机之间的网络隔离边界,而不是限制沙箱内应用的网络访问能力。在沙箱外部视角,只有一个AF_PACKET原始套接字负责与宿主机的实际网卡进行数据交换,而在沙箱内部视角,gVisor的netstack维护着大量语义完整、彼此独立的TCP连接。
沙箱网络协议栈中,底层网卡协程负责收包。如果是TCP包,为了效率,它会把包丢给专门的TCP协程组去异步处理;如果是UDP等简单包,网卡协程就自己顺手处理完,直到把数据放进Socket队列里让程序来读取。报文外出的逻辑相反,非tcp报文由当前发起调用的Goroutine直接带着数据包处理,tcp报文由专用的Goroutine,依次经过TCP层加密/封包、IP层路由、最后到达链路层发送到AF_PACKET套接字中。
注意: gVisor的网络协议栈被设计为一个可独立运行、可高复用的用户态网络栈,能够方便地集成到其他项目中。尽管其对外API相对稳定,但并不承诺向后兼容的稳定性,也未以Go module的形式发布版本
network=host参数使gVisor启用hostinet模式,通过透传系统调用直接使用宿主机内核网络栈,从而牺牲隔离性以换取原生级别的网络性能
- 内存管理
如果宿主机的Linux内核支持透明大页THP并开启该功能,gVisor沙箱内的Sentry、goroutine以及沙箱内应用的内存访问可能获得性能提升(echo advise >/sys/kernel/mm/transparent_hugepage/shmem_enabled)
官方文档除了介绍上述基本安装配置外,还详尽介绍了如何在gVisor环境中配置GPU和TPU,感兴趣的朋友可以移步至官方文档
3.runsc基础命令
runsc命令主要分为几类,基本命令,调试命令,帮助命令,内部命令,度量命令等
root@peter-ThinkPad-Edge-E530:~# runsc --help
Usage: runsc <flags> <subcommand> <subcommand args>
runsc is the gVisor container runtime.
Functionality is provided by subcommands. For help with a specific subcommand,
use "runsc help <subcommand>".
Subcommands:
checkpoint checkpoint current state of container (experimental) // 保存容器的当前快照
create create a secure container // 创建容器
delete delete resources held by a container // 删除容器
do Simplistic way to execute a command inside the sandbox. It's to be used for testing only.
events display container events such as OOM notifications, cpu, memory, and IO usage statistics
exec execute new process inside the container // exec在容器内部开启新的进程
flags describe all known top-level flags
help Print help documentation.
kill sends a signal to the container // 向容器发送指定信号
list list containers started by runsc with the given root
pause pause suspends all processes in a container // 暂停挂起容器内部所有进程
port-forward port forward to a secure container // 设置容器与宿主机端口映射
ps ps displays the processes running inside a container // ps命令展示容器内部运行的进程
restore restore a saved state of container (experimental) // 恢复容器快照
resume Resume unpauses a paused container
run create and run a secure container // 创建并启动一个安全容器
spec create a new OCI bundle specification file
start start a secure container // 启动一个安全容器
state get the state of a container // 获取指定容器状态
tar creates tar archives from container filesystems // 将指定容器的文件系统打包
wait wait on a process inside a container // 等待容器内部的一个进程
Subcommands for debug:
debug shows a variety of debug information
read-control read a cgroups control value inside the container
statefile shows information about a statefile
symbolize Convert synthetic instruction pointers from kcov into positions in the runsc source code. Only used when Go coverage is enabled.
usage Usage shows application memory usage across various categories in bytes.
write-control write a cgroups control value inside the container
Subcommands for helpers:
cpu-features list CPU features supported on current machine
install adds a runtime to docker daemon configuration // 将runsc添加到Docker守护进程的运行时配置
mitigate mitigate mitigates the underlying system against side channel attacks
nvproxy shows information about nvproxy support
trace manages trace sessions for a given sandbox // 管理指定沙箱的跟踪会话
uninstall removes a runtime from docker daemon configuration
Subcommands for internal use only:
boot launch a sandbox process
gofer launch a gofer process that proxies access to container files
umount umount the specified directory lazily when one byte is read from sync-fd
Subcommands for metrics:
export-metrics export metric data for the sandbox
metric-metadata export metric metadata of metrics registered in this build, in text proto format
metric-server implements Prometheus metrics HTTP endpoint
Additional help topics (Use "runsc help <topic>" to see help on the topic):
platforms Print a list of available platforms.
syscalls Print compatibility information for syscalls.
Use "runsc flags" for a list of top-level flags
4.进程监控
gVisor支持在宿主系统中监控沙箱内的进程行为,这种进程监控特性可以运用在包括威胁检测,安全审计和行为分析等在内的多种场景。与Linux原生的eBPF不同,eBPF可以在程序运行时动态地在内核任何位置插入探测点Kprobes,而gVisor的监控系统更像是Tracepoint和USDT的静态代码插庄。
以下命令列出Sentry中支持的静态探针点:
# 上下文字段基本都支持
# time|thread_id|task_start_time|group_id|thread_group_start_time|container_id|credentials|cwd|process_name
# 统一省略处理
# 同时省略部分sysno
root@peter-Legion-Y9000P-IAH7H:/tmp/test# runsc trace metadata
POINTS (991)
Name: container/start, optional fields: [env], context fields: [......]
Name: sentry/clone, optional fields: [], context fields: [......]
Name: sentry/execve, optional fields: [binary_info|binary_sha256], context fields: [......]
Name: sentry/exit_notify_parent, optional fields: [], context fields: [time|thread_id|task_start_time|group_id|thread_group_start_time|container_id|credentials|process_name]
Name: sentry/task_exit, optional fields: [], context fields: [......]
Name: syscall/accept/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/accept/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/accept4/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/accept4/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/bind/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/bind/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/chdir/enter, optional fields: [], context fields: [......]
Name: syscall/chdir/exit, optional fields: [], context fields: [......]
Name: syscall/chroot/enter, optional fields: [], context fields: [......]
Name: syscall/chroot/exit, optional fields: [], context fields: [......]
Name: syscall/clone/enter, optional fields: [], context fields: [......]
Name: syscall/clone/exit, optional fields: [], context fields: [......]
Name: syscall/close/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/close/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/connect/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/connect/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/creat/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/creat/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/dup/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/dup/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/dup2/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/dup2/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/dup3/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/dup3/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/eventfd/enter, optional fields: [], context fields: [......]
Name: syscall/eventfd/exit, optional fields: [], context fields: [......]
Name: syscall/eventfd2/enter, optional fields: [], context fields: [......]
Name: syscall/eventfd2/exit, optional fields: [], context fields: [......]
Name: syscall/execve/enter, optional fields: [envv], context fields: [......]
Name: syscall/execve/exit, optional fields: [envv], context fields: [......]
Name: syscall/execveat/enter, optional fields: [fd_path|envv], context fields: [......]
Name: syscall/execveat/exit, optional fields: [fd_path|envv], context fields: [......]
Name: syscall/fchdir/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/fchdir/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/fcntl/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/fcntl/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/fork/enter, optional fields: [], context fields: [......]
Name: syscall/fork/exit, optional fields: [], context fields: [......]
Name: syscall/inotify_add_watch/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/inotify_add_watch/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/inotify_init/enter, optional fields: [], context fields: [......]
Name: syscall/inotify_init/exit, optional fields: [], context fields: [......]
Name: syscall/inotify_init1/enter, optional fields: [], context fields: [......]
Name: syscall/inotify_init1/exit, optional fields: [], context fields: [......]
Name: syscall/inotify_rm_watch/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/inotify_rm_watch/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/open/enter, optional fields: [], context fields: [......]
Name: syscall/open/exit, optional fields: [], context fields: [......]
Name: syscall/openat/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/openat/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/pipe/enter, optional fields: [], context fields: [......]
Name: syscall/pipe/exit, optional fields: [], context fields: [......]
Name: syscall/pipe2/enter, optional fields: [], context fields: [......]
Name: syscall/pipe2/exit, optional fields: [], context fields: [......]
Name: syscall/pread64/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/pread64/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/preadv/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/preadv/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/preadv2/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/preadv2/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/prlimit64/enter, optional fields: [], context fields: [......]
Name: syscall/prlimit64/exit, optional fields: [], context fields: [......]
Name: syscall/pwrite64/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/pwrite64/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/pwritev/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/pwritev/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/pwritev2/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/pwritev2/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/read/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/read/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/readv/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/readv/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/setgid/enter, optional fields: [], context fields: [......]
Name: syscall/setgid/exit, optional fields: [], context fields: [......]
Name: syscall/setresgid/enter, optional fields: [], context fields: [......]
Name: syscall/setresgid/exit, optional fields: [], context fields: [......]
Name: syscall/setresuid/enter, optional fields: [], context fields: [......]
Name: syscall/setresuid/exit, optional fields: [], context fields: [......]
Name: syscall/setsid/enter, optional fields: [], context fields: [......]
Name: syscall/setsid/exit, optional fields: [], context fields: [......]
Name: syscall/setuid/enter, optional fields: [], context fields: [......]
Name: syscall/setuid/exit, optional fields: [], context fields: [......]
Name: syscall/signalfd/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/signalfd/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/signalfd4/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/signalfd4/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/socket/enter, optional fields: [], context fields: [......]
Name: syscall/socket/exit, optional fields: [], context fields: [......]
Name: syscall/socketpair/enter, optional fields: [], context fields: [......]
Name: syscall/socketpair/exit, optional fields: [], context fields: [......]
Name: syscall/sysno/0/enter, optional fields: [], context fields: [......]
Name: syscall/sysno/0/exit, optional fields: [], context fields: [......]
.....
Name: syscall/sysno/441/enter, optional fields: [], context fields: [......]
Name: syscall/sysno/441/exit, optional fields: [], context fields: [......]
Name: syscall/timerfd_create/enter, optional fields: [], context fields: [......]
Name: syscall/timerfd_create/exit, optional fields: [], context fields: [......]
Name: syscall/timerfd_gettime/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/timerfd_gettime/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/timerfd_settime/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/timerfd_settime/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/vfork/enter, optional fields: [], context fields: [......]
Name: syscall/vfork/exit, optional fields: [], context fields: [......]
Name: syscall/write/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/write/exit, optional fields: [fd_path], context fields: [......]
Name: syscall/writev/enter, optional fields: [fd_path], context fields: [......]
Name: syscall/writev/exit, optional fields: [fd_path], context fields: [......]
SINKS (2)
Name: null
Name: remote
gVisor的监控体系seccheck中,使用json文件描述的配置文件session会话是核心概念,它决定了谁在监控、监控什么以及数据发往何处。以下配置源于官方文档:
{
"trace_session": {
"name": "Default",
"points": [
{
"name": "sentry/clone"
},
{
"name": "syscall/fork/enter",
"context_fields": [
"group_id",
"process_name"
]
},
{
"name": "syscall/fork/exit",
"context_fields": [
"group_id",
"process_name"
]
},
{
"name": "syscall/execve/enter",
"context_fields": [
"group_id",
"process_name"
]
},
{
"name": "syscall/sysno/35/enter",
"context_fields": [
"group_id",
"process_name"
]
},
{
"name": "syscall/sysno/35/exit"
}
],
"sinks": [
{
"name": "remote",
"config": {
"endpoint": "/tmp/gvisor_events.sock"
}
}
]
}
}
docker run --rm --runtime=runsc -d bash -c "while true; do echo looping; sleep 5; done"
CID=dee0da1eafc6b15abffeed1abc6ca968c6d816252ae334435de6f3871fb05e61
sudo runsc --root /var/run/docker/runtime-runc/moby trace create --config session.json ${CID}
上面的例子中在沙箱容器运行后,使用runsc trace创建跟踪会话,这种情况可能无法捕获沙箱初始化阶段触发的追踪点, 如需完整事件,应在沙箱启动时通过–pod-init-config配置追踪会话。 上述配置中sinks字段定义沙箱事件产生后会发送至宿主机的gvisor_events.sock套接字,宿主机只需运行相应的用户态进程消费并解析该事件流,即可实现预期的监控或分析功能
package main
import (
"fmt"
"log"
pbCommon "test/common"
"unsafe"
"golang.org/x/sys/unix"
"google.golang.org/protobuf/proto"
)
const (
socketPath = "/tmp/gvisor_events.sock"
magicNumber = 0x47565352 // "GVSR"
protoVersion = 1
maxPayloadSize = 64 * 1024 // 64KB
headerSize = pbCommon.HeaderStructSize
)
func main() {
fd, err := unix.Socket(unix.AF_UNIX, unix.SOCK_SEQPACKET, 0)
if err != nil {
log.Fatalf("无法创建socket: %v", err)
}
defer unix.Close(fd)
addr := &unix.SockaddrUnix{Name: socketPath}
if err := unix.Bind(fd, addr); err != nil {
log.Fatalf("Bind失败: %v", err)
}
if err := unix.Listen(fd, 10); err != nil {
log.Fatalf("Listen失败: %v", err)
}
fmt.Println("等待gVisor连接.....")
for {
nfd, sa, err := unix.Accept(fd)
if err != nil {
log.Printf("Accept错误: %v", err)
continue
}
fmt.Printf("收到连接:%+v\n", sa)
go handleConn(nfd)
}
}
func handleConn(cfd int) {
defer unix.Close(cfd)
if _, err := doHandshake(cfd); err != nil {
log.Printf("handshake failed: %v", err)
return
}
for {
buf := make([]byte, maxPayloadSize)
n, err := unix.Read(cfd, buf)
if err != nil || n < headerSize {
if err != nil {
log.Printf("读取数据失败: %v", err)
} else if n == 0 {
log.Printf("连接关闭")
} else {
log.Printf("读取数据失败: 数据长度不足")
}
return
}
header := *(*pbCommon.Header)(unsafe.Pointer(&buf[0]))
payload := buf[header.HeaderSize:n]
mt := pbCommon.MessageType(header.MessageType)
fmt.Printf("收到消息:类型=%s, 长度=%d, 丢弃计数=%d\n",
pbCommon.MessageType_name[int32(mt)],
n-headerSize,
header.DroppedCount,
)
switch mt {
case pbCommon.MessageType_MESSAGE_SYSCALL_FORK:
var sf pbCommon.Fork
if err := proto.Unmarshal(payload, &sf); err != nil {
log.Printf("无法解析Fork消息: %v", err)
continue
}
fmt.Printf("Fork消息: Tid=%d, Name=%s, ContainerId=%s\n", sf.ContextData.ThreadId, sf.ContextData.ProcessName, sf.ContextData.ContainerId)
case pbCommon.MessageType_MESSAGE_SYSCALL_EXECVE:
var se pbCommon.Execve
if err := proto.Unmarshal(payload, &se); err != nil {
log.Printf("无法解析Execve消息: %v", err)
continue
}
fmt.Printf("Execve参数: %+v\n", &se)
}
}
}
func doHandshake(fd int) (bool, error) {
var peerHS pbCommon.Handshake
data, err := proto.Marshal(&pbCommon.Handshake{
Version: 1,
})
if err != nil {
return false, err
}
if _, err := unix.Write(fd, data); err != nil {
return false, err
}
buf := make([]byte, maxPayloadSize)
n, err := unix.Read(fd, buf)
if err != nil {
return false, err
}
if err := proto.Unmarshal(buf[:n], &peerHS); err != nil {
return false, err
}
return true, nil
}
在gVisor的安全模型中,假设Sentry的执行代码存在安全风险并不可信,监控进程需要在启动时创建SOCK_SEQPACKET类型的UNIX套接字,等待开启会话跟踪的沙箱连接,汇报沙箱中发生的感兴趣的系统调用。从unix套接字收到的任何消息都需要进行安全验证,每条消息包含一个头部,负载部分使用Protocol Buffers编码,这可以安全地通过标准库进行反序列化,上面代码展示了如何使用golang语言在宿主机实现一个简单的监控进程服务。gVisor中提供的工具tracereplay
gVisor通常可以与Falco开源的云原生运行时安全监控工具配合使用。官方文档https://gvisor.dev/docs/tutorials/falco/详细介绍了如何在Docker或Kubernetes环境中进行配置,这里不再赘述。