Docker、Containerd、runC、Docker-shim

docker是一整套虚拟化技术的整合者,包括从前端到底层的所有封装,官网上的结构图比较偏业务化,可以参考一下:

Docker Architecture Diagram

即,由dockerctl命令接入dockerd,再由dockerd来操作容器和镜像。镜像可以自己构建,也可以从registry获取。

技术架构图应该是这样的:

graph TB dc[dockerctl] --> dd[dockerd] dd -- gRPC --> cd[containerd] cd --> ds1[shim] cd --> ds2[shim] cd --> ds3[shim] ds1 --> rc1[runC] ds2 --> rc2[runC] ds3 --> rc3[runC] rc1 --> c1[container] rc2 --> c2[container] rc3 --> c3[container]

dockerctl相当于面向用户的命令行客户端,常见的命令如 docker psdocker images 都是由控制台接收再调用dockerd处理的。而dockerd是对容器操作的api上层封装,之后再通过gPRC传递到底层的containerd处理。

containerd是容器技术标准化之后产生的,兼容OCI(Open Container Interface Standard)标准,所以理论上并不信赖dockerd独立运行。它的主要职责就是镜像管理(镜像和元信息等)、容器执行,向上为dockerd提供了基于gRPC的接口,向下通过docker-containerd-shim结合runC,使引擎可以独立升级。containerd负责容器运行时和生命周期,以及镜像构建、卷管理、日志等功能,这些功能的实现都是由不同的功能模块负责处理的:

  • Executor: The executor implements the actual container runtime.
  • Supervisor: The supervisor monitors and reports container state.
  • Metadata: Stores metadata in a graph database. Use to store any persistent references to images and bundles. Data entered into the database will have schemas coordinated between components to provide access to arbitrary data. Other functionality includes hooks for garbage collection of on-disk resources.
  • Content: Provides access to content addressable storage. All immutable content will be stored here, keyed by content hash.
  • Snapshot: Manages filesystem snapshots for container images. This is analogous to the graphdriver in Docker today. Layers are unpacked into snapshots.
  • Events: Supports the collection and consumption of events for providing consistent, event driven behavior and auditing. Events may be replayed to various modules
  • Metrics: Each components will export several metrics, accessible via the metrics API. (We may want to promote this to a subsystem.

每一个容器上都存在一个docker-containerd-shim进程,它需要三个参数来调用runC的api创建容器:容器ID、目录、运行时二进制文件,其中容器目录中包含:

  • config.json:容器配置
  • init-stderr:标准错误
  • init-stdin:标准输入
  • init-stdout:标准输出

runC是由docker按照OCF(Open Container Format开放容器格式)提供的一种实现。它具备容器启停、资源隔离等功能。docker中默认内置了docker-runc,所以在containerd封装时可以通过 --add-runtime 参数指定另外一种runC的实现。