基于信创的互金应用探索与实践

2021-08-12 Docker Go Gitlab 信创

分享到:

Overview

信创是什么？

信创二字来源于“信息技术应用创新工作委员会”。2016年3月4日工委会成立，是由从事信息技术软硬件关键技术研究、应用和服务的企事业单位发起建立的非营利性社会组织。

信创产业，即信息技术应用创新产业。信创产业推进的背景在于，过去中国IT底层标准、架构、产品、生态大多数都由国外IT商业公司来制定，由此存在诸多的底层技术、信息安全、数据保存方式被限制的风险。

全球IT生态格局将由过去的“一极”向未来的“两极”演变，中国要逐步建立基于自己的IT底层架构和标准。基于自有IT底层架构和标准建立起来的IT产业生态便是信创产业的主要内涵。

跟我们有什么关系？

根据xxxxx《xxxxx》（[2021]221号文）要求，xxxxx决定成立信创工作领导小组和执行小组，xx、xx有幸被选中作为首批的试点项目。

我们要做什么？

简单来说，我们的远期目标是实现全面国产化，说实话从我个人来看的话目标的难度非常大，非常具有挑战性。我们的短期目标是什么呢？实现应用程序的国产CPU服务器和国产数据库化。

国产CPU服务器

国产CPU服务器化，对应用侧来说就是将程序部署于国产CPU的服务器之上，目前我们互联网金融的应用程序均部署于公司的微服务平台 — Eagle。

国产服务器硬件层面由 Exxxe 与公司基础运维组统一采购、部署、管理，对于应用侧来说只需将应用程序发布到相应的信创集群即可，那么 Exxxe 方面目前提供了哪些方案呢？

操作系统	操作系统研发单位	CPU型号	CPU研发单位	CPU指令集体系	CPU架构来源	是否就绪
银河麒麟 V10	麒麟软件	鲲鹏920	华为	ARM	指令集授权	是（集群标签armcs1）
银河麒麟 V10	麒麟软件	飞腾	天津飞腾	ARM	指令集授权	否
银河麒麟 V10	麒麟软件	海光	天津海光	x86（AMD）	IP授权	否

Exxxe 主推“麒麟+鲲鹏“方案；考虑到服务器交付风险，增加”麒麟+飞腾“备选方案；考虑到业务系统向 ARM 平台迁移改造的适配风险，增加”麒麟+海关“备选方案。

信创主推和备选的是自主化程度较高、基于 ARM 指令集体系方案，对于应用侧来说也是改造最多的，所以这是我们重点关注的方案。

ARM VS x86

ARM 与 x86 最本质上的区别是指令集的差异，ARM 使用 RISC 精简指令集，而 x86 使用 CISC 复杂指令集。

比如我们在使用 Go 编写应用时，我们的源代码会根据目标平台的指令集架构编译成特定平台的机器码。而不同平台的机器码是不兼容的，所以我们需要针对目标平台进行编译。

ARM	X86
Uses Reduced Instruction Set computing Architecture (RISC).	Uses Complex Instruction Set computing Architecture (CISC).
Executes single instruction per cycle.	Executes complex instruction at a time, and it takes more than a cycle.
Optimization of performance with Software focused approach.	Hardware approach to optimize performance.
Requires less registers, more memory.	It uses more registers and less memory
Pipelining of instructions is a unique feature.	Less pipelined.
Faster Execution of Instructions reduces time.	Time to execute is more.
Complex addressing is managed by software.	Inherently designed to handle complex addresses.
Compiler plays a key role in managing operations.	The micro program does the trick.
Multiple Instructions are generated from a complex one and executed individually.	Its Architecture is capable of managing complex statement execution at a time.
Managing code expansion is difficult.	Code expansion is managed easily.
Decoding of instruction is handled easily.	Decoding is handled in a complex way.
Uses available memory for calculations.	Needs supplement memory for calculations.
Deployed in mobile devices where size, power consumption speed matters.	Deployed in Servers, Desktops, Laptops where high performance and stability matters.

国产数据库

国产数据库化，将应用程序使用的数据库迁移到国产数据库。目前公司国产数据库的建设方案尚未完全敲定，已POC以下厂商的解决方案：

关系数据库：TiDB、OceanBase、达梦、GoldenDB，神州通用、PolarDB

非关系数据库：暂无

怎么做？

由于国产数据库仍在POC阶段，并且MongoDB等非关系数据库尚无计划支持，所以目前我们主要谈谈国产CPU服务器化的一些实践。

在前面已经提到我司主推的国产CPU服务器采用 ARM 架构，而我们的应用程序和制品均是 x86 架构，所以我们的主要任务是：

应用程序的可执行文件编译为 ARM 架构
应用程序的动态依赖库更新为 ARM 架构
应用程序的基础镜像或运行时镜像更新为 ARM 架构可见实质上我们要做的就是 x86 到 ARM 的迁移。

构建 ARM 应用程序

由于编程语言之间的差异，构建 ARM 应用程序的方案也不尽相同，下面简单介绍几门常用语言的大概情况。

对于 Go/C/C++ 等编译型的静态语言来说

在 ARM 机器进行本地编译，生成 ARM 可执行文件
在 x86_64 机器进行交叉编译，生成 ARM 可执行文件

构建 ARM 制品最简单的方案是上文中方案1。但目前我们并没有获取到 ARM 的机器，所以暂时只能先走交叉编译这条路。

对于 Go 来说，交叉编译非常简单，官方提供的工具链已经提供交叉编译的能力，我们只需要在编译的过程中稍作如下修改即可：

1GOOS=linux GOARCH=arm64 go build -o app main.go

对于 Node.JS 来说，如果你的代码和依赖都是纯JS的，不涉及到 C++ 编译，那么你只需要使用 ARM 版本的运行时运行程序即可。但如果涉及到 C++ 编译时，Node.JS 并未提供交叉编译的能力，你还是需要在 ARM 机器上进行编译。

对于 Java 来说，情况与 Node.JS 类似，使用 ARM 版 JVM 即可，JVM 提供了一层抽象，可以屏蔽底层 CPU 架构的差异，除非涉及一些 Native 的调用。

构建 ARM 镜像

构建 ARM 镜像也要面临两个选择：

在 ARM 机器上安装 arm64 版本 Docker Engine 进行本地镜像构建，生成 ARM 镜像
在 x86_64 机器上安装 x86_64 版本 Docker Engine 进行交叉镜像构建，生成 ARM 镜像

由于没有 ARM 机器，继续挑战 HARD 模式。

多CPU架构镜像

在挑战之前，我们需要先简单了解一下 Docker 镜像的多 CPU 架构。

Docker 镜像支持 Multiple Architectures，也就是说一个镜像可以包含不同 CPU 架构的多个子镜像。在我们拉取或者运行镜像时，Docker 会自动根据当前运行环境拉取相适配的子镜像运行。

举个例子，我们可以去官方查看 alpine:latest 的镜像，如下图：

该镜像包含多个 OS/ARCH 的子镜像，我们在本地（MacOS）使用如下命令：

1docker pull alpine:latest docker inspect alpine:latest

可以看到如下两个关键信息 Os: linux，Architecture: amd64，我没有指定就根据我的运行环境进行了拉取。

使用 buildx 构建跨平台镜像

Docker buildx 是 Moby/BuildKit 提供的一套 docker 命令行构建工具插件。buildx 可以使用 QEMU 作为模拟器来构建或者运行 ARM, x86_64 等多个CPU架构的 docker 镜像。

Familiar UI from docker build
Full BuildKit capabilities with container driver
Multiple builder instance support
Multi-node builds for cross-platform images
Compose build support
High-level build constructs (bake)
In-container driver support (both Docker and Kubernetes)

关于 QEMU 不是本文的重点，这里不做过多的介绍，简而言之 Qemu 是纯软件实现的虚拟化模拟器，几乎可以模拟任何硬件设备，buildx 借助它模拟不同CPU架构的运行环境来构建镜像。

开始构建镜像，请按以下步骤操作

宿主机的 Docker >= 19.03，Linux kernel >= 4.8，binfmt-support >= 2.1.7
开启 Docker 的实现性特性

1export DOCKER_CLI_EXPERIMENTAL=enabled

下载并安装 buildx 可执行文件

1# 下载 buildx
2curl -o ~/.docker/cli-plugins/docker-buildx https://github.com/docker/buildx/releases/download/v0.6.0/buildx-v0.6.0.linux-amd64
3# 更新权限
4chmod a+x ~/.docker/cli-plugins/docker-buildx
5# 查看 buildx 版本，确定是否安装成功
6docker buildx version
7# 输出 github.com/docker/buildx v0.6.0-docker 11057da37336192bfc57d81e02359ba7ba848e4a

1docker run --privileged --rm tonistiigi/binfmt --install all

创建并启用 builder

1# 创建并启用 mybuilder
2docker buildx create --use --name mybuilder
3# 查看 mybuilder 支持的构建
4docker buildx ls
5# 输出：default default running linux/amd64, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/arm/v7, linux/arm/v6

编写 Go 应用代码

 1package main
 2
 3import (
 4	"fmt"
 5	"log"
 6	"net/http"
 7)
 8
 9func main() {
10	http.HandleFunc("/ping", func(w http.ResponseWriter, r *http.Request) {
11		fmt.Println("GET /ping")
12		fmt.Fprintf(w, "pong\n")
13	})
14
15	log.Println("server start listen ...")
16	log.Fatal(http.ListenAndServe(":9000", nil))
17}

编写 Dockerfile

 1# Go 可执行文件构建阶段
 2FROM --platform=$TARGETPLATFORM xxxx/go:1.16-alpine3.14 AS builder
 3ARG TARGETPLATFORM
 4ARG BUILDPLATFORM
 5
 6RUN echo "I am running on $BUILDPLATFORM, building for $TARGETPLATFORM"
 7
 8RUN mkdir /app
 9ADD . /app
10WORKDIR /app
11
12RUN go build -o /app/server .
13
14
15# Go 可执行文件运行环境构建阶段
16FROM --platform=$TARGETPLATFORM xxxx/alpine:3.11
17
18COPY --from=builder /app/server /go/bin/server
19ADD config /go/bin/config
20
21WORKDIR /go/bin
22ENTRYPOINT [ "/go/bin/server" ]

构建 ARM 镜像并 push 到仓库

1# 构建同时支持 arm 和 amd64 的镜像，并以 oci 格式将镜像输出到本地
2docker buildx build --platform=linux/arm64,linux/amd64 -o type=oci,dest=- . > image-oci.tar
3
4# 将镜像上传到镜像仓库
5# 为什么用 skopeo，而不是直接 push 呢？因为我司在建的镜像仓库不支持 buildx 的镜像 push 。
6skopeo copy -a oci-archive:image-oci.tar docker://xxxx/templates/cicd:xman
7
8# 查看镜像
9docker buildx imagetools inspect xxxx/templates/cicd:xman

发布到 Eagle，可以看到

是骡子是马，拉出来溜溜

方案：使用 ab 以 200 并发压测 10w 次，进行多轮，选最好成绩对比。

ARM

 1# ab -n 100000 -c 200 eagle_armcs1:32025/api/xxxx/demo/1.0.0/ping
 2
 3This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
 4Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
 5Licensed to The Apache Software Foundation, http://www.apache.org/
 6
 7Benchmarking eagle_armcs1 (be patient)
 8Completed 10000 requests
 9Completed 20000 requests
10Completed 30000 requests
11Completed 40000 requests
12Completed 50000 requests
13Completed 60000 requests
14Completed 70000 requests
15Completed 80000 requests
16Completed 90000 requests
17Completed 100000 requests
18Finished 100000 requests
19
20
21Server Software:        nginx/1.18.0
22Server Hostname:        eagle_armcs1
23Server Port:            32025
24
25Document Path:          /api/xxxx/demo/1.0.0/ping
26Document Length:        5 bytes
27
28Concurrency Level:      200
29Time taken for tests:   5.893 seconds
30Complete requests:      100000
31Failed requests:        0
32Total transferred:      16200000 bytes
33HTML transferred:       500000 bytes
34Requests per second:    16970.17 [#/sec] (mean)
35Time per request:       11.785 [ms] (mean)
36Time per request:       0.059 [ms] (mean, across all concurrent requests)
37Transfer rate:          2684.73 [Kbytes/sec] received
38
39Connection Times (ms)
40              min  mean[+/-sd] median   max
41Connect:        0    5  25.5      4    1034
42Processing:     1    7   8.9      5     267
43Waiting:        1    7   8.1      5     267
44Total:          1   11  27.0     10    1039
45
46Percentage of the requests served within a certain time (ms)
47  50%     10
48  66%     11
49  75%     11
50  80%     12
51  90%     16
52  95%     21
53  98%     28
54  99%     33
55 100%   1039 (longest request)

x86

 1# ab -n 100000 -c 200 eagle_cs8:32025/api/xxxx/demo/1.0.0/ping
 2
 3This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
 4Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
 5Licensed to The Apache Software Foundation, http://www.apache.org/
 6
 7Benchmarking eagle_cs8 (be patient)
 8Completed 10000 requests
 9Completed 20000 requests
10Completed 30000 requests
11Completed 40000 requests
12Completed 50000 requests
13Completed 60000 requests
14Completed 70000 requests
15Completed 80000 requests
16Completed 90000 requests
17Completed 100000 requests
18Finished 100000 requests
19
20
21Server Software:        nginx/1.10.2
22Server Hostname:        eagle_cs8
23Server Port:            32025
24
25Document Path:          /api/xxxx/demo/1.0.0/ping
26Document Length:        5 bytes
27
28Concurrency Level:      200
29Time taken for tests:   5.034 seconds
30Complete requests:      100000
31Failed requests:        0
32Total transferred:      16200000 bytes
33HTML transferred:       500000 bytes
34Requests per second:    19863.21 [#/sec] (mean)
35Time per request:       10.069 [ms] (mean)
36Time per request:       0.050 [ms] (mean, across all concurrent requests)
37Transfer rate:          3142.42 [Kbytes/sec] received
38
39Connection Times (ms)
40              min  mean[+/-sd] median   max
41Connect:        0    5  51.6      2    1035
42Processing:     1    4  10.1      4     236
43Waiting:        1    4   9.3      3     231
44Total:          1   10  52.6      6    1039
45
46Percentage of the requests served within a certain time (ms)
47  50%      6
48  66%      7
49  75%      8
50  80%      8
51  90%     11
52  95%     14
53  98%     27
54  99%     31
55 100%   1039 (longest request)

通过压测报告的 QPS 与 CPU 对比来看，ARM 机器与 x86 还是存在一定的差距。

持续集成

对于有志成为 10x Programmer 的我们来说，上面的步骤太繁琐，能不能简单点优雅点，我自己就有4,50个项目要改造啊啊啊！

来了来了，它来了：http://xxxx/templates/cicd

我们以 Go 项目为例，先如下改造 Dockerfile，再更新 .gitlab-ci.yml 文件即可。

改造 Dockerfile

 1FROM --platform=$TARGETPLATFORM xxxx/go:1.16-alpine3.14 AS builder
 2ARG TARGETPLATFORM
 3ARG BUILDPLATFORM
 4
 5RUN mkdir /app
 6ADD . /app
 7WORKDIR /app
 8
 9RUN go build -o /app/server .
10
11
12FROM --platform=$TARGETPLATFORM xxxx/alpine:3.11
13
14COPY --from=builder /app/server /go/bin/server
15ADD config /go/bin/config
16
17WORKDIR /go/bin
18ENTRYPOINT [ "/go/bin/server" ]

改造 .gitlab-ci.yml

 1include:
 2  - '/.gitlab-ci/dockerx.yml'
 3  - '/.gitlab-ci/eagle.yml'
 4
 5stages:
 6  - docker
 7  - deploy
 8
 9variables:
10  ARM_ENABLE: "yes"
11  WORKDIR: example_go_buildx
12  EAGLE_TEMPLATE_ID: demo-xxxx-1.0.0-kxc1mc1mcuat
13  EAGLE_TEMPLATE_ID_TEST: demo-xxxx-1.0.0-armcs1cs8
14

为什么可以这么方便？是因为上面的过程全部都集成到 CICD 的模版中了。

展望未来

有人说搞这些是在开倒车，有人说我们又搞起了闭关锁国，也有人说我们会像日本一样步入失去的20年，我选择相信习大大的版本：道阻且长，行则将至；行而不辍，未来可期。

FAQ

迁移太麻烦、太困难，我想直接运行 X86 的应用程序或者镜像行不行？

如何将 Docker Hub 的多 CPU 架构基础镜像迁移到公司内部？

前面我们已经提到如果使用 docker pull 的方式拉取镜像，我们只能拉取到当前系统 CPU 架构的镜像，这显然不满足我们的需求。

我们以 alpine:3.11 为例，Docker Hub 支持的 CPU 架构非常多。我们可以使用 skopeo copy 命令一键完成迁移。通过下图可以看到迁移到公司仓库的镜像也保持了同样的 OS/ARCH。

1skopeo copy -a docker://docker.io/library/alpine:3.11 docker://xxxx/alpine:3.11

buildx 编译时 moby/buildkit 时无法下载或下载太慢，肿么办？

在构建镜像的机器上创建并启用 builder 时，如下设置本地 buildkit 镜像：

1docker buildx create --driver-opt image=xxxx/moby/buildkit:latest --use

使用到 node-oracledb 安装错误

1#21 218.6 npm ERR! code 87
2#21 218.6 npm ERR! path /opt/app/node_modules/oracledb
3#21 218.6 npm ERR! command failed
4#21 218.6 npm ERR! command sh -c node package/install.js
5#21 218.6 npm ERR! oracledb ERR! NJS-067: a pre-built node-oracledb binary was not found for linux arm64
6#21 218.6 npm ERR! oracledb ERR! Try compiling node-oracledb source code using https://oracle.github.io/node-oracledb/INSTALL.html#github

错误说的很清楚了，也有人问了同样的问题：https://github.com/oracle/node-oracledb/issues/1382

预编译好的只有x64_64的，arm需要自己编译。

用到的镜像

用到或者迁移过来的一些 arm64 & amd64 的镜像

说明	镜像地址
go 16.3	xxxx/go:v1.0.0
alpine 3.13	xxxx/runtime:v1.0.0
node 6.11.5	xxxx/runtime:node6.11.5
node 16	xxxx/runtime:node16-alpine3.14
dind with buildx	xxxx/dindx:latest
moby build kit	xxxx/moby/buildkit:latest
gitlab runner helper	xxxx/gitlab-runner-helper:alpine