黄玮
我这边的电脑有什么魔法?其他人的电脑有什么“诅咒”?
开发人员的运行环境和运维、测试、用户的运行环境不一致!
开发人员的运行环境和运维、测试、用户的运行环境不一致!
开发人员的运行环境和运维、测试、用户的运行环境不一致!
怎么解决运行环境不一致的困局?
核心:(不同组织、不同部门、不同角色的)沟通问题
当Dev和Ops是两个人(部门)的时候,需求分析和产品发布阶段会存在大量沟(chĕ)通(pÍ)时间,角色和人员上的二合一可以直接“跳过”沟(chĕ)通(pÍ)阶段
DevOps 集文化理念、实践和工具于一身,可以提高组织高速交付应用程序和服务的能力,与使用传统软件开发和基础设施管理流程相比,能够帮助组织更快地发展和改进产品。这种速度使组织能够更好地服务其客户,并在市场上更高效地参与竞争。
假如让你去维护前述已经在“正常工作”的系统,现在需要升级维护其中一个“老旧”组件你该怎么下手?
你以为的并不是你以为的
如果没有自动化运维,你可能在旅途中突然需要这样
治大国如烹小虾,我们来类比餐厅老板,看如何实现炒菜的自动化
持续集成强调开发人员提交了新代码之后,立刻进行构建、(单元)测试。根据测试结果,开发人员可以确定新代码和原有代码能否正确地集成在一起。
持续交付在持续集成的基础上,将集成后的代码部署到更贴近真实运行环境的类生产环境(production-like environments)中。
比如,开发人员完成单元测试后,可以把代码部署到连接数据库的模拟(Staging)环境中进行更多的测试。如果代码没有问题,可以继续手动部署到生产环境中。
持续部署则是在持续交付的基础上,把部署到生产环境的过程自动化。
git push to github
travis-ci build
travis-ci deploy to github pages
+ git add && git push to gitee pages
github pages
gitee pages
开发
和 运维
角色实现 代码
与 运行时配置
独立管理、互不干扰统一构建
规则
构建规则
纳入「版本管理」,实现「变更可追溯」reveal.js
以「Git 子模块」方式被引入 项目构建依赖
CI/CD
可重复: Restart build on Travis
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
解决代码和文档的版本管理、多人协作开发与编辑需求。
Provides Git repository management, code reviews, issue tracking, activity feeds and wikis. GitLab itself is also free software.
Deploy apps. Manage systems. Crush complexity. Ansible helps you build a strong foundation for DevOps.
提供开发、测试和生产环境的软件定义能力,满足代码运行环境一致性、可审计、自动化等需求。
# 确认系统版本信息和 ansible 版本信息
lsb_release -a
# No LSB modules are available.
# Distributor ID: Ubuntu
# Description: Ubuntu 20.04.2 LTS
# Release: 20.04
# Codename: focal
apt policy ansible
# ansible:
# Installed: (none)
# Candidate: 2.9.6+dfsg-1
# Version table:
# 2.9.6+dfsg-1 500
# 500 http://cn.archive.ubuntu.com/ubuntu focal/universe amd64 Packages
sudo apt-get install ansible
# 验证当前已安装 ansible 版本
ansible --version
# ansible 2.9.6
# config file = /etc/ansible/ansible.cfg
# configured module search path = ['/home/cuc/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
# ansible python module location = /usr/lib/python3/dist-packages/ansible
# executable location = /usr/bin/ansible
# python version = 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0]
# pip 方式可以安装到最新版本的 ansible(可选)
# ref: https://pypi.org/project/ansible/
# 根据 https://www.ansible.com/blog/ansible-3.0.0-qa
# To upgrade to Ansible-3.0 from Ansible-2.10: pip install --upgrade ansible.
# To upgrade to Ansible-3.0 from Ansible-2.9 or earlier: pip uninstall ansible; pip install ansible. This is due to a limitation in pip.
# 升级安装
sudo apt remove ansible
# 使用国内 pypi 镜像加速下载
pip3 install ansible -i https://pypi.tuna.tsinghua.edu.cn/simple
# 验证 pip 方式安装的 ansible 版本
pip3 freeze | grep ansible
# ansible==3.2.0
# ansible-base==2.10.7
# 以下命令只能验证 ansible-base 的版本
# ansible 来自于 ansible-base
ansible --version
# ansible 2.10.7
# config file = /etc/ansible/ansible.cfg
# configured module search path = ['/home/cuc/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
# ansible python module location = /home/cuc/.local/lib/python3.8/site-packages/ansible
# executable location = /home/cuc/.local/bin/ansible
# python version = 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0]
ansible-base
, 2.11 开始重命名为 ansible-core
Ansible 社区发布包 | ansible-core |
---|---|
使用新的语义化版本命名规则 | 延续“经典 Ansible”命名惯例(2.10, 2.11, …) |
只维护一个最新版 | 同时维护一个最新版和2个最近的旧版本 |
包含语言、运行时和指定 Collections(all-in-one ) |
包含语言、运行时和内置插件 |
在 Collections 仓库开发和维护 | 在 ansible/ansible 仓库开发和维护 |
Chef和Puppet是DevOps领域的另2款重量级解决方案,这2个方案的共同点是采用了:(被管理主机上安装)代理(agent)软件和拉取(pull)模型。代理软件按照预先配置的策略通过私有传输协议从管理主机上拉取配置变更到本地应用。
ansible则采用了截然相反的一种工作模型:agentless和push。并且,ansible使用SSH通道来实现远程管理和命令执行。
幂等性 Idempotence: 同一个操作执行两次或更多次不会改变第一次执行的结果。
ansible的大部分命令、模块和操作都能保证上述幂等性。
# 配置A->B的root用户免密SSH登录
$ ssh-copy-id -i ~/.ssh/id_rsa.pub cuc@192.168.56.101
# SSH登录到B上手工配置A上cuc用户的RSA公钥到B上root用户的/root/.ssh/authorized_keys
$ ssh cuc@192.168.56.101
$ sudo mkdir /root/.ssh
$ sudo cp /home/cuc/.ssh/authorized_keys /root/.ssh/authorized_keys
# 假设B上没有安装过python
$ sudo apt-get update && sudo apt-get install -y python-minimal
# python 2.x 从 2020.1.1 开始终止维护更新
# 上述 python-minimal 也相应地在部分发行版中被移除,替代品是 python3-minimal
$ exit
# 继续在A上执行命令
# 验证A->B的root用户免密SSH登录
$ ssh root@192.168.56.101
# exit
# 回到A上继续执行命令
# 创建一个ansible的本地工作目录(可选步骤)
$ mkdir ansible && cd ansible
# hosts文件内容参照/etc/ansible/hosts的内容格式
$ echo -e "[web]\n192.168.56.101" > hosts
$ ansible all -m ping -u root -i hosts
192.168.56.101 | SUCCESS => {
"changed": false,
"ping": "pong"
}
ansible使用playbooks来定义远程管理“脚本”,playbooks使用YAML语法。
role是ansible中用来抽象可重用配置脚本的概念。通常一个role中包括变量、任务和句柄。
collections 是 Ansible
的一种打包封装格式,可以包含 playbooks
, roles
, modules
和 plugins
。Ansible core
仓库里的 modules
正在逐渐重构迁移到 collections
。
Ansible Galaxy是ansible官方维护的一个 collections
和 role
分享社区。通过在线搜索nginx,我们可以很快发现这个nginxinc/nginx_core
# 确保你在当前用户可写的目录中
$ mkdir roles
# 以下命令会在当前目录的子目录roles下创建一个名为jeqo.nginx的子目录
$ ansible-galaxy collection install nginxinc.nginx_core
# Starting galaxy collection install process
# Process install dependency map
# ERROR! Unknown error when attempting to call Galaxy at 'https://galaxy.ansible.com/api/': <urlopen error [Errno -3] Temporary failure in name resolution>
# 遇到如上网络连接错误时,需要使用第三方域名解析服务查询对应远程主机域名的『正确』IP
# ansible-galaxy collection install nginxinc.nginx_core
# Starting galaxy collection install process
# Process install dependency map
# Starting collection install process
# Installing 'nginxinc.nginx_core:0.3.0' to '/home/cuc/.ansible/collections/ansible_collections/nginxinc/nginx_core'
# Downloading https://galaxy.ansible.com/download/nginxinc-nginx_core-0.3.0.tar.gz to /home/cuc/.ansible/tmp/ansible-local-12102kn8levp/tmpn8vk89lk
# ERROR! Unexpected Exception, this is probably a bug: <urlopen error [Errno -3] Temporary failure in name resolution>
# 继续解决域名解析结果被污染的问题
# wget https://galaxy.ansible.com/download/nginxinc-nginx_core-0.3.0.tar.gz
# --2021-04-12 01:37:40-- https://galaxy.ansible.com/download/nginxinc-nginx_core-0.3.0.tar.gz
# Resolving galaxy.ansible.com (galaxy.ansible.com)... 172.67.68.251, 104.26.1.234, 104.26.0.234, ...
# Connecting to galaxy.ansible.com (galaxy.ansible.com)|172.67.68.251|:443... connected.
# HTTP request sent, awaiting response... 302 Found
# Location: https://ansible-galaxy.s3.amazonaws.com/artifact/bd/f9de1f668f868a872bfdc64df23423e53ad7f08195217437c653bfc97aa2e8?response-content-disposition=attachment%3B%20filename%3Dnginxinc-nginx_core-0.3.0.tar.gz&AWSAccessKeyId=AKIAJZZ23S6M5JUH2EOA&Signature=8InBUVhESAvuX5Ee1CxmqPZYUiY%3D&Expires=1618195061 [following]
# --2021-04-12 01:37:41-- https://ansible-galaxy.s3.amazonaws.com/artifact/bd/f9de1f668f868a872bfdc64df23423e53ad7f08195217437c653bfc97aa2e8?response-content-disposition=attachment%3B%20filename%3Dnginxinc-nginx_core-0.3.0.tar.gz&AWSAccessKeyId=AKIAJZZ23S6M5JUH2EOA&Signature=8InBUVhESAvuX5Ee1CxmqPZYUiY%3D&Expires=1618195061
# Resolving ansible-galaxy.s3.amazonaws.com (ansible-galaxy.s3.amazonaws.com)... failed: Temporary failure in name resolution.
# wget: unable to resolve host address ‘ansible-galaxy.s3.amazonaws.com’
# 经过以上 2 步网络连接错误手动修复,在 /etc/hosts 中一共添加 2 条域名解析记录
# 52.217.8.132 ansible-galaxy.s3.amazonaws.com
# 104.26.1.234 galaxy.ansible.com
# 可以进入 ~/.ansible/collections/ansible_collections/nginxinc/nginx_core
# 查看自动下载好的一个nginx collections(一堆配置文件和ansible脚本)
# 可以 cd playbooks 查看所有示例 playbooks
# 由于全是脚本,所以文档解决不了的问题可以直接查看代码
# 自行替换其中的 192.168.56.202 为目标主机 IP
echo -e "[web]\n192.168.56.202 ansible_user=cuc ansible_become=true ansible_become_method=sudo" >> hosts
# 把本地的nginx配置“代码”在远程主机上执行起来吧!
ansible-playbook deploy-nginx.yml -i hosts -K
验证你的第一个ansible-playbook的成果吧!
curl http://192.168.56.202
⛔️ 不要在生产环境或其他重要环境中盲目下载执行别人的代码!!!
⛔️ 不要在生产环境或其他重要环境中盲目下载执行别人的代码!!!
⛔️ 不要在生产环境或其他重要环境中盲目下载执行别人的代码!!!
⚠️ 我们所以选择在一个纯净的虚拟机环境中做上述实验!
注意:Docker只是容器技术目前最“火”的一种,Docker不是容器技术的代名词,只是方案之一。
容器的优势 | 虚拟机的优势 | |
---|---|---|
一致的运行时环境 | ✔️ | ✔️ |
应用沙盒化 | ✔️ | ✔️ |
占用存储空间少 | ✔️ | |
开销低 | ✔️ |
Image Spec
Runtime Spec
容器全生命周期管理方案。
Build, Ship, and Run Any App, Anywhere
Docker是一种轻量虚拟化的容器技术,提供类似虚拟机的隔离功能,并使用了一种分层的联合文件系统技术管理镜像,能极大简化环境运维过程。
Docker容器云则是使用Docker技术打造的一站式容器云服务平台,即CaaS(Containers as a Service)——容器即服务;可以将它简单看作为PaaS(Platform as a Service)的升级版,使用Docker容器技术的CaaS平台功能更强大,使用灵活,部署更方便。
Docker 可以让你像使用集装箱一样快速的组合成应用,并且可以像运输标准集装箱一样,尽可能的屏蔽代码层面的差异。Docker 会尽可能的缩短从代码测试到产品部署的时间。对于DevOps来说,Docker可以提供软件的一致性、可审计和自动化交付能力,主动规避软件对操作系统及其运行环境特定版本或组合依赖性或兼容性可能带来的bug。
以官方文档为准: https://docs.docker.com/install/linux/docker-ce/ubuntu/ ,以下内容为 过时 内容,仅为『证明』:基本安装步骤不会有大的变化,但和『最新版』安装方法一定 有差别
sudo apt-get update
sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
# 导入Docker官方的GPG Key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
# 添加Docker官方镜像源地址
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get update
# 确认你的镜像源配置是正确的:从Docker官网下载安装最新版docker,避免从Ubuntu官方镜像源下载旧版的docker
apt-cache madison docker-ce
# docker-ce | 17.12.0~ce-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
# docker-ce | 17.09.1~ce-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
# docker-ce | 17.09.0~ce-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
# ...
# docker-ce | 17.03.1~ce-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
# docker-ce | 17.03.0~ce-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
# 安装docker
$ sudo apt-get install -y docker-ce
# 检查docker守护进程是否已自动启动
$ sudo systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2017-02-15 14:44:43 CST; 7min ago
Docs: https://docs.docker.com
Main PID: 1464 (dockerd)
Tasks: 16
Memory: 50.5M
CPU: 441ms
CGroup: /system.slice/docker.service
├─1464 /usr/bin/dockerd -H fd://
└─1622 containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim containerd-shim --metrics-interval=0 --sta
An image is a filesystem and parameters to use at runtime. It doesn’t have state and never changes. A container is a running instance of an image.
镜像可以类比 VirtualBox 的基础镜像概念,容器可以类比 VirtualBox 的差分增量镜像。
A Dockerfile is a recipe which describes the files, environment, and commands that make up an image.
Dockerfile可以类比Makefile。
Local Docker instance可以类比本地的Virtualbox引擎。
Docker registry相当于git仓库,Docker官方的Docker Hub相当于Github。
$ sudo docker version
Client:
Version: 17.12.0-ce
API version: 1.35
Go version: go1.9.2
Git commit: c97c6d6
Built: Wed Dec 27 20:11:19 2017
OS/Arch: linux/amd64
Server:
Engine:
Version: 17.12.0-ce
API version: 1.35 (minimum version 1.12)
Go version: go1.9.2
Git commit: c97c6d6
Built: Wed Dec 27 20:09:53 2017
OS/Arch: linux/amd64
Experimental: false
$ sudo docker
Usage: docker COMMAND
A self-sufficient runtime for containers
Options:
--config string Location of client config files (default "/home/cuc/.docker")
-D, --debug Enable debug mode
-H, --host list Daemon socket(s) to connect to
-l, --log-level string Set the logging level ("debug"|"info"|"warn"|"error"|"fatal") (default "info")
--tls Use TLS; implied by --tlsverify
--tlscacert string Trust certs signed only by this CA (default "/home/cuc/.docker/ca.pem")
--tlscert string Path to TLS certificate file (default "/home/cuc/.docker/cert.pem")
--tlskey string Path to TLS key file (default "/home/cuc/.docker/key.pem")
--tlsverify Use TLS and verify the remote
-v, --version Print version information and quit
Management Commands:
config Manage Docker configs
container Manage containers
image Manage images
network Manage networks
node Manage Swarm nodes
plugin Manage plugins
secret Manage Docker secrets
service Manage services
stack Manage Docker stacks
swarm Manage Swarm
system Manage Docker
trust Manage trust on Docker images (experimental)
volume Manage volumes
Commands:
attach Attach local standard input, output, and error streams to a running container
build Build an image from a Dockerfile
commit Create a new image from a container's changes
cp Copy files/folders between a container and the local filesystem
create Create a new container
diff Inspect changes to files or directories on a container's filesystem
events Get real time events from the server
exec Run a command in a running container
export Export a container's filesystem as a tar archive
history Show the history of an image
images List images
import Import the contents from a tarball to create a filesystem image
info Display system-wide information
inspect Return low-level information on Docker objects
kill Kill one or more running containers
load Load an image from a tar archive or STDIN
login Log in to a Docker registry
logout Log out from a Docker registry
logs Fetch the logs of a container
pause Pause all processes within one or more containers
port List port mappings or a specific mapping for the container
ps List containers
pull Pull an image or a repository from a registry
push Push an image or a repository to a registry
rename Rename a container
restart Restart one or more containers
rm Remove one or more containers
rmi Remove one or more images
run Run a command in a new container
save Save one or more images to a tar archive (streamed to STDOUT by default)
search Search the Docker Hub for images
start Start one or more stopped containers
stats Display a live stream of container(s) resource usage statistics
stop Stop one or more running containers
tag Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE
top Display the running processes of a container
unpause Unpause all processes within one or more containers
update Update configuration of one or more containers
version Show the Docker version information
wait Block until one or more containers stop, then print their exit codes
Run 'docker COMMAND --help' for more information on a command.
Jenkins is a self-contained, open source automation server which can be used to automate all sorts of tasks such as building, testing, and deploying software.
Jenkins 是被广泛应用的持续集成、自动化测试、持续部署的框架,甚至有些项目组顺便将其用来做流程管理的工具。
Test and Deploy Your Code with Confidence
只支持 GitHub 托管代码的持续集成服务,同时支持持续部署到指定的一些第三方云计算平台。
对于 GitHub 上的私有仓库代码,需要付费购买travis-ci.com的服务。对于 GitHub 上的开源项目代码,可以免费使用 travis-ci.org ,未来将逐步转向统一由travis-ci.com继续向 GitHub.com 上的开源项目继续提供免费服务。
🌰
shpec | sharness | bats | bats-core | shunit2 | assert.sh | |
---|---|---|---|---|---|---|
最近一次提交 | ||||||
点赞数 | ||||||
最近一次自动构建 | ||||||
最近一年的提交次数 | ||||||
travis CI | .travis.yml | .travis.yml | .travis.yml | .travis.yml | .travis.yml | .travis.yml |
Test Anything Protocol | 行为驱动开发(BDD)模式 | ✅ | ✅ | ✅ | 基于 xUnit 模式 | ✅ |
特色 | 架构设计仿照 RSpec, Jasmine, mocha | 提取自 Git 的脚本自动化框架 | 知名成功案例多:大量知名开源项目都在使用该项目 | 基于 bats 最后一次更新 0360811 的 fork 版 | 支持的 shell 类型多 | 代码量小,轻量级测试框架 |
GitLab 内置的持续集成和持续部署功能,开源社区版也可以免费使用该功能。
jenkins | travis | Gitlab CI | |
---|---|---|---|
CI | 支持自定义来源的代码托管仓库 | 仅限 GitHub.com 上托管的代码 | 仅限 GitLab.com 和使用 GitLab 自建仓库托管的代码 |
CD | 支持自定义部署目标 | 官方指定第三方平台若干 | 支持自定义部署目标 |
docker | ✅ | ✅ | ✅ |
自建服务 | ✅ | ❌ | ✅ |
OpenStack is a cloud operating system that controls large pools of compute, storage, and networking resources throughout a datacenter, all managed through a dashboard that gives administrators control while empowering their users to provision resources through a web interface.
Open vSwitch is a production quality, multilayer virtual switch licensed under the open source Apache 2.0 license. It is designed to enable massive network automation through programmatic extension, while still supporting standard management interfaces and protocols (e.g. NetFlow, sFlow, IPFIX, RSPAN, CLI, LACP, 802.1ag). In addition, it is designed to support distribution across multiple physical servers similar to VMware’s vNetwork distributed vswitch or Cisco’s Nexus 1000V.
Tcpreplay is a suite of GPLv3 licensed tools written by Aaron Turner for UNIX (and Win32 under Cygwin) operating systems which gives you the ability to use previously captured traffic in libpcap format to test a variety of network devices. It allows you to classify traffic as client or server, rewrite Layer 2, 3 and 4 headers and finally replay the traffic back onto the network and through other devices such as switches, routers, firewalls, NIDS and IPS’s. Tcpreplay supports both single and dual NIC modes for testing both sniffing and inline devices.
一言以蔽之:满足开发和测试环境模拟生产环境真实流量的需求。
tcpdump+Tcpreplay 或者 wireshark+Tcpreplay 可以用来回放在线流量,这种方案可以解决 TCP 层以下的问题,如防火墙问题,然而,此方案仍有如下缺陷:
TCPCopy主要用来解决 TCP 层及其以上(如 http 协议)的流量复制问题,用于服务器端的流量回放领域。总体来说,TCPCopy 有如下优点:
Code Defined Software!
Software Defined Everything!