您现在的位置是：首页 > 其他

当前栏目

renovate: 处理 Postgres 模式迁移

迁移模式处理 Postgres

2023-06-13 09:17:11 时间

去年 10 月，我在 review 数据库迁移代码时，不断回溯十多个已有的迁移文件，艰难地试图了解目前数据库 schema 的最终形态时，萌生了做一个数据库模式迁移工具的想法。当时主流的模式迁移工具，无论是直接撰写 SQL，还是撰写某个语言的 DSL，都要求开发者以数据库上一次迁移的状态为基础，撰写对该状态的更改。比如要对已有的 todos 表加一个字段 created_at，我需要创建一个新的迁移文件，撰写类似如下的代码：

ALTER TABLE todos ADD COLUMN created_at timestamptz;

当一个数据库维护数年，这样的迁移脚本会多达数十个甚至上百个，导致阅读和维护上的不便。更重要的是，手工撰写迁移脚本是一件反直觉的事情，它和我们正常的修改更新逻辑是割裂的。

于是 10 月份，我开始思考如何解决这个问题。我查阅了一些已有的开源项目，并详细研究了 golang 下的 atlas（https://github.com/ariga/atlas）。它是最接近于我想要的工具：通过描述当前数据库模式，然而自动生成迁移脚本。然而 atlas 对 Postgres 的支持并不太好，生成的 migration plan 很多时候都是破坏性的（比如 drop table 再 crate table），这根本无法在生产环境使用。此外，atlas 使用了类似 Terraform 的 HCL 来描述数据库模式，这让人很抓狂 —— 我需要学习新的语法，并且在脑海中为 SQL DDL 和 HCL 中建立相应的映射，才能很好地修改数据库模式。

在对开源项目的一番探索却收获不大后，我开始着手思考如何自己解决这一问题。我有两个刚性的目标：

使用 SQL 来描述 schema，而不是发明一种新的语言
生成的 migration plan 尽量避免破坏性更新

于是我给这个项目起了个名字：Renovate，然后开始撰写 RFC（见：https://github.com/tyrchen/renovate/blob/master/rfcs/0001-sql-migration.md）来梳理思路，构想我自己的解决方案。这是我当时写下的整个用户流程：

# dump all the schemas into a folder
$ renovate schema init --url postgres://user@localhost:5432/hello
Database schema has successfully dumped into ./hello.

# if schema already exists, before modifying it, it is always a good practice to fetch the latest schema. Fetch will fail if current folder is not under git or it is not up to date with remote repository.
$ renovate schema fetch

# do whatever schema changes you want

# then run plan to see what changes will be applied. When redirect to a file, it will just print all the SQL statements for the migration.
$ renovate schema plan
Table auth.users changed:

create table auth.users(
    id uuid primary key,
    name text not null,
    email text not null,
    password text not null,
-   created_at timestamptz not null,
+   created_at timestamptz not null default now(),
+   updated_at timestamptz not null
);

The following SQLs will be applied:

    alter table auth.users add column updated_at timestamptz not null;
    alter table auth.users alter column created_at set default now();

# then apply the changes
$ renovate apply
Your repo is dirty. Please commit the changes before applying.

$ git commit -a -m "add updated_at column and set default value for created_at"

# now you can directly apply
# apply can use -p to run a previously saved plan or manually edited plan
# the remove schema and the plan being executed will be saved in _meta/plans/202109301022/.
$ renovate apply

The following SQLs will be applied:

    alter table auth.users add column updated_at timestamptz not null;
    alter table auth.users alter column created_at set default now();

Continue (y/n)? y
Successfully applied migration to postgres://user@localhost:5432/hello.
Your repo is updated with the latest schema. See `git diff HEAD~1` for details.

我的大概想法是：用户可以创建一个 db schema repo，用 git 管理 schema 的修改。用户不必考虑 schema migration，只需在现有的 schema 上修改即可，当 renovate schema plan 时，Renovate 会通过 pg_dump 来获取远端的 schema，然后本地和和远端的 SQL 都会被解析成 AST，二者在 AST 级别对比找不同即可。

有了这个思路，接下来就是一些大的数据结构的定义，比如 postgres 下的一个 schema 可以这样描述：

pub struct Schema {
    pub types: BTreeMap<String, DataType>,
    pub tables: BTreeMap<String, Table>,
    pub views: BTreeMap<String, View>,
    pub functions: BTreeMap<String, Function>,
    pub triggers: BTreeMap<String, Trigger>,
}

一个 table 可以这么描述：

pub struct Table {
    pub columns: BTreeMap<String, Column>,
    pub constraints: BTreeMap<String, Constraint>,
    pub privileges: BTreeMap<String, Privilege>,
}

每个级别的数据都需要实现 Planner trait：

pub trait Planner {
    fn diff(&self, remote: &Self) -> Vec<Diff>;
    fn plan(&self, diff: &[Diff]) -> Vec<Plan>;
}

这样，我们就可以从顶层的 schema 一层层追溯到一个 table 的 column 下的 constraint，进行 diff 并给出 migration plan。整体的架构如下（图是今天画的，大致思路没变）：

思路有了，我就开始有一搭没一搭地为每个数据结构写一些基础的 parser，然后实现其 migration planner trait。最初，处理的都是一些比较容易的情况，比如用户修改 index 后，我们可以删除旧的 index，再创建新的 index，如下所示：

#[test]
fn changed_index_should_generate_migration() {
    let sql1 = "CREATE INDEX foo ON bar (baz)";
    let sql2 = "CREATE INDEX foo ON bar (ooo)";
    let old: TableIndex = sql1.parse().unwrap();
    let new: TableIndex = sql2.parse().unwrap();
    let diff = old.diff(&new).unwrap().unwrap();
    let migrations = diff.plan().unwrap();
    assert_eq!(migrations[0], "DROP INDEX foo");
    assert_eq!(migrations[1], "CREATE INDEX foo ON bar USING btree (ooo)");
}

这样断断续续写了近两千行代码后，我卡在了 table migration 上。这里的数据结构和状态至多，让人望而生畏。很多 column 级别的改动需要一点点对着 AST 扣细节，很是折磨人。于是我就将其放在一边。

上周四，我们 Tubi 一年一度的 Hackathon 又开始了。我自己有好几个想尝试的项目：

继续开发 Renovate，将其推进成一个可用的产品
开发一个通过 JSON 生成 UI 的工具
使用 pulumi + CloudFront function + CloudFront + lambda function (deno layer + deno code) 构建一个 serverless framework

考虑再三，我还是选择继续开发 Renovate，因为我不确定如果再放久一点，这个项目是否也会步其他未完成的项目后尘，永远被撂在一边。

于是，加上周末两天总共四天，刨去开会，面试，接送娃上课后班等开销，我在这个项目上花费了大约 30 小时，又写下了两千五百多行代码：

其中包含 57 个单元测试和 1 个 CLI 测试（包含 5 个 CLI 测试项），项目总体有 73% 的覆盖率：

最终的成品，已经非常接近我心目中数据库迁移工具的样子，大家可以自行去 https://github.com/tyrchen/renovate 代码库感受。我用 asciinema 录了个简单的 demo：https://asciinema.org/a/N7Pd3gDPGFcpCddREJKAKTtbx，有条件的同学可以去看看。没条件的看低清 gif 吧：

在这个 demo 里，我先是用 pgcli 为一个空的 neon db 创建了一个 todo 表，之后用 renovate schema init 获取 neon db 的 schema，本地创建了一个 schema repo。随后我修改了数据库，添加了字段，然后使用 renovate schema plan 和 renovate schema apply 生成 migration 并执行。一切如德芙般丝滑。

一些心得

从 1 到 100

Renovate 这个项目，技术上并没有太大的挑战 —— 一旦思路确定，剩下的就是工作量。工作量包括两部分：1) 庞杂的 SQL 语句的 AST diff 的支持，以及 2) 如何尽可能把细节掩盖，给用户一个良好的使用体验。然而我自己很多时候过于关注从 0 到 1 的问题，对做 PoC 乐此不疲，而忽视从 1 到 100 的问题。如果不是这次 Hackathon，Renovate 差点又成为我的另一个 PoC。在过去的 4 天里，我几乎就是解决完一个细节，再解决下一个，前前后后一共发布了近 20 个平平无奇的小版本。这些小版本无非就是支持一下 default constraint 或者解决 varchar(256)[] 解析的问题，但就是这样一个个琐碎的功能，共同构筑了目前 Renovate 还算不错的用户体验。

把 trait 设计当作架构和设计的一部分

trait 是 Rust 做软件开发的灵魂，我们应该在做架构设计时就考虑 trait。不仅如此，还可以在实现的时候为局部代码引入 trait（局部设计）。我在处理整个 db schema plan 时遇到 DRY 的问题：数据结构可能是 BTreeMap<_, T>, BTreeMap<_, BTreeMap<_, T>>, BTreeMap<_, BTreeSet<T>> 等，它们有类似的 diff 的结构，如果为每种结构写一份大致差不多的代码，维护成本很高；如果使用宏（macro_rules），又带来代码代码阅读和日后重构的痛苦。此时，使用 trait 是最好的方案：

trait SchemaPlan {
    fn diff_altered(&self, remote: &Self, verbose: bool) -> Result<Vec<String>>;
    fn diff_added(&self, verbose: bool) -> Result<Vec<String>>;
    fn diff_removed(&self, verbose: bool) -> Result<Vec<String>>;
}

impl<T> SchemaPlan for T
where
    T: NodeItem + Clone + FromStr<Err = anyhow::Error> + PartialEq + Eq + 'static,
    NodeDiff<T>: MigrationPlanner<Migration = String> { ... }

impl<T> SchemaPlan for BTreeMap<String, T>
where
    T: NodeItem + Clone + FromStr<Err = anyhow::Error> + PartialEq + Eq + 'static,
    NodeDiff<T>: MigrationPlanner<Migration = String> { ... }

impl<T> SchemaPlan for BTreeSet<T>
where
    T: NodeItem + Clone + FromStr<Err = anyhow::Error> + PartialEq + Eq + Ord + Hash + 'static,
    NodeDiff<T>: MigrationPlanner<Migration = String> { ... }

fn schema_diff<K, T>(
    local: &BTreeMap<K, T>,
    remote: &BTreeMap<K, T>,
    verbose: bool,
) -> Result<Vec<String>>
where
    K: Hash + Eq + Ord,
    T: SchemaPlan,
{ ... }

使用 trait 后，我可以用一份 schema_diff 完成好几份工作，还不用担心可维护性。

在 Renovate 项目中，我一共设计了这些 trait：

CommandExecutor：统一处理 CLI
SchemaLoader：处理 local repo 和 remote db 加载和解析 SQL
SqlSaver：保存 sql
NodeItem：统一 db object 的行为
Differ：对数据库中的对象进行 diff
MigrationPlanner：处理 migration
DeltaItem：生成细粒度的 migration
SqlFormatter：格式化 sql
MigrationExecutor：执行 migration
SchemaPlan：见上文

它们共同构筑了 Renovate 的主脉络。

避免使用 macro_rules，尽量使用泛型函数

我之前有个不太好的习惯，就是复杂的重复性的逻辑，我会顺手将其写成 macro_rules，便于复用。然而，宏不容易阅读，也不太好单元测试，很多工具对宏都支持不好（比如 lint tool），所以，在使用 macro_rules 时，想想看，是否可以通过泛型函数将其取代。上文中的 schema_diff，一开始我是用宏实现的，后来做了一些大的重构，才改成了现在的模样。虽然使用泛型函数，类型修饰会非常辣眼睛，但带来的巨大好处值得这样的重构。

做，做就能赢

《让子弹飞》中有句著名的台词：「打，打就能赢」，我把它稍作修改当小标题。在 hackathon 开始时，Renovate 会何去何从我非常没底，但快速为一个很傻很天真的版本构建最基本的用户界面，并将其展示给别人时（我录了个屏发公司 hackathon 的 slack channel 里），就能收到有意义的反馈。根据反馈，我调整了 CLI 的用户体验，思考了如何让 Renovate 适用于不同的环境（开发环境，生产环境等）。

为了录屏，我重新拾起好久不用的 aciinema；后来为了让录屏的体验在 github 好一些，我又找到了 agg 这个可以把 asciinema 录屏转换成 gif 的工具。就这样一点点，我完善用户体验，完善文档，在让产品变得更好的同时，不经意掌握了一些新的工具。

与此同时，我对 Rust 的使用也更加熟络，也更加熟练地利用递归处理让人头大的 AST。

有时候你真的很难分辨究竟是「能者多劳」还是「劳者多能」。对于这样一段旅程，其目的地固然重要，但沿途的风景也是超值的收获。假如没有这次 hackathon，我大概率也不会写这篇文章，也就少了一次对着镜子总结和自我审视的机会。所以，无论如何，做就完了，做就已经赢在路上了。

题图：AI 生成 optimus prime is cooking Italian noodles for a cute toddler, bumblebee makes laughs at him. Digital art

猜你喜欢

Linux的文件和目录操作
Oracle表之间的比较与合并（oracle两张表的比对）
DataGrip2023年激活码,安装教程DataGrip项目创建
异常Oracle 管理员登录遭遇异常！（oracle管理员登陆）
EL表达式使用fmt:formatNumber标签保留两位小数详解编程语言
node.js中的http.response.addTrailers方法使用说明
DateDiff函数详解数据库
ORA-01030: SELECT … INTO variable does not exist ORACLE 报错故障修复远程处理
ORA-28068: The object “string” does not have a data redaction policy. ORACLE 报错故障修复远程处理
Linux远程删除文件的简易方法（linux远程删除文件）
【算法】动态规划 ① ( 动态规划简介 | 自底向上的动态规划示例 | 自顶向下的动态规划示例 )
SAP常用函数详解编程语言
python中使用MD5加密字符串详解编程语言
MYSQL中自增序号实现方法（mysql自增序号）
oracle数据迁移到db2数据库的实现方法(分享)
利用Redis解决过期场景的实践（redis过期场景）
动态初始化 + 杨辉三角
iPAD越狱后下载激活成功教程版的pad软件方法总录[通俗易懂]
Sql2000与Sql2005共存安装的解决方法
Linux-权限详解程序员

zl程序教程