您现在的位置是：首页 > 数据库

当前栏目

自建MongoDB实践：MongoDB 分片集群

MongoDB 集群

2023-03-14 11:24:53 时间

一般来说，在数据库系统和计算系统中，我们有两种方法来提高其性能。第一个是简单地用更强大的服务器取代我们常规的服务器，一般我们称之为垂直扩容（或纵向扩容）。

垂直扩容的主要缺点是它有限制：它不能无限扩大，这取决于多方面的因素。诸如：硬件已达到其物理极限、云提供商不能为我们提供更强大的服务器。

提高性能的第二种方法是使用具有相同容量的服务器并增加其数量，一般我们称之为水平扩容（或横向扩容）。

当数据量比较大的时候，我们需要把数据分片运行在不同的机器中，以降低 CPU、内存和 IO 的压力，Sharding 就是数据库分片技术。

MongoDB 分片技术类似 MySQL 的水平切分和垂直切分，数据库主要由两种方式做 Sharding：垂直扩展和横向切分。

垂直扩展：添加更多的 CPU，内存，磁盘空间等。

横向切分：则是通过数据分片的方式，通过集群统一提供服务。

一个 MongoDB 分片集群由以下组件组成：

shard: 每个分片都包含分片数据的一个子集。每个分片以副本集部署。
mongos: Mongos 充当查询路由器，在客户端应用程序和分片集群之间提供接口。从 MongoDB 4.4 开始，mongos 可以支持 hedged 读取，以尽量减少延迟。
config servers: 配置服务器存储集群的元数据和配置信息。

MongoDB 在 Collection 级别进行分片处理，在集群中的分片之间分发这些 Collection 数据。

一个生产环境的集群，请确保数据的冗余性及系统的高可用性。对于一个生产级别的分片集群，需要考虑一下几点：

部署一个 3 成员的复制集作为一个配置中心服务
每个分片部署为一个 3 成员的复制集
部署一个或多个 mongos 路由

环境准备

开始演示：

主机名	IP	角色
mongo01.tyun.cn	10.20.20.19	mongos1（27017），config1（27000），shard1 primary（27010）
mongo02.tyun.cn	10.20.20.11	mongos2（27017），config2（27000），shard1 secondary（27010）
mongo03.tyun.cn	10.20.20.41	mongos3（27017），config3（27000），shard1 secondary（27010）
mongo04.tyun.cn	10.20.20.14	shard2 primary（27010）
mongo05.tyun.cn	10.20.20.53	shard2 secondary（27010）
mongo06.tyun.cn	10.20.20.61	shard2 secondary（27010）
mongo07.tyun.cn	10.20.20.62	shard3 primary（27010）
mongo08.tyun.cn	10.20.20.89	shard3 secondary（27010）
mongo09.tyun.cn	10.20.20.99	shard3 secondary（27010）

如果大家在演示该文档时，手头上的机器资源不充足的话，可以安排一台多个角色即可（使用不同的端口号），不一定非得一台机器一个角色。

环境拓扑如下：

这里我们使用了静态 DNS 解析，如果有条件，可以用 DNS 服务进行域名的配置解析。/etc/hosts 文件如下：

10.20.20.19 mongo01.tyun.cn cfg1.tyun.cn mongos1.tyun.cn
10.20.20.11 mongo02.tyun.cn cfg2.tyun.cn mongos2.tyun.cn
10.20.20.41 mongo03.tyun.cn cfg3.tyun.cn mongos3.tyun.cn
10.20.20.14 mongo04.tyun.cn
10.20.20.53 mongo05.tyun.cn
10.20.20.61 mongo06.tyun.cn
10.20.20.62 mongo07.tyun.cn
10.20.20.89 mongo08.tyun.cn
10.20.20.99 mongo09.tyun.cn

配置 Config Server

01准备配置文件

在 3 台配置节点上分别创建配置文件 /etc/mongo-cfg.conf，内容如下：

# cfg1.tyun.cn 的配置文件
(venv36) [root@mongo01 ~]# cat /etc/mongo-cfg.conf 
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongo-cfg.log
storage:
  dbPath: /var/lib/mongocfg
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      cacheSizeGB: 1
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongo-cfg.pid  # location of pidfile
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27000
  bindIp: cfg1.tyun.cn
sharding:
  clusterRole: configsvr
replication:
  replSetName: config


# cfg2.tyun.cn 的配置文件
(venv36) [root@mongo02 ~]# cat /etc/mongo-cfg.conf 
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongo-cfg.log
storage:
  dbPath: /var/lib/mongocfg
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      cacheSizeGB: 1
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongo-cfg.pid  # location of pidfile
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27000
  bindIp: cfg2.tyun.cn
sharding:
  clusterRole: configsvr
replication:
  replSetName: config


# cfg3.tyun.cn 的配置文件
(venv36) [root@mongo03 ~]# cat /etc/mongo-cfg.conf 
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongo-cfg.log
storage:
  dbPath: /var/lib/mongocfg
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      cacheSizeGB: 1
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongo-cfg.pid  # location of pidfile
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27000
  bindIp: cfg3.tyun.cn
sharding:
  clusterRole: configsvr
replication:
  replSetName: config

02启动 Config Server

在 3 台配置节点上分别执行如下命令：

[root@mongo01 ~]# systemctl start mongocfg


[root@mongo02 ~]# systemctl start mongocfg


[root@mongo03 ~]# systemctl start mongocfg

检查一下进程是否已经启动成功：

(venv36) [root@mongo01 ~]# ansible -i hosts 'cfg' -m shell -a "systemctl status mongocfg" |grep "Active: active (running)"
   Active: active (running) since Fri 2022-08-05 05:24:56 UTC; 1min 4s ago
   Active: active (running) since Fri 2022-08-05 05:25:25 UTC; 35s ago
   Active: active (running) since Fri 2022-08-05 05:25:36 UTC; 24s ago

03初始化 Config Server

登录到第一个节点上，这时还没有创建用户及密码，所以登录时没有指定密码也是可以登录的。

(venv36) [root@mongo01 ~]# mongo cfg1.tyun.cn:27000
MongoDB shell version v4.4.15
connecting to: mongodb://cfg1.tyun.cn:27000/test?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("651fb6a5-9e7e-43f9-91ee-1ae6a2b3365f") }
MongoDB server version: 4.4.15
> 
> show dbs


> use test
switched to db test


> db.test.insert({a: 1})
WriteCommandError({
    "ok" : 0,
    "errmsg" : "command insert requires authentication",
    "code" : 13,
    "codeName" : "Unauthorized"
})

除了创建用户，其实什么也不能操作。接下来的第一件事情是创建用户及密码：

> db.createUser({user: "root", pwd: "root123", roles: [{role: "root", db: "admin" }]})

接着初始化 Config Server：

(venv36) [root@mongo01 ~]# mongo -u root -p  --host cfg1.tyun.cn:27000 --authenticationDatabase admin
MongoDB shell version v4.4.15
Enter password: 
connecting to: mongodb://cfg1.tyun.cn:27000/?authSource=admin&compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("a1827479-b741-4f8b-be49-5ca0be4852aa") }
MongoDB server version: 4.4.15
---
The server generated these startup warnings when booting: 
        2022-08-05T06:30:24.135+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).


        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.


        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
>
> rs.initiate({
    _id: "config",
    "members" : [
        {
            "_id": 0,
            "host" : "cfg1.tyun.cn:27000"
        },
        {
            "_id": 1,
            "host" : "cfg2.tyun.cn:27000"
        },
        {
            "_id": 2,
            "host" : "cfg3.tyun.cn:27000"
        }
    ]
});
{ "ok" : 1 }

需要等待 10 秒钟左右，3 个 Config Server 会通过选举产生主节点。

注意提示符变化：

config:SECONDARY> 
config:PRIMARY> 
......
config:PRIMARY> config:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB


config:PRIMARY> use admin
switched to db admin


config:PRIMARY> show users
{
    "_id" : "admin.admin",
    "userId" : UUID("0c0d5bc1-062c-4204-963f-bba842ffda7d"),
    "user" : "admin",
    "db" : "admin",
    "roles" : [
        {
            "role" : "dbAdminAnyDatabase",
            "db" : "admin"
        },
        {
            "role" : "userAdminAnyDatabase",
            "db" : "admin"
        }
    ],
    "mechanisms" : [
        "SCRAM-SHA-1",
        "SCRAM-SHA-256"
    ]
}
{
    "_id" : "admin.root",
    "userId" : UUID("aa54a433-e9a2-452b-bd1d-d6ef54f4a46e"),
    "user" : "root",
    "db" : "admin",
    "roles" : [
        {
            "role" : "root",
            "db" : "admin"
        }
    ],
    "mechanisms" : [
        "SCRAM-SHA-1",
        "SCRAM-SHA-256"
    ]
}

至此，Config Server 配置完成。

配置 Replica Set

Replica Set 的配置请参考 Replica Set 章节。分片 1 的集群节点为：

mongo04.tyun.cn:27010
mongo05.tyun.cn:27010
mongo06.tyun.cn:27010

配置 Mongos

01准备 mongos 配置文件

sharding:
  configDB: <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019
net:
  bindIp: localhost,<hostname(s)|ip address(es)>

一个相对完整的配置文件（以 mongos1 为例）：

[root@mongo01 ~]# cat /etc/mongos.conf 
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongos.log
processManagement:
  fork: true
  pidFilePath: /var/run/mongodb/mongos.pid
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27017
  bindIp: mongos1.tyun.cn
# security:
#   authorization: enabled
#   keyFile: /etc/mongod.keyfile
sharding:
  configDB: config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000

02启动 mongos

[root@mongo01 ~]# mongos \
  --bind_ip mongos1.tyun.cn \
  --port 27017 \
  --logpath /var/log/mongodb/mongos.log \
  --configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
  --fork


[root@mongo02 ~]# mongos \
  --bind_ip mongos2.tyun.cn \
  --port 27017 \
  --logpath /var/log/mongodb/mongos.log \
  --configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
  --fork


[root@mongo03 ~]# mongos \
  --bind_ip mongos3.tyun.cn \
  --port 27017 \
  --logpath /var/log/mongodb/mongos.log \
  --configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
  --fork

mongos 也可以通过配置文件的形式启动：

[root@mongo01 ~]# cat /etc/mongos.conf 
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongos.log
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongos.pid  # location of pidfile
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27017
  bindIp: mongos1.tyun.cn
security:
  # authorization: enabled
  keyFile: /etc/mongo.keyfile
sharding:
  configDB: config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000

启动命令如下：

[root@mongo01 ~]# mongos -f /etc/mongos.conf

添加 shard1 分片到分片集

增加第一个分片 shard1 到集群中：

[root@mongo01 ~]# mongo --host mongos1.tyun.cn:27017
MongoDB shell version v4.4.15
connecting to: mongodb://mongos1.tyun.cn:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("f1ade2c4-c071-4e8a-9fbb-f1093e9d9753") }
MongoDB server version: 4.4.15
---
The server generated these startup warnings when booting: 
        2022-08-05T09:17:59.537+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
        2022-08-05T09:17:59.537+00:00: You are running this process as the root user, which is not recommended
---
mongos> show dbs
admin   0.000GB
config  0.000GB
mongos> 
mongos> sh.addShard("shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010");
{
    "shardAdded" : "shard1",
    "ok" : 1,
    "operationTime" : Timestamp(1659691403, 8),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1659691403, 8),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}


mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
      "_id" : 1,
      "minCompatibleVersion" : 5,
      "currentVersion" : 6,
      "clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
  }
  shards:
        {  "_id" : "shard1",  "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",  "state" : 1 }
  active mongoses:
        "4.4.15" : 3
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }

创建分片表

接下来我们创建一个测试库 test，然后在 test 库上创建集合 shard，并开启分片。

mongos> sh.enableSharding("test");
{
    "ok" : 1,
    "operationTime" : Timestamp(1659755432, 7),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1659755432, 7),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}


mongos> sh.shardCollection("test.shard", {_id: 'hashed'});
{
    "collectionsharded" : "test.shard",
    "collectionUUID" : UUID("329f4308-bff9-453a-bec2-7f3a757d95dd"),
    "ok" : 1,
    "operationTime" : Timestamp(1659755452, 13),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1659755452, 13),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}


mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
      "_id" : 1,
      "minCompatibleVersion" : 5,
      "currentVersion" : 6,
      "clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
  }
  shards:
        {  "_id" : "shard1",  "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",  "state" : 1 }
  active mongoses:
        "4.4.15" : 3
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard1    1024
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "test",  "primary" : "shard1",  "partitioned" : true,  "version" : {  "uuid" : UUID("8c333889-11b2-4de0-9f54-f0c56b622124"),  "lastMod" : 1 } }
                test.shard
                        shard key: { "_id" : "hashed" }
                        unique: false
                        balancing: true
                        chunks:
                                shard1    2 // 注意这里的输出
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong(0) } on : shard1 Timestamp(1, 0) 
                        { "_id" : NumberLong(0) } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 1)

我们可以看到 shard1 中有 2 chunk。

插入测试数据：

mongos> use test
switched to db test


mongos> for (var i = 0; i < 100000; i++) {
    db.shard.insert({i: i});
}


mongos> db.shard.find().limit(10)
{ "_id" : ObjectId("62eddc26f659b8344f42c837"), "i" : 0 }
{ "_id" : ObjectId("62eddc26f659b8344f42c838"), "i" : 1 }
{ "_id" : ObjectId("62eddc26f659b8344f42c839"), "i" : 2 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83a"), "i" : 3 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83b"), "i" : 4 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83c"), "i" : 5 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83d"), "i" : 6 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83e"), "i" : 7 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83f"), "i" : 8 }
{ "_id" : ObjectId("62eddc26f659b8344f42c840"), "i" : 9 }

这时我们也可以登录到 shard1 复本集里面查看一下数据（找到主节点进行登录）：

[root@mongo01 ~]# mongo --host mongo05.tyun.cn:27010
MongoDB shell version v4.4.15
connecting to: mongodb://mongo05.tyun.cn:27010/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("b14a4b9b-f6b9-48d5-980a-a7fd3bbf2d73") }
MongoDB server version: 4.4.15
---
shard1:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.004GB
test    0.006GB
shard1:PRIMARY> use test
switched to db test
shard1:PRIMARY> db.shard
db.shard
shard1:PRIMARY> db.shard.find().limit(6)
{ "_id" : ObjectId("62eddc26f659b8344f42c837"), "i" : 0 }
{ "_id" : ObjectId("62eddc26f659b8344f42c838"), "i" : 1 }
{ "_id" : ObjectId("62eddc26f659b8344f42c839"), "i" : 2 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83a"), "i" : 3 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83b"), "i" : 4 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83c"), "i" : 5 }
shard1:PRIMARY>

添加 shard2 分片到分片集

Replica Set 的配置请参考 Replica Set 章节。分片 1 的集群节点为：

mongo07.tyun.cn:27010
mongo08.tyun.cn:27010
mongo09.tyun.cn:27010

shard2 复本集验证：

[root@mongo01 ~]# mongo --host mongo07.tyun.cn:27010


shard2:PRIMARY> rs.status()
{
    "set" : "shard2",
    "date" : ISODate("2022-08-06T03:31:26.564Z"),
    "myState" : 1,
    "term" : NumberLong(1),
    "syncSourceHost" : "",
    "syncSourceId" : -1,
    "heartbeatIntervalMillis" : NumberLong(2000),
    "majorityVoteCount" : 2,
    "writeMajorityCount" : 2,
    "votingMembersCount" : 3,
    "writableVotingMembersCount" : 3,
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1659756685, 1),
            "t" : NumberLong(1)
        },
        "lastCommittedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
        "readConcernMajorityOpTime" : {
            "ts" : Timestamp(1659756685, 1),
            "t" : NumberLong(1)
        },
        "readConcernMajorityWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
        "appliedOpTime" : {
            "ts" : Timestamp(1659756685, 1),
            "t" : NumberLong(1)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1659756685, 1),
            "t" : NumberLong(1)
        },
        "lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
        "lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z")
    },
    "lastStableRecoveryTimestamp" : Timestamp(1659756625, 4),
    "electionCandidateMetrics" : {
        "lastElectionReason" : "electionTimeout",
        "lastElectionDate" : ISODate("2022-08-06T03:30:25.877Z"),
        "electionTerm" : NumberLong(1),
        "lastCommittedOpTimeAtElection" : {
            "ts" : Timestamp(0, 0),
            "t" : NumberLong(-1)
        },
        "lastSeenOpTimeAtElection" : {
            "ts" : Timestamp(1659756615, 1),
            "t" : NumberLong(-1)
        },
        "numVotesNeeded" : 2,
        "priorityAtElection" : 1,
        "electionTimeoutMillis" : NumberLong(10000),
        "numCatchUpOps" : NumberLong(0),
        "newTermStartDate" : ISODate("2022-08-06T03:30:25.915Z"),
        "wMajorityWriteAvailabilityDate" : ISODate("2022-08-06T03:30:26.890Z")
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "mongo07.tyun.cn:27010",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 213,
            "optime" : {
                "ts" : Timestamp(1659756685, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2022-08-06T03:31:25Z"),
            "lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "syncSourceHost" : "",
            "syncSourceId" : -1,
            "infoMessage" : "",
            "electionTime" : Timestamp(1659756625, 1),
            "electionDate" : ISODate("2022-08-06T03:30:25Z"),
            "configVersion" : 1,
            "configTerm" : -1,
            "self" : true,
            "lastHeartbeatMessage" : ""
        },
        {
            "_id" : 1,
            "name" : "mongo08.tyun.cn:27010",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 71,
            "optime" : {
                "ts" : Timestamp(1659756675, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1659756675, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2022-08-06T03:31:15Z"),
            "optimeDurableDate" : ISODate("2022-08-06T03:31:15Z"),
            "lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "lastHeartbeat" : ISODate("2022-08-06T03:31:25.890Z"),
            "lastHeartbeatRecv" : ISODate("2022-08-06T03:31:24.933Z"),
            "pingMs" : NumberLong(0),
            "lastHeartbeatMessage" : "",
            "syncSourceHost" : "mongo07.tyun.cn:27010",
            "syncSourceId" : 0,
            "infoMessage" : "",
            "configVersion" : 1,
            "configTerm" : -1
        },
        {
            "_id" : 2,
            "name" : "mongo09.tyun.cn:27010",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 71,
            "optime" : {
                "ts" : Timestamp(1659756675, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1659756675, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2022-08-06T03:31:15Z"),
            "optimeDurableDate" : ISODate("2022-08-06T03:31:15Z"),
            "lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "lastHeartbeat" : ISODate("2022-08-06T03:31:25.890Z"),
            "lastHeartbeatRecv" : ISODate("2022-08-06T03:31:24.872Z"),
            "pingMs" : NumberLong(0),
            "lastHeartbeatMessage" : "",
            "syncSourceHost" : "mongo07.tyun.cn:27010",
            "syncSourceId" : 0,
            "infoMessage" : "",
            "configVersion" : 1,
            "configTerm" : -1
        }
    ],
    "ok" : 1,
    "$clusterTime" : {
        "clusterTime" : Timestamp(1659756685, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    },
    "operationTime" : Timestamp(1659756685, 1)
}
shard2:PRIMARY>

接着把 shard2 加入到分片集中（连接任意一台 mongos）：

[root@mongo01 ~]# mongo --host mongos1.tyun.cn:27017
MongoDB shell version v4.4.15
connecting to: mongodb://mongos1.tyun.cn:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("1bb0a6ed-dad1-4440-95cb-2f60e0be506f") }
MongoDB server version: 4.4.15
---
The server generated these startup warnings when booting: 
        2022-08-05T09:17:59.537+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
        2022-08-05T09:17:59.537+00:00: You are running this process as the root user, which is not recommended
---
mongos> 
mongos> sh.addShard("shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010");
{
    "shardAdded" : "shard2",
    "ok" : 1,
    "operationTime" : Timestamp(1659756859, 4),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1659756859, 4),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}


mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
      "_id" : 1,
      "minCompatibleVersion" : 5,
      "currentVersion" : 6,
      "clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
  }
  shards:
        {  "_id" : "shard1",  "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",  "state" : 1 }
        {  "_id" : "shard2",  "host" : "shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010",  "state" : 1 }
  active mongoses:
        "4.4.15" : 3
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                31 : Success
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard1    994
                                shard2    30
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "test",  "primary" : "shard1",  "partitioned" : true,  "version" : {  "uuid" : UUID("8c333889-11b2-4de0-9f54-f0c56b622124"),  "lastMod" : 1 } }
                test.shard
                        shard key: { "_id" : "hashed" }
                        unique: false
                        balancing: true
                        chunks:
                                shard1    1
                                shard2    1
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong(0) } on : shard2 Timestamp(2, 0) 
                        { "_id" : NumberLong(0) } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(2, 1) 
mongos>

从输出可以看到，shard1 的 2 个 chunk，已经分配到了 shard2 上面了，这是 MongoDB 的自动均衡机制起作用了。

看看每个 Shard 的文档数量有多少？

mongos> status = db.shard.stats()


// 查看文档数量
mongos> status.shards.shard1.count
50184


// 过一段时间再次查看
mongos> status.shards.shard2.count
49816


// 比较一下两个分片的文档数量
mongos> status.shards.shard1.count - status.shards.shard2.count
368

从两个分片中的文档数量来看，数据存放基本是均衡的。

mongos> use admin
switched to db admin
mongos> db.runCommand({listShards: 1})
{
    "shards" : [
        {
            "_id" : "shard1",
            "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",
            "state" : 1
        },
        {
            "_id" : "shard2",
            "host" : "shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010",
            "state" : 1
        }
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1660384940, 3),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1660384940, 3),
        "signature" : {
            "hash" : BinData(0,"kAzOU7gYu5MWoNSYPEZanw1KYd4="),
            "keyId" : NumberLong("7128287226089177110")
        }
    }
}

删除分片：

mongos> db.adminCommand( { removeShard: "shard2" } )
{
    "msg" : "draining started successfully",
    "state" : "started",
    "shard" : "shard2",
    "note" : "you need to drop or movePrimary these databases",
    "dbsToMove" : [
        "testdb"
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1660384982, 2),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1660384982, 2),
        "signature" : {
            "hash" : BinData(0,"ToGrQJZSWqSfiFwe/Hop2eykOAM="),
            "keyId" : NumberLong("7128287226089177110")
        }
    }
}

查看移动的状态：

mongos> db.adminCommand( { removeShard: "shard2" } )
{
    "msg" : "draining ongoing",
    "state" : "ongoing", // 进行中
    "remaining" : {
        "chunks" : NumberLong(406),  // 剩余
        "dbs" : NumberLong(1),
        "jumboChunks" : NumberLong(0)
    },
    "note" : "you need to drop or movePrimary these databases",
    "dbsToMove" : [
        "testdb"
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1660385198, 21),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1660385198, 21),
        "signature" : {
            "hash" : BinData(0,"HVDmppA+MhUor9a72JKDjWErLKo="),
            "keyId" : NumberLong("7128287226089177110")
        }
    }
}


// 再次查看
mongos> db.adminCommand( { removeShard: "shard2" } )
{
    "msg" : "draining ongoing",
    "state" : "ongoing",
    "remaining" : {
        "chunks" : NumberLong(345),  // 这里
        "dbs" : NumberLong(1),
        "jumboChunks" : NumberLong(0)
    },
    "note" : "you need to drop or movePrimary these databases",
    "dbsToMove" : [
        "testdb"
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1660385328, 3),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1660385328, 3),
        "signature" : {
            "hash" : BinData(0,"Wi6BxDNErUjsHYTdVpvbiEyGUrw="),
            "keyId" : NumberLong("7128287226089177110")
        }
    }
}


// 一段时间后再次查看
mongos> db.adminCommand( { removeShard: "shard2" } )
{
    "msg" : "draining ongoing",
    "state" : "ongoing",
    "remaining" : {
        "chunks" : NumberLong(87),  // 这里
        "dbs" : NumberLong(1),
        "jumboChunks" : NumberLong(0)
    },
    "note" : "you need to drop or movePrimary these databases",
    "dbsToMove" : [
        "testdb"
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1660385870, 3),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1660385870, 6),
        "signature" : {
            "hash" : BinData(0,"R5LJzYTNv+s+aJaiJZVZ9arr+84="),
            "keyId" : NumberLong("7128287226089177110")
        }
    }
}

移动 DB 到其它分片：

mongos> db.adminCommand( { movePrimary: "testdb", to: "shard0" })
{
    "ok" : 1,
    "operationTime" : Timestamp(1660386323, 42852),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1660386323, 42852),
        "signature" : {
            "hash" : BinData(0,"wpJWCc5pzEghDEgRjXl9NiA9Gxs="),
            "keyId" : NumberLong("7128287226089177110")
        }
    }
}


// 再次查看状态
mongos> db.adminCommand( { removeShard: "shard2" } )
{
    "msg" : "removeshard completed successfully",
    "state" : "completed",
    "shard" : "shard2",
    "ok" : 1,
    "operationTime" : Timestamp(1660386353, 3),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1660386353, 3),
        "signature" : {
            "hash" : BinData(0,"EoqSZ6a4MbSrQcBHH6rVAI1DtyA="),
            "keyId" : NumberLong("7128287226089177110")
        }
    }
}


mongos> db.runCommand({listShards: 1})
{
    "shards" : [
        {
            "_id" : "shard1",
            "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",
            "state" : 1
        },
        {
            "_id" : "shard0",
            "host" : "shard0/mongo01.tyun.cn:27010,mongo02.tyun.cn:27010,mongo03.tyun.cn:27010",
            "state" : 1
        }
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1660386367, 23),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1660386367, 23),
        "signature" : {
            "hash" : BinData(0,"yBy7UjBzOh1RIbm4fj/q+Docptg="),
            "keyId" : NumberLong("7128287226089177110")
        }
    }
}

总结

分片具有很大的灵活性。

不过，我们在执行某些操作时还存在一些限制。

我们将在以下列表中突出显示最重要的内容：

01group() 命令不起作用。我们应该使用 aggregate() 和聚合框架，或者 mapreduce()。

02db.eval() 命令不起作用，出于安全原因，在大多数情况下应将其禁用。

03更新操作时的 $isolated 选项不起作用。这是分片环境中缺少的功能。update() 的 $isolated 选项提供了保证，如果我们一次更新多个文档，其他读者和作者将看不到一些更新了新值的文档，而其他文档仍然具有旧值。这是在 unsharded 中实现的方式环境是通过持有全局写锁和/或将操作序列化到单个线程来确保对受 update() 影响的文档的每个请求都不会被其他线程/操作访问。此实现意味着它不是高性能的并且不支持任何并发，这禁止在分片环境中使用 $isolated 运算符。

04不支持查询的 $snapshot 运算符。find() 游标中的 $snapshot 运算符可防止文档由于在更新后被移动到磁盘上的不同位置，在结果中出现不止一次。$snapshot 运算符的操作成本很高，通常不是硬性要求。替代它的方法是对我们查询的字段使用索引，该字段的键在查询期间不会更改。

05如果我们的查询不包含分片键，索引将无法覆盖我们的查询。分片环境中的结果将来自磁盘，而不仅仅是来自索引。唯一的例外是如果我们只在内置的 _id 字段上查询并且只返回 _id 字段，在这种情况下，MongoDB 仍然可以使用内置索引覆盖查询。

06update()和remove()操作的工作方式不同。分片环境中的所有update()和remove()操作必须包括要受影响的文档的_id 或分片键；否则，mongos 路由器将不得不对所有集合、数据库和分片进行全表扫描，这在操作上会非常耗时。

07跨分片的唯一索引需要包含分片键作为索引的前缀。换句话说，要实现跨分片文档的唯一性，我们需要遵循 MongoDB 对分片遵循的数据分布。

08分片键的大小不得超过 512 字节。分片键索引必须在被分片的键字段和可选的其他字段上按升序排列，或者在其上的哈希索引。

猜你喜欢

鲜为人知但很有用的 HTML 属性
在 Go 里用 CGO？这 7 个问题你要关注！
数据孤岛是业务效率的无声杀手
9款优秀的去中心化通讯软件 Matrix 的客户端
翻转再翻转！有意思的水平横向溢出滚动
发现 Linux SpaceFM 文件管理器的威力
图像处理工具Python扩展库，你了解吗？
求职数据分析，项目经验该怎么写
自定义计数器小技巧！CSS 实现长按点赞累加动画
在OKR中，我看到了数据驱动业务的未来
2023展望：新的一年将给大数据分析领域带来什么？
过五关！React高频面试题指南
阿里云ADB基于Hudi构建Lakehouse的实践
火山引擎云原生大数据在金融行业的实践
OpenHarmony富设备移植指南（二）—从postmarketOS获取移植资源
《数据成熟度指数》报告：64%的企业领袖认为大多数员工“不懂数据”
OpenHarmony 小型系统兼容性测试指南
肯睿中国（Cloudera）：2023年企业数字战略三大趋势预测
适用于 Linux 的十大命令行游戏
软件开发中的十个认知偏差

zl程序教程