未验证 提交 65cc7b1d 编写于 作者: W wade zhang 提交者: GitHub

Merge pull request #14977 from taosdata/docs/wade-3.0

doc: english version of cluster chapter and some refine for chinese v…
......@@ -6,9 +6,9 @@ title: Deployment
### Step 1
The FQDN of all hosts must be setup properly. For e.g. FQDNs may have to be configured in the /etc/hosts file on each host. You must confirm that each FQDN can be accessed from any other host. For e.g. you can do this by using the `ping` command.
The FQDN of all hosts must be setup properly. All FQDNs need to be configured in the /etc/hosts file on each host. You must confirm that each FQDN can be accessed from any other host, you can do this by using the `ping` command.
To get the hostname on any host, the command `hostname -f` can be executed. `ping <FQDN>` command can be executed on each host to check whether any other host is accessible from it. If any host is not accessible, the network configuration, like /etc/hosts or DNS configuration, needs to be checked and revised, to make any two hosts accessible to each other.
The command `hostname -f` can be executed to get the hostname on any host. `ping <FQDN>` command can be executed on each host to check whether any other host is accessible from it. If any host is not accessible, the network configuration, like /etc/hosts or DNS configuration, needs to be checked and revised, to make any two hosts accessible to each other.
:::note
......@@ -20,7 +20,7 @@ To get the hostname on any host, the command `hostname -f` can be executed. `pin
### Step 2
If any previous version of TDengine has been installed and configured on any host, the installation needs to be removed and the data needs to be cleaned up. For details about uninstalling please refer to [Install and Uninstall](/operation/pkg-install). To clean up the data, please use `rm -rf /var/lib/taos/\*` assuming the `dataDir` is configured as `/var/lib/taos`.
If any previous version of TDengine has been installed and configured on any host, the installation needs to be removed and the data needs to be cleaned up. For details about uninstalling please refer to [Install and Uninstall](/operation/pkg-install). To clean up the data, please use `rm -rf /var/lib/taos/\*` assuming the `dataDir` is configured as `/var/lib/taos`.
:::note
......@@ -54,22 +54,12 @@ serverPort 6030
For all the dnodes in a TDengine cluster, the below parameters must be configured exactly the same, any node whose configuration is different from dnodes already in the cluster can't join the cluster.
| **#** | **Parameter** | **Definition** |
| ----- | ------------------ | --------------------------------------------------------------------------------- |
| 1 | numOfMnodes | The number of management nodes in the cluster |
| 2 | mnodeEqualVnodeNum | The ratio of resource consuming of mnode to vnode |
| 3 | offlineThreshold | The threshold of dnode offline, once it's reached the dnode is considered as down |
| 4 | statusInterval | The interval by which dnode reports its status to mnode |
| 5 | arbitrator | End point of the arbitrator component in the cluster |
| 6 | timezone | Timezone |
| 7 | balance | Enable load balance automatically |
| 8 | maxTablesPerVnode | Maximum number of tables that can be created in each vnode |
| 9 | maxVgroupsPerDb | Maximum number vgroups that can be used by each DB |
:::note
Prior to version 2.0.19.0, besides the above parameters, `locale` and `charset` must also be configured the same for each dnode.
:::
| **#** | **Parameter** | **Definition** |
| ----- | -------------- | ------------------------------------------------------------- |
| 1 | statusInterval | The time interval for which dnode reports its status to mnode |
| 2 | timezone | Time Zone where the server is located |
| 3 | locale | Location code of the system |
| 4 | charset | Character set of the system |
## Start Cluster
......@@ -77,19 +67,19 @@ In the following example we assume that first dnode has FQDN h1.taosdata.com and
### Start The First DNODE
The first dnode can be started following the instructions in [Get Started](/get-started/). Then TDengine CLI `taos` can be launched to execute command `show dnodes`, the output is as following for example:
Start the first dnode following the instructions in [Get Started](/get-started/). Then launch TDengine CLI `taos` and execute command `show dnodes`, the output is as following for example:
```
Welcome to the TDengine shell from Linux, Client Version:2.0.0.0
Welcome to the TDengine shell from Linux, Client Version:3.0.0.0
Copyright (c) 2022 by TAOS Data, Inc. All rights reserved.
Copyright (c) 2017 by TAOS Data, Inc. All rights reserved.
Server is Enterprise trial Edition, ver:3.0.0.0 and will never expire.
taos> show dnodes;
id | end_point | vnodes | cores | status | role | create_time |
=====================================================================================
1 | h1.taosdata.com:6030 | 0 | 2 | ready | any | 2020-07-31 03:49:29.202 |
Query OK, 1 row(s) in set (0.006385s)
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | h1.taosdata.com:6030 | 0 | 1024 | ready | 2022-07-16 10:50:42.673 | |
Query OK, 1 rows affected (0.007984s)
taos>
```
......@@ -100,7 +90,7 @@ From the above output, it is shown that the end point of the started dnode is "h
There are a few steps necessary to add other dnodes in the cluster.
Let's assume we are starting the second dnode with FQDN, h2.taosdata.com. First we make sure the configuration is correct.
Let's assume we are starting the second dnode with FQDN, h2.taosdata.com. Firstly we make sure the configuration is correct.
```c
// firstEp is the end point to connect to when any dnode starts
......@@ -114,7 +104,7 @@ serverPort 6030
```
Second, we can start `taosd` as instructed in [Get Started](/get-started/).
Secondly, we can start `taosd` as instructed in [Get Started](/get-started/).
Then, on the first dnode i.e. h1.taosdata.com in our example, use TDengine CLI `taos` to execute the following command to add the end point of the dnode in the cluster. In the command "fqdn:port" should be quoted using double quotes.
......
......@@ -39,31 +39,25 @@ USE SOME_DATABASE;
SHOW VGROUPS;
```
The example output is below:
```
taos> show dnodes;
id | end_point | vnodes | cores | status | role | create_time | offline reason |
======================================================================================================================================
1 | localhost:6030 | 9 | 8 | ready | any | 2022-04-15 08:27:09.359 | |
Query OK, 1 row(s) in set (0.008298s)
The output is like below:
taos> use db;
Database changed.
taos> show vgroups;
vgId | tables | status | onlines | v1_dnode | v1_status | compacting |
vgId | tables | status | onlines | v1_dnode | v1_status | compacting |
==========================================================================================
14 | 38000 | ready | 1 | 1 | leader | 0 |
15 | 38000 | ready | 1 | 1 | leader | 0 |
16 | 38000 | ready | 1 | 1 | leader | 0 |
17 | 38000 | ready | 1 | 1 | leader | 0 |
18 | 37001 | ready | 1 | 1 | leader | 0 |
19 | 37000 | ready | 1 | 1 | leader | 0 |
20 | 37000 | ready | 1 | 1 | leader | 0 |
21 | 37000 | ready | 1 | 1 | leader | 0 |
14 | 38000 | ready | 1 | 1 | leader | 0 |
15 | 38000 | ready | 1 | 1 | leader | 0 |
16 | 38000 | ready | 1 | 1 | leader | 0 |
17 | 38000 | ready | 1 | 1 | leader | 0 |
18 | 37001 | ready | 1 | 1 | leader | 0 |
19 | 37000 | ready | 1 | 1 | leader | 0 |
20 | 37000 | ready | 1 | 1 | leader | 0 |
21 | 37000 | ready | 1 | 1 | leader | 0 |
Query OK, 8 row(s) in set (0.001154s)
```
````
## Add DNODE
......@@ -71,7 +65,7 @@ Launch TDengine CLI `taos` and execute the command below to add the end point of
```sql
CREATE DNODE "fqdn:port";
```
````
The example output is as below:
......@@ -142,72 +136,3 @@ In the above example, when `show dnodes` is executed the first time, two dnodes
- dnodeID is allocated automatically and can't be manually modified. dnodeID is generated in ascending order without duplication.
:::
## Move VNODE
A vnode can be manually moved from one dnode to another.
Launch TDengine CLI `taos` and execute below command:
```sql
ALTER DNODE <source-dnodeId> BALANCE "VNODE:<vgId>-DNODE:<dest-dnodeId>";
```
In the above command, `source-dnodeId` is the original dnodeId where the vnode resides, `dest-dnodeId` specifies the target dnode. vgId (vgroup ID) can be shown by `SHOW VGROUPS `.
First `show vgroups` is executed to show the vgroup distribution.
```
taos> show vgroups;
vgId | tables | status | onlines | v1_dnode | v1_status | compacting |
==========================================================================================
14 | 38000 | ready | 1 | 3 | leader | 0 |
15 | 38000 | ready | 1 | 3 | leader | 0 |
16 | 38000 | ready | 1 | 3 | leader | 0 |
17 | 38000 | ready | 1 | 3 | leader | 0 |
18 | 37001 | ready | 1 | 3 | leader | 0 |
19 | 37000 | ready | 1 | 1 | leader | 0 |
20 | 37000 | ready | 1 | 1 | leader | 0 |
21 | 37000 | ready | 1 | 1 | leader | 0 |
Query OK, 8 row(s) in set (0.001314s)
```
It can be seen that there are 5 vgroups in dnode 3 and 3 vgroups in node 1, now we want to move vgId 18 from dnode 3 to dnode 1. Execute the below command in `taos`
```
taos> alter dnode 3 balance "vnode:18-dnode:1";
DB error: Balance already enabled (0.00755
```
However, the operation fails with error message show above, which means automatic load balancing has been enabled in the current database so manual load balance can't be performed.
Shutdown the cluster, configure `balance` parameter in all the dnodes to 0, then restart the cluster, and execute `alter dnode` and `show vgroups` as below.
```
taos> alter dnode 3 balance "vnode:18-dnode:1";
Query OK, 0 row(s) in set (0.000575s)
taos> show vgroups;
vgId | tables | status | onlines | v1_dnode | v1_status | v2_dnode | v2_status | compacting |
=================================================================================================================
14 | 38000 | ready | 1 | 3 | leader | 0 | NULL | 0 |
15 | 38000 | ready | 1 | 3 | leader | 0 | NULL | 0 |
16 | 38000 | ready | 1 | 3 | leader | 0 | NULL | 0 |
17 | 38000 | ready | 1 | 3 | leader | 0 | NULL | 0 |
18 | 37001 | ready | 2 | 1 | follower | 3 | leader | 0 |
19 | 37000 | ready | 1 | 1 | leader | 0 | NULL | 0 |
20 | 37000 | ready | 1 | 1 | leader | 0 | NULL | 0 |
21 | 37000 | ready | 1 | 1 | leader | 0 | NULL | 0 |
Query OK, 8 row(s) in set (0.001242s)
```
It can be seen from above output that vgId 18 has been moved from dnode 3 to dnode 1.
:::note
- Manual load balancing can only be performed when the automatic load balancing is disabled, i.e. `balance` is set to 0.
- Only a vnode in normal state, i.e. leader or follower, can be moved. vnode can't be moved when its in status offline, unsynced or syncing.
- Before moving a vnode, it's necessary to make sure the target dnode has enough resources: CPU, memory and disk.
:::
---
sidebar_label: HA & LB
title: High Availability and Load Balancing
---
## High Availability of Vnode
High availability of vnode and mnode can be achieved through replicas in TDengine.
A TDengine cluster can have multiple databases. Each database has a number of vnodes associated with it. A different number of replicas can be configured for each DB. When creating a database, the parameter `replica` is used to specify the number of replicas. The default value for `replica` is 1. Naturally, a single replica cannot guarantee high availability since if one node is down, the data service is unavailable. Note that the number of dnodes in the cluster must NOT be lower than the number of replicas set for any DB, otherwise the `create table` operation will fail with error "more dnodes are needed". The SQL statement below is used to create a database named "demo" with 3 replicas.
```sql
CREATE DATABASE demo replica 3;
```
The data in a DB is divided into multiple shards and stored in multiple vgroups. The number of vnodes in each vgroup is determined by the number of replicas set for the DB. The vnodes in each vgroup store exactly the same data. For the purpose of high availability, the vnodes in a vgroup must be located in different dnodes on different hosts. As long as over half of the vnodes in a vgroup are in an online state, the vgroup is able to provide data access. Otherwise the vgroup can't provide data access for reading or inserting data.
There may be data for multiple DBs in a dnode. When a dnode is down, multiple DBs may be affected. While in theory, the cluster will provide data access for reading or inserting data if over half the vnodes in vgroups are online, because of the possibly complex mapping between vnodes and dnodes, it is difficult to guarantee that the cluster will work properly if over half of the dnodes are online.
## High Availability of Mnode
Each TDengine cluster is managed by `mnode`, which is a module of `taosd`. For the high availability of mnode, multiple mnodes can be configured using system parameter `numOfMNodes`. The valid range for `numOfMnodes` is [1,3]. To ensure data consistency between mnodes, data replication between mnodes is performed synchronously.
There may be multiple dnodes in a cluster, but only one mnode can be started in each dnode. Which one or ones of the dnodes will be designated as mnodes is automatically determined by TDengine according to the cluster configuration and system resources. The command `show mnodes` can be executed in TDengine `taos` to show the mnodes in the cluster.
```sql
SHOW MNODES;
```
The end point and role/status (leader, follower, unsynced, or offline) of all mnodes can be shown by the above command. When the first dnode is started in a cluster, there must be one mnode in this dnode. Without at least one mnode, the cluster cannot work. If `numOfMNodes` is configured to 2, another mnode will be started when the second dnode is launched.
For the high availability of mnode, `numOfMnodes` needs to be configured to 2 or a higher value. Because the data consistency between mnodes must be guaranteed, the replica confirmation parameter `quorum` is set to 2 automatically if `numOfMNodes` is set to 2 or higher.
:::note
If high availability is important for your system, both vnode and mnode must be configured to have multiple replicas.
:::
## Load Balancing
Load balancing will be triggered in 3 cases without manual intervention.
- When a new dnode joins the cluster, automatic load balancing may be triggered. Some data from other dnodes may be transferred to the new dnode automatically.
- When a dnode is removed from the cluster, the data from this dnode will be transferred to other dnodes automatically.
- When a dnode is too hot, i.e. too much data has been stored in it, automatic load balancing may be triggered to migrate some vnodes from this dnode to other dnodes.
:::tip
Automatic load balancing is controlled by the parameter `balance`, 0 means disabled and 1 means enabled. This is set in the file [taos.cfg](https://docs.tdengine.com/reference/config/#balance).
:::
## Dnode Offline
When a dnode is offline, it can be detected by the TDengine cluster. There are two cases:
- The dnode comes online before the threshold configured in `offlineThreshold` is reached. The dnode is still in the cluster and data replication is started automatically. The dnode can work properly after the data sync is finished.
- If the dnode has been offline over the threshold configured in `offlineThreshold` in `taos.cfg`, the dnode will be removed from the cluster automatically. A system alert will be generated and automatic load balancing will be triggered if `balance` is set to 1. When the removed dnode is restarted and becomes online, it will not join the cluster automatically. The system administrator has to manually join the dnode to the cluster.
:::note
If all the vnodes in a vgroup (or mnodes in mnode group) are in offline or unsynced status, the leader node can only be voted on, after all the vnodes or mnodes in the group become online and can exchange status. Following this, the vgroup (or mnode group) is able to provide service.
:::
## Arbitrator
The "arbitrator" component is used to address the special case when the number of replicas is set to an even number like 2,4 etc. If half of the vnodes in a vgroup don't work, it is impossible to vote and select a leader node. This situation also applies to mnodes if the number of mnodes is set to an even number like 2,4 etc.
To resolve this problem, a new arbitrator component named `tarbitrator`, an abbreviation of TDengine Arbitrator, was introduced. The `tarbitrator` simulates a vnode or mnode but it's only responsible for network communication and doesn't handle any actual data access. As long as more than half of the vnode or mnode, including Arbitrator, are available the vnode group or mnode group can provide data insertion or query services normally.
Normally, it's prudent to configure the replica number for each DB or system parameter `numOfMNodes` to be an odd number. However, if a user is very sensitive to storage space, a replica number of 2 plus arbitrator component can be used to achieve both lower cost of storage space and high availability.
Arbitrator component is installed with the server package. For details about how to install, please refer to [Install](/operation/pkg-install). The `-p` parameter of `tarbitrator` can be used to specify the port on which it provides service.
In the configuration file `taos.cfg` of each dnode, parameter `arbitrator` needs to be configured to the end point of the `tarbitrator` process. Arbitrator component will be used automatically if the replica is configured to an even number and will be ignored if the replica is configured to an odd number.
Arbitrator can be shown by executing command in TDengine CLI `taos` with its role shown as "arb".
```sql
SHOW DNODES;
```
---
sidebar_label: High Availability
title: High Availability
---
## High Availability of Vnode
High availability of vnode can be achieved through replicas in TDengine.
A TDengine cluster can have multiple databases. Each database has a number of vnodes associated with it. A different number of replicas can be configured for each DB. When creating a database, the parameter `replica` is used to specify the number of replicas. The default value for `replica` is 1. Naturally, a single replica cannot guarantee high availability since if one node is down, the data service is unavailable. Note that the number of dnodes in the cluster must NOT be lower than the number of replicas set for any DB, otherwise the `create table` operation will fail with error "more dnodes are needed". The SQL statement below is used to create a database named "demo" with 3 replicas.
```sql
CREATE DATABASE demo replica 3;
```
The data in a DB is divided into multiple shards and stored in multiple vgroups. The number of vnodes in each vgroup is determined by the number of replicas set for the DB. The vnodes in each vgroup store exactly the same data. For the purpose of high availability, the vnodes in a vgroup must be located in different dnodes on different hosts. As long as over half of the vnodes in a vgroup are in an online state, the vgroup is able to provide data access. Otherwise the vgroup can't provide data access for reading or inserting data.
There may be data for multiple DBs in a dnode. When a dnode is down, multiple DBs may be affected. While in theory, the cluster will provide data access for reading or inserting data if over half the vnodes in vgroups are online, because of the possibly complex mapping between vnodes and dnodes, it is difficult to guarantee that the cluster will work properly if over half of the dnodes are online.
## High Availability of Mnode
Each TDengine cluster is managed by `mnode`, which is a module of `taosd`. For the high availability of mnode, multiple mnodes can be configured in the system. When a TDengine cluster is started from scratch, there is only one `mnode`, then you can use command `create mnode` to and start corresponding dnode to add more mnodes.
```sql
SHOW MNODES;
```
The end point and role/status (leader, follower, candidate, offline) of all mnodes can be shown by the above command. When the first dnode is started in a cluster, there must be one mnode in this dnode. Without at least one mnode, the cluster cannot work.
From TDengine 3.0.0, RAFT procotol is used to guarantee the high availability, so the number of mnodes is should be 1 or 3.
---
sidebar_label: Load Balance
title: Load Balance
---
The load balance in TDengine is mainly about processing data series data. TDengine employes builtin hash algorithm to distribute all the tables, sub-tables and their data of a database across all the vgroups that belongs to the database. Each table or sub-table can only be handled by a single vgroup, while each vgroup can process multiple table or sub-table.
The number of vgroup can be specified when creating a database, using the parameter `vgroups`.
```sql
create database db0 vgroups 100;
```
The proper value of `vgroups` depends on available system resources. Assuming there is only one database to be created in the system, then the number of `vgroups` is determined by the available resources from all dnodes. In principle more vgroups can be created if you have more CPU and memory. Disk I/O is another important factor to consider. Once the bottleneck shows on disk I/O, more vgroups may downgrad the system performance significantly. If multiple databases are to be created in the system, then the total number of `vroups` of all the databases are dependent on the available system resources. It needs to be careful to distribute vgroups among these databases, you need to consider the number of tables, data writing frequency, size of each data row for all these databases. A recommended practice is to firstly choose a starting number for `vgroups`, for example double of the number of CPU cores, then try to adjust and optimize system configurations to find the best setting for `vgroups`, then distribute these vgroups among databases.
Furthermode, TDengine distributes the vgroups of each database equally among all dnodes. In case of replica 3, the distrubtion is even more complex, TDengine tries its best to prevent any dnode from becoming a bottleneck.
TDegnine utilizes the above ways to achieve load balance in a cluster, and finally achieve higher throughput.
Once the load balance is achieved, after some operations like deleting tables or droping databases, the load across all dnodes may become inbalanced, the method of rebalance will be provided in later versions. However, even without explicit rebalancing, TDengine will try its best to achieve new balance without manual interfering when a new database is created.
\ No newline at end of file
......@@ -72,19 +72,22 @@ serverPort 6030
按照《立即开始》里的步骤,启动第一个数据节点,例如 h1.taosdata.com,然后执行 taos,启动 taos shell,从 shell 里执行命令“SHOW DNODES”,如下所示:
```
Welcome to the TDengine shell from Linux, Client Version:2.0.0.0
Welcome to the TDengine shell from Linux, Client Version:3.0.0.0
Copyright (c) 2022 by TAOS Data, Inc. All rights reserved.
Copyright (c) 2017 by TAOS Data, Inc. All rights reserved.
Server is Enterprise trial Edition, ver:3.0.0.0 and will never expire.
taos> show dnodes;
id | end_point | vnodes | cores | status | role | create_time |
=====================================================================================
1 | h1.taos.com:6030 | 0 | 2 | ready | any | 2020-07-31 03:49:29.202 |
Query OK, 1 row(s) in set (0.006385s)
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | h1.taosdata.com:6030 | 0 | 1024 | ready | 2022-07-16 10:50:42.673 | |
Query OK, 1 rows affected (0.007984s)
taos>
```
taos>
````
上述命令里,可以看到刚启动的数据节点的 End Point 是:h1.taos.com:6030,就是这个新集群的 firstEp。
......@@ -98,7 +101,7 @@ taos>
```sql
CREATE DNODE "h2.taos.com:6030";
```
````
将新数据节点的 End Point(准备工作中第四步获知的)添加进集群的 EP 列表。“fqdn:port”需要用双引号引起来,否则出错。请注意将示例的“h2.taos.com:6030” 替换为这个新数据节点的 End Point。
......
......@@ -72,10 +72,11 @@ CREATE DNODE "fqdn:port";
taos> show dnodes;
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | trd01:6030 | 100 | 1024 | ready | 2022-07-15 16:47:47.726 | |
2 | trd04:6030 | 0 | 1024 | ready | 2022-07-15 16:56:13.670 | |
1 | localhost:6030 | 100 | 1024 | ready | 2022-07-15 16:47:47.726 | |
2 | localhost:7030 | 0 | 1024 | ready | 2022-07-15 16:56:13.670 | |
Query OK, 2 rows affected (0.007031s)
```
从中可以看到两个 dnode 状态都为 ready
## 删除数据节点
......@@ -85,14 +86,15 @@ Query OK, 2 rows affected (0.007031s)
```sql
DROP DNODE "fqdn:port";
```
或者
```sql
DROP DNODE dnodeId;
```
通过 “fqdn:port” 或 dnodeID 来指定一个具体的节点都是可以的。其中 fqdn 是被删除的节点的 FQDN,port 是其对外服务器的端口号;dnodeID 可以通过 SHOW DNODES 获得。
:::warning
数据节点一旦被 drop 之后,不能重新加入集群。需要将此节点重新部署(清空数据文件夹)。集群在完成 `drop dnode` 操作之前,会将该 dnode 的数据迁移走。
......@@ -101,5 +103,3 @@ DROP DNODE dnodeId;
dnodeID 是集群自动分配的,不得人工指定。它在生成时是递增的,不会重复。
:::
......@@ -28,6 +28,6 @@ TDengine 集群是由 mnode(taosd 的一个模块,管理节点)负责管
SHOW MNODES;
```
来查看 mnode 列表,该列表将列出 mnode 所处的 dnode 的 End Point 和角色(leader, follower, candidate)。当集群中第一个数据节点启动时,该数据节点一定会运行一个 mnode 实例,否则该数据节点 dnode 无法正常工作,因为一个系统是必须有至少一个 mnode 的。
来查看 mnode 列表,该列表将列出 mnode 所处的 dnode 的 End Point 和角色(leader, follower, candidate, offline)。当集群中第一个数据节点启动时,该数据节点一定会运行一个 mnode 实例,否则该数据节点 dnode 无法正常工作,因为一个系统是必须有至少一个 mnode 的。
在 TDengine 3.0 及以后的版本中,数据同步采用 RAFT 协议,所以 mnode 的数量应该被设置为 1 个或者 3 个。
......@@ -2,17 +2,17 @@
title: 负载均衡
---
TDengine 中的负载均衡主要指对时序数据的处理的负载均衡。TDengine 采用 Hash 一致性算法将一个数据库中的所有表和子表的数据均衡分散在属于该数据库的所有 vgroups 中,每张表或子表只能由一个 vgroups 处理,一个 vgroups 可能负责处理多个表或子表。
TDengine 中的负载均衡主要指对时序数据的处理的负载均衡。TDengine 采用 Hash 一致性算法将一个数据库中的所有表和子表的数据均衡分散在属于该数据库的所有 vgroup 中,每张表或子表只能由一个 vgroup 处理,一个 vgroup 可能负责处理多个表或子表。
创建数据库时可以指定其中的 vgroups 的数量:
创建数据库时可以指定其中的 vgroup 的数量:
```sql
create database db0 vgroups 100;
```
如何指定合适的 vgroups 的数量,这取决于系统资源。假定系统中只计划建立一个数据库,则 vgroups 由集群中所有 dnode 所能使用的资源决定。原则上可用的 CPU 和 Memory 越多,可建立的 vgroups 也越多。但也要考虑到磁盘性能,过多的 vgroups 在磁盘性能达到上限后反而会拖累整个系统的性能。假如系统中会建立多个数据库,则多个数据库的 vgoups 之和取决于系统中可用资源的数量。要综合考虑多个数据库之间表的数量、写入频率、数据量等多个因素在多个数据库之间分配 vgroups。实际中建议首先根据系统资源配置选择一个初始的 vgroups 数量,比如 CPU 总核数的 2 倍,以此为起点通过测试找到最佳的 vgroups 数量配置,此为系统中的 vgroups 总数。如果有多个数据库的话,再根据各个数据库的表数和数据量对 vgroups 进行分配。
如何指定合适的 vgroup 的数量,这取决于系统资源。假定系统中只计划建立一个数据库,则 vgroup 数量由集群中所有 dnode 所能使用的资源决定。原则上可用的 CPU 和 Memory 越多,可建立的 vgroup 也越多。但也要考虑到磁盘性能,过多的 vgroup 在磁盘性能达到上限后反而会拖累整个系统的性能。假如系统中会建立多个数据库,则多个数据库的 vgroup 之和取决于系统中可用资源的数量。要综合考虑多个数据库之间表的数量、写入频率、数据量等多个因素在多个数据库之间分配 vgroup。实际中建议首先根据系统资源配置选择一个初始的 vgroup 数量,比如 CPU 总核数的 2 倍,以此为起点通过测试找到最佳的 vgroup 数量配置,此为系统中的 vgroup 总数。如果有多个数据库的话,再根据各个数据库的表数和数据量对 vgroup 进行分配。
此外,对于任意数据库的 vgroups,TDengine 都是尽可能将其均衡分散在多个 dnode 上。在多副本情况下(replica 3),这种均衡分布尤其复杂,TDengine 的分布策略会尽量避免任意一个 dnode 成为写入的瓶颈。
此外,对于任意数据库的 vgroup,TDengine 都是尽可能将其均衡分散在多个 dnode 上。在多副本情况下(replica 3),这种均衡分布尤其复杂,TDengine 的分布策略会尽量避免任意一个 dnode 成为写入的瓶颈。
通过以上措施可以最大限度地在整个 TDengine 集群中实现负载均衡,负载均衡也能反过来提升系统总的数据处理能力。
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册