未验证 提交 4311b0e5 编写于 作者: sangshuduo's avatar sangshuduo 提交者: GitHub

Docs/sangshuduo/td 13307 add desc to index (#10326)

* [TD-13307]<docs>: add description to docs index.

* fix the link of how to use taosBenchmark

* fix wrong links

* fix markdown format

* refine getting started section

* update

* fix install doc link

* fix chinese wording

* adjust section order

* adjust few format

* fix markdown format
上级 f7b580e0
如何使用 taosBenchmark 进行性能测试
==
# 如何使用 taosBenchmark 进行性能测试
自从 TDengine 2019年 7 月开源以来,凭借创新的数据建模设计、快捷的安装方式、易用的编程接口和强大的数据写入查询性能博得了大量时序数据开发者的青睐。其中写入和查询性能往往令刚接触 TDengine 的用户称叹不已。为了便于用户在最短时间内就可以体验到 TDengine 的高性能特点,我们专门开发了一个应用程序 taosBenchmark (曾命名为 taosdemo)用于对 TDengine 进行写入和查询的性能测试,用户可以通过 taosBenchmark 轻松模拟大量设备产生海量数据的场景,并且可以通过 taosBenchmark 参数灵活控制表的列数、数据类型、乱序比例以及并发线程数量。
运行 taosBenchmark 很简单,通过下载 TDengine 安装包( https://www.taosdata.com/cn/all-downloads/ )或者自行下载 TDengine 代码( https://github.com/taosdata/TDengine )编译都可以在安装目录或者编译结果目录中找到并运行。
运行 taosBenchmark 很简单,通过下载 [TDengine 安装包](https://www.taosdata.com/cn/all-downloads/)或者自行下载 [TDengine 代码](https://github.com/taosdata/TDengine)编译都可以在安装目录或者编译结果目录中找到并运行。
接下来本文为大家讲解 taosBenchmark 的使用介绍及注意事项。
使用 taosBenchmark 进行写入测试
--
## 使用 taosBenchmark 进行写入测试
不使用任何参数的情况下执行 taosBenchmark 命令,输出如下:
```
$ taosBenchmark
......@@ -58,7 +57,9 @@ column[0]:FLOAT column[1]:INT column[2]:FLOAT
Press enter key to continue or Ctrl-C to stop
```
这里显示的是接下来 taosBenchmark 进行数据写入的各项参数。默认不输入任何命令行参数的情况下 taosBenchmark 将模拟生成一个电力行业典型应用的电表数据采集场景数据。即建立一个名为 test 的数据库,并创建一个名为 meters 的超级表,其中表结构为:
```
taos> describe test.meters;
Field | Type | Length | Note |
......@@ -71,7 +72,9 @@ taos> describe test.meters;
location | BINARY | 64 | TAG |
Query OK, 6 row(s) in set (0.002972s)
```
按任意键后 taosBenchmark 将建立数据库 test 和超级表 meters,并按照 TDengine 数据建模的最佳实践,以 meters 超级表为模板生成一万个子表,代表一万个独立上报数据的电表设备。
```
taos> use test;
Database changed.
......@@ -82,7 +85,9 @@ taos> show stables;
meters | 2021-08-27 11:21:01.209 | 4 | 2 | 10000 |
Query OK, 1 row(s) in set (0.001740s)
```
然后 taosBenchmark 为每个电表设备模拟生成一万条记录:
```
...
====thread[3] completed total inserted rows: 6250000, total affected rows: 6250000. 347626.22 records/second====
......@@ -99,9 +104,11 @@ Spent 18.0863 seconds to insert rows: 100000000, affected rows: 100000000 with 1
insert delay, avg: 28.64ms, max: 112.92ms, min: 9.35ms
```
以上信息是在一台具备 8个CPU 64G 内存的普通 PC 服务器上进行实测的结果。显示 taosBenchmark 用了 18 秒的时间插入了 100000000 (一亿)条记录,平均每秒钟插入 552 万 9千零49 条记录。
TDengine 还提供性能更好的参数绑定接口,而在同样的硬件上使用参数绑定接口 (taosBenchmark -I stmt )进行相同数据量的写入,结果如下:
```
...
......@@ -136,12 +143,13 @@ Spent 6.0257 seconds to insert rows: 100000000, affected rows: 100000000 with 16
insert delay, avg: 8.31ms, max: 860.12ms, min: 2.00ms
```
显示 taosBenchmark 用了 6 秒的时间插入了一亿条记录,每秒钟插入性能高达 1659 万 5 千 590 条记录。
显示 taosBenchmark 用了 6 秒的时间插入了一亿条记录,每秒钟插入性能高达 1659 万 5 千 590 条记录。
由于 taosBenchmark 使用起来非常方便,我们又对 taosBenchmark 做了更多的功能扩充,使其支持更复杂的参数设置,便于进行快速原型开发的样例数据准备和验证工作。
完整的 taosBenchmark 命令行参数列表可以通过 taosBenchmark --help 显示如下:
```
$ taosBenchmark --help
......@@ -188,51 +196,70 @@ Report bugs to <support@taosdata.com>.
```
taosBenchmark 的参数是为了满足数据模拟的需求来设计的。下面介绍几个常用的参数:
```
-I, --interface=INTERFACE The interface (taosc, rest, and stmt) taosBenchmark uses. Default is 'taosc'.
```
前面介绍 taosBenchmark 不同接口的性能差异已经提到, -I 参数为选择不同的接口,目前支持 taosc、stmt 和 rest 几种。其中 taosc 为使用 SQL 语句方式进行数据写入;stmt 为使用参数绑定接口进行数据写入;rest 为使用 RESTful 协议进行数据写入。
```
-T, --threads=NUMBER The number of threads. Default is 8.
```
-T 参数设置 taosBenchmark 使用多少个线程进行数据同步写入,通过多线程可以尽最大可能压榨硬件的处理能力。
```
-b, --data-type=DATATYPE The data_type of columns, default: FLOAT, INT, FLOAT.
-w, --binwidth=WIDTH The width of data_type 'BINARY' or 'NCHAR'. Default is 64
-l, --columns=COLUMNS The number of columns per record. Demo mode by default is 3 (float, int, float). Max values is 4095
```
前文提到,taosBenchmark 默认创建一个典型电表数据采集应用场景,每个设备包含电流电压相位3个采集量。对于需要定义不同的采集量,可以使用 -b 参数。TDengine 支持 BOOL、TINYINT、SMALLINT、INT、BIGINT、FLOAT、DOUBLE、BINARY、NCHAR、TIMESTAMP 等多种数据类型。通过 -b 加上以“ , ”(英文逗号)分割定制类型的列表可以使 taosBenchmark 建立对应的超级表和子表并插入相应模拟数据。通过 -w 参数可以指定 BINARY 和 NCHAR 数据类型的列的宽度(默认为 64 )。-l 参数可以在 -b 参数指定数据类型的几列之后补充以 INT 型的总的列数,特别多列的情况下可以减少手工输入的过程,最多支持到 4095 列。
```
-r, --rec-per-req=NUMBER The number of records per request. Default is 30000.
```
为了达到 TDengine 性能极限,可以使用多客户端、多线程以及一次插入多条数据来进行数据写入。 -r 参数为设置一次写入请求可以拼接的记录条数,默认为30000条。有效的拼接记录条数还和客户端缓冲区大小有关,目前的缓冲区为 1M Bytes,如果记录的列宽度比较大,最大拼接记录条数可以通过 1M 除以列宽(以字节为单位)计算得出。
```
-t, --tables=NUMBER The number of tables. Default is 10000.
-n, --records=NUMBER The number of records per table. Default is 10000.
-M, --random The value of records generated are totally random. The default is to simulate power equipment senario.
```
前面提到 taosBenchmark 默认创建 10000 个表,每个表写入 10000 条记录。可以通过 -t 和 -n 设置表的数量和每个表的记录的数量。默认无参数生成的数据为模拟真实场景,模拟生成的数据为电流电压相位值增加一定的抖动,可以更真实表现 TDengine 高效的数据压缩能力。如果需要模拟生成完全随机数据,可以通过 -M 参数。
```
-y, --answer-yes Default input yes for prompt.
```
前面我们可以看到 taosBenchmark 默认在进行创建数据库或插入数据之前输出将要进行操作的参数列表,方便使用者在插入之前了解即将进行的数据写入的内容。为了方便进行自动测试,-y 参数可以使 taosBenchmark 输出参数后立刻进行数据写入操作。
```
-O, --disorder=NUMBER Insert order mode--0: In order, 1 ~ 50: disorder ratio. Default is in order.
-R, --disorder-range=NUMBER Out of order data's range, ms, default is 1000.
```
在某些场景,接收到的数据并不是完全按时间顺序到来,而是包含一定比例的乱序数据,TDengine 也能进行很好的处理。为了模拟乱序数据的写入,taosBenchmark 提供 -O 和 -R 参数进行设置。-O 参数为 0 和不使用 -O 参数相同为完全有序数据写入。1 到 50 为数据中包含乱序数据的比例。-R 参数为乱序数据时间戳偏移的范围,默认为 1000 毫秒。另外注意,时序数据以时间戳为唯一标识,所以乱序数据可能会生成和之前已经写入数据完全相同的时间戳,这样的数据会根据数据库创建的 update 值或者被丢弃(update 0)或者覆盖已有数据(update 1 或 2),而总的数据条数可能和期待的条数不一致的情况。
```
-g, --debug Print debug info.
```
如果对 taosBenchmark 写入数据过程感兴趣或者数据写入结果不符合预期,可以使用 -g 参数使 taosBenchmark 打印执行过程中间调试信息到屏幕上,或通过 Linux 重定向命令导入到另外一个文件,方便找到发生问题的原因。另外 taosBenchmark 在执行失败后也会把相应执行的语句和调试原因输出到屏幕。可以搜索 reason 来找到 TDengine 服务端返回的错误原因信息。
```
-x, --aggr-func Test aggregation funtions after insertion.
```
TDengine 不仅仅是插入性能非常强大,由于其先进的数据库引擎设计使查询性能也异常强大。taosBenchmark 提供一个 -x 函数,可以在插入数据结束后进行常用查询操作并输出查询消耗时间。以下为在前述服务器上进行插入一亿条记录后进行常用查询的结果。
可以看到 select * 取出一亿条记录(不输出到屏幕)操作仅消耗1.26秒。而对一亿条记录进行常用的聚合函数操作通常仅需要二十几毫秒,时间最长的 count 函数也不到四十毫秒。
```
taosBenchmark -I stmt -T 48 -y -x
...
......@@ -254,7 +281,9 @@ select min(current) took 0.025812 second(s)
select first(current) took 0.024105 second(s)
...
```
除了命令行方式, taosBenchmark 还支持接受指定一个 JSON 文件做为传入参数的方式来提供更丰富的设置。一个典型的 JSON 文件内容如下:
```
{
"filetype": "insert",
......@@ -263,17 +292,17 @@ select first(current) took 0.024105 second(s)
"port": 6030,
"user": "root",
"password": "taosdata",
"thread_count": 4,
"thread_count_create_tbl": 4,
"result_file": "./insert_res.txt",
"confirm_parameter_prompt": "no",
"insert_interval": 0,
"interlace_rows": 100,
"thread_count": 4,
"thread_count_create_tbl": 4,
"result_file": "./insert_res.txt",
"confirm_parameter_prompt": "no",
"insert_interval": 0,
"interlace_rows": 100,
"num_of_records_per_req": 100,
"databases": [{
"dbinfo": {
"name": "db",
"drop": "yes",
"drop": "yes",
"replica": 1,
"days": 10,
"cache": 16,
......@@ -291,39 +320,41 @@ select first(current) took 0.024105 second(s)
},
"super_tables": [{
"name": "stb",
"child_table_exists":"no",
"childtable_count": 100,
"childtable_prefix": "stb_",
"auto_create_table": "no",
"batch_create_tbl_num": 5,
"data_source": "rand",
"insert_mode": "taosc",
"insert_rows": 100000,
"childtable_limit": 10,
"childtable_offset":100,
"interlace_rows": 0,
"insert_interval":0,
"max_sql_len": 1024000,
"disorder_ratio": 0,
"disorder_range": 1000,
"timestamp_step": 10,
"start_timestamp": "2020-10-01 00:00:00.000",
"sample_format": "csv",
"sample_file": "./sample.csv",
"tags_file": "",
"child_table_exists":"no",
"childtable_count": 100,
"childtable_prefix": "stb_",
"auto_create_table": "no",
"batch_create_tbl_num": 5,
"data_source": "rand",
"insert_mode": "taosc",
"insert_rows": 100000,
"childtable_limit": 10,
"childtable_offset":100,
"interlace_rows": 0,
"insert_interval":0,
"max_sql_len": 1024000,
"disorder_ratio": 0,
"disorder_range": 1000,
"timestamp_step": 10,
"start_timestamp": "2020-10-01 00:00:00.000",
"sample_format": "csv",
"sample_file": "./sample.csv",
"tags_file": "",
"columns": [{"type": "INT"}, {"type": "DOUBLE", "count":10}, {"type": "BINARY", "len": 16, "count":3}, {"type": "BINARY", "len": 32, "count":6}],
"tags": [{"type": "TINYINT", "count":2}, {"type": "BINARY", "len": 16, "count":5}]
}]
}]
}
```
例如:我们可以通过 "thread_count" 和 "thread_count_create_tbl" 来为建表和插入数据指定不同数量的线程。可以通过 "child_table_exists"、"childtable_limit" 和 "childtable_offset" 的组合来使用多个 taosBenchmark 进程(甚至可以在不同的电脑上)对同一个超级表的不同范围子表进行同时写入。也可以通过 "data_source" 和 "sample_file" 来指定数据来源为 csv 文件,来实现导入已有数据的功能。
使用 taosBenchmark 进行查询和订阅测试
--
## 使用 taosBenchmark 进行查询和订阅测试
taosBenchmark 不仅仅可以进行数据写入,也可以执行查询和订阅功能。但一个 taosBenchmark 实例只能支持其中的一种功能,不能同时支持三种功能,通过配置文件来指定进行哪种功能的测试。
以下为一个典型查询 JSON 示例文件内容:
```
{
"filetype": "query",
......@@ -363,7 +394,9 @@ taosBenchmark 不仅仅可以进行数据写入,也可以执行查询和订阅
}
}
```
以下为 JSON 文件中和查询相关的特有参数含义:
```
"query_times": 每种查询类型的查询次数
"query_mode": 查询数据接口,"taosc":调用TDengine的c接口;“resetful”:使用restfule接口。可选项。缺省是“taosc”。
......@@ -382,6 +415,7 @@ taosBenchmark 不仅仅可以进行数据写入,也可以执行查询和订阅
```
以下为一个典型订阅 JSON 示例文件内容:
```
{
"filetype":"subscribe",
......@@ -394,34 +428,36 @@ taosBenchmark 不仅仅可以进行数据写入,也可以执行查询和订阅
"confirm_parameter_prompt": "no",
"specified_table_query":
{
"concurrent":1,
"mode":"sync",
"interval":0,
"restart":"yes",
"concurrent":1,
"mode":"sync",
"interval":0,
"restart":"yes",
"keepProgress":"yes",
"sqls": [
{
"sql": "select * from stb00_0 ;",
"sql": "select * from stb00_0 ;",
"result": "./subscribe_res0.txt"
}]
},
"super_table_query":
"super_table_query":
{
"stblname": "stb0",
"threads":1,
"mode":"sync",
"interval":10000,
"restart":"yes",
"threads":1,
"mode":"sync",
"interval":10000,
"restart":"yes",
"keepProgress":"yes",
"sqls": [
{
"sql": "select * from xxxx where ts > '2021-02-25 11:35:00.000' ;",
"sql": "select * from xxxx where ts > '2021-02-25 11:35:00.000' ;",
"result": "./subscribe_res1.txt"
}]
}
}
```
以下为订阅功能相关的特有参数含义:
```
"interval": 执行订阅的间隔,单位是秒。可选项,缺省是0。
"restart": 订阅重启。"yes":如果订阅已经存在,重新开始,"no": 继续之前的订阅。(请注意执行用户需要对 dataDir 目录有读写权限)
......@@ -429,16 +465,15 @@ taosBenchmark 不仅仅可以进行数据写入,也可以执行查询和订阅
"resubAfterConsume": 配合 keepProgress 使用,在订阅消费了相应次数后调用 unsubscribe 取消订阅并再次订阅。
"result": 查询结果写入的文件名。可选项,缺省是空,表示查询结果不写入文件。 注意:每条sql语句后的保存结果的文件不能重名,且生成结果文件时,文件名会附加线程号。
```
结语
--
## 结语
TDengine是涛思数据专为物联网、车联网、工业互联网、IT运维等设计和优化的大数据平台。TDengine 由于数据库内核中创新的数据存储和查询引擎设计,展现出远超同类产品的高效性能。并且由于支持 SQL 语法和多种编程语言的连接器(目前支持 Java, Python, Go, C#, NodeJS, Rust 等),易用性极强,学习成本为零。为了便于运维需求,我们还提供数据迁移和监控功能等相关生态工具软件。
为了刚接触 TDengine 的使用者方便进行技术评估和压力测试,我们为 taosBenchmark 开发了丰富的特性。本文即为对 taosBenchmark 的一个简单介绍,随着 TDengine 新功能的不断增加,taosBenchmark 也会继续演化和改进。taosBenchmark 的代码做为 TDengine 的一部分在 GitHub 上完全开源。欢迎就 taosBenchmark 或 TDengine 的使用或实现在 GitHub 或者涛思数据的用户群提出建议或批评。
## 附录 - 完整 taosBenchmark 参数介绍
附录 - 完整 taosBenchmark 参数介绍
--
taosBenchmark支持两种配置参数的模式,一种是命令行参数,一种是使用 JSON 格式的配置文件。
一、命令行参数
......@@ -505,12 +540,12 @@ taosBenchmark支持两种配置参数的模式,一种是命令行参数,一
--help: 打印命令参数列表。
二、JSON 格式的配置文件中所有参数说明
taosBenchmark支持3种功能的测试,包括插入、查询、订阅。但一个taosBenchmark实例不能同时支持三种功能,一个 taosBenchmark 实例只能支持其中的一种功能,通过配置文件来指定进行哪种功能的测试。
1、插入功能测试的 JSON 配置文件
```
{
"filetype": "insert",
......@@ -519,17 +554,17 @@ taosBenchmark支持3种功能的测试,包括插入、查询、订阅。但一
"port": 6030,
"user": "root",
"password": "taosdata",
"thread_count": 4,
"thread_count_create_tbl": 4,
"result_file": "./insert_res.txt",
"confirm_parameter_prompt": "no",
"insert_interval": 0,
"interlace_rows": 100,
"thread_count": 4,
"thread_count_create_tbl": 4,
"result_file": "./insert_res.txt",
"confirm_parameter_prompt": "no",
"insert_interval": 0,
"interlace_rows": 100,
"num_of_records_per_req": 100,
"databases": [{
"dbinfo": {
"name": "db",
"drop": "yes",
"drop": "yes",
"replica": 1,
"days": 10,
"cache": 16,
......@@ -547,27 +582,27 @@ taosBenchmark支持3种功能的测试,包括插入、查询、订阅。但一
},
"super_tables": [{
"name": "stb",
"child_table_exists":"no",
"childtable_count": 100,
"childtable_prefix": "stb_",
"auto_create_table": "no",
"batch_create_tbl_num": 5,
"data_source": "rand",
"insert_mode": "taosc",
"insert_rows": 100000,
"childtable_limit": 10,
"childtable_offset":100,
"interlace_rows": 0,
"insert_interval":0,
"max_sql_len": 1024000,
"disorder_ratio": 0,
"disorder_range": 1000,
"timestamp_step": 10,
"start_timestamp": "2020-10-01 00:00:00.000",
"sample_format": "csv",
"child_table_exists":"no",
"childtable_count": 100,
"childtable_prefix": "stb_",
"auto_create_table": "no",
"batch_create_tbl_num": 5,
"data_source": "rand",
"insert_mode": "taosc",
"insert_rows": 100000,
"childtable_limit": 10,
"childtable_offset":100,
"interlace_rows": 0,
"insert_interval":0,
"max_sql_len": 1024000,
"disorder_ratio": 0,
"disorder_range": 1000,
"timestamp_step": 10,
"start_timestamp": "2020-10-01 00:00:00.000",
"sample_format": "csv",
"sample_file": "./sample.csv",
"use_sameple_ts": "no",
"tags_file": "",
"tags_file": "",
"columns": [{"type": "INT"}, {"type": "DOUBLE", "count":10}, {"type": "BINARY", "len": 16, "count":3}, {"type": "BINARY", "len": 32, "count":6}],
"tags": [{"type": "TINYINT", "count":2}, {"type": "BINARY", "len": 16, "count":5}]
}]
......@@ -700,6 +735,7 @@ taosBenchmark支持3种功能的测试,包括插入、查询、订阅。但一
}]
2、查询功能测试的 JSON 配置文件
```
{
"filetype": "query",
......@@ -784,12 +820,12 @@ taosBenchmark支持3种功能的测试,包括插入、查询、订阅。但一
"result": 查询结果写入的文件名。可选项,缺省是空,表示查询结果不写入文件。
注意:每条sql语句后的保存结果的文件不能重名,且生成结果文件时,文件名会附加线程号。
查询结果显示:如果查询线程结束一次查询距开始执行时间超过30秒打印一次查询次数、用时和QPS。所有查询结束时,汇总打印总的查询次数和QPS。
3、订阅功能测试的 JSON 配置文件
```
{
"filetype":"subscribe",
......@@ -802,28 +838,28 @@ taosBenchmark支持3种功能的测试,包括插入、查询、订阅。但一
"confirm_parameter_prompt": "no",
"specified_table_query":
{
"concurrent":1,
"mode":"sync",
"interval":0,
"restart":"yes",
"concurrent":1,
"mode":"sync",
"interval":0,
"restart":"yes",
"keepProgress":"yes",
"sqls": [
{
"sql": "select * from stb00_0 ;",
"sql": "select * from stb00_0 ;",
"result": "./subscribe_res0.txt"
}]
},
"super_table_query":
"super_table_query":
{
"stblname": "stb0",
"threads":1,
"mode":"sync",
"interval":10000,
"restart":"yes",
"threads":1,
"mode":"sync",
"interval":10000,
"restart":"yes",
"keepProgress":"yes",
"sqls": [
{
"sql": "select * from xxxx where ts > '2021-02-25 11:35:00.000' ;",
"sql": "select * from xxxx where ts > '2021-02-25 11:35:00.000' ;",
"result": "./subscribe_res1.txt"
}]
}
......
Since TDengine was open sourced in July 2019, it has gained a lot of popularity among time-series database developers with its innovative data modeling design, simple installation method, easy programming interface, and powerful data insertion and query performance. The insertion and querying performance is often astonishing to users who are new to TDengine. In order to help users to experience the high performance and functions of TDengine in the shortest time, we developed an application called `taosBenchmark` (was named `taosdemo`) for insertion and querying performance testing of TDengine. Then user can easily simulate the scenario of a large number of devices generating a very large amount of data. User can easily manipulate the number of columns, data types, disorder ratio, and number of concurrent threads with taosBenchmark customized parameters.
Running taosBenchmark is very simple. Just download the [TDengine installation package](https://www.taosdata.com/cn/all-downloads/) or compiling the [TDengine code](https://github.com/taosdata/TDengine). It can be found and run in the installation directory or in the compiled results directory.
Running taosBenchmark is very simple. Just download the TDengine installation package (https://www.taosdata.com/cn/all-downloads/) or compiling the TDengine code yourself (https://github.com/taosdata/TDengine). It can be found and run in the installation directory or in the compiled results directory.
# To run an insertion test with taosBenchmark
To run an insertion test with taosBenchmark
--
Executing taosBenchmark without any parameters results in the following output.
```
$ taosBenchmark
......@@ -70,6 +70,7 @@ Query OK, 6 row(s) in set (0.002972s)
```
After pressing any key taosBenchmark will create the database test and super table meters and generate 10,000 sub-tables representing 10,000 individule meter devices that report data. That means they independently using the super table meters as a template according to TDengine data modeling best practices.
```
taos> use test;
Database changed.
......@@ -91,7 +92,9 @@ taos> show stables;
meters | 2021-08-27 11:21:01.209 | 4 | 2 | 10000 |
Query OK, 1 row(s) in set (0.001740s)
```
Then taosBenchmark generates 10,000 records for each meter device.
```
...
====thread[3] completed total inserted rows: 6250000, total affected rows: 6250000. 347626.22 records/second====
......@@ -108,9 +111,11 @@ Spent 18.0863 seconds to insert rows: 100000000, affected rows: 100000000 with 1
insert delay, avg: 28.64ms, max: 112.92ms, min: 9.35ms
```
The above information is the result of a real test on a normal PC server with 8 CPUs and 64G RAM. It shows that taosBenchmark inserted 100,000,000 (no need to count, 100 million) records in 18 seconds, or an average of 552,909,049 records per second.
TDengine also offers a parameter-bind interface for better performance, and using the parameter-bind interface (taosBenchmark -I stmt) on the same hardware for the same amount of data writes, the results are as follows.
```
...
......@@ -145,12 +150,13 @@ Spent 6.0257 seconds to insert rows: 100000000, affected rows: 100000000 with 16
insert delay, avg: 8.31ms, max: 860.12ms, min: 2.00ms
```
It shows that taosBenchmark inserted 100 million records in 6 seconds, with a much more higher insertion performance, 1,659,590 records wer inserted per second.
It shows that taosBenchmark inserted 100 million records in 6 seconds, with a much more higher insertion performance, 1,659,590 records wer inserted per second.
Because taosBenchmark is so easy to use, so we have extended it with more features to support more complex parameter settings for sample data preparation and validation for rapid prototyping.
The complete list of taosBenchmark command-line arguments can be displayed via taosBenchmark --help as follows.
```
$ taosBenchmark --help
......@@ -197,52 +203,70 @@ Report bugs to <support@taosdata.com>.
```
taosBenchmark's parameters are designed to meet the needs of data simulation. A few commonly used parameters are described below.
```
-I, --interface=INTERFACE The interface (taosc, rest, and stmt) taosBenchmark uses. Default is 'taosc'.
```
The performance difference between different interfaces of taosBenchmark has been mentioned earlier, the -I parameter is used to select different interfaces, currently taosc, stmt and rest are supported. The -I parameter is used to select different interfaces, currently taosc, stmt and rest are supported. taosc uses SQL statements to write data, stmt uses parameter binding interface to write data, and rest uses RESTful protocol to write data.
```
-T, --threads=NUMBER The number of threads. Default is 8.
```
The -T parameter sets how many threads taosBenchmark uses to synchronize data writes, so that multiple threads can squeeze as much processing power out of the hardware as possible.
```
-b, --data-type=DATATYPE The data_type of columns, default: FLOAT, INT, FLOAT.
-w, --binwidth=WIDTH The width of data_type 'BINARY' or 'NCHAR'. Default is 64
-l, --columns=COLUMNS The number of columns per record. Demo mode by default is 3 (float, int, float). Max values is 4095
```
As mentioned earlier, tadosdemo creates a typical meter data reporting scenario by default, with each device containing three columns. They are current, voltage and phases. TDengine supports BOOL, TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, BINARY, NCHAR, TIMESTAMP data types. By using -b with a list of types allows you to specify the column list with customized data type. Using -w to specify the width of the columns of the BINARY and NCHAR data types (default is 64). The -l parameter can be added to the columns of the data type specified by the -b parameter with the total number of columns of the INT type, which reduces the manual input process in case of a particularly large number of columns, up to 4095 columns.
```
-r, --rec-per-req=NUMBER The number of records per request. Default is 30000.
```
To reach TDengine performance limits, data insertion can be executed by using multiple clients, multiple threads, and batch data insertions at once. The -r parameter sets the number of records batch that can be stitched together in a single write request, the default is 30,000. The effective number of spliced records is also related to the client buffer size, which is currently 1M Bytes. If the record column width is large, the maximum number of spliced records can be calculated by dividing 1M by the column width (in bytes).
```
-t, --tables=NUMBER The number of tables. Default is 10000.
-n, --records=NUMBER The number of records per table. Default is 10000.
-M, --random The value of records generated are totally random. The default is to simulate power equipment scenario.
```
As mentioned earlier, taosBenchmark creates 10,000 tables by default, and each table writes 10,000 records. taosBenchmark can set the number of tables and the number of records in each table by -t and -n. The data generated by default without parameters are simulated real scenarios, and the simulated data are current and voltage phase values with certain jitter, which can more realistically show TDengine's efficient data compression ability. If you need to simulate the generation of completely random data, you can pass the -M parameter.
```
-y, --answer-yes Default input yes for prompt.
```
As we can see above, taosBenchmark outputs a list of parameters for the upcoming operation by default before creating a database or inserting data, so that the user can know what data is about to be written before inserting. To facilitate automatic testing, the -y parameter allows taosBenchmark to write data immediately after outputting the parameters.
```
-O, --disorder=NUMBER Insert order mode--0: In order, 1 ~ 50: disorder ratio. Default is in order.
-R, --disorder-range=NUMBER Out of order data's range, ms, default is 1000.
```
In some scenarios, the received data does not arrive in exact order, but contains a certain percentage of out-of-order data, which TDengine can also handle very well. In order to simulate the writing of out-of-order data, tadosdemo provides -O and -R parameters to be set. The -O parameter is the same as the -O parameter for fully ordered data writes. 1 to 50 is the percentage of data that contains out-of-order data. The -R parameter is the range of the timestamp offset of the out-of-order data, default is 1000 milliseconds. Also note that temporal data is uniquely identified by a timestamp, so garbled data may generate the exact same timestamp as previously written data, and such data may either be discarded (update 0) or overwrite existing data (update 1 or 2) depending on the update value created by the database, and the total number of data entries may not match the expected number of entries.
```
-g, --debug Print debug info.
```
If you are interested in the taosBenchmark insertion process or if the data insertion result is not as expected, you can use the -g parameter to make taosBenchmark print the debugging information in the process of the execution to the screen or import it to another file with the Linux redirect command to easily find the cause of the problem. In addition, taosBenchmark will also output the corresponding executed statements and debugging reasons to the screen after the execution fails. You can search the word "reason" to find the error reason information returned by the TDengine server.
```
-x, --aggr-func Test aggregation funtions after insertion.
```
TDengine is not only very powerful in insertion performance, but also in query performance due to its advanced database engine design. tadosdemo provides a -x function that performs the usual query operations and outputs the query consumption time after the insertion of data. The following is the result of a common query after inserting 100 million rows on the aforementioned server.
You can see that the select * fetch 100 million rows (not output to the screen) operation consumes only 1.26 seconds. The most of normal aggregation function for 100 million records usually takes only about 20 milliseconds, and even the longest count function takes less than 40 milliseconds.
```
taosBenchmark -I stmt -T 48 -y -x
...
......@@ -264,7 +288,9 @@ select min(current) took 0.025812 second(s)
select first(current) took 0.024105 second(s)
...
```
In addition to the command line approach, taosBenchmark also supports take a JSON file as an incoming parameter to provide a richer set of settings. A typical JSON file would look like this.
```
{
"filetype": "insert",
......@@ -273,17 +299,17 @@ In addition to the command line approach, taosBenchmark also supports take a JSO
"port": 6030,
"user": "root",
"password": "taosdata",
"thread_count": 4,
"thread_count_create_tbl": 4,
"result_file": "./insert_res.txt",
"confirm_parameter_prompt": "no",
"insert_interval": 0,
"interlace_rows": 100,
"thread_count": 4,
"thread_count_create_tbl": 4,
"result_file": "./insert_res.txt",
"confirm_parameter_prompt": "no",
"insert_interval": 0,
"interlace_rows": 100,
"num_of_records_per_req": 100,
"databases": [{
"dbinfo": {
"name": "db",
"drop": "yes",
"drop": "yes",
"replica": 1,
"days": 10,
"cache": 16,
......@@ -301,39 +327,41 @@ In addition to the command line approach, taosBenchmark also supports take a JSO
},
"super_tables": [{
"name": "stb",
"child_table_exists":"no",
"childtable_count": 100,
"childtable_prefix": "stb_",
"auto_create_table": "no",
"batch_create_tbl_num": 5,
"data_source": "rand",
"insert_mode": "taosc",
"insert_rows": 100000,
"childtable_limit": 10,
"childtable_offset":100,
"interlace_rows": 0,
"insert_interval":0,
"max_sql_len": 1024000,
"disorder_ratio": 0,
"disorder_range": 1000,
"timestamp_step": 10,
"start_timestamp": "2020-10-01 00:00:00.000",
"sample_format": "csv",
"sample_file": "./sample.csv",
"tags_file": "",
"child_table_exists":"no",
"childtable_count": 100,
"childtable_prefix": "stb_",
"auto_create_table": "no",
"batch_create_tbl_num": 5,
"data_source": "rand",
"insert_mode": "taosc",
"insert_rows": 100000,
"childtable_limit": 10,
"childtable_offset":100,
"interlace_rows": 0,
"insert_interval":0,
"max_sql_len": 1024000,
"disorder_ratio": 0,
"disorder_range": 1000,
"timestamp_step": 10,
"start_timestamp": "2020-10-01 00:00:00.000",
"sample_format": "csv",
"sample_file": "./sample.csv",
"tags_file": "",
"columns": [{"type": "INT"}, {"type": "DOUBLE", "count":10}, {"type": "BINARY", "len": 16, "count":3}, {"type": "BINARY", "len": 32, "count":6}],
"tags": [{"type": "TINYINT", "count":2}, {"type": "BINARY", "len": 16, "count":5}]
}]
}]
}
```
For example, we can specify different number of threads for table creation and data insertion with "thread_count" and "thread_count_create_tbl". You can use a combination of "child_table_exists", "childtable_limit" and "childtable_offset" to use multiple taosBenchmark processes (even on different computers) to write to different ranges of child tables of the same super table at the same time. You can also import existing data by specifying the data source as a csv file with "data_source" and "sample_file".
Use taosBenchmark for query and subscription testing
--
# Use taosBenchmark for query and subscription testing
taosBenchmark can not only write data, but also perform query and subscription functions. However, a taosBenchmark instance can only support one of these functions, not all three, and the configuration file is used to specify which function to test.
The following is the content of a typical query JSON example file.
```
{
"filetype": "query",
......@@ -373,7 +401,9 @@ The following is the content of a typical query JSON example file.
}
}
```
The following parameters are specific to the query in the JSON file.
```
"query_times": the number of queries per query type
"query_mode": query data interface, "tosc": call TDengine's c interface; "resetful": use restfule interface. Options are available. Default is "taosc".
......@@ -392,6 +422,7 @@ The following parameters are specific to the query in the JSON file.
```
The following is a typical subscription JSON example file content.
```
{
"filetype":"subscribe",
......@@ -404,34 +435,36 @@ The following is a typical subscription JSON example file content.
"confirm_parameter_prompt": "no",
"specified_table_query":
{
"concurrent":1,
"mode":"sync",
"interval":0,
"restart":"yes",
"concurrent":1,
"mode":"sync",
"interval":0,
"restart":"yes",
"keepProgress":"yes",
"sqls": [
{
"sql": "select * from stb00_0 ;",
"sql": "select * from stb00_0 ;",
"result": "./subscribe_res0.txt"
}]
},
"super_table_query":
"super_table_query":
{
"stblname": "stb0",
"threads":1,
"mode":"sync",
"interval":10000,
"restart":"yes",
"threads":1,
"mode":"sync",
"interval":10000,
"restart":"yes",
"keepProgress":"yes",
"sqls": [
{
"sql": "select * from xxxx where ts > '2021-02-25 11:35:00.000' ;",
"sql": "select * from xxxx where ts > '2021-02-25 11:35:00.000' ;",
"result": "./subscribe_res1.txt"
}]
}
}
```
The following are the meanings of the parameters specific to the subscription function.
```
"interval": interval for executing subscriptions, in seconds. Optional, default is 0.
"restart": subscription restart." yes": restart the subscription if it already exists, "no": continue the previous subscription. (Please note that the executing user needs to have read/write access to the dataDir directory)
......@@ -439,11 +472,12 @@ The following are the meanings of the parameters specific to the subscription fu
"resubAfterConsume": Used in conjunction with keepProgress to call unsubscribe after the subscription has been consumed the appropriate number of times and to subscribe again.
"result": the name of the file to which the query result is written. Optional, default is null, means the query result will not be written to the file. Note: The file to save the result after each sql statement cannot be renamed, and the file name will be appended with the thread number when generating the result file.
```
Conclusion
--
# Conclusion
TDengine is a big data platform designed and optimized for IoT, Telematics, Industrial Internet, DevOps, etc. TDengine shows a high performance that far exceeds similar products due to the innovative data storage and query engine design in the database kernel. And withSQL syntax support and connectors for multiple programming languages (currently Java, Python, Go, C#, NodeJS, Rust, etc. are supported), it is extremely easy to use and has zero learning cost. To facilitate the operation and maintenance needs, we also provide data migration and monitoring functions and other related ecological tools and software.
For users who are new to TDengine, we have developed rich features for taosBenchmark to facilitate technical evaluation and stress testing. This article is a brief introduction to taosBenchmark, which will continue to evolve and improve as new features are added to TDengine.
For users who are new to TDengine, we have developed rich features for taosBenchmark to facilitate technical evaluation and stress testing. This article is a brief introduction to taosBenchmark, which will continue to evolve and improve as new features are added to TDengine.
As part of TDengine, taosBenchmark's source code is fully open on the GitHub. Suggestions or advices about the use or implementation of taosBenchmark or TDengine are welcomed on GitHub or in the Taos Data user group.
......@@ -22,7 +22,7 @@ TDengine is very easy to install, from download to successful installation in ju
<ul id="server-packageList" class="package-list"></ul>
For detailed installation steps, please refer to [How to install/uninstall TDengine with installation package](https://www.taosdata.com/en/getting-started/install).
For detailed installation steps, please refer to [How to install/uninstall TDengine with installation package](https://www.taosdata.com/getting-started/install).
**Click [here](https://github.com/taosdata/TDengine/releases) for release notes.**
......@@ -57,7 +57,7 @@ To run taosdump, you need to install the TDengine server or TDengine client inst
If you want to contribute to TDengine, please visit [TDengine GitHub page](https://github.com/taosdata/TDengine) for detailed instructions on build and installation from the source code.
**To download other components, beta, or early releases, please click [here](https://www.taosdata.com/cn/all-downloads/)**
**To download other components, beta version, or early releases, please click [here](https://www.taosdata.com/en/all-downloads/).**
## <a class="anchor" id="start"></a>Quick Launch
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册