05-taosbenchmark.md 27.3 KB
Newer Older
1 2 3 4 5 6 7 8 9
---
title: taosBenchmark
sidebar_label: taosBenchmark
toc_max_heading_level: 4
description: "taosBenchmark (once called taosdemo ) is a tool for testing the performance of TDengine."
---

## Introduction

10
taosBenchmark (formerly taosdemo ) is a tool for testing the performance of TDengine products. taosBenchmark can test the performance of TDengine's insert, query, and subscription functions and simulate large amounts of data generated by many devices. taosBenchmark can flexibly control the number and type of databases, supertables, tag columns, number and type of data columns, and sub-tables, and types of databases, super tables, the number and types of data columns, the number of sub-tables, the amount of data per sub-table, the time interval for inserting data, the number of working threads, whether and how to insert disordered data, and so on. The installer provides taosdemo as a soft link to taosBenchmark for compatibility and for the convenience of past users.
11 12 13 14 15 16 17 18 19 20 21 22 23

## Installation

There are two ways to install taosBenchmark:

- Installing the official TDengine installer will automatically install taosBenchmark. Please refer to [TDengine installation](/operation/pkg-install) for details.

- Compile taos-tools separately and install them. Please refer to the [taos-tools](https://github.com/taosdata/taos-tools) repository for details.

## Run

### Configuration and running methods

Y
Yu Chen 已提交
24
TaosBenchmark needs to be executed on the terminal of the operating system, it supports two configuration methods: [Command-line arguments](#Command-line arguments in detailed) and [JSON configuration file](#Configuration file arguments in detailed). These two methods are mutually exclusive. Users can use `-f <json file>` to specify a configuration file. When running taosBenchmark with command-line arguments to control its behavior, users should use other parameters for configuration, but not the `-f` parameter. In addition, taosBenchmark offers a special way of running without parameters.
25 26 27 28 29 30 31 32 33 34 35 36 37

taosBenchmark supports complete performance testing of TDengine. taosBenchmark supports the TDengine functions in three categories: write, query, and subscribe. These three functions are mutually exclusive, and users can select only one of them each time taosBenchmark runs. It is important to note that the type of functionality to be tested is not configurable when using the command-line configuration method, which can only test writing performance. To test the query and subscription performance of the TDengine, you must use the configuration file method and specify the function type to test via the parameter `filetype` in the configuration file.

**Make sure that the TDengine cluster is running correctly before running taosBenchmark. **

### Run without command-line arguments

Execute the following commands to quickly experience taosBenchmark's default configuration-based write performance testing of TDengine.

```bash
taosBenchmark
```

38
When run without parameters, taosBenchmark connects to the TDengine cluster specified in `/etc/taos` by default and creates a database named `test`, a super table named `meters` under the test database, and 10,000 tables under the super table with 10,000 records written to each table. Note that if there is already a database named "test" this command will delete it first and create a new database.
39 40 41 42 43 44 45 46 47

### Run with command-line configuration parameters

The `-f <json file>` argument cannot be used when running taosBenchmark with command-line parameters and controlling its behavior. Users must specify all configuration parameters from the command-line. The following is an example of testing taosBenchmark writing performance using the command-line approach.

```bash
taosBenchmark -I stmt -n 200 -t 100
```

48
Using the above command, `taosBenchmark` will create a database named `test`, create a super table `meters` in it, create 100 sub-tables in the super table and insert 200 records for each sub-table using parameter binding.
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97

### Run with the configuration file

A sample configuration file is provided in the taosBenchmark installation package under `<install_directory>/examples/taosbenchmark-json`.

Use the following command-line to run taosBenchmark and control its behavior via a configuration file.

```bash
taosBenchmark -f <json file>
```

**Here are a few examples of configuration files:**

#### Example of inserting a scenario JSON configuration file

<details>
<summary>insert.json</summary>

```json
{{#include /taos-tools/example/insert.json}}
```

</details>

#### Query Scenario JSON Profile Example

<details>
<summary>query.json</summary>

```json
{{#include /taos-tools/example/query.json}}
```

</details>

#### Subscription JSON configuration example

<details>
<summary>subscribe.json</summary>

```json
{{#include /taos-tools/example/subscribe.json}}
```

</details>

## Command-line argument in detailed

- **-f/--file <json file\>** :
98
  specify the configuration file to use. This file includes All parameters. Users should not use this parameter with other parameters on the command-line. There is no default value.
99 100

- **-c/--config-dir <dir\>** :
101
  specify the directory where the TDengine cluster configuration file. The default path is `/etc/taos`.
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274

- **-h/--host <host\>** :
  Specify the FQDN of the TDengine server to connect to. The default value is localhost.

- **-P/--port <port\>** :
  The port number of the TDengine server to connect to, the default value is 6030.

- **-I/--interface <insertMode\>** :
  Insert mode. Options are taosc, rest, stmt, sml, sml-rest, corresponding to normal write, restful interface writing, parameter binding interface writing, schemaless interface writing, RESTful schemaless interface writing (provided by taosAdapter). The default value is taosc.

- **-u/--user <user\>** :
  User name to connect to the TDengine server. Default is root.

- **-p/--password <passwd\>** :
  The default password to connect to the TDengine server is `taosdata`.

- **-o/--output <file\>** :
  specify the path of the result output file, the default value is `. /output.txt`.

- **-T/--thread <threadNum\>** :
  The number of threads to insert data. Default is 8.

- **-B/--interlace-rows <rowNum\>** :
  Enables interleaved insertion mode and specifies the number of rows of data to be inserted into each child table. Interleaved insertion mode means inserting the number of rows specified by this parameter into each sub-table and repeating the process until all sub-tables have been inserted. The default value is 0, i.e., data is inserted into one sub-table before the next sub-table is inserted.

- **-i/--insert-interval <timeInterval\>** :
  Specify the insert interval in `ms` for interleaved insert mode. The default value is 0. It only works if `-B/--interlace-rows` is greater than 0. That means that after inserting interlaced rows for each child table, the data insertion with multiple threads will wait for the interval specified by this value before proceeding to the next round of writes.

- **-r/--rec-per-req <rowNum\>** :
  Writing the number of rows of records per request to TDengine, the default value is 30000.

- **-t/--tables <tableNum\>** :
  Specify the number of sub-tables. The default is 10000.

- **-S/--timestampstep <stepLength\>** :
  Timestamp step for inserting data in each child table in ms, default is 1.

- **-n/--records <recordNum\>** :
  The default value of the number of records inserted in each sub-table is 10000.

- **-d/--database <dbName\>** :
  The name of the database used, the default value is `test`.

- **-b/--data-type <colType\>** :
  specify the type of the data columns of the super table. It defaults to three columns of type FLOAT, INT, and FLOAT if not used.

- **-l/--columns <colNum\>** :
  specify the number of columns in the super table. If both this parameter and `-b/--data-type` is set, the final result number of columns is the greater of the two. If the number specified by this parameter is greater than the number of columns specified by `-b/--data-type`, the unspecified column type defaults to INT, for example: `-l 5 -b float,double`, then the final column is `FLOAT,DOUBLE,INT,INT,INT`. If the number of columns specified is less than or equal to the number of columns specified by `-b/--data-type`, then the result is the column and type specified by `-b/--data-type`, e.g.: `-l 3 -b float,double,float,bigint`. The last column is `FLOAT,DOUBLE, FLOAT,BIGINT`.

- **-A/--tag-type <tagType\>** :
  The tag column type of the super table. nchar and binary types can both set the length, for example:

```
taosBenchmark -A INT,DOUBLE,NCHAR,BINARY(16)
```

If users did not set tag type, the default is two tags, whose types are INT and BINARY(16).
Note: In some shells, such as bash, "()" needs to be escaped, so the above command should be

```
taosBenchmark -A INT,DOUBLE,NCHAR,BINARY\(16\)
```

- **-w/--binwidth <length\>**:
  specify the default length for nchar and binary types. The default value is 64.

- **-m/--table-prefix <tablePrefix\>** :
  The prefix of the sub-table name, the default value is "d".

- **-E/--escape-character** :
  Switch parameter specifying whether to use escape characters in the super table and sub-table names. By default is not used.

- **-C/--chinese** :
  Switch specifying whether to use Unicode Chinese characters in nchar and binary. By default is not used.

- **-N/--normal-table** :
  This parameter indicates that taosBenchmark will create only normal tables instead of super tables. The default value is false. It can be used if the insert mode is taosc, stmt, and rest.

- **-M/--random** :
  This parameter indicates writing data with random values. The default is false. If users use this parameter, taosBenchmark will generate the random values. For tag/data columns of numeric type, the value is a random value within the range of values of that type. For NCHAR and BINARY type tag columns/data columns, the value is the random string within the specified length range.

- **-x/--aggr-func** :
  Switch parameter to indicate query aggregation function after insertion. The default value is false.

- **-y/--answer-yes** :
  Switch parameter that requires the user to confirm at the prompt to continue. The default value is false.

- **-O/--disorder <Percentage\>** :
  Specify the percentage probability of disordered data, with a value range of [0,50]. The default is 0, i.e., there is no disordered data.

- **-R/--disorder-range <timeRange\>** :
  Specify the timestamp range for the disordered data. It leads the resulting disorder timestamp as the ordered timestamp minus a random value in this range. Valid only if the percentage of disordered data specified by `-O/--disorder` is greater than 0.

- **-F/--prepare_rand <Num\>** :
  Specify the number of unique values in the generated random data. A value of 1 means that all data are equal. The default value is 10000.

- **-a/--replica <replicaNum\>** :
  Specify the number of replicas when creating the database. The default value is 1.

- **-V/--version** :
  Show version information only. Users should not use it with other parameters.

- **-? /--help** :
  Show help information and exit. Users should not use it with other parameters.

## Configuration file parameters in detailed

### General configuration parameters

The parameters listed in this section apply to all function modes.

- **filetype** : The function to be tested, with optional values `insert`, `query` and `subscribe`. These correspond to the insert, query, and subscribe functions, respectively. Users can specify only one of these in each configuration file.
**cfgdir**: specify the TDengine cluster configuration file's directory. The default path is /etc/taos.

- **host**: Specify the FQDN of the TDengine server to connect. The default value is `localhost`.

- **port**: The port number of the TDengine server to connect to, the default value is `6030`.

- **user**: The user name of the TDengine server to connect to, the default is `root`.

- **password**: The password to connect to the TDengine server, the default value is `taosdata`.

### Insert scenario configuration parameters

`filetype` must be set to `insert` in the insertion scenario. See [General Configuration Parameters](#General Configuration Parameters)

#### Database related configuration parameters

The parameters related to database creation are configured in `dbinfo` in the json configuration file, as follows. These parameters correspond to the database parameters specified when `create database` in TDengine.

- **name**: specify the name of the database.

- **drop**: indicate whether to delete the database before inserting. The default is true.

- **replica**: specify the number of replicas when creating the database.

- **days**: specify the time span for storing data in a single data file. The default is 10.

- **cache**: specify the size of the cache blocks in MB. The default value is 16.

- **blocks**: specify the number of cache blocks in each vnode. The default is 6.

- **precision**: specify the database time precision. The default value is "ms".

- **keep**: specify the number of days to keep the data. The default value is 3650.

- **minRows**: specify the minimum number of records in the file block. The default value is 100.

- **maxRows**: specify the maximum number of records in the file block. The default value is 4096.

- **comp**: specify the file compression level. The default value is 2.

- **walLevel** : specify WAL level, default is 1.

- **cacheLast**: indicate whether to allow the last record of each table to be kept in memory. The default value is 0. The value can be 0, 1, 2, or 3.

- **quorum**: specify the number of writing acknowledgments in multi-replica mode. The default value is 1.

- **fsync**: specify the interval of fsync in ms when users set WAL to 2. The default value is 3000.

- **update** : indicate whether to support data update, default value is 0, optional values are 0, 1, 2.

#### Super table related configuration parameters

The parameters for creating super tables are configured in `super_tables` in the json configuration file, as shown below.

- **name**: Super table name, mandatory, no default value.
- **child_table_exists** : whether the child table already exists, default value is "no", optional value is "yes" or "no".

- **child_table_count** : The number of child tables, the default value is 10.

- **child_table_prefix** : The prefix of the child table name, mandatory configuration item, no default value.

275
- **escape_character**: specify the super table and child table names containing escape characters. The value can be "yes" or "no". The default is "no".
276 277 278

- **auto_create_table**: only when insert_mode is taosc, rest, stmt, and childtable_exists is "no". "yes" means taosBenchmark will automatically create non-existent tables when inserting data; "no" means that taosBenchmark will create all tables before inserting.

279
- **batch_create_tbl_num** : the number of tables per batch when creating sub-tables, default is 10. Note: the actual number of batches may not be the same as this value. If the executed SQL statement is larger than the maximum length supported, it will be automatically truncated and re-executed to continue creating.
280

281
- **data_source**: specify the source of data-generation. Default is taosBenchmark randomly generated. Users can configure it as "rand" and "sample". When "sample" is used, taosBenchmark will use the data in the file specified by the `sample_file` parameter.
282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302

- **insert_mode**: insertion mode with options taosc, rest, stmt, sml, sml-rest, corresponding to normal write, restful interface write, parameter binding interface write, schemaless interface write, restful schemaless interface write (provided by taosAdapter). The default value is taosc.

- **non_stop_mode**: Specify whether to keep writing. If "yes", insert_rows will be disabled, and writing will not stop until Ctrl + C stops the program. The default value is "no", i.e., taosBenchmark will stop the writing after the specified number of rows are written. Note: insert_rows must be configured as a non-zero positive integer even if it fails in continuous write mode.

- **line_protocol**: Insert data using line protocol. Only works when insert_mode is sml or sml-rest. The value can be `line`, `telnet`, or `json`.

- **tcp_transfer**: Communication protocol in telnet mode only takes effect when insert_mode is sml-rest, and line_protocol is telnet. If not configured, the default protocol is http.

- **insert_rows** : The number of inserted rows per child table, default is 0.

- **childtable_offset**: Effective only if childtable_exists is yes, specifies the offset when fetching the list of child tables from the super table, i.e., starting from the first child table.

- **childtable_limit**: Effective only when childtable_exists is yes, specifies the upper limit for fetching the list of child tables from the super table.

- **interlace_rows**: Enables interleaved insertion mode and specifies the number of rows of data to be inserted into each child table at a time. Staggered insertion mode means inserting the number of rows specified by this parameter into each sub-table and repeating the process until all sub-tables have been inserted. The default value is 0, i.e., data is inserted into one sub-table before the next sub-table is inserted.

- **insert_interval** : Specifies the insertion interval in ms for interleaved insertion mode. The default value is 0. It only works if `-B/--interlace-rows` is greater than 0. After inserting interlaced rows for each child table, the data insertion thread will wait for the interval specified by this value before proceeding to the next round of writes.

- **partial_col_num**: If this value is a positive number n, only the first n columns are written to, only if insert_mode is taosc and rest, or all columns if n is 0.

303
- **disorder_ratio** : Specifies the percentage probability of disordered (i.e. out-of-order) data in the value range [0,50]. The default is 0, which means there is no disorder data.
304

305
- **disorder_range** : Specifies the timestamp fallback range for the disordered  data. The disordered timestamp is generated by subtracting a random value in this range, from the timestamp that would be used in the non-disorder case. Valid only if the percentage of disordered data specified by `-O/--disorder` is greater than 0.
306

307
- **timestamp_step**: The timestamp step for inserting data in each child table, in units consistent with the `precision` of the database. For e.g. if the `precision` is milliseconds, the timestamp step will be in milliseconds. The default value is 1.
308 309 310

- **start_timestamp** : The timestamp start value of each sub-table, the default value is now.

311
- **sample_format**: The type of the sample data file; for now only "csv" is supported.
312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343

- **sample_file**: Specify a CSV format file as the data source. It only works when data_source is a sample. If the number of rows in the CSV file is less than or equal to prepared_rand, then taosBenchmark will read the CSV file data cyclically until it is the same as prepared_rand; otherwise, taosBenchmark will read only the rows with the number of prepared_rand. The final number of rows of data generated is the smaller of the two.

- **use_sample_ts**: effective only when data_source is `sample`, indicates whether the CSV file specified by sample_file contains the first timestamp column. Default is no. If set to yes, the first column of the CSV file is used as `timestamp`. Since the timestamp of the same sub-table cannot be repeated, the amount of data generated depends on the same number of rows of data in the CSV file, and insert_rows will be invalidated.

- **tags_file** : only works when insert_mode is taosc, rest. The final tag value is related to the childtable_count. Suppose the tag data rows in the CSV file are smaller than the given number of child tables. In that case, taosBenchmark will read the CSV file data cyclically until the number of child tables specified by childtable_count is generated. Otherwise, taosBenchmark will read the childtable_count rows of tag data only. The final number of child tables generated is the smaller of the two.

#### Tag and Data Column Configuration Parameters

The configuration parameters for specifying super table tag columns and data columns are in `columns` and `tag` in `super_tables`, respectively.

- **type**: Specify the column type. For optional values, please refer to the data types supported by TDengine.
  Note: JSON data type is unique and can only be used for tags. When using JSON type as a tag, there is and can only be this one tag. At this time, `count` and `len` represent the meaning of the number of key-value pairs within the JSON tag and the length of the value of each KV pair. Respectively, the value is a string by default.

- **len**: Specifies the length of this data type, valid for NCHAR, BINARY, and JSON data types. If this parameter is configured for other data types, a value of 0 means that the column is always written with a null value; if it is not 0, it is ignored.

- **count**: Specifies the number of consecutive occurrences of the column type, e.g., "count": 4096 generates 4096 columns of the specified type.

- **name** : The name of the column, if used together with count, e.g. "name": "current", "count":3, then the names of the 3 columns are current, current_2. current_3.

- **min**: The minimum value of the column/label of the data type.

- **max**: The maximum value of the column/label of the data type.

- **values**: The value field of the nchar/binary column/label, which will be chosen randomly from the values.

#### insertion behavior configuration parameters

- **thread_count**: specify the number of threads to insert data. Default is 8.

- **create_table_thread_count** : The number of threads to build the table, default is 8.

344
- **connection_pool_size** : The number of pre-established connections to the TDengine server. If not configured, it is the same as number of threads specified.
345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434

- **result_file** : The path to the result output file, the default value is . /output.txt.

- **confirm_parameter_prompt**: The switch parameter requires the user to confirm after the prompt to continue. The default value is false.

- **interlace_rows**: Enables interleaved insertion mode and specifies the number of rows of data to be inserted into each child table at a time. Interleaved insertion mode means inserting the number of rows specified by this parameter into each sub-table and repeating the process until all sub-tables are inserted. The default value is 0, which means that data will be inserted into the following child table only after data is inserted into one child table.
  This parameter can also be configured in `super_tables`, and if so, the configuration in `super_tables` takes precedence and overrides the global setting.

- **insert_interval** :
  Specifies the insertion interval in ms for interleaved insertion mode. The default value is 0. Only works if `-B/--interlace-rows` is greater than 0. It means that after inserting interlace rows for each child table, the data insertion thread will wait for the interval specified by this value before proceeding to the next round of writes.
  This parameter can also be configured in `super_tables`, and if configured, the configuration in `super_tables` takes high priority, overriding the global setting.

- **num_of_records_per_req** :
  The number of rows of data to be written per request to TDengine, the default value is 30000. When it is set too large, the TDengine client driver will return the corresponding error message, so you need to lower the setting of this parameter to meet the writing requirements.

- **prepare_rand**: The number of unique values in the generated random data. A value of 1 means that all data are the same. The default value is 10000.

### Query scenario configuration parameters

`filetype` must be set to `query` in the query scenario. See [General Configuration Parameters](#General Configuration Parameters) for details of this parameter and other general parameters

#### Configuration parameters for executing the specified query statement

The configuration parameters for querying the sub-tables or the normal tables are set in `specified_table_query`.

- **query_interval** : The query interval in seconds, the default value is 0.

- **threads**: The number of threads to execute the query SQL, the default value is 1.

- **sqls**.
  - **sql**: the SQL command to be executed.
  - **result**: the file to save the query result. If it is unspecified, taosBenchark will not save the result.

#### Configuration parameters of query super table

The configuration parameters of the super table query are set in `super_table_query`.

- **stblname**: Specify the name of the super table to be queried, required.

- **query_interval** : The query interval in seconds, the default value is 0.

- **threads**: The number of threads to execute the query SQL, the default value is 1.

- **sqls** : The default value is 1.
  - **sql**: The SQL command to be executed. For the query SQL of super table, keep "xxxx" in the SQL command. The program will automatically replace it with all the sub-table names of the super table.
    Replace it with all the sub-table names in the super table.
  - **result**: The file to save the query result. If not specified, taosBenchmark will not save result.

### Subscription scenario configuration parameters

`filetype` must be set to `subscribe` in the subscription scenario. See [General Configuration Parameters](#General Configuration Parameters) for details of this and other general parameters

#### Configuration parameters for executing the specified subscription statement

The configuration parameters for subscribing to a sub-table or a generic table are set in `specified_table_query`.

- **threads**: The number of threads to execute SQL, default is 1.

- **interval**: The time interval to execute the subscription, in seconds, default is 0.

- **restart** : "yes" means start a new subscription, "no" means continue the previous subscription, the default value is "no".

- **keepProgress**: "yes" means keep the progress of the subscription, "no" means don't keep it, and the default value is "no".

- **resubAfterConsume**: "yes" means cancel the previous subscription and then subscribe again, "no" means continue the previous subscription, and the default value is "no".

- **sqls** : The default value is "no".
  - **sql** : The SQL command to be executed, required.
  - **result** : The file to save the query result, unspecified is not saved.

#### Configuration parameters for subscribing to supertables

The configuration parameters for subscribing to a super table are set in `super_table_query`.

- **stblname**: The name of the super table to subscribe.

- **threads**: The number of threads to execute SQL, default is 1.

- **interval**: The time interval to execute the subscription, in seconds, default is 0.

- **restart** : "yes" means start a new subscription, "no" means continue the previous subscription, the default value is "no".

- **keepProgress**: "yes" means keep the progress of the subscription, "no" means don't keep it, and the default value is "no".

- **resubAfterConsume**: "yes" means cancel the previous subscription and then subscribe again, "no" means continue the previous subscription, and the default value is "no".

- **sqls** : The default value is "no".
  - **sql**: SQL command to be executed, required; for the query SQL of the super table, keep "xxxx" in the SQL command, and the program will replace it with all the sub-table names of the super table automatically.
    Replace it with all the sub-table names in the super table.
  - **result**: The file to save the query result, if not specified, it will not be saved.