Since TDengine was open sourced in July 2019, it has gained a lot of popularity among time-series database developers with its innovative data modeling design, simple installation mehtod, easy programming interface, and powerful data insertion and query performance. The insertion and querying performance is often astonishing to users who are new to TDengine. In order to help users to experience the high performance and functions of TDengine in the shortest time, we developed an application called taosdemo for insertion and querying performance testing of TDengine. Then user can easily simulate the scenario of a large number of devices generating a very large amount of data. User can easily maniplate the number of columns, data types, disorder ratio, and number of concurrent threads with taosdemo customized parameters.
Since TDengine was open sourced in July 2019, it has gained a lot of popularity among time-series database developers with its innovative data modeling design, simple installation method, easy programming interface, and powerful data insertion and query performance. The insertion and querying performance is often astonishing to users who are new to TDengine. In order to help users to experience the high performance and functions of TDengine in the shortest time, we developed an application called taosdemo for insertion and querying performance testing of TDengine. Then user can easily simulate the scenario of a large number of devices generating a very large amount of data. User can easily manipulate the number of columns, data types, disorder ratio, and number of concurrent threads with taosdemo customized parameters.
Running taosdemo is very simple. Just download the TDengine installation package (https://www.taosdata.com/cn/all-downloads/) or compiling the TDengine code yourself (https://github.com/taosdata/TDengine). It can be found and run in the installation directory or in the compiled results directory.
Running taosdemo is very simple. Just download the TDengine installation package (https://www.taosdata.com/cn/all-downloads/) or compiling the TDengine code yourself (https://github.com/taosdata/TDengine). It can be found and run in the installation directory or in the compiled results directory.
...
@@ -221,7 +221,7 @@ To reach TDengine performance limits, data insertion can be executed by using mu
...
@@ -221,7 +221,7 @@ To reach TDengine performance limits, data insertion can be executed by using mu
```
```
-t, --tables=NUMBER The number of tables. Default is 10000.
-t, --tables=NUMBER The number of tables. Default is 10000.
-n, --records=NUMBER The number of records per table. Default is 10000.
-n, --records=NUMBER The number of records per table. Default is 10000.
-M, --random The value of records generated are totally random. The default is to simulate power equipment senario.
-M, --random The value of records generated are totally random. The default is to simulate power equipment scenario.
```
```
As mentioned earlier, taosdemo creates 10,000 tables by default, and each table writes 10,000 records. taosdemo can set the number of tables and the number of records in each table by -t and -n. The data generated by default without parameters are simulated real scenarios, and the simulated data are current and voltage phase values with certain jitter, which can more realistically show TDengine's efficient data compression ability. If you need to simulate the generation of completely random data, you can pass the -M parameter.
As mentioned earlier, taosdemo creates 10,000 tables by default, and each table writes 10,000 records. taosdemo can set the number of tables and the number of records in each table by -t and -n. The data generated by default without parameters are simulated real scenarios, and the simulated data are current and voltage phase values with certain jitter, which can more realistically show TDengine's efficient data compression ability. If you need to simulate the generation of completely random data, you can pass the -M parameter.
@@ -193,7 +193,7 @@ A complete TDengine system runs on one or more physical nodes. Logically, it inc
...
@@ -193,7 +193,7 @@ A complete TDengine system runs on one or more physical nodes. Logically, it inc
**Redirection**: No matter about dnode or TAOSC, the connection to the mnode shall be initiated first, but the mnode is automatically created and maintained by the system, so the user does not know which dnode is running the mnode. TDengine only requires a connection to any working dnode in the system. Because any running dnode maintains the currently running mnode EP List, when receiving a connecting request from the newly started dnode or TAOSC, if it’s not a mnode by self, it will reply to the mnode EP List back. After receiving this list, TAOSC or the newly started dnode will try to establish the connection again. When the mnode EP List changes, each data node quickly obtains the latest list and notifies TAOSC through messaging interaction among nodes.
**Redirection**: No matter about dnode or TAOSC, the connection to the mnode shall be initiated first, but the mnode is automatically created and maintained by the system, so the user does not know which dnode is running the mnode. TDengine only requires a connection to any working dnode in the system. Because any running dnode maintains the currently running mnode EP List, when receiving a connecting request from the newly started dnode or TAOSC, if it’s not a mnode by self, it will reply to the mnode EP List back. After receiving this list, TAOSC or the newly started dnode will try to establish the connection again. When the mnode EP List changes, each data node quickly obtains the latest list and notifies TAOSC through messaging interaction among nodes.
### A Typical Data Writinfg Process
### A Typical Data Writing Process
To explain the relationship between vnode, mnode, TAOSC and application and their respective roles, the following is an analysis of a typical data writing process.
To explain the relationship between vnode, mnode, TAOSC and application and their respective roles, the following is an analysis of a typical data writing process.
...
@@ -244,7 +244,7 @@ The meta data of each table (including schema, tags, etc.) is also stored in vno
...
@@ -244,7 +244,7 @@ The meta data of each table (including schema, tags, etc.) is also stored in vno
### Data Partitioning
### Data Partitioning
In addition to vnode sharding, TDengine partitions the time-series data by time range. Each data file contains only one time range of time-series data, and the length of the time range is determined by DB's configuration parameter `“days”`. This method of partitioning by time rang is also convenient to efficiently implement the data retention policy. As long as the data file exceeds the specified number of days (system configuration parameter `“keep”`), it will be automatically deleted. Moreover, different time ranges can be stored in different paths and storage media, so as to facilitate the tiered-storage. Cold/hot data can be stored in different storage meida to reduce the storage cost.
In addition to vnode sharding, TDengine partitions the time-series data by time range. Each data file contains only one time range of time-series data, and the length of the time range is determined by DB's configuration parameter `“days”`. This method of partitioning by time rang is also convenient to efficiently implement the data retention policy. As long as the data file exceeds the specified number of days (system configuration parameter `“keep”`), it will be automatically deleted. Moreover, different time ranges can be stored in different paths and storage media, so as to facilitate the tiered-storage. Cold/hot data can be stored in different storage media to reduce the storage cost.
In general, **TDengine splits big data by vnode and time range in two dimensions** to manage the data efficiently with horizontal scalability.
In general, **TDengine splits big data by vnode and time range in two dimensions** to manage the data efficiently with horizontal scalability.
@@ -182,7 +182,7 @@ In the output plugins section, add the [[outputs.http]] configuration:
...
@@ -182,7 +182,7 @@ In the output plugins section, add the [[outputs.http]] configuration:
In agent section:
In agent section:
- hostname: The machine name that distinguishes different collection devices, and it is necessary to ensure its uniqueness
- hostname: The machine name that distinguishes different collection devices, and it is necessary to ensure its uniqueness
- metric_batch_size: 100, which is the max number of records per batch wriiten by Telegraf allowed. Increasing the number can reduce the request sending frequency of Telegraf.
- metric_batch_size: 100, which is the max number of records per batch written by Telegraf allowed. Increasing the number can reduce the request sending frequency of Telegraf.
For information on how to use Telegraf to collect data and more about using Telegraf, please refer to the official [document](https://docs.influxdata.com/telegraf/v1.11/) of Telegraf.
For information on how to use Telegraf to collect data and more about using Telegraf, please refer to the official [document](https://docs.influxdata.com/telegraf/v1.11/) of Telegraf.
For example, in TAOS shell, the records with vlotage > 215 are queried from table d1001, sorted in descending order by timestamps, and only two records are outputted.
For example, in TAOS shell, the records with voltage > 215 are queried from table d1001, sorted in descending order by timestamps, and only two records are outputted.
```mysql
```mysql
taos> select * from d1001 where voltage > 215 order by ts desc limit 2;
taos> select * from d1001 where voltage > 215 order by ts desc limit 2;
Its second parameter is used to decide whether to keep the progress information of subscription on the client. If this parameter is **false** (zero), the subscription can only be restarted no matter what the `restart` parameter is when `taos_subscribe` is called next time. In addition, progress information is saved in the directory {DataDir}/subscribe/. Each subscription has a file with the same name as its `topic`. Deleting a file will also lead to a new start when the corresponding subscription is created next time.
Its second parameter is used to decide whether to keep the progress information of subscription on the client. If this parameter is **false** (zero), the subscription can only be restarted no matter what the `restart` parameter is when `taos_subscribe` is called next time. In addition, progress information is saved in the directory {DataDir}/subscribe/. Each subscription has a file with the same name as its `topic`. Deleting a file will also lead to a new start when the corresponding subscription is created next time.
After introducing the code, let's take a look at the actual running effect. For exmaple:
After introducing the code, let's take a look at the actual running effect. For example:
Similiar to a standard table creation SQL, but you need to specify name and type of TAGS field.
Similar to a standard table creation SQL, but you need to specify name and type of TAGS field.
Note:
Note:
...
@@ -673,7 +673,7 @@ Query OK, 1 row(s) in set (0.001091s)
...
@@ -673,7 +673,7 @@ Query OK, 1 row(s) in set (0.001091s)
SELECT * FROM tb1 WHERE ts >= NOW - 1h;
SELECT * FROM tb1 WHERE ts >= NOW - 1h;
```
```
- Look up table tb1 from 2018-06-01 08:00:00. 000 to 2018-06-02 08:00:00. 000, and col3 string is a record ending in'nny ', and the result is in descending order of timestamp:
- Look up table tb1 from 2018-06-01 08:00:00. 000 to 2018-06-02 08:00:00. 000, and col3 string is a record ending in'nny ', and the result is in descending order of timestamp:
```mysql
```mysql
SELECT * FROM tb1 WHERE ts > '2018-06-01 08:00:00.000' AND ts <= '2018-06-02 08:00:00.000' AND col3 LIKE '%nny' ORDER BY ts DESC;
SELECT * FROM tb1 WHERE ts > '2018-06-01 08:00:00.000' AND ts <= '2018-06-02 08:00:00.000' AND col3 LIKE '%nny' ORDER BY ts DESC;
...
@@ -782,7 +782,7 @@ TDengine supports aggregations over data, they are listed below:
...
@@ -782,7 +782,7 @@ TDengine supports aggregations over data, they are listed below:
Function: return the sum of a statistics/STable.
Function: return the sum of a statistics/STable.
Return Data Type: long integer INMT64 and Double.
Return Data Type: INT64 and Double.
Applicable Fields: All types except timestamp, binary, nchar, bool.
Applicable Fields: All types except timestamp, binary, nchar, bool.
...
@@ -1196,7 +1196,7 @@ SELECT function_list FROM stb_name
...
@@ -1196,7 +1196,7 @@ SELECT function_list FROM stb_name
- FILL statement specifies a filling mode when data missed in a certain interval. Applicable filling modes include the following:
- FILL statement specifies a filling mode when data missed in a certain interval. Applicable filling modes include the following:
1. Do not fill: NONE (default filingl mode).
1. Do not fill: NONE (default filing mode).
2. VALUE filling: Fixed value filling, where the filled value needs to be specified. For example: fill (VALUE, 1.23).
2. VALUE filling: Fixed value filling, where the filled value needs to be specified. For example: fill (VALUE, 1.23).
3. NULL filling: Fill the data with NULL. For example: fill (NULL).
3. NULL filling: Fill the data with NULL. For example: fill (NULL).
4. PREV filling: Filling data with the previous non-NULL value. For example: fill (PREV).
4. PREV filling: Filling data with the previous non-NULL value. For example: fill (PREV).
@@ -28,7 +28,7 @@ The overall system architecture of a typical DevOps application scenario is show
...
@@ -28,7 +28,7 @@ The overall system architecture of a typical DevOps application scenario is show
In this application scenario, there are Agent tools deployed in the application environment to collect machine metrics, network metrics, and application metrics, data collectors to aggregate information collected by agents, systems for data persistence storage and management, and tools for monitoring data visualization (e.g., Grafana, etc.).
In this application scenario, there are Agent tools deployed in the application environment to collect machine metrics, network metrics, and application metrics, data collectors to aggregate information collected by agents, systems for data persistence storage and management, and tools for monitoring data visualization (e.g., Grafana, etc.).
Among them, Agents deployed in application nodes are responsible for providing operational metrics from different sources to collectd/Statsd, and collectd/StatsD is responsible for pushing the aggregated data to the OpenTSDB cluster system and then visualizing the data using the visualization kanban board Grafana.
Among them, Agents deployed in application nodes are responsible for providing operational metrics from different sources to collectd/Statsd, and collectd/StatsD is responsible for pushing the aggregated data to the OpenTSDB cluster system and then visualizing the data using the visualization board of Grafana.
### 2. Migration Service
### 2. Migration Service
...
@@ -127,11 +127,11 @@ On the one hand, TDengine requires a strict schema definition for its incoming d
...
@@ -127,11 +127,11 @@ On the one hand, TDengine requires a strict schema definition for its incoming d
Now let's assume a DevOps scenario where we use collectd to collect base metrics of devices, including memory, swap, disk, etc. The schema in OpenTSDB is as follows:
Now let's assume a DevOps scenario where we use collectd to collect base metrics of devices, including memory, swap, disk, etc. The schema in OpenTSDB is as follows:
| No. | metric | value | type | tag1 | tag2 | tag3 | tag4 | tag5 |
| No. | metric | value | type | tag1 | tag2 | tag3 | tag4 | tag5 |