@@ -162,7 +162,7 @@ To better understand the data model using metri, tags, super table and subtable,
## Database
A database is a collection of tables. TDengine allows a running instance to have multiple databases, and each database can be configured with different storage policies. The [characteristics of time-series data](https://www.taosdata.com/blog/2019/07/09/86.html) from different data collection points may be different. Characteristics include collection frequency, retention policy and others which determine how you create and configure the database. For e.g. days to keep, number of replicas, data block size, whether data updates are allowed and other configurable parameters would be determined by the characteristics of your data and your business requirements. In order for TDengine to work with maximum efficiency in various scenarios, TDengine recommends that STables with different data characteristics be created in different databases.
A database is a collection of tables. TDengine allows a running instance to have multiple databases, and each database can be configured with different storage policies. The [characteristics of time-series data](https://tdengine.com/tsdb/characteristics-of-time-series-data/) from different data collection points may be different. Characteristics include collection frequency, retention policy and others which determine how you create and configure the database. For e.g. days to keep, number of replicas, data block size, whether data updates are allowed and other configurable parameters would be determined by the characteristics of your data and your business requirements. In order for TDengine to work with maximum efficiency in various scenarios, TDengine recommends that STables with different data characteristics be created in different databases.
In a database, there can be one or more STables, but a STable belongs to only one database. All tables owned by a STable are stored in only one database.
@@ -279,6 +279,6 @@ Prior to establishing connection, please make sure TDengine is already running a
</Tabs>
:::tip
If the connection fails, in most cases it's caused by improper configuration for FQDN or firewall. Please refer to the section "Unable to establish connection" in [FAQ](https://docs.taosdata.com/train-faq/faq).
If the connection fails, in most cases it's caused by improper configuration for FQDN or firewall. Please refer to the section "Unable to establish connection" in [FAQ](https://docs.tdengine.com/train-faq/faq).
@@ -39,18 +39,18 @@ To get the hostname on any host, the command `hostname -f` can be executed.
On the physical machine running the application, ping the dnode that is running taosd. If the dnode is not accessible, the application cannot connect to taosd. In this case, verify the DNS and hosts settings on the physical node running the application.
The end point of each dnode is the output hostname and port, such as h1.taosdata.com:6030.
The end point of each dnode is the output hostname and port, such as h1.tdengine.com:6030.
### Step 5
Modify the TDengine configuration file `/etc/taos/taos.cfg` on each node. Assuming the first dnode of TDengine cluster is "h1.taosdata.com:6030", its `taos.cfg` is configured as following.
Modify the TDengine configuration file `/etc/taos/taos.cfg` on each node. Assuming the first dnode of TDengine cluster is "h1.tdengine.com:6030", its `taos.cfg` is configured as following.
```c
// firstEp is the end point to connect to when any dnode starts
firstEph1.taosdata.com:6030
firstEph1.tdengine.com:6030
// must be configured to the FQDN of the host where the dnode is launched
fqdnh1.taosdata.com
fqdnh1.tdengine.com
// the port used by the dnode, default is 6030
serverPort6030
...
...
@@ -76,13 +76,13 @@ The first dnode can be started following the instructions in [Get Started](/get-
taos> show dnodes;
id | endpoint | vnodes | support_vnodes | status | create_time | note |
This adds the end point of the new dnode (from Step 4) into the end point list of the cluster. In the command "fqdn:port" should be quoted using double quotes. Change `"h2.taos.com:6030"` to the end point of your new dnode.
Then on the first dnode h1.taosdata.com, execute `show dnodes` in `taos`
Then on the first dnode h1.tdengine.com, execute `show dnodes` in `taos`
This section explains the syntax of SQL to perform operations on databases, tables and STables, insert data, select data and use functions. We also provide some tips that can be used in TDengine SQL. If you have previous experience with SQL this section will be fairly easy to understand. If you do not have previous experience with SQL, you'll come to appreciate the simplicity and power of SQL. TDengine SQL has been enhanced in version 3.0, and the query engine has been rearchitected. For information about how TDengine SQL has changed, see [Changes in TDengine 3.0](../taos-sql/changes).
...
...
@@ -15,7 +15,7 @@ Syntax Specifications used in this chapter:
- | means one of a few options, excluding | itself.
- … means the item prior to it can be repeated multiple times.
To better demonstrate the syntax, usage and rules of TAOS SQL, hereinafter it's assumed that there is a data set of data from electric meters. Each meter collects 3 data measurements: current, voltage, phase. The data model is shown below:
To better demonstrate the syntax, usage and rules of TDengine SQL, hereinafter it's assumed that there is a data set of data from electric meters. Each meter collects 3 data measurements: current, voltage, phase. The data model is shown below:
@@ -18,12 +18,12 @@ If the TDengine server is already installed, it can be verified as follows:
The following example is in an Ubuntu environment and uses the `curl` tool to verify that the REST interface is working. Note that the `curl` tool may need to be installed in your environment.
The following example lists all databases on the host h1.taosdata.com. To use it in your environment, replace `h1.taosdata.com` and `6041` (the default port) with the actual running TDengine service FQDN and port number.
The following example lists all databases on the host h1.tdengine.com. To use it in your environment, replace `h1.tdengine.com` and `6041` (the default port) with the actual running TDengine service FQDN and port number.
@@ -133,8 +133,6 @@ The configuration parameters in the URL are as follows:
- batchfetch: true: pulls result sets in batches when executing queries; false: pulls result sets row by row. The default value is true. Enabling batch pulling and obtaining a batch of data can improve query performance when the query data volume is large.
- batchErrorIgnore:true: When executing statement executeBatch, if there is a SQL execution failure in the middle, the following SQL will continue to be executed. false: No more statements after the failed SQL are executed. The default value is: false.
For more information about JDBC native connections, see [Video Tutorial](https://www.taosdata.com/blog/2020/11/11/1955.html).
**Connect using the TDengine client-driven configuration file **
When you use a JDBC native connection to connect to a TDengine cluster, you can use the TDengine client driver configuration file to specify parameters such as `firstEp` and `secondEp` of the cluster in the configuration file as below:
`Taos` is an ADO.NET connector for TDengine, supporting Linux and Windows platforms. Community contributor `Maikebing@@maikebing contributes the connector`. Please refer to:
Since the TDengine client driver is written in C, using the native connection requires loading the client driver shared library file, which is usually included in the TDengine installer. You can install either standard TDengine server installation package or [TDengine client installation package](/get-started/). For Windows development, you need to install the corresponding [Windows client](https://www.taosdata.com/cn/all-downloads/#TDengine-Windows-Client) for TDengine.
Since the TDengine client driver is written in C, using the native connection requires loading the client driver shared library file, which is usually included in the TDengine installer. You can install either standard TDengine server installation package or [TDengine client installation package](/get-started/). For Windows development, you need to install the corresponding Windows client, please refer to [Install TDengine](../../get-started/package).
- libtaos.so: After successful installation of TDengine on a Linux system, the dependent Linux version of the client driver `libtaos.so` file will be automatically linked to `/usr/lib/libtaos.so`, which is included in the Linux scannable path and does not need to be specified separately.
- taos.dll: After installing the client on Windows, the dependent Windows version of the client driver taos.dll file will be automatically copied to the system default search path C:/Windows/System32, again without the need to specify it separately.
@@ -263,7 +263,7 @@ Once the import is complete, the full page view of TDinsight is shown below.
## TDinsight dashboard details
The TDinsight dashboard is designed to provide the usage and status of TDengine-related resources [dnodes, mnodes, vnodes](https://www.taosdata.com/cn/documentation/architecture#cluster) or databases.
The TDinsight dashboard is designed to provide the usage and status of TDengine-related resources [dnodes, mnodes, vnodes](../../taos-sql/node/) or databases.
In IoT applications, data is collected for many purposes such as intelligent control, business analysis, device monitoring and so on. Due to changes in business or functional requirements or changes in device hardware, the application logic and even the data collected may change. Schemaless writing automatically creates storage structures for your data as it is being written to TDengine, so that you do not need to create supertables in advance. When necessary, schemaless writing
...
...
@@ -25,7 +25,7 @@ where:
- measurement will be used as the data table name. It will be separated from tag_set by a comma.
-`tag_set` will be used as tags, with format like `<tag_key>=<tag_value>,<tag_key>=<tag_value>` Enter a space between `tag_set` and `field_set`.
-`field_set`will be used as data columns, with format like `<field_key>=<field_value>,<field_key>=<field_value>` Enter a space between `field_set` and `timestamp`.
-`timestamp`is the primary key timestamp corresponding to this row of data
-`timestamp` is the primary key timestamp corresponding to this row of data
All data in tag_set is automatically converted to the NCHAR data type and does not require double quotes (").
...
...
@@ -36,14 +36,14 @@ In the schemaless writing data line protocol, each data item in the field_set ne
- Spaces, equal signs (=), commas (,), and double quotes (") need to be escaped with a backslash (\\) in front. (All refer to the ASCII character)
- Numeric types will be distinguished from data types by the suffix.
-`t`, `T`, `true`, `True`, `TRUE`, `f`, `F`, `false`, and `False` will be handled directly as BOOL types.
...
...
@@ -61,7 +61,7 @@ Note that if the wrong case is used when describing the data type suffix, or if
Schemaless writes process row data according to the following principles.
1. You can use the following rules to generate the subtable names: first, combine the measurement name and the key and value of the label into the next string:
1. You can use the following rules to generate the subtable names: first, combine the measurement name and the key and value of the label into the next string:
@@ -82,7 +82,7 @@ You can configure smlChildTableName to specify table names, for example, `smlChi
:::tip
All processing logic of schemaless will still follow TDengine's underlying restrictions on data structures, such as the total length of each row of data cannot exceed
16KB. See [TAOS SQL Boundary Limits](/taos-sql/limit) for specific constraints in this area.
16KB. See [TDengine SQL Boundary Limits](/taos-sql/limit) for specific constraints in this area.
:::
...
...
@@ -90,23 +90,23 @@ All processing logic of schemaless will still follow TDengine's underlying restr
Three specified modes are supported in the schemaless writing process, as follows:
In OpenTSDB file and JSON protocol modes, the precision of the timestamp is determined from its length in the standard OpenTSDB manner. User input is ignored.
Note: The table schema is based on the blog [(In Chinese) Data Transfer, Storage, Presentation, EMQX + TDengine Build MQTT IoT Data Visualization Platform](https://www.taosdata.com/blog/2020/08/04/1722.html) as an example. Subsequent operations are carried out with this blog scenario too. Please modify it according to your actual application scenario.
## Configuring EMQX Rules
Since the configuration interface of EMQX differs from version to version, here is v4.4.5 as an example. For other versions, please refer to the corresponding official documentation.
...
...
@@ -137,5 +135,5 @@ Use the TDengine CLI program to log in and query the appropriate databases and t
![TDengine Database EMQX result in taos](./emqx/check-result-in-taos.webp)
Please refer to the [TDengine official documentation](https://docs.taosdata.com/) for more details on how to use TDengine.
Please refer to the [TDengine official documentation](https://docs.tdengine.com/) for more details on how to use TDengine.
EMQX Please refer to the [EMQX official documentation](https://www.emqx.io/docs/en/v4.4/rule/rule-engine.html) for details on how to use EMQX.
A complete TDengine system runs on one or more physical nodes. Logically, it includes data node (dnode), TDengine client driver (TAOSC) and application (app). There are one or more data nodes in the system, which form a cluster. The application interacts with the TDengine cluster through TAOSC's API. The following is a brief introduction to each logical unit.
...
...
@@ -38,15 +39,16 @@ A complete TDengine system runs on one or more physical nodes. Logically, it inc
**Cluster external connection**: TDengine cluster can accommodate a single, multiple or even thousands of data nodes. The application only needs to initiate a connection to any data node in the cluster. The network parameter required for connection is the End Point (FQDN plus configured port number) of a data node. When starting the application taos through CLI, the FQDN of the data node can be specified through the option `-h`, and the configured port number can be specified through `-p`. If the port is not configured, the system configuration parameter “serverPort” of TDengine will be adopted.
**Inter-cluster communication**: Data nodes connect with each other through TCP/UDP. When a data node starts, it will obtain the EP information of the dnode where the mnode is located, and then establish a connection with the mnode in the system to exchange information. There are three steps to obtain EP information of the mnode:
**Inter-cluster communication**: Data nodes connect with each other through TCP/UDP. When a data node starts, it will obtain the EP information of the dnode where the mnode is located, and then establish a connection with the mnode in the system to exchange information. There are three steps to obtain EP information of the mnode:
1. Check whether the mnodeEpList file exists, if it does not exist or cannot be opened normally to obtain EP information of the mnode, skip to the second step;
1. Check whether the mnodeEpList file exists, if it does not exist or cannot be opened normally to obtain EP information of the mnode, skip to the second step;
2. Check the system configuration file taos.cfg to obtain node configuration parameters “firstEp” and “secondEp” (the node specified by these two parameters can be a normal node without mnode, in this case, the node will try to redirect to the mnode node when connected). If these two configuration parameters do not exist or do not exist in taos.cfg, or are invalid, skip to the third step;
3. Set your own EP as a mnode EP and run it independently. After obtaining the mnode EP list, the data node initiates the connection. It will successfully join the working cluster after connection. If not successful, it will try the next item in the mnode EP list. If all attempts are made, but the connection still fails, sleep for a few seconds before trying again.
**The choice of MNODE**: TDengine logically has a management node, but there is no separate execution code. The server-side only has one set of execution code, taosd. So which data node will be the management node? This is determined automatically by the system without any manual intervention. The principle is as follows: when a data node starts, it will check its End Point and compare it with the obtained mnode EP List. If its EP exists in it, the data node shall start the mnode module and become a mnode. If your own EP is not in the mnode EP List, the mnode module will not start. During the system operation, due to load balancing, downtime and other reasons, mnode may migrate to the new dnode, totally transparently and without manual intervention. The modification of configuration parameters is the decision made by mnode itself according to resources usage.
**Add new data nodes:** After the system has a data node, it has become a working system. There are two steps to add a new node into the cluster.
**Add new data nodes:** After the system has a data node, it has become a working system. There are two steps to add a new node into the cluster.
- Step1: Connect to the existing working data node using TDengine CLI, and then add the End Point of the new data node with the command "create dnode"
- Step 2: In the system configuration parameter file taos.cfg of the new data node, set the “firstEp” and “secondEp” parameters to the EP of any two data nodes in the existing cluster. Please refer to the user tutorial for detailed steps. In this way, the cluster will be established step by step.
...
...
@@ -57,6 +59,7 @@ A complete TDengine system runs on one or more physical nodes. Logically, it inc
To explain the relationship between vnode, mnode, TAOSC and application and their respective roles, the following is an analysis of a typical data writing process.
![typical process of TDengine Database](message.webp)
<center> Figure 2: Typical process of TDengine </center>
1. Application initiates a request to insert data through JDBC, ODBC, or other APIs.
...
...
@@ -121,16 +124,17 @@ The load balancing process does not require any manual intervention, and it is t
If a database has N replicas, a virtual node group has N virtual nodes. But only one is the Leader and all others are slaves. When the application writes a new record to system, only the Leader vnode can accept the writing request. If a follower vnode receives a writing request, the system will notifies TAOSC to redirect.
<center> Figure 3: TDengine Leader writing process </center>
1. Leader vnode receives the application data insertion request, verifies, and moves to next step;
2. If the system configuration parameter `“walLevel”` is greater than 0, vnode will write the original request packet into database log file WAL. If walLevel is set to 2 and fsync is set to 0, TDengine will make WAL data written immediately to ensure that even system goes down, all data can be recovered from database log file;
3.If there are multiple replicas, vnode will forward data packet to follower vnodes in the same virtual node group, and the forwarded packet has a version number with data;
3. If there are multiple replicas, vnode will forward data packet to follower vnodes in the same virtual node group, and the forwarded packet has a version number with data;
4. Write into memory and add the record to “skip list”;
5. Leader vnode returns a confirmation message to the application, indicating a successful write.
6. If any of Step 2, 3 or 4 fails, the error will directly return to the application.
...
...
@@ -140,6 +144,7 @@ Leader Vnode uses a writing process as follows:
For a follower vnode, the write process as follows:
<center> Figure 4: TDengine Follower Writing Process </center>
1. Follower vnode receives a data insertion request forwarded by Leader vnode;
...
...
@@ -212,6 +217,7 @@ When data is written to disk, the system decideswhether to compress the data bas
By default, TDengine saves all data in /var/lib/taos directory, and the data files of each vnode are saved in a different directory under this directory. In order to expand the storage space, minimize the bottleneck of file reading and improve the data throughput rate, TDengine can configure the system parameter “dataDir” to allow multiple mounted hard disks to be used by system at the same time. In addition, TDengine also provides the function of tiered data storage, i.e. storage on different storage media according to the time stamps of data files. For example, the latest data is stored on SSD, the data older than a week is stored on local hard disk, and data older than four weeks is stored on network storage device. This reduces storage costs and ensures efficient data access. The movement of data on different storage media is automatically done by the system and is completely transparent to applications. Tiered storage of data is also configured through the system parameter “dataDir”.
dataDir format is as follows:
```
dataDir data_path [tier_level]
```
...
...
@@ -270,6 +276,7 @@ For the data collected by device D1001, the number of records per hour is counte
TDengine creates a separate table for each data collection point, but in practical applications, it is often necessary to aggregate data from different data collection points. In order to perform aggregation operations efficiently, TDengine introduces the concept of STable (super table). STable is used to represent a specific type of data collection point. It is a table set containing multiple tables. The schema of each table in the set is the same, but each table has its own static tag. There can be multiple tags which can be added, deleted and modified at any time. Applications can aggregate or statistically operate on all or a subset of tables under a STABLE by specifying tag filters. This greatly simplifies the development of applications. The process is shown in the following figure:
![TDengine Database Diagram of multi-table aggregation query](multi_tables.webp)
<center> Figure 5: Diagram of multi-table aggregation query </center>
1. Application sends a query condition to system;
...
...
@@ -279,9 +286,8 @@ TDengine creates a separate table for each data collection point, but in practic
5. Each vnode first finds the set of tables within its own node that meet the tag filters from memory, then scans the stored time-series data, completes corresponding aggregation calculations, and returns result to TAOSC;
6. TAOSC finally aggregates the results returned by multiple data nodes and send them back to application.
Since TDengine stores tag data and time-series data separately in vnode, by filtering tag data in memory, the set of tables that need to participate in aggregation operation is first found, which reduces the volume of data to be scanned and improves aggregation speed. At the same time, because the data is distributed in multiple vnodes/dnodes, the aggregation operation is carried out concurrently in multiple vnodes, which further improves the aggregation speed. Aggregation functions for ordinary tables and most operations are applicable to STables. The syntax is exactly the same. Please see TAOS SQL for details.
Since TDengine stores tag data and time-series data separately in vnode, by filtering tag data in memory, the set of tables that need to participate in aggregation operation is first found, which reduces the volume of data to be scanned and improves aggregation speed. At the same time, because the data is distributed in multiple vnodes/dnodes, the aggregation operation is carried out concurrently in multiple vnodes, which further improves the aggregation speed. Aggregation functions for ordinary tables and most operations are applicable to STables. The syntax is exactly the same. Please see TDengine SQL for details.
### Precomputation
In order to effectively improve the performance of query processing, based-on the unchangeable feature of IoT data, statistical information of data stored in data block is recorded in the head of data block, including max value, min value, and sum. We call it a precomputing unit. If the query processing involves all the data of a whole data block, the pre-calculated results are directly used, and no need to read the data block contents at all. Since the amount of pre-calculated data is much smaller than the actual size of data block stored on disk, for query processing with disk IO as bottleneck, the use of pre-calculated results can greatly reduce the pressure of reading IO and accelerate the query process. The precomputation mechanism is similar to the BRIN (Block Range Index) of PostgreSQL.
@@ -41,7 +41,7 @@ The agents deployed in the application nodes are responsible for providing opera
-**TDengine installation and deployment**
First of all, please install TDengine. Download the latest stable version of TDengine from the official website and install it. For help with using various installation packages, please refer to the blog ["Installation and Uninstallation of TDengine Multiple Installation Packages"](https://www.taosdata.com/blog/2019/08/09/566.html).
First of all, please install TDengine. Download the latest stable version of TDengine from the official website and install it. For help with using various installation packages, please refer to [Install TDengine](../../get-started/package)
Note that once the installation is complete, do not start the `taosd` service before properly configuring the parameters.
...
...
@@ -51,7 +51,7 @@ TDengine version 2.4 and later version includes `taosAdapter`. taosAdapter is a
Users can flexibly deploy taosAdapter instances, based on their requirements, to improve data writing throughput and provide guarantees for data writes in different application scenarios.
Through taosAdapter, users can directly write the data collected by `collectd` or `StatsD` to TDengine to achieve easy, convenient and seamless migration in application scenarios. taosAdapter also supports Telegraf, Icinga, TCollector, and node_exporter data. For more details, please refer to [taosAdapter](/reference/taosadapter/).
Through taosAdapter, users can directly write the data collected by `collectd` or `StatsD` to TDengine to achieve easy, convenient and seamless migration in application scenarios. taosAdapter also supports Telegraf, Icinga, TCollector, and node_exporter data. For more details, please refer to [taosAdapter](../../reference/taosadapter/).
If using collectd, modify the configuration file in its default location `/etc/collectd/collectd.conf` to point to the IP address and port of the node where to deploy taosAdapter. For example, assuming the taosAdapter IP address is 192.168.1.130 and port 6046, configure it as follows.
...
...
@@ -411,7 +411,7 @@ TDengine provides a wealth of help documents to explain many aspects of cluster
### Cluster Deployment
The first is TDengine installation. Download the latest stable version of TDengine from the official website, and install it. Please refer to the blog ["Installation and Uninstallation of Various Installation Packages of TDengine"](https://www.taosdata.com/blog/2019/08/09/566.html) for the various installation package formats.
The first is TDengine installation. Download the latest stable version of TDengine from the official website, and install it. Please refer to [Install TDengine](../../get-started/package) for more details.
Note that once the installation is complete, do not immediately start the `taosd` service, but start it after correctly configuring the parameters.