未验证 提交 c150b2fb 编写于 作者: E Elias Soong 提交者: GitHub

Merge pull request #7780 from taosdata/docs/endoc

correct some English writing errors
# TDengine Documentation
TDengine is a highly efficient platform to store, query, and analyze time-series data. It is specially designed and optimized for IoT, Internet of Vehicles, Industrial IoT, IT Infrastructure and Application Monitoring, etc. It works like a relational database, such as MySQL, but you are strongly encouraged to read through the following documentation before you experience it, especially the Data Model and Data Modeling sections. In addition to this document, you should also download and read our technology white paper. For the older TDengine version 1.6 documentation, please click here.
TDengine is a highly efficient platform to store, query, and analyze time-series data. It is specially designed and optimized for IoT, Internet of Vehicles, Industrial IoT, IT Infrastructure and Application Monitoring, etc. It works like a relational database, such as MySQL, but you are strongly encouraged to read through the following documentation before you experience it, especially the Data Modeling sections. In addition to this document, you should also download and read the technology white paper. For the older TDengine version 1.6 documentation, please click [here](https://www.taosdata.com/en/documentation16/).
## [TDengine Introduction](/evaluation)
......@@ -10,28 +10,41 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
## [Getting Started](/getting-started)
* [Quickly Install](/getting-started#install): install via source code/package / Docker within seconds
- [Easy to Launch](/getting-started#start): start / stop TDengine with systemctl
- [Command-line](/getting-started#console) : an easy way to access TDengine server
- [Experience Lightning Speed](/getting-started#demo): running a demo, inserting/querying data to experience faster speed
- [List of Supported Platforms](/getting-started#platforms): a list of platforms supported by TDengine server and client
- [Deploy to Kubernetes](https://taosdata.github.io/TDengine-Operator/en/index.html):a detailed guide for TDengine deployment in Kubernetes environment
* [Quick Install](/getting-started#install): install via source code/package / Docker within seconds
* [Quick Launch](/getting-started#start): start / stop TDengine quickly with systemctl
* [Command-line](/getting-started#console) : an easy way to access TDengine server
* [Experience Lightning Speed](/getting-started#demo): running a demo, inserting/querying data to experience faster speed
* [List of Supported Platforms](/getting-started#platforms): a list of platforms supported by TDengine server and client
* [Deploy to Kubernetes](https://taosdata.github.io/TDengine-Operator/en/index.html):a detailed guide for TDengine deployment in Kubernetes environment
## [Overall Architecture](/architecture)
- [Data Model](/architecture#model): relational database model, but one table for one device with static tags
- [Cluster and Primary Logical Unit](/architecture#cluster): Take advantage of NoSQL, support scale-out and high-reliability
- [Storage Model and Data Partitioning/Sharding](/architecture#sharding): tag data will be separated from time-series data, segmented by vnode and time
- [Data Writing and Replication Process](/architecture#replication): records received are written to WAL, cached, with acknowledgement is sent back to client, while supporting multi-replicas
- [Data Model](/architecture#model): relational database model, but one table for one data collection point with static tags
- [Cluster and Primary Logical Unit](/architecture#cluster): Take advantage of NoSQL architecture, high availability and horizontal scalability
- [Storage Model and Data Partitioning/Sharding](/architecture#sharding): tag data is separated from time-series data, sharded by vnodes and partitioned by time
- [Data Writing and Replication Process](/architecture#replication): records received are written to WAL, cached, with acknowledgement sent back to client, while supporting data replications
- [Caching and Persistence](/architecture#persistence): latest records are cached in memory, but are written in columnar format with an ultra-high compression ratio
- [Data Query](/architecture#query): support various functions, time-axis aggregation, interpolation, and multi-table aggregation
- [Data Query](/architecture#query): support various SQL functions, downsampling, interpolation, and multi-table aggregation
## [Data Modeling](/model)
- [Create a Database](/model#create-db): create a database for all data collection points with similar features
- [Create a Database](/model#create-db): create a database for all data collection points with similar data characteristics
- [Create a Super Table(STable)](/model#create-stable): create a STable for all data collection points with the same type
- [Create a Table](/model#create-table): use STable as the template, to create a table for each data collecting point
- [Create a Table](/model#create-table): use STable as the template to create a table for each data collecting point
## [Efficient Data Ingestion](/insert)
- [Data Writing via SQL](/insert#sql): write one or multiple records into one or multiple tables via SQL insert command
- [Data Writing via Prometheus](/insert#prometheus): Configure Prometheus to write data directly without any code
- [Data Writing via Telegraf](/insert#telegraf): Configure Telegraf to write collected data directly without any code
- [Data Writing via EMQ X](/insert#emq): Configure EMQ X to write MQTT data directly without any code
- [Data Writing via HiveMQ Broker](/insert#hivemq): Configure HiveMQ to write MQTT data directly without any code
## [Efficient Data Querying](/queries)
- [Major Features](/queries#queries): support various standard query functions, setting filter conditions, and querying per time segment
- [Multi-table Aggregation](/queries#aggregation): use STable and set tag filter conditions to perform efficient aggregation
- [Downsampling](/queries#sampling): aggregate data in successive time windows, support interpolation
## [TAOS SQL](/taos-sql)
......@@ -40,27 +53,13 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
- [Table Management](/taos-sql#table): add, drop, check, alter tables
- [STable Management](/taos-sql#super-table): add, drop, check, alter STables
- [Tag Management](/taos-sql#tags): add, drop, alter tags
- [Inserting Records](/taos-sql#insert): support to write single/multiple items per table, multiple items across tables, and support to write historical data
- [Inserting Records](/taos-sql#insert): write single/multiple records a table, multiple records across tables, and historical data
- [Data Query](/taos-sql#select): support time segment, value filtering, sorting, manual paging of query results, etc
- [SQL Function](/taos-sql#functions): support various aggregation functions, selection functions, and calculation functions, such as avg, min, diff, etc
- [Time Dimensions Aggregation](/taos-sql#aggregation): aggregate and reduce the dimension after cutting table data by time segment
- [Cutting and Aggregation](/taos-sql#aggregation): aggregate and reduce the dimension after cutting table data by time segment
- [Boundary Restrictions](/taos-sql#limitation): restrictions for the library, table, SQL, and others
- [Error Code](/taos-sql/error-code): TDengine 2.0 error codes and corresponding decimal codes
## [Efficient Data Ingestion](/insert)
- [SQL Ingestion](/insert#sql): write one or multiple records into one or multiple tables via SQL insert command
- [Prometheus Ingestion](/insert#prometheus): Configure Prometheus to write data directly without any code
- [Telegraf Ingestion](/insert#telegraf): Configure Telegraf to write collected data directly without any code
- [EMQ X Broker](/insert#emq): Configure EMQ X to write MQTT data directly without any code
- [HiveMQ Broker](/insert#hivemq): Configure HiveMQ to write MQTT data directly without any code
## [Efficient Data Querying](/queries)
- [Main Query Features](/queries#queries): support various standard functions, setting filter conditions, and querying per time segment
- [Multi-table Aggregation Query](/queries#aggregation): use STable and set tag filter conditions to perform efficient aggregation queries
- [Downsampling to Query Value](/queries#sampling): aggregate data in successive time windows, support interpolation
## [Advanced Features](/advanced-features)
- [Continuous Query](/advanced-features#continuous-query): Based on sliding windows, the data stream is automatically queried and calculated at regular intervals
......@@ -88,12 +87,12 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
## [Installation and Management of TDengine Cluster](/cluster)
- [Preparation](/cluster#prepare): important considerations before deploying TDengine for production usage
- [Create Your First Node](/cluster#node-one): simple to follow the quick setup
- [Preparation](/cluster#prepare): important steps before deploying TDengine for production usage
- [Create the First Node](/cluster#node-one): just follow the steps in quick start
- [Create Subsequent Nodes](/cluster#node-other): configure taos.cfg for new nodes to add more to the existing cluster
- [Node Management](/cluster#management): add, delete, and check nodes in the cluster
- [High-availability of Vnode](/cluster#high-availability): implement high-availability of Vnode through multi-replicas
- [Mnode Management](/cluster#mnode): automatic system creation without any manual intervention
- [High-availability of Vnode](/cluster#high-availability): implement high-availability of Vnode through replicas
- [Mnode Management](/cluster#mnode): mnodes are created automatically without any manual intervention
- [Load Balancing](/cluster#load-balancing): automatically performed once the number of nodes or load changes
- [Offline Node Processing](/cluster#offline): any node that offline for more than a certain period will be removed from the cluster
- [Arbitrator](/cluster#arbitrator): used in the case of an even number of replicas to prevent split-brain
......@@ -108,27 +107,14 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
- [Export Data](/administrator#export): export data either from TDengine shell or from the taosdump tool
- [System Monitor](/administrator#status): monitor the system connections, queries, streaming calculation, logs, and events
- [File Directory Structure](/administrator#directories): directories where TDengine data files and configuration files located
- [Parameter Restrictions and Reserved Keywords](/administrator#keywords): TDengine’s list of parameter restrictions and reserved keywords
## TDengine Technical Design
- [System Module]: taosd functions and modules partitioning
- [Data Replication]: support real-time synchronous/asynchronous replication, to ensure high-availability of the system
- [Technical Blog](https://www.taosdata.com/cn/blog/?categories=3): More technical analysis and architecture design articles
## Common Tools
- [TDengine sample import tools](https://www.taosdata.com/blog/2020/01/18/1166.html)
- [TDengine performance comparison test tools](https://www.taosdata.com/blog/2020/01/18/1166.html)
- [Use TDengine visually through IDEA Database Management Tool](https://www.taosdata.com/blog/2020/08/27/1767.html)
- [Parameter Limitss and Reserved Keywords](/administrator#keywords): TDengine’s list of parameter limits and reserved keywords
## Performance: TDengine vs Others
- [Performance: TDengine vs InfluxDB with InfluxDB’s open-source performance testing tool](https://www.taosdata.com/blog/2020/01/13/1105.html)
- [Performance: TDengine vs OpenTSDB](https://www.taosdata.com/blog/2019/08/21/621.html)
- [Performance: TDengine vs Cassandra](https://www.taosdata.com/blog/2019/08/14/573.html)
- [Performance: TDengine vs InfluxDB](https://www.taosdata.com/blog/2019/07/19/419.html)
- [Performance Test Reports of TDengine vs InfluxDB/OpenTSDB/Cassandra/MySQL/ClickHouse](https://www.taosdata.com/downloads/TDengine_Testing_Report_cn.pdf)
- [Performance: TDengine vs OpenTSDB](https://www.taosdata.com/blog/2019/09/12/710.html)
- [Performance: TDengine vs Cassandra](https://www.taosdata.com/blog/2019/09/12/708.html)
- [Performance: TDengine vs InfluxDB](https://www.taosdata.com/blog/2019/09/12/706.html)
- [Performance Test Reports of TDengine vs InfluxDB/OpenTSDB/Cassandra/MySQL/ClickHouse](https://www.taosdata.com/downloads/TDengine_Testing_Report_en.pdf)
## More on IoT Big Data
......@@ -136,7 +122,8 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
- [Features and Functions of IoT Big Data platforms](https://www.taosdata.com/blog/2019/07/29/542.html)
- [Why don’t General Big Data Platforms Fit IoT Scenarios?](https://www.taosdata.com/blog/2019/07/09/why-does-the-general-big-data-platform-not-fit-iot-data-processing/)
- [Why TDengine is the best choice for IoT, Internet of Vehicles, and Industry Internet Big Data platforms?](https://www.taosdata.com/blog/2019/07/09/why-tdengine-is-the-best-choice-for-iot-big-data-processing/)
- [Technical Blog](https://www.taosdata.com/cn/blog/?categories=3): More technical analysis and architecture design articles
## FAQ
- [FAQ: Common questions and answers](/faq)
- [FAQ: Common questions and answers](/faq)
\ No newline at end of file
......@@ -2,18 +2,18 @@
## <a class="anchor" id="intro"></a> About TDengine
TDengine is an innovative Big Data processing product launched by Taos Data in the face of the fast-growing Internet of Things (IoT) Big Data market and technical challenges. It does not rely on any third-party software, nor does it optimize or package any open-source database or stream computing product. Instead, it is a product independently developed after absorbing the advantages of many traditional relational databases, NoSQL databases, stream computing engines, message queues, and other software. TDengine has its own unique Big Data processing advantages in time-series space.
TDengine is an innovative Big Data processing product launched by TAOS Data in the face of the fast-growing Internet of Things (IoT) Big Data market and technical challenges. It does not rely on any third-party software, nor does it optimize or package any open-source database or stream computing product. Instead, it is a product independently developed after absorbing the advantages of many traditional relational databases, NoSQL databases, stream computing engines, message queues, and other software. TDengine has its own unique Big Data processing advantages in time-series space.
One of the modules of TDengine is the time-series database. However, in addition to this, to reduce the complexity of research and development and the difficulty of system operation, TDengine also provides functions such as caching, message queuing, subscription, stream computing, etc. TDengine provides a full-stack technical solution for the processing of IoT and Industrial Internet BigData. It is an efficient and easy-to-use IoT Big Data platform. Compared with typical Big Data platforms such as Hadoop, TDengine has the following distinct characteristics:
- **Performance improvement over 10 times**: An innovative data storage structure is defined, with each single core can process at least 20,000 requests per second, insert millions of data points, and read more than 10 million data points, which is more than 10 times faster than other existing general database.
- **Reduce the cost of hardware or cloud services to 1/5**: Due to its ultra-performance, TDengine’s computing resources consumption is less than 1/5 of other common Big Data solutions; through columnar storage and advanced compression algorithms, the storage consumption is less than 1/10 of other general databases.
- **Full-stack time-series data processing engine**: Integrate database, message queue, cache, stream computing, and other functions, and the applications do not need to integrate with software such as Kafka/Redis/HBase/Spark/HDFS, thus greatly reducing the complexity cost of application development and maintenance.
- **Powerful analysis functions**: Data from ten years ago or one second ago, can all be queried based on a specified time range. Data can be aggregated on a timeline or multiple devices. Ad-hoc queries can be made at any time through Shell, Python, R, and MATLAB.
- **Seamless connection with third-party tools**: Integration with Telegraf, Grafana, EMQ, HiveMQ, Prometheus, MATLAB, R, etc. without even one single line of code. OPC, Hadoop, Spark, etc. will be supported in the future, and more BI tools will be seamlessly connected to.
- **Highly Available and Horizontal Scalable **: With the distributed architecture and consistency algorithm, via multi-replication and clustering features, TDengine ensures high availability and horizontal scalability to support the mission-critical applications.
- **Zero operation cost & zero learning cost**: Installing clusters is simple and quick, with real-time backup built-in, and no need to split libraries or tables. Similar to standard SQL, TDengine can support RESTful, Python/Java/C/C++/C#/Go/Node.js, and similar to MySQL with zero learning cost.
- **Core is Open Sourced:** Except some auxiliary features, the core of TDengine is open sourced. Enterprise won't be locked by the database anymore. Ecosystem is more strong, product is more stable, and developer communities are more active.
With TDengine, the total cost of ownership of typical IoT, Internet of Vehicles, and Industrial Internet Big Data platforms can be greatly reduced. However, it should be pointed out that due to making full use of the characteristics of IoT time-series data, TDengine cannot be used to process general data from web crawlers, microblogs, WeChat, e-commerce, ERP, CRM, and other sources.
With TDengine, the total cost of ownership of typical IoT, Internet of Vehicles, and Industrial Internet Big Data platforms can be greatly reduced. However, since it makes full use of the characteristics of IoT time-series data, TDengine cannot be used to process general data from web crawlers, microblogs, WeChat, e-commerce, ERP, CRM, and other sources.
![TDengine Technology Ecosystem](page://images/eco_system.png)
......@@ -62,4 +62,4 @@ From the perspective of data sources, designers can analyze the applicability of
| ------------------------------------------------- | ------------------ | ----------------------- | ------------------- | ------------------------------------------------------------ |
| Require system with high-reliability | | | √ | TDengine has a very robust and reliable system architecture to implement simple and convenient daily operation with streamlined experiences for operators, thus human errors and accidents are eliminated to the greatest extent. |
| Require controllable operation learning cost | | | √ | As above. |
| Require abundant talent supply | √ | | | As a new-generation product, it’s still difficult to find talents with TDengine experiences from market. However, the learning cost is low. As the vendor, we also provide extensive operation training and counselling services. |
| Require abundant talent supply | √ | | | As a new-generation product, it’s still difficult to find talents with TDengine experiences from market. However, the learning cost is low. As the vendor, we also provide extensive operation training and counselling services. |
\ No newline at end of file
......@@ -2,7 +2,7 @@
## <a class="anchor" id="install"></a>Quick Install
TDengine software consists of 3 components: server, client, and alarm module. At the moment, TDengine server only runs on Linux (Windows, mac OS and more OS supports will come soon), but client can run on either Windows or Linux. TDengine client can be installed and run on Windows or Linux. Applications based-on any OSes can all connect to server taosd via a RESTful interface. About CPU, TDengine supports X64/ARM64/MIPS64/Alpha64, and ARM32、RISC-V, other more CPU architectures will be supported soon. You can set up and install TDengine server either from the [source code](https://www.taosdata.com/en/getting-started/#Install-from-Source) or the [packages](https://www.taosdata.com/en/getting-started/#Install-from-Package).
TDengine software consists of 3 parts: server, client, and alarm module. At the moment, TDengine server only runs on Linux (Windows, mac OS and more OS supports will come soon), but client can run on either Windows or Linux. TDengine client can be installed and run on Windows or Linux. Applications based-on any OSes can all connect to server taosd via a RESTful interface. About CPU, TDengine supports X64/ARM64/MIPS64/Alpha64, and ARM32、RISC-V, other more CPU architectures will be supported soon. You can set up and install TDengine server either from the [source code](https://www.taosdata.com/en/getting-started/#Install-from-Source) or the [packages](https://www.taosdata.com/en/getting-started/#Install-from-Package).
### <a class="anchor" id="source-install"></a>Install from Source
......@@ -16,7 +16,7 @@ Please refer to the detailed operation in [Quickly experience TDengine through D
### <a class="anchor" id="package-install"></a>Install from Package
It’s extremely easy to install for TDengine, which takes only a few seconds from downloaded to successful installed. The server installation package includes clients and connectors. We provide 3 installation packages, which you can choose according to actual needs:
Three different packages for TDengine server are provided, please pick up the one you like. (Lite packages only have execution files and connector of C/C++, but standard packages support connectors of nearly all programming languages.) Beta version has more features, but we suggest you to install stable version for production or testing.
Click [here](https://www.taosdata.com/en/getting-started/#Install-from-Package) to download the install package.
......@@ -131,7 +131,7 @@ After starting the TDengine server, you can execute the command `taosdemo` in th
$ taosdemo
```
Using this command, a STable named `meters` will be created in the database `test` There are 10k tables under this stable, named from `t0` to `t9999`. In each table there are 100k rows of records, each row with columns (`f1`, `f2` and `f3`. The timestamp is from "2017-07-14 10:40:00 000" to "2017-07-14 10:41:39 999". Each table also has tags `areaid` and `loc`: `areaid` is set from 1 to 10, `loc` is set to "beijing" or "shanghai".
Using this command, a STable named `meters` will be created in the database `test`. There are 10k tables under this STable, named from `t0` to `t9999`. In each table there are 100k rows of records, each row with columns (`f1`, `f2` and `f3`. The timestamp is from "2017-07-14 10:40:00 000" to "2017-07-14 10:41:39 999". Each table also has tags `areaid` and `loc`: `areaid` is set from 1 to 10, `loc` is set to "beijing" or "shanghai".
It takes about 10 minutes to execute this command. Once finished, 1 billion rows of records will be inserted.
......@@ -201,7 +201,7 @@ Note: ● has been verified by official tests; ○ has been verified by unoffici
List of platforms supported by TDengine client and connectors
At the moment, TDengine connectors can support a wide range of platforms, including hardware platforms such as X64/X86/ARM64/ARM32/MIPS/Alpha, and development environments such as Linux/Win64/Win32.
At the moment, TDengine connectors can support a wide range of platforms, including hardware platforms such as X64/X86/ARM64/ARM32/MIPS/Alpha, and operating system such as Linux/Win64/Win32.
Comparison matrix as following:
......@@ -218,4 +218,4 @@ Comparison matrix as following:
Note: ● has been verified by official tests; ○ has been verified by unofficial tests.
Please visit [Connectors](https://www.taosdata.com/en/documentation/connector) section for more detailed information.
Please visit Connectors section for more detailed information.
\ No newline at end of file
......@@ -2,17 +2,15 @@
TDengine adopts a relational data model, so we need to build the "database" and "table". Therefore, for a specific application scenario, it is necessary to consider the design of the database, STable and ordinary table. This section does not discuss detailed syntax rules, but only concepts.
Please watch the [video tutorial](https://www.taosdata.com/blog/2020/11/11/1945.html) for data modeling.
## <a class="anchor" id="create-db"></a> Create a Database
Different types of data collection points often have different data characteristics, including frequency of data collecting, length of data retention time, number of replicas, size of data blocks, whether to update data or not, and so on. To ensure TDengine working with great efficiency in various scenarios, TDengine suggests creating tables with different data characteristics in different databases, because each database can be configured with different storage strategies. When creating a database, in addition to SQL standard options, the application can also specify a variety of parameters such as retention duration, number of replicas, number of memory blocks, time accuracy, max and min number of records in a file block, whether it is compressed or not, and number of days a data file will be overwritten. For example:
Different types of data collection points often have different data characteristics, including data sampling rate, length of data retention time, number of replicas, size of data blocks, whether to update data or not, and so on. To ensure TDengine working with great efficiency in various scenarios, TDengine suggests creating tables with different data characteristics in different databases, because each database can be configured with different storage strategies. When creating a database, in addition to SQL standard options, the application can also specify a variety of parameters such as retention duration, number of replicas, number of memory blocks, time resolution, max and min number of records in a file block, whether it is compressed or not, and number of days covered by a data file. For example:
```mysql
CREATE DATABASE power KEEP 365 DAYS 10 BLOCKS 6 UPDATE 1;
```
The above statement will create a database named “power”. The data of this database will be kept for 365 days (it will be automatically deleted 365 days later), one data file created per 10 days, and the number of memory blocks is 4 for data updating. For detailed syntax and parameters, please refer to [Data Management section of TAOS SQL](https://www.taosdata.com/en/documentation/taos-sql#management).
The above statement will create a database named “power”. The data of this database will be kept for 365 days (data will be automatically deleted 365 days later), one data file will be created per 10 days, the number of memory blocks is 4, and data updating is allowed. For detailed syntax and parameters, please refer to [Data Management section of TAOS SQL](https://www.taosdata.com/en/documentation/taos-sql#management).
After the database created, please use SQL command USE to switch to the new database, for example:
......@@ -20,13 +18,12 @@ After the database created, please use SQL command USE to switch to the new data
USE power;
```
Replace the database operating in the current connection with “power”, otherwise, before operating on a specific table, you need to use "database name. table name" to specify the name of database to use.
Specify the database operating in the current connection with “power”, otherwise, before operating on a specific table, you need to use "database-name.table-name" to specify the name of database to use.
**Note:**
- Any table or STable belongs to a database. Before creating a table, a database must be created first.
- Tables in two different databases cannot be JOIN.
- You need to specify a timestamp when creating and inserting records, or querying history records.
## <a class="anchor" id="create-stable"></a> Create a STable
......@@ -38,11 +35,11 @@ CREATE STABLE meters (ts timestamp, current float, voltage int, phase float) TAG
**Note:** The STABLE keyword in this instruction needs to be written as TABLE in versions before 2.0.15.
Just like creating an ordinary table, you need to provide the table name (‘meters’ in the example) and the table structure Schema, that is, the definition of data columns. The first column must be a timestamp (‘ts’ in the example), the other columns are the physical metrics collected (current, volume, phase in the example), and the data types can be int, float, string, etc. In addition, you need to provide the schema of the tag (location, groupId in the example), and the data types of the tag can be int, float, string and so on. Static attributes of collection points can often be used as tags, such as geographic location of collection points, device model, device group ID, administrator ID, etc. The schema of the tag can be added, deleted and modified afterwards. Please refer to the [STable Management section of TAOS SQL](https://www.taosdata.com/cn/documentation/taos-sql#super-table) for specific definitions and details.
Just like creating an ordinary table, you need to provide the table name (‘meters’ in the example) and the table structure Schema, that is, the definition of data columns. The first column must be a timestamp (‘ts’ in the example), the other columns are the physical metrics collected (current, volume, phase in the example), and the data types can be int, float, string, etc. In addition, you need to provide the schema of the tag (location, groupId in the example), and the data types of the tag can be int, float, string and so on. Static attributes of data collection points can often be used as tags, such as geographic location of collection points, device model, device group ID, administrator ID, etc. The schema of the tags can be added, deleted and modified afterwards. Please refer to the [STable Management section of TAOS SQL](https://www.taosdata.com/cn/documentation/taos-sql#super-table) for specific definitions and details.
Each type of data collection point needs an established STable, so an IoT system often has multiple STables. For the power grid, we need to build a STable for smart meters, transformers, buses, switches, etc. For IoT, a device may have multiple data collection points (for example, a fan for wind-driven generator, some collection points capture parameters such as current and voltage, and some capture environmental parameters such as temperature, humidity and wind direction). In this case, multiple STables need to be established for corresponding types of devices. All collected physical metrics contained in one and the same STable must be collected at the same time (with a consistent timestamp).
A STable must be created for each type of data collection point, so an IoT system often has multiple STables. For the power grid, we need to build a STable for smart meters, a STable for transformers, a STable for buses, a STable for switches, etc. For IoT, a device may have multiple data collection points (for example, a fan for wind-driven generator, one data collection point captures metrics such as current and voltage, and one data collection point captures environmental parameters such as temperature, humidity and wind direction). In this case, multiple STables need to be established for corresponding types of devices. All metrics contained in a STable must be collected at the same time (with the same timestamp).
A STable allows up to 1024 columns. If the number of physical metrics collected at a collection point exceeds 1024, multiple STables need to be built to process them. A system can have multiple DBs, and a DB can have one or more STables.
A STable allows up to 1024 columns. If the number of metrics collected at a data collection point exceeds 1024, multiple STables need to be built to process them. A system can have multiple DBs, and a DB can have one or more STables.
## <a class="anchor" id="create-table"></a> Create a Table
......@@ -54,22 +51,23 @@ CREATE TABLE d1001 USING meters TAGS ("Beijing.Chaoyang", 2);
Where d1001 is the table name, meters is the name of the STable, followed by the specific tag value of tag Location as "Beijing.Chaoyang", and the specific tag value of tag groupId 2. Although the tag value needs to be specified when creating the table, it can be modified afterwards. Please refer to the [Table Management section of TAOS SQL](https://www.taosdata.com/en/documentation/taos-sql#table) for details.
**Note: ** At present, TDengine does not technically restrict the use of a STable of a database (dbA) as a template to create a sub-table of another database (dbB). This usage will be prohibited later, and it is not recommended to use this method to create a table.
**Note: ** At present, TDengine does not technically restrict the use of a STable of a database (dbA) as a template to create a sub-table of another database (dbB). This usage will be prohibited later, and it is not recommended to use this way to create a table.
TDengine suggests to use the globally unique ID of data collection point as a table name (such as device serial number). However, in some scenarios, there is no unique ID, and multiple IDs can be combined into a unique ID. It is not recommended to use a unique ID as tag value.
**Automatic table creating** : In some special scenarios, user is not sure whether the table of a certain data collection point exists when writing data. In this case, the non-existent table can be created by using automatic table building syntax when writing data. If the table already exists, no new table will be created. For example:
**Automatic table creating** : In some special scenarios, user is not sure whether the table of a certain data collection point exists when writing data. In this case, the non-existent table can be created by using automatic table creating syntax when writing data. If the table already exists, no new table will be created. For example:
```mysql
INSERT INTO d1001 USING METERS TAGS ("Beijng.Chaoyang", 2) VALUES (now, 10.2, 219, 0.32);
```
The SQL statement above inserts records (now, 10.2, 219, 0.32) into table d1001. If table d1001 has not been created yet, the STable meters is used as the template to automatically create it, and the tag value "Beijing.Chaoyang", 2 is marked at the same time.
The SQL statement above inserts records (now, 10.2, 219, 0.32) into table d1001. If table d1001 has not been created yet, the STable meters is used as the template to create it automatically, and the tag value "Beijing.Chaoyang", 2 is set at the same time.
For detailed syntax of automatic table building, please refer to the "[Automatic Table Creation When Inserting Records](https://www.taosdata.com/en/documentation/taos-sql#auto_create_table)" section.
## Multi-column Model vs Single-column Model
TDengine supports multi-column model. As long as physical metrics are collected simultaneously by a data collection point (with a consistent timestamp), these metrics can be placed in a STable as different columns. However, there is also an extreme design, a single-column model, in which each collected physical metric is set up separately, so each type of physical metrics is set up separately with a STable. For example, create 3 Stables, one each for current, voltage and phase.
TDengine supports multi-column model. As long as metrics are collected simultaneously by a data collection point (with the same timestamp), these metrics can be placed in a STable as different columns. However, there is also an extreme design, a single-column model, in which a STable is created for each metric. For smart meter example, we need to create 3 Stables, one for current, one for voltage and one for phase.
TDengine recommends using multi-column model as much as possible because of higher insertion and storage efficiency. However, for some scenarios, types of collected metrics often change. In this case, if multi-column model is adopted, the schema definition of STable needs to be modified frequently and the application becomes complicated. To avoid that, single-column model is recommended.
TDengine recommends using multi-column model as much as possible because of higher insertion and storage efficiency. However, for some scenarios, types of collected metrics often change. In this case, if multi-column model is adopted, the structure definition of STable needs to be frequently modified so make the application complicated. To avoid that, single-column model is recommended.
# Efficient Data Writing
TDengine supports multiple interfaces to write data, including SQL, Prometheus, Telegraf, EMQ MQTT Broker, HiveMQ Broker, CSV file, etc. Kafka, OPC and other interfaces will be provided in the future. Data can be inserted in a single piece or in batches, data from one or multiple data collection points can be inserted at the same time. TDengine supports multi-thread insertion, nonsequential data insertion, and also historical data insertion.
TDengine supports multiple ways to write data, including SQL, Prometheus, Telegraf, EMQ MQTT Broker, HiveMQ Broker, CSV file, etc. Kafka, OPC and other interfaces will be provided in the future. Data can be inserted in one single record or in batches, data from one or multiple data collection points can be inserted at the same time. TDengine supports multi-thread insertion, out-of-order data insertion, and also historical data insertion.
## <a class="anchor" id="sql"></a> SQL Writing
## <a class="anchor" id="sql"></a> Data Writing via SQL
Applications insert data by executing SQL insert statements through C/C++, JDBC, GO, C#, or Python Connector, and users can manually enter SQL insert statements to insert data through TAOS Shell. For example, the following insert writes a record to table d1001:
......@@ -10,13 +10,13 @@ Applications insert data by executing SQL insert statements through C/C++, JDBC,
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31);
```
TDengine supports writing multiple records at a time. For example, the following command writes two records to table d1001:
TDengine supports writing multiple records in a single statement. For example, the following command writes two records to table d1001:
```mysql
INSERT INTO d1001 VALUES (1538548684000, 10.2, 220, 0.23) (1538548696650, 10.3, 218, 0.25);
```
TDengine also supports writing data to multiple tables at a time. For example, the following command writes two records to d1001 and one record to d1002:
TDengine also supports writing data to multiple tables in a single statement. For example, the following command writes two records to d1001 and one record to d1002:
```mysql
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31) (1538548695000, 12.6, 218, 0.33) d1002 VALUES (1538548696800, 12.3, 221, 0.31);
......@@ -26,22 +26,22 @@ For the SQL INSERT Grammar, please refer to [Taos SQL insert](https://www.taosd
**Tips:**
- To improve writing efficiency, batch writing is required. The more records written in a batch, the higher the insertion efficiency. However, a record cannot exceed 16K, and the total length of an SQL statement cannot exceed 64K (it can be configured by parameter maxSQLLength, and the maximum can be configured to 1M).
- TDengine supports multi-thread parallel writing. To further improve writing speed, a client needs to open more than 20 threads to write parallelly. However, after the number of threads reaches a certain threshold, it cannot be increased or even become decreased, because too much frequent thread switching brings extra overhead.
- For a same table, if the timestamp of a newly inserted record already exists, (no database was created using UPDATE 1) the new record will be discarded as default, that is, the timestamp must be unique in a table. If an application automatically generates records, it is very likely that the generated timestamps will be the same, so the number of records successfully inserted will be smaller than the number of records the application try to insert. If you use UPDATE 1 option when creating a database, inserting a new record with the same timestamp will overwrite the original record.
- To improve writing efficiency, batch writing is required. The more records written in a batch, the higher the insertion efficiency. However, a record size cannot exceed 16K, and the total length of an SQL statement cannot exceed 64K (it can be configured by parameter maxSQLLength, and the maximum can be configured to 1M).
- TDengine supports multi-thread parallel writing. To further improve writing speed, a client needs to open more than 20 threads to write parallelly. However, after the number of threads reaches a certain threshold, it cannot be increased or even become decreased, because too much thread switching brings extra overhead.
- For the same table, if the timestamp of a newly inserted record already exists, the new record will be discarded as default (database option update = 0), that is, the timestamp must be unique in a table. If an application automatically generates records, it is very likely that the generated timestamps will be the same, so the number of records successfully inserted will be smaller than the number of records the application try to insert. If you use UPDATE 1 option when creating a database, inserting a new record with the same timestamp will overwrite the original record.
- The timestamp of written data must be greater than the current time minus the time of configuration parameter keep. If keep is configured for 3650 days, data older than 3650 days cannot be written. The timestamp for writing data cannot be greater than the current time plus configuration parameter days. If days is configured to 2, data 2 days later than the current time cannot be written.
## <a class="anchor" id="prometheus"></a> Direct Writing of Prometheus
## <a class="anchor" id="prometheus"></a> Data Writing via Prometheus
As a graduate project of Cloud Native Computing Foundation, [Prometheus](https://www.prometheus.io/) is widely used in the field of performance monitoring and K8S performance monitoring. TDengine provides a simple tool [Bailongma](https://github.com/taosdata/Bailongma), which only needs to be simply configured in Prometheus without any code, and can directly write the data collected by Prometheus into TDengine, then automatically create databases and related table entries in TDengine according to rules. Blog post [Use Docker Container to Quickly Build a Devops Monitoring Demo](https://www.taosdata.com/blog/2020/02/03/1189.html), which is an example of using bailongma to write Prometheus and Telegraf data into TDengine.
### Compile blm_prometheus From Source
Users need to download the source code of [Bailongma](https://github.com/taosdata/Bailongma) from github, then compile and generate an executable file using Golang language compiler. Before you start compiling, you need to complete following prepares:
Users need to download the source code of [Bailongma](https://github.com/taosdata/Bailongma) from github, then compile and generate an executable file using Golang language compiler. Before you start compiling, you need to prepare:
- A server running Linux OS
- Golang version 1.10 and higher installed
- An appropriated TDengine version. Because the client dynamic link library of TDengine is used, it is necessary to install the same version of TDengine as the server-side; for example, if the server version is TDengine 2.0. 0, ensure install the same version on the linux server where bailongma is located (can be on the same server as TDengine, or on a different server)
- Since the client dynamic link library of TDengine is used, it is necessary to install the same version of TDengine as the server-side. For example, if the server version is TDengine 2.0. 0, ensure install the same version on the linux server where bailongma is located (can be on the same server as TDengine, or on a different server)
Bailongma project has a folder, blm_prometheus, which holds the prometheus writing API. The compiling process is as follows:
......@@ -134,7 +134,7 @@ The format of generated data by Prometheus is as follows:
}
```
Where apiserver_request_latencies_bucket is the name of the time-series data collected by prometheus, and the tag of the time-series data is in the following {}. blm_prometheus automatically creates a STable in TDengine with the name of the time series data, and converts the tag in {} into the tag value of TDengine, with Timestamp as the timestamp and value as the value of the time-series data. Therefore, in the client of TDEngine, you can check whether this data was successfully written through the following instruction.
Where apiserver_request_latencies_bucket is the name of the time-series data collected by prometheus, and the tag of the time-series data is in the following {}. blm_prometheus automatically creates a STable in TDengine with the name of the time series data, and converts the tag in {} into the tag value of TDengine, with Timestamp as the timestamp and value as the value of the time-series data. Therefore, in the client of TDengine, you can check whether this data was successfully written through the following instruction.
```mysql
use prometheus;
......@@ -144,7 +144,7 @@ select * from apiserver_request_latencies_bucket;
## <a class="anchor" id="telegraf"></a> Direct Writing of Telegraf
## <a class="anchor" id="telegraf"></a> Data Writing via Telegraf
[Telegraf](https://www.influxdata.com/time-series-platform/telegraf/) is a popular open source tool for IT operation data collection. TDengine provides a simple tool [Bailongma](https://github.com/taosdata/Bailongma), which only needs to be simply configured in Telegraf without any code, and can directly write the data collected by Telegraf into TDengine, then automatically create databases and related table entries in TDengine according to rules. Blog post [Use Docker Container to Quickly Build a Devops Monitoring Demo](https://www.taosdata.com/blog/2020/02/03/1189.html), which is an example of using bailongma to write Prometheus and Telegraf data into TDengine.
......@@ -271,12 +271,12 @@ select * from cpu;
MQTT is a popular data transmission protocol in the IoT. TDengine can easily access the data received by MQTT Broker and write it to TDengine.
## <a class="anchor" id="emq"></a> Direct Writing of EMQ Broker
## <a class="anchor" id="emq"></a> Data Writing via EMQ Broker
[EMQ](https://github.com/emqx/emqx) is an open source MQTT Broker software, with no need of coding, only to use "rules" in EMQ Dashboard for simple configuration, and MQTT data can be directly written into TDengine. EMQ X supports storing data to the TDengine by sending it to a Web service, and also provides a native TDengine driver on Enterprise Edition for direct data store. Please refer to [EMQ official documents](https://docs.emqx.io/broker/latest/cn/rule/rule-example.html#%E4%BF%9D%E5%AD%98%E6%95%B0%E6%8D%AE%E5%88%B0-tdengine) for more details.
## <a class="anchor" id="hivemq"></a> Direct Writing of HiveMQ Broker
## <a class="anchor" id="hivemq"></a> Data Writing via HiveMQ Broker
[HiveMQ](https://www.hivemq.com/) is an MQTT agent that provides Free Personal and Enterprise Edition versions. It is mainly used for enterprises, emerging machine-to-machine(M2M) communication and internal transmission to meet scalability, easy management and security features. HiveMQ provides an open source plug-in development kit. You can store data to TDengine via HiveMQ extension-TDengine. Refer to the [HiveMQ extension-TDengine documentation](https://github.com/huskar-t/hivemq-tdengine-extension/blob/b62a26ecc164a310104df57691691b237e091c89/README.md) for more details.
[HiveMQ](https://www.hivemq.com/) is an MQTT agent that provides Free Personal and Enterprise Edition versions. It is mainly used for enterprises, emerging machine-to-machine(M2M) communication and internal transmission to meet scalability, easy management and security features. HiveMQ provides an open source plug-in development kit. You can store data to TDengine via HiveMQ extension-TDengine. Refer to the [HiveMQ extension-TDengine documentation](https://github.com/huskar-t/hivemq-tdengine-extension/blob/b62a26ecc164a310104df57691691b237e091c89/README.md) for more details.
\ No newline at end of file
......@@ -28,7 +28,7 @@ For specific query syntax, please see the [Data Query section of TAOS SQL](https
## <a class="anchor" id="aggregation"></a> Multi-table Aggregation Query
In an IoT scenario, there are often multiple data collection points in a same type. TDengine uses the concept of STable to describe a certain type of data collection point, and an ordinary table to describe a specific data collection point. At the same time, TDengine uses tags to describe the statical attributes of data collection points. A given data collection point has a specific tag value. By specifying the filters of tags, TDengine provides an efficient method to aggregate and query the sub-tables of STables (data collection points of a certain type). Aggregation functions and most operations on ordinary tables are applicable to STables, and the syntax is exactly the same.
In an IoT scenario, there are often multiple data collection points in a same type. TDengine uses the concept of STable to describe a certain type of data collection point, and an ordinary table to describe a specific data collection point. At the same time, TDengine uses tags to describe the static attributes of data collection points. A given data collection point has a specific tag value. By specifying the filters of tags, TDengine provides an efficient method to aggregate and query the sub-tables of STables (data collection points of a certain type). Aggregation functions and most operations on ordinary tables are applicable to STables, and the syntax is exactly the same.
**Example 1**: In TAOS Shell, look up the average voltages collected by all smart meters in Beijing and group them by location
......@@ -55,7 +55,7 @@ TDengine only allows aggregation queries between tables belonging to a same STab
## <a class="anchor" id="sampling"></a> Down Sampling Query, Interpolation
In a scenario of IoT, it is often necessary to aggregate the collected data by intervals through down sampling. TDengine provides a simple keyword interval, which makes query operations according to time windows extremely simple. For example, the current values collected by smart meter d1001 are summed every 10 seconds.
In a scenario of IoT, it is often necessary to aggregate the collected data by intervals through down sampling. TDengine provides a simple keyword `interval`, which makes query operations according to time windows extremely simple. For example, the current values collected by smart meter d1001 are summed every 10 seconds.
```mysql
taos> SELECT sum(current) FROM d1001 INTERVAL(10s);
......@@ -94,6 +94,6 @@ taos> SELECT SUM(current) FROM meters INTERVAL(1s, 500a);
Query OK, 5 row(s) in set (0.001521s)
```
In a scenario of IoT, it is difficult to synchronize the time stamp of collected data at each point, but many analysis algorithms (such as FFT) need to align the collected data strictly at equal intervals of time. In many systems, it’s required to write their own programs to process, but the down sampling operation of TDengine can be easily solved. If there is no collected data in an interval, TDengine also provides interpolation calculation function.
In IoT scenario, it is difficult to synchronize the time stamp of collected data at each point, but many analysis algorithms (such as FFT) need to align the collected data strictly at equal intervals of time. In many systems, it’s required to write their own programs to process, but the down sampling operation of TDengine can be used to solve the problem easily. If there is no collected data in an interval, TDengine also provides interpolation calculation function.
For details of syntax rules, please refer to the [Time-dimension Aggregation section of TAOS SQL](https://www.taosdata.com/en/documentation/taos-sql#aggregation).
\ No newline at end of file
......@@ -9,8 +9,8 @@ Continuous query of TDengine adopts time-driven mode, which can be defined direc
The continuous query provided by TDengine differs from the time window calculation in ordinary stream computing in the following ways:
- Unlike the real-time feedback calculated results of stream computing, continuous query only starts calculation after the time window is closed. For example, if the time period is 1 day, the results of that day will only be generated after 23:59:59.
- If a history record is written to the time interval that has been calculated, the continuous query will not recalculate and will not push the results to the user again. For the mode of writing back to TDengine, the existing calculated results will not be updated.
- Using the mode of continuous query pushing results, the server does not cache the client's calculation status, nor does it provide Exactly-Once semantic guarantee. If the user's application side crashed, the continuous query pulled up again would only recalculate the latest complete time window from the time pulled up again. If writeback mode is used, TDengine can ensure the validity and continuity of data writeback.
- If a history record is written to the time interval that has been calculated, the continuous query will not re-calculate and will not push the new results to the user again.
- TDengine server does not cache or save the client's status, nor does it provide Exactly-Once semantic guarantee. If the application crashes, the continuous query will be pull up again and starting time must be provided by the application.
### How to use continuous query
......@@ -29,7 +29,7 @@ We already know that the average voltage of these meters can be counted with one
select avg(voltage) from meters interval(1m) sliding(30s);
```
Every time this statement is executed, all data will be recalculated. If you need to execute every 30 seconds to incrementally calculate the data of the latest minute, you can improve the above statement as following, using a different `startTime` each time and executing it regularly:
Every time this statement is executed, all data will be re-calculated. If you need to execute every 30 seconds to incrementally calculate the data of the latest minute, you can improve the above statement as following, using a different `startTime` each time and executing it regularly:
```sql
select avg(voltage) from meters where ts > {startTime} interval(1m) sliding(30s);
......@@ -65,7 +65,7 @@ It should be noted that now in the above example refers to the time when continu
### Manage the Continuous Query
Users can view all continuous queries running in the system through the show streams command in the console, and can kill the corresponding continuous queries through the kill stream command. Subsequent versions will provide more finer-grained and convenient continuous query management commands.
Users can view all continuous queries running in the system through the `show streams` command in the console, and can kill the corresponding continuous queries through the `kill stream` command. Subsequent versions will provide more finer-grained and convenient continuous query management commands.
## <a class="anchor" id="subscribe"></a> Publisher/Subscriber
......@@ -101,7 +101,7 @@ Another method is to query the STable. In this way, no matter how many meters th
select * from meters where ts > {last_timestamp} and current > 10;
```
However, how to choose `last_timestamp` has become a new problem. Because, on the one hand, the time of data generation (the data timestamp) and the time of data storage are generally not the same, and sometimes the deviation is still very large; On the other hand, the time when the data of different meters arrive at TDengine will also vary. Therefore, if we use the timestamp of the data from the slowest meter as `last_timestamp` in the query, we may repeatedly read the data of other meters; If the timestamp of the fastest meter is used, the data of other meters may be missed.
However, how to choose `last_timestamp` has become a new problem. Because, on the one hand, the time of data generation (the data timestamp) and the time of data writing are generally not the same, and sometimes the deviation is still very large; On the other hand, the time when the data of different meters arrive at TDengine will also vary. Therefore, if we use the timestamp of the data from the slowest meter as `last_timestamp` in the query, we may repeatedly read the data of other meters; If the timestamp of the fastest meter is used, the data of other meters may be missed.
The subscription function of TDengine provides a thorough solution to the above problem.
......@@ -357,4 +357,4 @@ This SQL statement will obtain the last recorded voltage value of all smart mete
In scenarios of TDengine, alarm monitoring is a common requirement. Conceptually, it requires the program to filter out data that meet certain conditions from the data of the latest period of time, and calculate a result according to a defined formula based on these data. When the result meets certain conditions and lasts for a certain period of time, it will notify the user in some form.
In order to meet the needs of users for alarm monitoring, TDengine provides this function in the form of an independent module. For its installation and use, please refer to the blog [How to Use TDengine for Alarm Monitoring](https://www.taosdata.com/blog/2020/04/14/1438.html).
In order to meet the needs of users for alarm monitoring, TDengine provides this function in the form of an independent module. For its installation and use, please refer to the blog [How to Use TDengine for Alarm Monitoring](https://www.taosdata.com/blog/2020/04/14/1438.html).
\ No newline at end of file
......@@ -179,7 +179,7 @@ Clean up the running environment and call this API before the application exits.
- `int taos_options(TSDB_OPTION option, const void * arg, ...)`
Set client options, currently only time zone setting (`_TSDB_OPTIONTIMEZONE`) and encoding setting (`_TSDB_OPTIONLOCALE`) are supported. The time zone and encoding default to the current operating system settings.
Set client options, currently only time zone setting (_TSDB_OPTIONTIMEZONE) and encoding setting (_TSDB_OPTIONLOCALE) are supported. The time zone and encoding default to the current operating system settings.
- `char *taos_get_client_info()`
......
......@@ -2,7 +2,7 @@
## <a class="anchor" id="grafana"></a> Grafana
TDengine can quickly integrate with [Grafana](https://www.grafana.com/), an open source data visualization system, to build a data monitoring and alarming system. The whole process does not require any code to write. The contents of the data table in TDengine can be visually showed on DashBoard.
TDengine can be quickly integrated with [Grafana](https://www.grafana.com/), an open source data visualization system, to build a data monitoring and alarming system. The whole process does not require any code to write. The contents of the data table in TDengine can be visually showed on DashBoard.
### Install Grafana
......
# TDengine Cluster Management
Multiple TDengine servers, that is, multiple running instances of taosd, can form a cluster to ensure the highly reliable operation of TDengine and provide scale-out features. To understand cluster management in TDengine 2.0, it is necessary to understand the basic concepts of clustering. Please refer to the chapter "Overall Architecture of TDengine 2.0". And before installing the cluster, please follow the chapter ["Getting started"](https://www.taosdata.com/en/documentation/getting-started/) to install and experience the single node function.
Multiple TDengine servers, that is, multiple running instances of taosd, can form a cluster to ensure the highly reliable operation of TDengine and provide scale-out features. To understand cluster management in TDengine 2.0, it is necessary to understand the basic concepts of clustering. Please refer to the chapter "Overall Architecture of TDengine 2.0". And before installing the cluster, please follow the chapter ["Getting started"](https://www.taosdata.com/en/documentation/getting-started/) to install and experience the single node TDengine.
Each data node of the cluster is uniquely identified by End Point, which is composed of FQDN (Fully Qualified Domain Name) plus Port, such as [h1.taosdata.com](http://h1.taosdata.com/):6030. The general FQDN is the hostname of the server, which can be obtained through the Linux command `hostname -f` (how to configure FQDN, please refer to: [All about FQDN of TDengine](https://www.taosdata.com/blog/2020/09/11/1824.html)). Port is the external service port number of this data node. The default is 6030, but it can be modified by configuring the parameter serverPort in taos.cfg. A physical node may be configured with multiple hostnames, and TDengine will automatically get the first one, but it can also be specified through the configuration parameter fqdn in taos.cfg. If you are accustomed to direct IP address access, you can set the parameter fqdn to the IP address of this node.
Each data node of the cluster is uniquely identified by End Point, which is composed of FQDN (Fully Qualified Domain Name) plus Port, such as [h1.taosdata.com](http://h1.taosdata.com/):6030. The general FQDN is the hostname of the server, which can be obtained through the Linux command `hostname -f` (how to configure FQDN, please refer to: [All about FQDN of TDengine](https://www.taosdata.com/blog/2020/09/11/1824.html)). Port is the external service port number of this data node. The default is 6030, but it can be modified by configuring the parameter serverPort in taos.cfg. A physical node may be configured with multiple hostnames, and TDengine will automatically get the first one, but it can also be specified through the configuration parameter `fqdn` in taos.cfg. If you want to access via direct IP address, you can set the parameter `fqdn` to the IP address of this node.
The cluster management of TDengine is extremely simple. Except for manual intervention in adding and deleting nodes, all other tasks are completed automatically, thus minimizing the workload of operation. This chapter describes the operations of cluster management in detail.
......@@ -12,7 +12,7 @@ Please refer to the [video tutorial](https://www.taosdata.com/blog/2020/11/11/19
**Step 0:** Plan FQDN of all physical nodes in the cluster, and add the planned FQDN to /etc/hostname of each physical node respectively; modify the /etc/hosts of each physical node, and add the corresponding IP and FQDN of all cluster physical nodes. [If DNS is deployed, contact your network administrator to configure it on DNS]
**Step 1:** If the physical nodes have previous test data, installed with version 1. x, or installed with other versions of TDengine, please delete it first and drop all data. For specific steps, please refer to the blog "[Installation and Uninstallation of Various Packages of TDengine](https://www.taosdata.com/blog/2019/08/09/566.html)"
**Step 1:** If the physical nodes have previous test data, installed with version 1. x, or installed with other versions of TDengine, please backup all data, then delete it and drop all data. For specific steps, please refer to the blog "[Installation and Uninstallation of Various Packages of TDengine](https://www.taosdata.com/blog/2019/08/09/566.html)"
**Note 1:** Because the information of FQDN will be written into a file, if FQDN has not been configured or changed before, and TDengine has been started, be sure to clean up the previous data (`rm -rf /var/lib/taos/*`)on the premise of ensuring that the data is useless or backed up;
......@@ -136,7 +136,7 @@ Execute the CLI program taos, log in to the TDengine system using the root accou
DROP DNODE "fqdn:port";
```
Where fqdn is the FQDN of the deleted node, and port is the port number of its external server.
Where fqdn is the FQDN of the deleted node, and port is the port number.
<font color=green>**【Note】**</font>
......@@ -185,7 +185,7 @@ Because of the introduction of vnode, it is impossible to simply draw a conclusi
TDengine cluster is managed by mnode (a module of taosd, management node). In order to ensure the high-availability of mnode, multiple mnode replicas can be configured. The number of replicas is determined by system configuration parameter numOfMnodes, and the effective range is 1-3. In order to ensure the strong consistency of metadata, mnode replicas are duplicated synchronously.
A cluster has multiple data node dnodes, but a dnode runs at most one mnode instance. In the case of multiple dnodes, which dnode can be used as an mnode? This is automatically specified by the system according to the resource situation on the whole. User can execute the following command in the console of TDengine through the CLI program taos:
A cluster has multiple data node dnodes, but a dnode runs at most one mnode instance. In the case of multiple dnodes, which dnode can be used as an mnode? This is automatically selected by the system based on the resource on the whole. User can execute the following command in the console of TDengine through the CLI program taos:
```
SHOW MNODES;
......@@ -213,7 +213,7 @@ When the above three situations occur, the system will start a load computing of
If a data node is offline, the TDengine cluster will automatically detect it. There are two detailed situations:
- If the data node is offline for more than a certain period of time (configuration parameter offlineThreshold in taos.cfg controls the duration), the system will automatically delete the data node, generate system alarm information and trigger the load balancing process. If the deleted data node is online again, it will not be able to join the cluster, and the system administrator will need to add it to the cluster again.
- If the data node is offline for more than a certain period of time (configuration parameter `offlineThreshold` in taos.cfg controls the duration), the system will automatically delete the data node, generate system alarm information and trigger the load balancing process. If the deleted data node is online again, it will not be able to join the cluster, and the system administrator will need to add it to the cluster again.
- After offline, the system will automatically start the data recovery process if it goes online again within the duration of offlineThreshold. After the data is fully recovered, the node will start to work normally.
**Note:** If each data node belonging to a virtual node group (including mnode group) is in offline or unsynced state, Master can only be elected after all data nodes in the virtual node group are online and can exchange status information, and the virtual node group can serve externally. For example, the whole cluster has 3 data nodes with 3 replicas. If all 3 data nodes go down and then 2 data nodes restart, it will not work. Only when all 3 data nodes restart successfully can serve externally again.
......@@ -229,7 +229,7 @@ The name of the executable for Arbitrator is tarbitrator. The executable has alm
1. Click [Package Download](https://www.taosdata.com/cn/all-downloads/), and in the TDengine Arbitrator Linux section, select the appropriate version to download and install.
2. The command line parameter -p of this application can specify the port number of its external service, and the default is 6042.
2. The command line parameter -p of this application can specify the port number of its service, and the default is 6042.
3. Modify the configuration file of each taosd instance, and set parameter arbitrator to the End Point corresponding to the tarbitrator in taos.cfg. (If this parameter is configured, when the number of replicas is even, the system will automatically connect the configured Arbitrator. If the number of replicas is odd, even if the Arbitrator is configured, the system will not establish a connection.)
4. The Arbitrator configured in the configuration file will appear in the return result of instruction `SHOW DNODES`; the value of the corresponding role column will be "arb".
......@@ -22,8 +22,8 @@ If there is plenty of memory, the configuration of Blocks can be increased so th
CPU requirements depend on the following two aspects:
- **Data insertion** TDengine single core can handle at least 10,000 insertion requests per second. Each insertion request can take multiple records, and inserting one record at a time is almost the same as inserting 10 records in computing resources consuming. Therefore, the larger the number of inserts, the higher the insertion efficiency. If an insert request has more than 200 records, a single core can insert 1 million records per second. However, the faster the insertion speed, the higher the requirement for front-end data collection, because records need to be cached and then inserted in batches.
- **Query requirements** TDengine to provide efficient queries, but the queries in each scenario vary greatly and the query frequency too, making it difficult to give objective figures. Users need to write some query statements for their own scenes to determine.
- **Data insertion**: TDengine single core can handle at least 10,000 insertion requests per second. Each insertion request can take multiple records, and inserting one record at a time is almost the same as inserting 10 records in computing resources consuming. Therefore, the larger the number of records per insert, the higher the insertion efficiency. If an insert request has more than 200 records, a single core can insert 1 million records per second. However, the faster the insertion speed, the higher the requirement for front-end data collection, because records need to be cached and then inserted in batches.
- **Query**: TDengine provides efficient queries, but the queries in each scenario vary greatly and the query frequency too, making it difficult to give objective figures. Users need to write some query statements for their own scenes to estimate.
Therefore, only for data insertion, CPU can be estimated, but the computing resources consumed by query cannot be that clear. In the actual operation, it is not recommended to make CPU utilization rate over 50%. After that, new nodes need to be added to bring more computing resources.
......@@ -78,7 +78,7 @@ When the nodes in TDengine cluster are deployed on different physical machines a
## <a class="anchor" id="config"></a> Server-side Configuration
The background service of TDengine system is provided by taosd, and the configuration parameters can be modified in the configuration file taos.cfg to meet the requirements of different scenarios. The default location of the configuration file is the /etc/taos directory, which can be specified by executing the parameter -c from the taosd command line. Such as taosd-c/home/user, to specify that the configuration file is located in the /home/user directory.
The background service of TDengine system is provided by taosd, and the configuration parameters can be modified in the configuration file taos.cfg to meet the requirements of different scenarios. The default location of the configuration file is the /etc/taos directory, which can be specified by executing the parameter `-c` from the taosd command line. Such as `taosd -c /home/user`, to specify that the configuration file is located in the /home/user directory.
You can also use “-C” to show the current server configuration parameters:
......@@ -88,14 +88,14 @@ taosd -C
Only some important configuration parameters are listed below. For more parameters, please refer to the instructions in the configuration file. Please refer to the previous chapters for detailed introduction and function of each parameter, and the default of these parameters is working and generally does not need to be set. **Note: After the configuration is modified, \*taosd service\* needs to be restarted to take effect.**
- firstEp: end point of the first dnode in the actively connected cluster when taosd starts, the default value is localhost: 6030.
- fqdn: FQDN of the data node, which defaults to the first hostname configured by the operating system. If you are accustomed to IP address access, you can set it to the IP address of the node.
- firstEp: end point of the first dnode which will be connected in the cluster when taosd starts, the default value is localhost: 6030.
- fqdn: FQDN of the data node, which defaults to the first hostname configured by the operating system. If you want to access via IP address directly, you can set it to the IP address of the node.
- serverPort: the port number of the external service after taosd started, the default value is 6030.
- httpPort: the port number used by the RESTful service to which all HTTP requests (TCP) require a query/write request. The default value is 6041.
- dataDir: the data file directory to which all data files will be written. [Default:/var/lib/taos](http://default/var/lib/taos).
- logDir: the log file directory to which the running log files of the client and server will be written. [Default:/var/log/taos](http://default/var/log/taos).
- arbitrator: the end point of the arbiter in the system; the default value is null.
- role: optional role for dnode. 0-any; it can be used as an mnode and to allocate vnodes; 1-mgmt; It can only be an mnode, but not to allocate vnodes; 2-dnode; caannot be an mnode, only vnode can be allocated
- arbitrator: the end point of the arbitrator in the system; the default value is null.
- role: optional role for dnode. 0-any; it can be used as an mnode and to allocate vnodes; 1-mgmt; It can only be an mnode, but not to allocate vnodes; 2-dnode; cannot be an mnode, only vnode can be allocated
- debugFlage: run the log switch. 131 (output error and warning logs), 135 (output error, warning, and debug logs), 143 (output error, warning, debug, and trace logs). Default value: 131 or 135 (different modules have different default values).
- numOfLogLines: the maximum number of lines allowed for a single log file. Default: 10,000,000 lines.
- logKeepDays: the maximum retention time of the log file. When it is greater than 0, the log file will be renamed to taosdlog.xxx, where xxx is the timestamp of the last modification of the log file in seconds. Default: 0 days.
......@@ -161,18 +161,18 @@ For example:
## <a class="anchor" id="client"></a> Client Configuration
The foreground interactive client application of TDengine system is taos and application driver, which shares the same configuration file taos.cfg with taosd. When running taos, use the parameter -c to specify the configuration file directory, such as taos-c/home/cfg, which means using the parameters in the taos.cfg configuration file under the /home/cfg/ directory. The default directory is /etc/taos. For more information on how to use taos, see the help information taos --help. This section mainly describes the parameters used by the taos client application in the configuration file taos.cfg.
The foreground interactive client application of TDengine system is taos and application driver, which shares the same configuration file taos.cfg with taosd. When running taos, use the parameter `-c` to specify the configuration file directory, such as `taos -c /home/cfg`, which means using the parameters in the taos.cfg configuration file under the /home/cfg/ directory. The default directory is /etc/taos. For more information on how to use taos, see the help information `taos --help`. This section mainly describes the parameters used by the taos client application in the configuration file taos.cfg.
**Versions after 2.0. 10.0 support the following parameters on command line to display the current client configuration parameters**
```bash
taos -C taos --dump-config
taos -C or taos --dump-config
```
Client configuration parameters:
- firstEp: end point of the first taosd instance in the actively connected cluster when taos is started, the default value is localhost: 6030.
- secondEp: when taos starts, if not impossible to connect to firstEp, it will try to connect to secondEp.
- secondEp: when taos starts, if unable to connect to firstEp, it will try to connect to secondEp.
- locale
Default value: obtained dynamically from the system. If the automatic acquisition fails, user needs to set it in the configuration file or through API
......@@ -493,4 +493,4 @@ At the moment, TDengine has nearly 200 internal reserved keywords, which cannot
| CONCAT | GLOB | METRICS | SET | VIEW |
| CONFIGS | GRANTS | MIN | SHOW | WAVG |
| CONFLICT | GROUP | MINUS | SLASH | WHERE |
| CONNECTION | | | | |
| CONNECTION | | | | |
\ No newline at end of file
# TAOS SQL
TDengine provides a SQL-style language, TAOS SQL, to insert or query data, and support other common tips. To finish this document, you should have some understanding about SQL.
TDengine provides a SQL-style language, TAOS SQL, to insert or query data. To read through this document, you should have some basic understanding about SQL.
TAOS SQL is the main tool for users to write and query data to TDengine. TAOS SQL provides a style and mode similar to standard SQL to facilitate users to get started quickly. Strictly speaking, TAOS SQL is not and does not attempt to provide SQL standard syntax. In addition, since TDengine does not provide deletion function for temporal structured data, the relevant function of data deletion is non-existent in TAO SQL.
TAOS SQL is the main way for users to write and query data to TDengine. TAOS SQL is similar to standard SQL to facilitate users to get started quickly. Strictly speaking, TAOS SQL is not and does not attempt to provide SQL standard syntax. In addition, since TDengine does not provide deletion function for time-series data, the relevant function of data deletion is non-existent in TAO SQL.
Let’s take a look at the conventions used for syntax descriptions.
......@@ -127,7 +127,7 @@ Note:
ALTER DATABASE db_name CACHELAST 0;
```
CACHELAST parameter controls whether last_row of the data subtable is cached in memory. The default value is 0, and the value range is [0, 1]. Where 0 means not enabled and 1 means enabled. (supported from version 2.0. 11)
**Tips**: After all the above parameters are modified, show databases can be used to confirm whether the modification is successful.
- **Show all databases in system**
......@@ -138,14 +138,17 @@ Note:
## <a class="anchor" id="table"></a> Table Management
- Create a table
Note:
- **Create a table**
1. The first field must be a timestamp, and system will set it as the primary key;
2. The max length of table name is 192;
3. The length of each row of the table cannot exceed 16k characters;
4. Sub-table names can only consist of letters, numbers, and underscores, and cannot begin with numbers
5. If the data type binary or nchar is used, the maximum number of bytes should be specified, such as binary (20), which means 20 bytes;
```mysql
CREATE TABLE [IF NOT EXISTS] tb_name (timestamp_field_name TIMESTAMP, field1_name data_type1 [, field2_name data_type2 ...]);
```
Note:
1. The first field must be a timestamp, and system will set it as the primary key;
2. The max length of table name is 192;
3. The length of each row of the table cannot exceed 16k characters;
4. Sub-table names can only consist of letters, numbers, and underscores, and cannot begin with numbers
5. If the data type binary or nchar is used, the maximum number of bytes should be specified, such as binary (20), which means 20 bytes;
- **Create a table via STable**
......@@ -171,10 +174,10 @@ Note:
Note:
1. The method of batch creating tables requires that the data table must use STable as a template.
2. On the premise of not exceeding the length limit of SQL statements, it is suggested that the number of tables in a single statement should be controlled between 1000 and 3000, which will obtain an ideal speed of table building.
2. On the premise of not exceeding the length limit of SQL statements, it is suggested that the number of tables in a single statement should be controlled between 1000 and 3000, which will obtain an ideal speed of table creating.
- **Drop a table**
```mysql
DROP TABLE [IF EXISTS] tb_name;
```
......@@ -218,7 +221,7 @@ Note:
## <a class="anchor" id="super-table"></a> STable Management
Note: In 2.0. 15.0 and later versions, STABLE reserved words are supported. That is, in the instruction description later in this section, the three instructions of CREATE, DROP and ALTER need to write TABLE instead of STABLE in the old version as the reserved word.
Note: In 2.0.15.0 and later versions, STABLE reserved words are supported. That is, in the instruction description later in this section, the three instructions of CREATE, DROP and ALTER need to write TABLE instead of STABLE in the old version as the reserved word.
- **Create a STable**
......@@ -290,7 +293,7 @@ Note: In 2.0. 15.0 and later versions, STABLE reserved words are supported. That
Modify a tag name of STable. After modifying, all sub-tables under the STable will automatically update the new tag name.
- **Modify a tag value of sub-table**
```mysql
ALTER TABLE tb_name SET TAG tag_name=new_tag_value;
```
......@@ -306,7 +309,7 @@ Note: In 2.0. 15.0 and later versions, STABLE reserved words are supported. That
Insert a record into table tb_name.
- **Insert a record with data corresponding to a given column**
```mysql
INSERT INTO tb_name (field1_name, ...) VALUES (field1_value1, ...);
```
......@@ -320,14 +323,14 @@ Note: In 2.0. 15.0 and later versions, STABLE reserved words are supported. That
Insert multiple records into table tb_name.
- **Insert multiple records into a given column**
```mysql
INSERT INTO tb_name (field1_name, ...) VALUES (field1_value1, ...) (field1_value2, ...) ...;
```
Insert multiple records into a given column of table tb_name.
- **Insert multiple records into multiple tables**
```mysql
INSERT INTO tb1_name VALUES (field1_value1, ...) (field1_value2, ...) ...
tb2_name VALUES (field1_value1, ...) (field1_value2, ...) ...;
......@@ -421,7 +424,7 @@ taos> SELECT * FROM d1001;
Query OK, 3 row(s) in set (0.001165s)
```
For Stables, wildcards contain *tag columns*.
For STables, wildcards contain *tag columns*.
```mysql
taos> SELECT * FROM meters;
......@@ -720,7 +723,7 @@ TDengine supports aggregations over data, they are listed below:
================================================
9 | 9 |
Query OK, 1 row(s) in set (0.004475s)
taos> SELECT COUNT(*), COUNT(voltage) FROM d1001;
count(*) | count(voltage) |
================================================
......@@ -758,7 +761,7 @@ TDengine supports aggregations over data, they are listed below:
```
- **TWA**
```mysql
SELECT TWA(field_name) FROM tb_name WHERE clause;
```
......@@ -799,7 +802,7 @@ TDengine supports aggregations over data, they are listed below:
================================================================================
35.200000763 | 658 | 0.950000018 |
Query OK, 1 row(s) in set (0.000980s)
```
```
- **STDDEV**
......@@ -896,7 +899,7 @@ TDengine supports aggregations over data, they are listed below:
======================================
13.40000 | 223 |
Query OK, 1 row(s) in set (0.001123s)
taos> SELECT MAX(current), MAX(voltage) FROM d1001;
max(current) | max(voltage) |
======================================
......@@ -937,8 +940,6 @@ TDengine supports aggregations over data, they are listed below:
Query OK, 1 row(s) in set (0.001023s)
```
-
- **LAST**
```mysql
......@@ -972,7 +973,7 @@ TDengine supports aggregations over data, they are listed below:
```
- **TOP**
```mysql
SELECT TOP(field_name, K) FROM { tb_name | stb_name } [WHERE clause];
```
......@@ -1029,7 +1030,7 @@ TDengine supports aggregations over data, they are listed below:
2018-10-03 14:38:15.000 | 218 |
2018-10-03 14:38:16.650 | 218 |
Query OK, 2 row(s) in set (0.001332s)
taos> SELECT BOTTOM(current, 2) FROM d1001;
ts | bottom(current, 2) |
=================================================
......@@ -1092,7 +1093,7 @@ TDengine supports aggregations over data, they are listed below:
=======================
12.30000 |
Query OK, 1 row(s) in set (0.001238s)
taos> SELECT LAST_ROW(current) FROM d1002;
last_row(current) |
=======================
......@@ -1146,7 +1147,7 @@ TDengine supports aggregations over data, they are listed below:
============================
5.000000000 |
Query OK, 1 row(s) in set (0.001792s)
taos> SELECT SPREAD(voltage) FROM d1001;
spread(voltage) |
============================
......@@ -1172,7 +1173,7 @@ TDengine supports aggregations over data, they are listed below:
## <a class="anchor" id="aggregation"></a> Time-dimension Aggregation
TDengine supports aggregating by intervals. Data in a table can partitioned by intervals and aggregated to generate results. For example, a temperature sensor collects data once per second, but the average temperature needs to be queried every 10 minutes. This aggregation is suitable for down sample operation, and the syntax is as follows:
TDengine supports aggregating by intervals (time range). Data in a table can partitioned by intervals and aggregated to generate results. For example, a temperature sensor collects data once per second, but the average temperature needs to be queried every 10 minutes. This aggregation is suitable for down sample operation, and the syntax is as follows:
```mysql
SELECT function_list FROM tb_name
......@@ -1235,12 +1236,12 @@ SELECT AVG(current), MAX(current), LEASTSQUARES(current, start_val, step_val), P
**Restrictions on group by**
TAOS SQL supports group by operation on tags, tbnames and ordinary columns, required that only one column and whichhas less than 100,000 unique values.
TAOS SQL supports group by operation on tags, tbnames and ordinary columns, required that only one column and which has less than 100,000 unique values.
**Restrictions on join operation**
TAOS SQL supports join columns of two tables by Primary Key timestamp between them, and does not support four operations after tables aggregated for the time being.
TAOS SQL supports join columns of two tables by Primary Key timestamp between them, and does not support four arithmetic operations after tables aggregated for the time being.
**Availability of is no null**
Is not null supports all types of columns. Non-null expression is < > "" and only applies to columns of non-numeric types.
Is not null supports all types of columns. Non-null expression is < > "" and only applies to columns of non-numeric types.
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册