提交 e76883e2 编写于 作者: Y yihaoDeng

merge develop

......@@ -15,7 +15,7 @@ TDengine是一个高效的存储、查询、分析时序大数据的平台,专
* [命令行程序TAOS](/getting-started#console):访问TDengine的简便方式
* [极速体验](/getting-started#demo):运行示例程序,快速体验高效的数据插入、查询
* [支持平台列表](/getting-started#platforms):TDengine服务器和客户端支持的平台列表
* [Kubenetes部署](https://taosdata.github.io/TDengine-Operator/zh/index.html):TDengine在Kubenetes环境进行部署的详细说明
* [Kubernetes部署](https://taosdata.github.io/TDengine-Operator/zh/index.html):TDengine在Kubernetes环境进行部署的详细说明
## [整体架构](/architecture)
......
......@@ -899,7 +899,7 @@ go env -w GOPROXY=https://goproxy.io,direct
Node.js连接器支持的系统有:
| **CPU类型** | x64(64bit) | | | aarch64 | aarch32 |
|**CPU类型** | x64(64bit) | | | aarch64 | aarch32 |
| ------------ | ------------ | -------- | -------- | -------- | -------- |
| **OS类型** | Linux | Win64 | Win32 | Linux | Linux |
| **支持与否** | **支持** | **支持** | **支持** | **支持** | **支持** |
......
......@@ -444,7 +444,7 @@ TDengine的所有可执行文件默认存放在 _/usr/local/taos/bin_ 目录下
- 数据库名:不能包含“.”以及特殊字符,不能超过 32 个字符
- 表名:不能包含“.”以及特殊字符,与所属数据库名一起,不能超过 192 个字符
- 表的列名:不能包含特殊字符,不能超过 64 个字符
- 数据库名、表名、列名,都不能以数字开头
- 数据库名、表名、列名,都不能以数字开头,合法的可用字符集是“英文字符、数字和下划线”
- 表的列数:不能超过 1024 列
- 记录的最大长度:包括时间戳 8 byte,不能超过 16KB(每个 BINARY/NCHAR 类型的列还会额外占用 2 个 byte 的存储位置)
- 单条 SQL 语句默认最大字符串长度:65480 byte
......
......@@ -16,6 +16,7 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
- [Command-line](/getting-started#console) : an easy way to access TDengine server
- [Experience Lightning Speed](/getting-started#demo): running a demo, inserting/querying data to experience faster speed
- [List of Supported Platforms](/getting-started#platforms): a list of platforms supported by TDengine server and client
- [Deploy to Kubernetes](https://taosdata.github.io/TDengine-Operator/en/index.html):a detailed guide for TDengine deployment in Kubernetes environment
## [Overall Architecture](/architecture)
......@@ -28,7 +29,7 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
## [Data Modeling](/model)
- [Create a Library](/model#create-db): create a library for all data collection points with similar features
- [Create a Database](/model#create-db): create a database for all data collection points with similar features
- [Create a Super Table(STable)](/model#create-stable): create a STable for all data collection points with the same type
- [Create a Table](/model#create-table): use STable as the template, to create a table for each data collecting point
......@@ -70,7 +71,7 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
## [Connector](/connector)
- [C/C++ Connector](/connector#c-cpp): primary method to connect to TDengine server through libtaos client library
- [Java Connector(JDBC)](/connector/java): driver for connecting to the server from Java applications using the JDBC API
- [Java Connector(JDBC)]: driver for connecting to the server from Java applications using the JDBC API
- [Python Connector](/connector#python): driver for connecting to TDengine server from Python applications
- [RESTful Connector](/connector#restful): a simple way to interact with TDengine via HTTP
- [Go Connector](/connector#go): driver for connecting to TDengine server from Go applications
......@@ -111,8 +112,8 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
## TDengine Technical Design
- [System Module](/architecture/taosd): taosd functions and modules partitioning
- [Data Replication](/architecture/replica): support real-time synchronous/asynchronous replication, to ensure high-availability of the system
- [System Module]: taosd functions and modules partitioning
- [Data Replication]: support real-time synchronous/asynchronous replication, to ensure high-availability of the system
- [Technical Blog](https://www.taosdata.com/cn/blog/?categories=3): More technical analysis and architecture design articles
## Common Tools
......@@ -121,7 +122,7 @@ TDengine is a highly efficient platform to store, query, and analyze time-series
- [TDengine performance comparison test tools](https://www.taosdata.com/blog/2020/01/18/1166.html)
- [Use TDengine visually through IDEA Database Management Tool](https://www.taosdata.com/blog/2020/08/27/1767.html)
## [Performance: TDengine vs Others
## Performance: TDengine vs Others
- [Performance: TDengine vs InfluxDB with InfluxDB’s open-source performance testing tool](https://www.taosdata.com/blog/2020/01/13/1105.html)
- [Performance: TDengine vs OpenTSDB](https://www.taosdata.com/blog/2019/08/21/621.html)
......
......@@ -43,26 +43,23 @@ From the perspective of data sources, designers can analyze the applicability of
### System Function Requirements
| | | | | |
| ----------------------------------------------------- | ------------------ | -------------------- | ------------------- | ------------------------------------------------------------ |
| **System Function Requirements** | **Not Applicable** | **Might Applicable** | **Very Applicable** | **Description** |
| **System Architecture Requirements** | **Not Applicable** | **Might Be Applicable** | **Very Applicable** | **Description** |
| ------------------------------------------------- | ------------------ | ----------------------- | ------------------- | ------------------------------------------------------------ |
| Require completed data processing algorithms built-in | | √ | | TDengine implements various general data processing algorithms, but has not properly handled all requirements of different industries, so special types of processing shall be processed at the application level. |
| Require a huge amount of crosstab queries | | √ | | This type of processing should be handled more by relational database systems, or TDengine and relational database systems should fit together to implement system functions. |
### System Performance Requirements
| | | | | |
| -------------------------------------------- | ------------------ | -------------------- | ------------------- | ------------------------------------------------------------ |
| **System Performance Requirements** | **Not Applicable** | **Might Applicable** | **Very Applicable** | **Description** |
| **System Architecture Requirements** | **Not Applicable** | **Might Be Applicable** | **Very Applicable** | **Description** |
| ------------------------------------------------- | ------------------ | ----------------------- | ------------------- | ------------------------------------------------------------ |
| Require larger total processing capacity | | | √ | TDengine’s cluster functions can easily improve processing capacity via multi-server-cooperating. |
| Require high-speed data processing | | | √ | TDengine’s storage and data processing are designed to be optimized for IoT, can generally improve the processing speed by multiple times than other similar products. |
| Require fast processing of fine-grained data | | | √ | TDengine has achieved the same level of performance with relational and NoSQL data processing systems. |
### System Maintenance Requirements
| | | | | |
| -------------------------------------------- | ------------------ | -------------------- | ------------------- | ------------------------------------------------------------ |
| **System Maintenance Requirements** | **Not Applicable** | **Might Applicable** | **Very Applicable** | **Description** |
| **System Architecture Requirements** | **Not Applicable** | **Might Be Applicable** | **Very Applicable** | **Description** |
| ------------------------------------------------- | ------------------ | ----------------------- | ------------------- | ------------------------------------------------------------ |
| Require system with high-reliability | | | √ | TDengine has a very robust and reliable system architecture to implement simple and convenient daily operation with streamlined experiences for operators, thus human errors and accidents are eliminated to the greatest extent. |
| Require controllable operation learning cost | | | √ | As above. |
| Require abundant talent supply | √ | | | As a new-generation product, it’s still difficult to find talents with TDengine experiences from market. However, the learning cost is low. As the vendor, we also provide extensive operation training and counselling services. |
......@@ -25,21 +25,13 @@ For more about installation process, please refer [TDengine Installation Package
After installation, you can start the TDengine service by the `systemctl` command.
```bash
```
$ systemctl start taosd
```
```
Then check if the service is working now.
```bash
```
$ systemctl status taosd
```
```
If the service is running successfully, you can play around through TDengine shell `taos`.
......@@ -50,12 +42,10 @@ If the service is running successfully, you can play around through TDengine she
- To get better product feedback and improve our solution, TDegnine will collect basic usage information, but you can modify the configuration parameter **telemetryReporting** in the system configuration file taos.cfg, and set it to 0 to turn it off.
- TDegnine uses FQDN (usually hostname) as the node ID. In order to ensure normal operation, you need to set hostname for the server running taosd, and configure DNS service or hosts file for the machine running client application, to ensure the FQDN can be resolved.
- TDengine supports installation on Linux systems with[ systemd ](https://en.wikipedia.org/wiki/Systemd)as the process service management, and uses `which systemctl` command to detect whether `systemd` packages exist in the system:
- ```bash
```
- $ which systemctl
- ```
```bash
$ which systemctl
```
If `systemd` is not supported in the system, TDengine service can also be launched via `/usr/local/taos/bin/taosd` manually.
......@@ -64,28 +54,18 @@ If `systemd` is not supported in the system, TDengine service can also be launch
To launch TDengine shell, the command line interface, in a Linux terminal, type:
```bash
```
$ taos
```
```
The welcome message is printed if the shell connects to TDengine server successfully, otherwise, an error message will be printed (refer to our [FAQ](https://www.taosdata.com/en/faq) page for troubleshooting the connection error). The TDengine shell prompt is:
```cmd
```
taos>
```
```
In the TDengine shell, you can create databases, create tables and insert/query data with SQL. Each query command ends with a semicolon. It works like MySQL, for example:
```mysql
```
create database demo;
use demo;
......@@ -107,8 +87,6 @@ ts | speed |
19-07-15 01:00:00.000| 20|
Query OK, 2 row(s) in set (0.001700s)
```
```
Besides the SQL commands, the system administrator can check system status, add or delete accounts, and manage the servers.
......@@ -127,11 +105,7 @@ You can configure command parameters to change how TDengine shell executes. Some
Examples:
```bash
```
$ taos -h 192.168.0.1 -s "use db; show tables;"
```
```
### Run SQL Command Scripts
......@@ -139,11 +113,7 @@ $ taos -h 192.168.0.1 -s "use db; show tables;"
Inside TDengine shell, you can run SQL scripts in a file with source command.
```mysql
```
taos> source <filename>;
```
```
### Shell Tips
......@@ -158,11 +128,7 @@ taos> source <filename>;
After starting the TDengine server, you can execute the command `taosdemo` in the Linux terminal.
```bash
```
$ taosdemo
```
```
Using this command, a STable named `meters` will be created in the database `test` There are 10k tables under this stable, named from `t0` to `t9999`. In each table there are 100k rows of records, each row with columns (`f1`, `f2` and `f3`. The timestamp is from "2017-07-14 10:40:00 000" to "2017-07-14 10:41:39 999". Each table also has tags `areaid` and `loc`: `areaid` is set from 1 to 10, `loc` is set to "beijing" or "shanghai".
......@@ -174,51 +140,31 @@ In the TDengine client, enter sql query commands and then experience our lightni
- query total rows of records:
```mysql
```
taos> select count(*) from test.meters;
```
```
- query average, max and min of the total 1 billion records:
```mysql
```
taos> select avg(f1), max(f2), min(f3) from test.meters;
```
```
- query the number of records where loc="beijing":
```mysql
```
taos> select count(*) from test.meters where loc="beijing";
```
```
- query the average, max and min of total records where areaid=10:
```mysql
```
taos> select avg(f1), max(f2), min(f3) from test.meters where areaid=10;
```
```
- query the average, max, min from table t10 when aggregating over every 10s:
```mysql
```
taos> select avg(f1), max(f2), min(f3) from test.t10 interval(10s);
```
```
**Note**: you can run command `taosdemo` with many options, like number of tables, rows of records and so on. To know more about these options, you can execute `taosdemo --help` and then take a try using different options.
......@@ -274,4 +220,4 @@ Comparison matrix as following:
Note: ● has been verified by official tests; ○ has been verified by unofficial tests.
Please visit [Connectors](https://www.taosdata.com/cn/documentation/connector) section for more detailed information.
\ No newline at end of file
Please visit [Connectors](https://www.taosdata.com/en/documentation/connector) section for more detailed information.
此差异已折叠。
此差异已折叠。
......@@ -7,31 +7,19 @@ TDengine supports multiple interfaces to write data, including SQL, Prometheus,
Applications insert data by executing SQL insert statements through C/C + +, JDBC, GO, or Python Connector, and users can manually enter SQL insert statements to insert data through TAOS Shell. For example, the following insert writes a record to table d1001:
```mysql
```
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31);
```
```
TDengine supports writing multiple records at a time. For example, the following command writes two records to table d1001:
```mysql
```
INSERT INTO d1001 VALUES (1538548684000, 10.2, 220, 0.23) (1538548696650, 10.3, 218, 0.25);
```
```
TDengine also supports writing data to multiple tables at a time. For example, the following command writes two records to d1001 and one record to d1002:
```mysql
```
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31) (1538548695000, 12.6, 218, 0.33) d1002 VALUES (1538548696800, 12.3, 221, 0.31);
```
```
For the SQL INSERT Grammar, please refer to [Taos SQL insert](https://www.taosdata.com/en/documentation/taos-sql#insert)
......@@ -58,13 +46,9 @@ Users need to download the source code of [Bailongma](https://github.com/taosdat
Bailongma project has a folder, blm_prometheus, which holds the prometheus writing API. The compiling process is as follows:
```bash
```
cd blm_prometheus
go build
```
```
If everything goes well, an executable of blm_prometheus will be generated in the corresponding directory.
......@@ -134,8 +118,6 @@ remote_write:
The format of generated data by Prometheus is as follows:
```json
{
Timestamp: 1576466279341,
Value: 37.000000,
......
......@@ -2,6 +2,8 @@
TDengine provides many connectors for development, including C/C++, JAVA, Python, RESTful, Go, Node.JS, etc.
![image-connector](page://images/connector.png)
At present, TDengine connectors support a wide range of platforms, including hardware platforms such as X64/X86/ARM64/ARM32/MIPS/Alpha, and development environments such as Linux/Win64/Win32. The comparison matrix is as follows:
| **CPU** | **X64 64bit** | **X64 64bit** | **X64 64bit** | **X86 32bit** | **ARM64** | **ARM32** | **MIPS Godson** | **Alpha Whenwei** | **X64 TimecomTech** |
......@@ -32,9 +34,9 @@ The server should already have the TDengine server package installed. The connec
**1. Download from TAOS Data website(https://www.taosdata.com/cn/all-downloads/)**
- X64 hardware environment: TDengine-client-2.x.x.x-Linux-x64.tar.gz
- ARM64 hardware environment: TDengine-client-2.x.x.x-Linux-aarch64.tar.gz
- ARM32 hardware environment: TDengine-client-2.x.x.x-Linux-aarch32.tar.gz
* X64 hardware environment: TDengine-client-2.x.x.x-Linux-x64.tar.gz
* ARM64 hardware environment: TDengine-client-2.x.x.x-Linux-aarch64.tar.gz
* ARM32 hardware environment: TDengine-client-2.x.x.x-Linux-aarch32.tar.gz
**2. Unzip the package**
......@@ -68,10 +70,10 @@ Edit the taos.cfg file (default path/etc/taos/taos.cfg) and change firstEP to En
**Windows x64/x86**
- **1. Download from TAOS Data website(https://www.taosdata.com/cn/all-downloads/)**
**1. Download from TAOS Data website(https://www.taosdata.com/cn/all-downloads/)**
- X64 hardware environment: TDengine-client-2.X.X.X-Windows-x64.exe
- X86 hardware environment: TDengine-client-2.X.X.X-Windows-x86.exe
* X64 hardware environment: TDengine-client-2.X.X.X-Windows-x64.exe
* X86 hardware environment: TDengine-client-2.X.X.X-Windows-x86.exe
**2. Execute installation, select default vales as prompted to complete**
......@@ -187,11 +189,11 @@ Get version information of the client.
Create a database connection and initialize the connection context. The parameters that need to be provided by user include:
- - host: FQDN used by TDengine to manage the master node
- user: User name
- pass: Password
- db: Database name. If user does not provide it, it can be connected normally, means user can create a new database through this connection. If user provides a database name, means the user has created the database and the database is used by default
- port: Port number
* host: FQDN used by TDengine to manage the master node
* user: User name
* pass: Password
* db: Database name. If user does not provide it, it can be connected normally, means user can create a new database through this connection. If user provides a database name, means the user has created the database and the database is used by default
* port: Port number
A null return value indicates a failure. The application needs to save the returned parameters for subsequent API calls.
......@@ -278,20 +280,18 @@ Asynchronous APIs all need applications to provide corresponding callback functi
Asynchronous APIs have relatively high requirements for users, who can selectively use them according to specific application scenarios. Here are three important asynchronous APIs:
- `void taos_query_a(TAOS *taos, const char *sql, void (*fp)(void *param, TAOS_RES *, int code), void *param);`
Execute SQL statement asynchronously.
Execute SQL statement asynchronously.
- - taos: The database connection returned by calling `taos_connect`
- sql: The SQL statement needed to execute
- fp: User-defined callback function, whose third parameter `code` is used to indicate whether the operation is successful, `0` for success, and negative number for failure (call `taos_errstr` to get the reason for failure). When defining the callback function, it mainly handles the second parameter `TAOS_RES *`, which is the result set returned by the query
- param:the parameter for the callback
* taos: The database connection returned by calling `taos_connect`
* sql: The SQL statement needed to execute
* fp: User-defined callback function, whose third parameter `code` is used to indicate whether the operation is successful, `0` for success, and negative number for failure (call `taos_errstr` to get the reason for failure). When defining the callback function, it mainly handles the second parameter `TAOS_RES *`, which is the result set returned by the query
* param:the parameter for the callback
- `void taos_fetch_rows_a(TAOS_RES *res, void (*fp)(void *param, TAOS_RES *, int numOfRows), void *param);`
Get the result set of asynchronous queries in batch, which can only be used with `taos_query_a`. Within:
Get the result set of asynchronous queries in batch, which can only be used with `taos_query_a`. Within:
- - res: The result set returned when backcall `taos_query_a`
- fp: Callback function. Its parameter `param` is a user-definable parameter construct passed to the callback function; `numOfRows` is the number of rows of data obtained (not a function of the entire query result set). In the callback function, applications can get each row of the batch records by calling `taos_fetch_rows` forward iteration. After reading all the records in a block, the application needs to continue calling `taos_fetch_rows_a` in the callback function to obtain the next batch of records for processing until the number of records returned (`numOfRows`) is zero (the result is returned) or the number of records is negative (the query fails).
* res: The result set returned when backcall `taos_query_a`
* fp: Callback function. Its parameter `param` is a user-definable parameter construct passed to the callback function; `numOfRows` is the number of rows of data obtained (not a function of the entire query result set). In the callback function, applications can get each row of the batch records by calling `taos_fetch_rows` forward iteration. After reading all the records in a block, the application needs to continue calling `taos_fetch_rows_a` in the callback function to obtain the next batch of records for processing until the number of records returned (`numOfRows`) is zero (the result is returned) or the number of records is negative (the query fails).
The asynchronous APIs of TDengine all use non-blocking calling mode. Applications can use multithreading to open multiple tables at the same time, and can query or insert to each open table at the same time. It should be pointed out that the **application client must ensure that the operation on the same table is completely serialized**, that is, when the insertion or query operation on the same table is not completed (when no result returned), the second insertion or query operation cannot be performed.
......@@ -349,12 +349,12 @@ TDengine provides time-driven real-time stream computing APIs. You can perform v
This API is used to create data streams where:
- - taos: Database connection established
- sql: SQL query statement (query statement only)
- fp: user-defined callback function pointer. After each stream computing is completed, TDengine passes the query result (TAOS_ROW), query status (TAOS_RES), and user-defined parameters (PARAM) to the callback function. In the callback function, the user can use taos_num_fields to obtain the number of columns in the result set, and taos_fetch_fields to obtain the type of data in each column of the result set.
- stime: The time when stream computing starts. If it is 0, it means starting from now. If it is not zero, it means starting from the specified time (the number of milliseconds from 1970/1/1 UTC time).
- param: It is a parameter provided by the application for callback. During callback, the parameter is provided to the application
- callback: The second callback function is called when the continuous query stop automatically.
* taos: Database connection established
* sql: SQL query statement (query statement only)
* fp: user-defined callback function pointer. After each stream computing is completed, TDengine passes the query result (TAOS_ROW), query status (TAOS_RES), and user-defined parameters (PARAM) to the callback function. In the callback function, the user can use taos_num_fields to obtain the number of columns in the result set, and taos_fetch_fields to obtain the type of data in each column of the result set.
* stime: The time when stream computing starts. If it is 0, it means starting from now. If it is not zero, it means starting from the specified time (the number of milliseconds from 1970/1/1 UTC time).
* param: It is a parameter provided by the application for callback. During callback, the parameter is provided to the application
* callback: The second callback function is called when the continuous query stop automatically.
The return value is NULL, indicating creation failed; the return value is not NULL, indicating creation successful.
......@@ -370,22 +370,22 @@ The subscription API currently supports subscribing to one or more tables and co
This function is for starting the subscription service, returning the subscription object in case of success, and NULL in case of failure. Its parameters are:
- - taos: Database connection established
- Restart: If the subscription already exists, do you want to start over or continue with the previous subscription
- Topic: Subject (that is, name) of the subscription. This parameter is the unique identification of the subscription
- sql: The query statement subscribed. This statement can only be a select statement. It should only query the original data, and can only query the data in positive time sequence
- fp: The callback function when the query result is received (the function prototype will be introduced later). It is only used when calling asynchronously, and this parameter should be passed to NULL when calling synchronously
- param: The additional parameter when calling the callback function, which is passed to the callback function as it is by the system API without any processing
- interval: Polling period in milliseconds. During asynchronous call, the callback function will be called periodically according to this parameter; In order to avoid affecting system performance, it is not recommended to set this parameter too small; When calling synchronously, if the interval between two calls to taos_consume is less than this period, the API will block until the interval exceeds this period.
* taos: Database connection established
* Restart: If the subscription already exists, do you want to start over or continue with the previous subscription
* Topic: Subject (that is, name) of the subscription. This parameter is the unique identification of the subscription
* sql: The query statement subscribed. This statement can only be a select statement. It should only query the original data, and can only query the data in positive time sequence
* fp: The callback function when the query result is received (the function prototype will be introduced later). It is only used when calling asynchronously, and this parameter should be passed to NULL when calling synchronously
* param: The additional parameter when calling the callback function, which is passed to the callback function as it is by the system API without any processing
* interval: Polling period in milliseconds. During asynchronous call, the callback function will be called periodically according to this parameter; In order to avoid affecting system performance, it is not recommended to set this parameter too small; When calling synchronously, if the interval between two calls to taos_consume is less than this period, the API will block until the interval exceeds this period.
- `typedef void (*TAOS_SUBSCRIBE_CALLBACK)(TAOS_SUB* tsub, TAOS_RES *res, void* param, int code)`
In asynchronous mode, the prototype of the callback function has the following parameters:
- - tsub: Subscription object
- res: Query the result set. Note that there may be no records in the result set
- param: Additional parameters supplied by the client when `taos_subscribe` is called
- code: Error code
* tsub: Subscription object
* res: Query the result set. Note that there may be no records in the result set
* param: Additional parameters supplied by the client when `taos_subscribe` is called
* code: Error code
- `TAOS_RES *taos_consume(TAOS_SUB *tsub)`
......@@ -621,16 +621,16 @@ Description:
Column types in column_meta:
- 1:BOOL
- 2:TINYINT
- 3:SMALLINT
- 4:INT
- 5:BIGINT
- 6:FLOAT
- 7:DOUBLE
- 8:BINARY
- 9:TIMESTAMP
- 10:NCHAR
* 1:BOOL
* 2:TINYINT
* 3:SMALLINT
* 4:INT
* 5:BIGINT
* 6:FLOAT
* 7:DOUBLE
* 8:BINARY
* 9:TIMESTAMP
* 10:NCHAR
### Custom authorization code
......@@ -868,7 +868,7 @@ This API is used to open DB and return an object of type \* DB. Generally, DRIVE
The Node.js connector supports the following systems:
| **CPU Type** | **x64****(****64bit****)** | | | **aarch64** | **aarch32** |
| **CPU Type** | x64(64bit) | | | aarch64 | aarch32 |
| -------------------- | ---------------------------- | ------- | ------- | ----------- | ----------- |
| **OS Type** | Linux | Win64 | Win32 | Linux | Linux |
| **Supported or Not** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** |
......
# Installation and Management of TDengine Cluster
# TDengine Cluster Management
Multiple TDengine servers, that is, multiple running instances of taosd, can form a cluster to ensure the highly reliable operation of TDengine and provide scale-out features. To understand cluster management in TDengine 2.0, it is necessary to understand the basic concepts of clustering. Please refer to the chapter "Overall Architecture of TDengine 2.0". And before installing the cluster, please follow the chapter ["Getting started"](https://www.taosdata.com/en/documentation/getting-started/) to install and experience the single node function.
......
......@@ -174,93 +174,78 @@ Client configuration parameters:
- firstEp: end point of the first taosd instance in the actively connected cluster when taos is started, the default value is localhost: 6030.
- secondEp: when taos starts, if not impossible to connect to firstEp, it will try to connect to secondEp.
- locale
Default value: obtained dynamically from the system. If the automatic acquisition fails, user needs to set it in the configuration file or through API
Default value: obtained dynamically from the system. If the automatic acquisition fails, user needs to set it in the configuration file or through API
TDengine provides a special field type nchar for storing non-ASCII encoded wide characters such as Chinese, Japanese and Korean. The data written to the nchar field will be uniformly encoded in UCS4-LE format and sent to the server. It should be noted that the correctness of coding is guaranteed by the client. Therefore, if users want to normally use nchar fields to store non-ASCII characters such as Chinese, Japanese, Korean, etc., it’s needed to set the encoding format of the client correctly.
TDengine provides a special field type nchar for storing non-ASCII encoded wide characters such as Chinese, Japanese and Korean. The data written to the nchar field will be uniformly encoded in UCS4-LE format and sent to the server. It should be noted that the correctness of coding is guaranteed by the client. Therefore, if users want to normally use nchar fields to store non-ASCII characters such as Chinese, Japanese, Korean, etc., it’s needed to set the encoding format of the client correctly.
The characters inputted by the client are all in the current default coding format of the operating system, mostly UTF-8 on Linux systems, and some Chinese system codes may be GB18030 or GBK, etc. The default encoding in the docker environment is POSIX. In the Chinese versions of Windows system, the code is CP936. The client needs to ensure that the character set it uses is correctly set, that is, the current encoded character set of the operating system running by the client, in order to ensure that the data in nchar is correctly converted into UCS4-LE encoding format.
The characters inputted by the client are all in the current default coding format of the operating system, mostly UTF-8 on Linux systems, and some Chinese system codes may be GB18030 or GBK, etc. The default encoding in the docker environment is POSIX. In the Chinese versions of Windows system, the code is CP936. The client needs to ensure that the character set it uses is correctly set, that is, the current encoded character set of the operating system running by the client, in order to ensure that the data in nchar is correctly converted into UCS4-LE encoding format.
The naming rules of locale in Linux are: < language > _ < region >. < character set coding >, such as: zh_CN.UTF-8, zh stands for Chinese, CN stands for mainland region, and UTF-8 stands for character set. Character set encoding provides a description of encoding transformations for clients to correctly parse local strings. Linux system and Mac OSX system can determine the character encoding of the system by setting locale. Because the locale used by Windows is not the POSIX standard locale format, another configuration parameter charset is needed to specify the character encoding under Windows. You can also use charset to specify character encoding in Linux systems.
The naming rules of locale in Linux are: < language > _ < region >. < character set coding >, such as: zh_CN.UTF-8, zh stands for Chinese, CN stands for mainland region, and UTF-8 stands for character set. Character set encoding provides a description of encoding transformations for clients to correctly parse local strings. Linux system and Mac OSX system can determine the character encoding of the system by setting locale. Because the locale used by Windows is not the POSIX standard locale format, another configuration parameter charset is needed to specify the character encoding under Windows. You can also use charset to specify character encoding in Linux systems.
- charset
Default value: obtained dynamically from the system. If the automatic acquisition fails, user needs to set it in the configuration file or through API
Default value: obtained dynamically from the system. If the automatic acquisition fails, user needs to set it in the configuration file or through API
If charset is not set in the configuration file, in Linux system, when taos starts up, it automatically reads the current locale information of the system, and parses and extracts the charset encoding format from the locale information. If the automatic reading of locale information fails, an attempt is made to read the charset configuration, and if the reading of the charset configuration also fails, the startup process is interrupted.
If charset is not set in the configuration file, in Linux system, when taos starts up, it automatically reads the current locale information of the system, and parses and extracts the charset encoding format from the locale information. If the automatic reading of locale information fails, an attempt is made to read the charset configuration, and if the reading of the charset configuration also fails, the startup process is interrupted.
In Linux system, locale information contains character encoding information, so it is unnecessary to set charset separately after setting locale of Linux system correctly. For example:
In Linux system, locale information contains character encoding information, so it is unnecessary to set charset separately after setting locale of Linux system correctly. For example:
```
locale zh_CN.UTF-8
```
- On Windows systems, the current system encoding cannot be obtained from locale. If string encoding information cannot be read from the configuration file, taos defaults to CP936. It is equivalent to adding the following to the configuration file:
On Windows systems, the current system encoding cannot be obtained from locale. If string encoding information cannot be read from the configuration file, taos defaults to CP936. It is equivalent to adding the following to the configuration file:
```
charset CP936
```
If you need to adjust the character encoding, check the encoding used by the current operating system and set it correctly in the configuration file.
- If you need to adjust the character encoding, check the encoding used by the current operating system and set it correctly in the configuration file.
In Linux systems, if user sets both locale and charset encoding charset, and the locale and charset are inconsistent, the value set later will override the value set earlier.
```
In Linux systems, if user sets both locale and charset encoding charset, and the locale and charset are inconsistent, the value set later will override the value set earlier.
```
locale zh_CN.UTF-8
charset GBK
```
- The valid value for charset is GBK.
```
The valid value for charset is GBK.
And the valid value for charset is UTF-8.
And the valid value for charset is UTF-8.
The configuration parameters of log are exactly the same as those of server.
The configuration parameters of log are exactly the same as those of server.
- timezone
Default value: get the current time zone option dynamically from the system
The time zone in which the client runs the system. In order to deal with the problem of data writing and query in multiple time zones, TDengine uses Unix Timestamp to record and store timestamps. The characteristics of UNIX timestamps determine that the generated timestamps are consistent at any time regardless of any time zone. It should be noted that UNIX timestamps are converted and recorded on the client side. In order to ensure that other forms of time on the client are converted into the correct Unix timestamp, the correct time zone needs to be set.
Default value: get the current time zone option dynamically from the system
In Linux system, the client will automatically read the time zone information set by the system. Users can also set time zones in profiles in a number of ways. For example:
The time zone in which the client runs the system. In order to deal with the problem of data writing and query in multiple time zones, TDengine uses Unix Timestamp to record and store timestamps. The characteristics of UNIX timestamps determine that the generated timestamps are consistent at any time regardless of any time zone. It should be noted that UNIX timestamps are converted and recorded on the client side. In order to ensure that other forms of time on the client are converted into the correct Unix timestamp, the correct time zone needs to be set.
In Linux system, the client will automatically read the time zone information set by the system. Users can also set time zones in profiles in a number of ways. For example:
```
timezone UTC-8
timezone GMT-8
timezone Asia/Shanghai
```
- All above are legal to set the format of the East Eight Zone.
All above are legal to set the format of the East Eight Zone.
The setting of time zone affects the content of non-Unix timestamp (timestamp string, parsing of keyword now) in query and writing SQL statements. For example:
The setting of time zone affects the content of non-Unix timestamp (timestamp string, parsing of keyword now) in query and writing SQL statements. For example:
```sql
SELECT count(*) FROM table_name WHERE TS<'2019-04-11 12:01:08';
```
- In East Eight Zone, the SQL statement is equivalent to
In East Eight Zone, the SQL statement is equivalent to
```sql
SELECT count(*) FROM table_name WHERE TS<1554955268000;
```
-
In the UTC time zone, the SQL statement is equivalent to
In the UTC time zone, the SQL statement is equivalent to
```sql
SELECT count(*) FROM table_name WHERE TS<1554984068000;
```
In order to avoid the uncertainty caused by using string time format, Unix timestamp can also be used directly. In addition, timestamp strings with time zones can also be used in SQL statements, such as: timestamp strings in RFC3339 format, 2013-04-12T15: 52: 01.123 +08:00, or ISO-8601 format timestamp strings 2013-04-12T15: 52: 01.123 +0800. The conversion of the above two strings into Unix timestamps is not affected by the time zone in which the system is located.
-
In order to avoid the uncertainty caused by using string time format, Unix timestamp can also be used directly. In addition, timestamp strings with time zones can also be used in SQL statements, such as: timestamp strings in RFC3339 format, 2013-04-12T15: 52: 01.123 +08:00, or ISO-8601 format timestamp strings 2013-04-12T15: 52: 01.123 +0800. The conversion of the above two strings into Unix timestamps is not affected by the time zone in which the system is located.
When starting taos, you can also specify an end point for an instance of taosd from the command line, otherwise read from taos.cfg.
When starting taos, you can also specify an end point for an instance of taosd from the command line, otherwise read from taos.cfg.
- maxBinaryDisplayWidth
The upper limit of the display width of binary and nchar fields in a shell, beyond which parts will be hidden. Default: 30. You can modify this option dynamically in the shell with the command set max_binary_display_width nn.
The upper limit of the display width of binary and nchar fields in a shell, beyond which parts will be hidden. Default: 30. You can modify this option dynamically in the shell with the command set max_binary_display_width nn.
## <a class="anchor" id="user"></a>User Management
......
# Connect with other tools
## Telegraf
TDengine is easy to integrate with [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/), an open-source server agent for collecting and sending metrics and events, without more development.
### Install Telegraf
At present, TDengine supports Telegraf newer than version 1.7.4. Users can go to the [download link] and choose the proper package to install on your system.
### Configure Telegraf
Telegraf is configured by changing items in the configuration file */etc/telegraf/telegraf.conf*.
In **output plugins** section,add _[[outputs.http]]_ iterm:
- _url_: http://ip:6020/telegraf/udb, in which _ip_ is the IP address of any node in TDengine cluster. Port 6020 is the RESTful APT port used by TDengine. _udb_ is the name of the database to save data, which needs to create beforehand.
- _method_: "POST"
- _username_: username to login TDengine
- _password_: password to login TDengine
- _data_format_: "json"
- _json_timestamp_units_: "1ms"
In **agent** part:
- hostname: used to distinguish different machines. Need to be unique.
- metric_batch_size: 30,the maximum number of records allowed to write in Telegraf. The larger the value is, the less frequent requests are sent. For TDengine, the value should be less than 50.
Please refer to the [Telegraf docs](https://docs.influxdata.com/telegraf/v1.11/) for more information.
## Grafana
[Grafana] is an open-source system for time-series data display. It is easy to integrate TDengine and Grafana to build a monitor system. Data saved in TDengine can be fetched and shown on the Grafana dashboard.
### Install Grafana
For now, TDengine only supports Grafana newer than version 5.2.4. Users can go to the [Grafana download page] for the proper package to download.
### Configure Grafana
TDengine Grafana plugin is in the _/usr/local/taos/connector/grafana_ directory.
Taking Centos 7.2 as an example, just copy TDengine directory to _/var/lib/grafana/plugins_ directory and restart Grafana.
### Use Grafana
Users can log in the Grafana server (username/password:admin/admin) through localhost:3000 to configure TDengine as the data source. As is shown in the picture below, TDengine as a data source option is shown in the box:
![img](../assets/clip_image001.png)
When choosing TDengine as the data source, the Host in HTTP configuration should be configured as the IP address of any node of a TDengine cluster. The port should be set as 6020. For example, when TDengine and Grafana are on the same machine, it should be configured as _http://localhost:6020.
Besides, users also should set the username and password used to log into TDengine. Then click _Save&Test_ button to save.
![img](../assets/clip_image001-2474914.png)
Then, TDengine as a data source should show in the Grafana data source list.
![img](../assets/clip_image001-2474939.png)
Then, users can create Dashboards in Grafana using TDengine as the data source:
![img](../assets/clip_image001-2474961.png)
Click _Add Query_ button to add a query and input the SQL command you want to run in the _INPUT SQL_ text box. The SQL command should expect a two-row, multi-column result, such as _SELECT count(*) FROM sys.cpu WHERE ts>=from and ts<​to interval(interval)_, in which, _from_, _to_ and _inteval_ are TDengine inner variables representing query time range and time interval.
_ALIAS BY_ field is to set the query alias. Click _GENERATE SQL_ to send the command to TDengine:
![img](../assets/clip_image001-2474987.png)
Please refer to the [Grafana official document] for more information about Grafana.
## Matlab
Matlab can connect to and retrieve data from TDengine by TDengine JDBC Driver.
### MatLab and TDengine JDBC adaptation
Several steps are required to adapt Matlab to TDengine. Taking adapting Matlab2017a on Windows10 as an example:
1. Copy the file _JDBCDriver-1.0.0-dist.jar_ in TDengine package to the directory _${matlab_root}\MATLAB\R2017a\java\jar\toolbox_
2. Copy the file _taos.lib_ in TDengine package to _${matlab_ root _dir}\MATLAB\R2017a\lib\win64_
3. Add the .jar package just copied to the Matlab classpath. Append the line below as the end of the file of _${matlab_ root _dir}\MATLAB\R2017a\toolbox\local\classpath.txt_
`$matlabroot/java/jar/toolbox/JDBCDriver-1.0.0-dist.jar`
4. Create a file called _javalibrarypath.txt_ in directory _${user_home}\AppData\Roaming\MathWorks\MATLAB\R2017a\_, and add the _taos.dll_ path in the file. For example, if the file _taos.dll_ is in the directory of _C:\Windows\System32_,then add the following line in file *javalibrarypath.txt*:
`C:\Windows\System32`
### TDengine operations in Matlab
After correct configuration, open Matlab:
- build a connection:
`conn = database(‘db’, ‘root’, ‘taosdata’, ‘com.taosdata.jdbc.TSDBDriver’, ‘jdbc:TSDB://127.0.0.1:0/’)`
- Query:
`sql0 = [‘select * from tb’]`
`data = select(conn, sql0);`
- Insert a record:
`sql1 = [‘insert into tb values (now, 1)’]`
`exec(conn, sql1)`
Please refer to the file _examples\Matlab\TDengineDemo.m_ for more information.
## R
Users can use R language to access the TDengine server with the JDBC interface. At first, install JDBC package in R:
```R
install.packages('rJDBC', repos='http://cran.us.r-project.org')
```
Then use _library_ function to load the package:
```R
library('RJDBC')
```
Then load the TDengine JDBC driver:
```R
drv<-JDBC("com.taosdata.jdbc.TSDBDriver","JDBCDriver-1.0.0-dist.jar", identifier.quote="\"")
```
If succeed, no error message will display. Then use the following command to try a database connection:
```R
conn<-dbConnect(drv,"jdbc:TSDB://192.168.0.1:0/?user=root&password=taosdata","root","taosdata")
```
Please replace the IP address in the command above to the correct one. If no error message is shown, then the connection is established successfully. TDengine supports below functions in _RJDBC_ package:
- _dbWriteTable(conn, "test", iris, overwrite=FALSE, append=TRUE)_: write the data in a data frame _iris_ to the table _test_ in the TDengine server. Parameter _overwrite_ must be _false_. _append_ must be _TRUE_ and the schema of the data frame _iris_ should be the same as the table _test_.
- _dbGetQuery(conn, "select count(*) from test")_: run a query command
- _dbSendUpdate(conn, "use db")_: run any non-query command.
- _dbReadTable(conn, "test"_): read all the data in table _test_
- _dbDisconnect(conn)_: close a connection
- _dbRemoveTable(conn, "test")_: remove table _test_
Below functions are **not supported** currently:
- _dbExistsTable(conn, "test")_: if talbe _test_ exists
- _dbListTables(conn)_: list all tables in the connection
[Telegraf]: www.taosdata.com
[download link]: https://portal.influxdata.com/downloads
[Telegraf document]: www.taosdata.com
[Grafana]: https://grafana.com
[Grafana download page]: https://grafana.com/grafana/download
[Grafana official document]: https://grafana.com/docs/
# TaosData Contributor License Agreement
This TaosData Contributor License Agreement (CLA) applies to any contribution you make to any TaosData projects. If you are representing your employing organization to sign this agreement, please warrant that you have the authority to grant the agreement.
## Terms
**"TaosData"**, **"we"**, **"our"** and **"us"** means TaosData, inc.
**"You"** and **"your"** means you or the organization you are on behalf of to sign this agreement.
**"Contribution"** means any original work you, or the organization you represent submit to TaosData for any project in any manner.
## Copyright License
All rights of your Contribution submitted to TaosData in any manner are granted to TaosData and recipients of software distributed by TaosData. You waive any rights that my affect our ownership of the copyright and grant to us a perpetual, worldwide, transferable, non-exclusive, no-charge, royalty-free, irrevocable, and sublicensable license to use, reproduce, prepare derivative works of, publicly display, publicly perform, sublicense, and distribute Contributions and any derivative work created based on a Contribution.
## Patent License
With respect to any patents you own or that you can license without payment to any third party, you grant to us and to any recipient of software distributed by us, a perpetual, worldwide, transferable, non-exclusive, no-charge, royalty-free, irrevocable patent license to make, have make, use, sell, offer to sell, import, and otherwise transfer the Contribution in whole or in part, alone or included in any product under any patent you own, or license from a third party, that is necessarily infringed by the Contribution or by combination of the Contribution with any Work.
## Your Representations and Warranties
You represent and warrant that:
- the Contribution you submit is an original work that you can legally grant the rights set out in this agreement.
- the Contribution you submit and licenses you granted does not and will not, infringe the rights of any third party.
- you are not aware of any pending or threatened claims, suits, actions, or charges pertaining to the contributions. You also warrant to notify TaosData immediately if you become aware of any such actual or potential claims, suits, actions, allegations or charges.
## Support
You are not obligated to support your Contribution except you volunteer to provide support. If you want, you can provide for a fee.
**I agree and accept on behalf of myself and behalf of my organization:**
\ No newline at end of file
#Documentation
TDengine is a highly efficient platform to store, query, and analyze time-series data. It works like a relational database, but you are strongly suggested to read through the following documentation before you experience it.
##Getting Started
- Quick Start: download, install and experience TDengine in a few seconds
- TDengine Shell: command-line interface to access TDengine server
- Major Features: insert/query, aggregation, cache, pub/sub, continuous query
## Data Model and Architecture
- Data Model: relational database model, but one table for one device with static tags
- Architecture: Management Module, Data Module, Client Module
- Writing Process: records recieved are written to WAL, cache, then ack is sent back to client
- Data Storage: records are sharded in the time range, and stored column by column
##TAOS SQL
- Data Types: support timestamp, int, float, double, binary, nchar, bool, and other types
- Database Management: add, drop, check databases
- Table Management: add, drop, check, alter tables
- Inserting Records: insert one or more records into tables, historical records can be imported
- Data Query: query data with time range and filter conditions, support limit/offset
- SQL Functions: support aggregation, selector, transformation functions
- Downsampling: aggregate data in successive time windows, support interpolation
##STable: Super Table
- What is a Super Table: an innovated way to aggregate tables
- Create a STable: it is like creating a standard table, but with tags defined
- Create a Table via STable: use STable as the template, with tags specified
- Aggregate Tables via STable: group tables together by specifying the tags filter condition
- Create Table Automatically: create tables automatically with a STable as a template
- Management of STables: create/delete/alter super table just like standard tables
- Management of Tags: add/delete/alter tags on super tables or tables
##Advanced Features
- Continuous Query: query executed by TDengine periodically with a sliding window
- Publisher/Subscriber: subscribe to the newly arrived data like a typical messaging system
- Caching: the newly arrived data of each device/table will always be cached
##Connector
- C/C++ Connector: primary method to connect to the server through libtaos client library
- Java Connector: driver for connecting to the server from Java applications using the JDBC API
- Python Connector: driver for connecting to the server from Python applications
- RESTful Connector: a simple way to interact with TDengine via HTTP
- Go Connector: driver for connecting to the server from Go applications
- Node.js Connector: driver for connecting to the server from node applications
##Connections with Other Tools
- Telegraf: pass the collected DevOps metrics to TDengine
- Grafana: query the data saved in TDengine and visualize them
- Matlab: access TDengine server from Matlab via JDBC
- R: access TDengine server from R via JDBC
##Administrator
- Directory and Files: files and directories related with TDengine
- Configuration on Server: customize IP port, cache size, file block size and other settings
- Configuration on Client: customize locale, default user and others
- User Management: add/delete users, change passwords
- Import Data: import data into TDengine from either script or CSV file
- Export Data: export data either from TDengine shell or from tool taosdump
- Management of Connections, Streams, Queries: check or kill the connections, queries
- System Monitor: collect the system metric, and log important operations
##More on System Architecture
- Storage Design: column-based storage with optimization on time-series data
- Query Design: an efficient way to query time-series data
- Technical blogs to delve into the inside of TDengine
## More on IoT Big Data
- [Characteristics of IoT Big Data](https://www.taosdata.com/blog/2019/07/09/characteristics-of-iot-big-data/)
- [Why don’t General Big Data Platforms Fit IoT Scenarios?](https://www.taosdata.com/blog/2019/07/09/why-does-the-general-big-data-platform-not-fit-iot-data-processing/)
- [Why TDengine is the Best Choice for IoT Big Data Processing?](https://www.taosdata.com/blog/2019/07/09/why-tdengine-is-the-best-choice-for-iot-big-data-processing/)
##Tutorials & FAQ
- <a href='https://www.taosdata.com/en/faq'>FAQ</a>: a list of frequently asked questions and answers
- <a href='https://www.taosdata.com/en/blog/?categories=4'>Use cases</a>: a few typical cases to explain how to use TDengine in IoT platform
#Getting Started
## Quick Start
At the moment, TDengine only runs on Linux. You can set up and install it either from the <a href='#Install-from-Source'>source code</a> or the <a href='#Install-from-Package'>packages</a>. It takes only a few seconds from download to run it successfully.
### Install from Source
Please visit our [github page](https://github.com/taosdata/TDengine) for instructions on installation from the source code.
### Install from Package
Three different packages are provided, please pick up the one you like.
<ul id='packageList'>
<li><a id='tdengine-rpm' style='color:var(--b2)'>TDengine RPM package (1.5M)</a></li>
<li><a id='tdengine-deb' style='color:var(--b2)'>TDengine DEB package (1.7M)</a></li>
<li><a id='tdengine-tar' style='color:var(--b2)'>TDengine Tarball (3.0M)</a></li>
</ul>
For the time being, TDengine only supports installation on Linux systems using [`systemd`](https://en.wikipedia.org/wiki/Systemd) as the service manager. To check if your system has *systemd* package, use the _which systemctl_ command.
```cmd
which systemctl
```
If the `systemd` package is not found, please [install from source code](#Install-from-Source).
### Running TDengine
After installation, start the TDengine service by the `systemctl` command.
```cmd
systemctl start taosd
```
Then check if the server is working now.
```cmd
systemctl status taosd
```
If the service is running successfully, you can play around through TDengine shell `taos`, the command line interface tool located in directory /usr/local/bin/taos
**Note: The _systemctl_ command needs the root privilege. Use _sudo_ if you are not the _root_ user.**
##TDengine Shell
To launch TDengine shell, the command line interface, in a Linux terminal, type:
```cmd
taos
```
The welcome message is printed if the shell connects to TDengine server successfully, otherwise, an error message will be printed (refer to our [FAQ](../faq) page for troubleshooting the connection error). The TDengine shell prompt is:
```cmd
taos>
```
In the TDengine shell, you can create databases, create tables and insert/query data with SQL. Each query command ends with a semicolon. It works like MySQL, for example:
```mysql
create database db;
use db;
create table t (ts timestamp, cdata int);
insert into t values ('2019-07-15 10:00:00', 10);
insert into t values ('2019-07-15 10:01:05', 20);
select * from t;
ts | speed |
===================================
19-07-15 10:00:00.000| 10|
19-07-15 10:01:05.000| 20|
Query OK, 2 row(s) in set (0.001700s)
```
Besides the SQL commands, the system administrator can check system status, add or delete accounts, and manage the servers.
###Shell Command Line Parameters
You can run `taos` command with command line options to fit your needs. Some frequently used options are listed below:
- -c, --config-dir: set the configuration directory. It is _/etc/taos_ by default
- -h, --host: set the IP address of the server it will connect to, Default is localhost
- -s, --commands: set the command to run without entering the shell
- -u, -- user: user name to connect to server. Default is root
- -p, --password: password. Default is 'taosdata'
- -?, --help: get a full list of supported options
Examples:
```cmd
taos -h 192.168.0.1 -s "use db; show tables;"
```
###Run Batch Commands
Inside TDengine shell, you can run batch commands in a file with *source* command.
```
taos> source <filename>;
```
### Tips
- Use up/down arrow key to check the command history
- To change the default password, use "`alter user`" command
- ctrl+c to interrupt any queries
- To clean the cached schema of tables or STables, execute command `RESET QUERY CACHE`
## Major Features
The core functionality of TDengine is the time-series database. To reduce the development and management complexity, and to improve the system efficiency further, TDengine also provides caching, pub/sub messaging system, and stream computing functionalities. It provides a full stack for IoT big data platform. The detailed features are listed below:
- SQL like query language used to insert or explore data
- C/C++, Java(JDBC), Python, Go, RESTful, and Node.JS interfaces for development
- Ad hoc queries/analysis via Python/R/Matlab or TDengine shell
- Continuous queries to support sliding-window based stream computing
- Super table to aggregate multiple time-streams efficiently with flexibility
- Aggregation over a time window on one or multiple time-streams
- Built-in messaging system to support publisher/subscriber model
- Built-in cache for each time stream to make latest data available as fast as light speed
- Transparent handling of historical data and real-time data
- Integrating with Telegraf, Grafana and other tools seamlessly
- A set of tools or configuration to manage TDengine
For enterprise edition, TDengine provides more advanced features below:
- Linear scalability to deliver higher capacity/throughput
- High availability to guarantee the carrier-grade service
- Built-in replication between nodes which may span multiple geographical sites
- Multi-tier storage to make historical data management simpler and cost-effective
- Web-based management tools and other tools to make maintenance simpler
TDengine is specially designed and optimized for time-series data processing in IoT, connected cars, Industrial IoT, IT infrastructure and application monitoring, and other scenarios. Compared with other solutions, it is 10x faster on insert/query speed. With a single-core machine, over 20K requestes can be processed, millions data points can be ingested, and over 10 million data points can be retrieved in a second. Via column-based storage and tuned compression algorithm for different data types, less than 1/10 storage space is required.
## Explore More on TDengine
Please read through the whole <a href='../documentation'>documentation</a> to learn more about TDengine.
# TDengine System Architecture
## Storage Design
TDengine data mainly include **metadata** and **data** that we will introduce in the following sections.
### Metadata Storage
Metadata include the information of databases, tables, etc. Metadata files are saved in _/var/lib/taos/mgmt/_ directory by default. The directory tree is as below:
```
/var/lib/taos/
+--mgmt/
+--db.db
+--meters.db
+--user.db
+--vgroups.db
```
A metadata structure (database, table, etc.) is saved as a record in a metadata file. All metadata files are appended only, and even a drop operation adds a deletion record at the end of the file.
### Data storage
Data in TDengine are sharded according to the time range. Data of tables in the same vnode in a certain time range are saved in the same filegroup, such as files v0f1804*. This sharding strategy can effectively improve data searching speed. By default, a group of files contains data in 10 days, which can be configured by *daysPerFile* in the configuration file or by *DAYS* keyword in *CREATE DATABASE* clause. Data in files are blockwised. A data block only contains one table's data. Records in the same data block are sorted according to the primary timestamp, which helps to improve the compression rate and save storage. The compression algorithms used in TDengine include simple8B, delta-of-delta, RLE, LZ4, etc.
By default, TDengine data are saved in */var/lib/taos/data/* directory. _/var/lib/taos/tsdb/_ directory contains vnode informations and data file linkes.
```
/var/lib/taos/
+--tsdb/
| +--vnode0
| +--meterObj.v0
| +--db/
| +--v0f1804.head->/var/lib/taos/data/vnode0/v0f1804.head1
| +--v0f1804.data->/var/lib/taos/data/vnode0/v0f1804.data
| +--v0f1804.last->/var/lib/taos/data/vnode0/v0f1804.last1
| +--v0f1805.head->/var/lib/taos/data/vnode0/v0f1805.head1
| +--v0f1805.data->/var/lib/taos/data/vnode0/v0f1805.data
| +--v0f1805.last->/var/lib/taos/data/vnode0/v0f1805.last1
| :
+--data/
+--vnode0/
+--v0f1804.head1
+--v0f1804.data
+--v0f1804.last1
+--v0f1805.head1
+--v0f1805.data
+--v0f1805.last1
:
```
#### meterObj file
There are only one meterObj file in a vnode. Informations bout the vnode, such as created time, configuration information, vnode statistic informations are saved in this file. It has the structure like below:
```
<start_of_file>
[file_header]
[table_record1_offset&length]
[table_record2_offset&length]
...
[table_recordN_offset&length]
[table_record1]
[table_record2]
...
[table_recordN]
<end_of_file>
```
The file header takes 512 bytes, which mainly contains informations about the vnode. Each table record is the representation of a table on disk.
#### head file
The _head_ files contain the index of data blocks in the _data_ file. The inner organization is as below:
```
<start_of_file>
[file_header]
[table1_offset]
[table2_offset]
...
[tableN_offset]
[table1_index_block]
[table2_index_block]
...
[tableN_index_block]
<end_of_file>
```
The table offset array in the _head_ file saves the information about the offsets of each table index block. Indices on data blocks in the same table are saved continuously. This also makes it efficient to load data indices on the same table. The data index block has a structure like:
```
[index_block_info]
[block1_index]
[block2_index]
...
[blockN_index]
```
The index block info part contains the information about the index block such as the number of index blocks, etc. Each block index corresponds to a real data block in the _data_ file or _last_ file. Information about the location of the real data block, the primary timestamp range of the data block, etc. are all saved in the block index part. The block indices are sorted in ascending order according to the primary timestamp. So we can apply algorithms such as the binary search on the data to efficiently search blocks according to time.
#### data file
The _data_ files store the real data block. They are append-only. The organization is as:
```
<start_of_file>
[file_header]
[block1]
[block2]
...
[blockN]
<end_of_file>
```
A data block in _data_ files only belongs to a table in the vnode and the records in a data block are sorted in ascending order according to the primary timestamp key. Data blocks are column-oriented. Data in the same column are stored contiguously, which improves reading speed and compression rate because of their similarity. A data block has the following organization:
```
[column1_info]
[column2_info]
...
[columnN_info]
[column1_data]
[column2_data]
...
[columnN_data]
```
The column info part includes information about column types, column compression algorithm, column data offset and length in the _data_ file, etc. Besides, pre-calculated results of the column data in the block are also in the column info part, which helps to improve reading speed by avoiding loading data block necessarily.
#### last file
To avoid storage fragment and to import query speed and compression rate, TDengine introduces an extra file, the _last_ file. When the number of records in a data block is lower than a threshold, TDengine will flush the block to the _last_ file for temporary storage. When new data comes, the data in the _last_ file will be merged with the new data and form a larger data block and written to the _data_ file. The organization of the _last_ file is similar to the _data_ file.
### Summary
The innovation in architecture and storage design of TDengine improves resource usage. On the one hand, the virtualization makes it easy to distribute resources between different vnodes and for future scaling. On the other hand, sorted and column-oriented storage makes TDengine have a great advantage in writing, querying and compression.
## Query Design
#### Introduction
TDengine provides a variety of query functions for both tables and super tables. In addition to regular aggregate queries, it also provides time window based query and statistical aggregation for time series data. TDengine's query processing requires the client app, management node, and data node to work together. The functions and modules involved in query processing included in each component are as follows:
Client (Client App). The client development kit, embed in a client application, consists of TAOS SQL parser and query executor, the second-stage aggregator (Result Merger), continuous query manager and other major functional modules. The SQL parser is responsible for parsing and verifying the SQL statement and converting it into an abstract syntax tree. The query executor is responsible for transforming the abstract syntax tree into the query execution logic and creates the metadata query according to the query condition of the SQL statement. Since TAOS SQL does not currently include complex nested queries and pipeline query processing mechanism, there is no longer need for query plan optimization and physical query plan conversions. The second-stage aggregator is responsible for performing the aggregation of the independent results returned by query involved data nodes at the client side to generate final results. The continuous query manager is dedicated to managing the continuous queries created by users, including issuing fixed-interval query requests and writing the results back to TDengine or returning to the client application as needed. Also, the client is also responsible for retrying after the query fails, canceling the query request, and maintaining the connection heartbeat and reporting the query status to the management node.
Management Node. The management node keeps the metadata of all the data of the entire cluster system, provides the metadata of the data required for the query from the client node, and divides the query request according to the load condition of the cluster. The super table contains information about all the tables created according to the super table, so the query processor (Query Executor) of the management node is responsible for the query processing of the tags of tables and returns the table information satisfying the tag query. Besides, the management node maintains the query status of the cluster in the Query Status Manager component, in which the metadata of all queries that are currently executing are temporarily stored in-memory buffer. When the client issues *show queries* command to management node, current running queries information is returned to the client.
Data Node. The data node, responsible for storing all data of the database, consists of query executor, query processing scheduler, query task queue, and other related components. Once the query requests from the client received, they are put into query task queue and waiting to be processed by query executor. The query executor extracts the query request from the query task queue and invokes the query optimizer to perform the basic optimization for the query execution plan. And then query executor scans the qualified data blocks in both cache and disk to obtain qualified data and return the calculated results. Besides, the data node also needs to respond to management information and commands from the management node. For example, after the *kill query* received from the management node, the query task needs to be stopped immediately.
<center> <img src="../assets/fig1.png"> </center>
<center>Fig 1. System query processing architecture diagram (only query related components)</center>
#### Query Process Design
The client, the management node, and the data node cooperate to complete the entire query processing of TDengine. Let's take a concrete SQL query as an example to illustrate the whole query processing flow. The SQL statement is to query on super table *FOO_SUPER_TABLE* to get the total number of records generated on January 12, 2019, from the table, of which TAG_LOC equals to 'beijing'. The SQL statement is as follows:
```sql
SELECT COUNT(*)
FROM FOO_SUPER_TABLE
WHERE TAG_LOC = 'beijing' AND TS >= '2019-01-12 00:00:00' AND TS < '2019-01-13 00:00:00'
```
First, the client invokes the TAOS SQL parser to parse and validate the SQL statement, then generates a syntax tree, and extracts the object of the query - the super table *FOO_SUPER_TABLE*, and then the parser sends requests with filtering information (TAG_LOC='beijing') to management node to get the corresponding metadata about *FOO_SUPER_TABLE*.
Once the management node receives the request for metadata acquisition, first finds the super table *FOO_SUPER_TABLE* basic information, and then applies the query condition (TAG_LOC='beijing') to filter all the related tables created according to it. And finally, the query executor returns the metadata information that satisfies the query request to the client.
After the client obtains the metadata information of *FOO_SUPER_TABLE*, the query executor initiates a query request with timestamp range filtering condition (TS >= '2019- 01-12 00:00:00' AND TS < '2019-01-13 00:00:00') to all nodes that hold the corresponding data according to the information about data distribution in metadata.
The data node receives the query sent from the client, converts it into an internal structure and puts it into the query task queue to be executed by query executor after optimizing the execution plan. When the query result is obtained, the query result is returned to the client. It should be noted that the data nodes perform the query process independently of each other, and rely solely on their data and content for processing.
When all data nodes involved in the query return results, the client aggregates the result sets from each data node. In this case, all results are accumulated to generate the final query result. The second stage of aggregation is not always required for all queries. For example, a column selection query does not require a second-stage aggregation at all.
#### REST Query Process
In addition to C/C++, Python, and JDBC interface, TDengine also provides a REST interface based on the HTTP protocol, which is different from using the client application programming interface. When the user uses the REST interface, all the query processing is completed on the server-side, and the user's application is not involved in query processing anymore. After the query processing is completed, the result is returned to the client through the HTTP JSON string.
<center> <img src="../assets/fig2.png"> </center>
<center>Fig. 2 REST query architecture</center>
When a client uses an HTTP-based REST query interface, the client first establishes a connection with the HTTP connector at the data node and then uses the token to ensure the reliability of the request through the REST signature mechanism. For the data node, after receiving the request, the HTTP connector invokes the embedded client program to initiate a query processing, and then the embedded client parses the SQL statement from the HTTP connector and requests the management node to get metadata as needed. After that, the embedded client sends query requests to the same data node or other nodes in the cluster and aggregates the calculation results on demand. Finally, you also need to convert the result of the query into a JSON format string and return it to the client via an HTTP response. After the HTTP connector receives the request SQL, the subsequent process processing is completely consistent with the query processing using the client application development kit.
It should be noted that during the entire processing, the client application is no longer involved in, and is only responsible for sending SQL requests through the HTTP protocol and receiving the results in JSON format. Besides, each data node is embedded with an HTTP connector and a client, so any data node in the cluster received requests from a client, the data node can initiate the query and return the result to the client through the HTTP protocol, with transfer the request to other data nodes.
#### Technology
Because TDengine stores data and tags value separately, the tag value is kept in the management node and directly associated with each table instead of records, resulting in a great reduction of the data storage. Therefore, the tag value can be managed by a fully in-memory structure. First, the filtering of the tag data can drastically reduce the data size involved in the second phase of the query. The query processing for the data is performed at the data node. TDengine takes advantage of the immutable characteristics of IoT data by calculating the maximum, minimum, and other statistics of the data in one data block on each saved data block, to effectively improve the performance of query processing. If the query process involves all the data of the entire data block, the pre-computed result is used directly, and the content of the data block is no longer needed. Since the size of disk space required to store the pre-computation result is much smaller than the size of the specific data, the pre-computation result can greatly reduce the disk IO and speed up the query processing.
TDengine employs column-oriented data storage techniques. When the data block is involved to be loaded from the disk for calculation, only the required column is read according to the query condition, and the read overhead can be minimized. The data of one column is stored in a contiguous memory block and therefore can make full use of the CPU L2 cache to greatly speed up the data scanning. Besides, TDengine utilizes the eagerly responding mechanism and returns a partial result before the complete result is acquired. For example, when the first batch of results is obtained, the data node immediately returns it directly to the client in case of a column select query.
\ No newline at end of file
# 超级表STable:多表聚合
TDengine要求每个数据采集点单独建表,这样能极大提高数据的插入/查询性能,但是导致系统中表的数量猛增,让应用对表的维护以及聚合、统计操作难度加大。为降低应用的开发难度,TDengine引入了超级表STable (Super Table)的概念。
## 什么是超级表
STable是同一类型数据采集点的抽象,是同类型采集实例的集合,包含多张数据结构一样的子表。每个STable为其子表定义了表结构和一组标签:表结构即表中记录的数据列及其数据类型;标签名和数据类型由STable定义,标签值记录着每个子表的静态信息,用以对子表进行分组过滤。子表本质上就是普通的表,由一个时间戳主键和若干个数据列组成,每行记录着具体的数据,数据查询操作与普通表完全相同;但子表与普通表的区别在于每个子表从属于一张超级表,并带有一组由STable定义的标签值。每种类型的采集设备可以定义一个STable。数据模型定义表的每列数据的类型,如温度、压力、电压、电流、GPS实时位置等,而标签信息属于Meta Data,如采集设备的序列号、型号、位置等,是静态的,是表的元数据。用户在创建表(数据采集点)时指定STable(采集类型)外,还可以指定标签的值,也可事后增加或修改。
TDengine扩展标准SQL语法用于定义STable,使用关键词tags指定标签信息。语法如下:
```mysql
CREATE TABLE <stable_name> (<field_name> TIMESTAMP, field_name1 field_type,…) TAGS(tag_name tag_type, …)
```
其中tag_name是标签名,tag_type是标签的数据类型。标签可以使用时间戳之外的其他TDengine支持的数据类型,标签的个数最多为6个,名字不能与系统关键词相同,也不能与其他列名相同。如:
```mysql
create table thermometer (ts timestamp, degree float)
tags (location binary(20), type int)
```
上述SQL创建了一个名为thermometer的STable,带有标签location和标签type。
为某个采集点创建表时,可以指定其所属的STable以及标签的值,语法如下:
```mysql
CREATE TABLE <tb_name> USING <stb_name> TAGS (tag_value1,...)
```
沿用上面温度计的例子,使用超级表thermometer建立单个温度计数据表的语句如下:
```mysql
create table t1 using thermometer tags (‘beijing’, 10)
```
上述SQL以thermometer为模板,创建了名为t1的表,这张表的Schema就是thermometer的Schema,但标签location值为‘beijing’,标签type值为10。
用户可以使用一个STable创建数量无上限的具有不同标签的表,从这个意义上理解,STable就是若干具有相同数据模型,不同标签的表的集合。与普通表一样,用户可以创建、删除、查看超级表STable,大部分适用于普通表的查询操作都可运用到STable上,包括各种聚合和投影选择函数。除此之外,可以设置标签的过滤条件,仅对STbale中部分表进行聚合查询,大大简化应用的开发。
TDengine对表的主键(时间戳)建立索引,暂时不提供针对数据模型中其他采集量(比如温度、压力值)的索引。每个数据采集点会采集若干数据记录,但每个采集点的标签仅仅是一条记录,因此数据标签在存储上没有冗余,且整体数据规模有限。TDengine将标签数据与采集的动态数据完全分离存储,而且针对STable的标签建立了高性能内存索引结构,为标签提供全方位的快速操作支持。用户可按照需求对其进行增删改查(Create,Retrieve,Update,Delete,CRUD)操作。
STable从属于库,一个STable只属于一个库,但一个库可以有一到多个STable, 一个STable可有多个子表。
## 超级表管理
- 创建超级表
```mysql
CREATE TABLE <stable_name> (<field_name> TIMESTAMP, field_name1 field_type,…) TAGS(tag_name tag_type, …)
```
与创建表的SQL语法相似。但需指定TAGS字段的名称和类型。
说明:
1. TAGS列总长度不能超过512 bytes;
2. TAGS列的数据类型不能是timestamp和nchar类型;
3. TAGS列名不能与其他列名相同;
4. TAGS列名不能为预留关键字.
- 显示已创建的超级表
```mysql
show stables;
```
查看数据库内全部STable,及其相关信息,包括STable的名称、创建时间、列数量、标签(TAG)数量、通过该STable建表的数量。
- 删除超级表
```mysql
DROP TABLE <stable_name>
```
Note: 删除STable不会级联删除通过STable创建的表;相反删除STable时要求通过该STable创建的表都已经被删除。
- 查看属于某STable并满足查询条件的表
```mysql
SELECT TBNAME,[TAG_NAME,…] FROM <stable_name> WHERE <tag_name> <[=|=<|>=|<>] values..> ([AND|OR] …)
```
查看属于某STable并满足查询条件的表。说明:TBNAME为关键词,显示通过STable建立的子表表名,查询过程中可以使用针对标签的条件。
```mysql
SELECT COUNT(TBNAME) FROM <stable_name> WHERE <tag_name> <[=|=<|>=|<>] values..> ([AND|OR] …)
```
统计属于某个STable并满足查询条件的子表的数量
## 写数据时自动建子表
在某些特殊场景中,用户在写数据时并不确定某个设备的表是否存在,此时可使用自动建表语法来实现写入数据时里用超级表定义的表结构自动创建不存在的子表,若该表已存在则不会建立新表。注意:自动建表语句只能自动建立子表而不能建立超级表,这就要求超级表已经被事先定义好。自动建表语法跟insert/import语法非常相似,唯一区别是语句中增加了超级表和标签信息。具体语法如下:
```mysql
INSERT INTO <tb_name> USING <stb_name> TAGS (<tag1_value>, ...) VALUES (field_value, ...) (field_value, ...) ...;
```
向表tb_name中插入一条或多条记录,如果tb_name这张表不存在,则会用超级表stb_name定义的表结构以及用户指定的标签值(即tag1_value…)来创建名为tb_name新表,并将用户指定的值写入表中。如果tb_name已经存在,则建表过程会被忽略,系统也不会检查tb_name的标签是否与用户指定的标签值一致,也即不会更新已存在表的标签。
```mysql
INSERT INTO <tb1_name> USING <stb1_name> TAGS (<tag1_value1>, ...) VALUES (<field1_value1>, ...) (<field1_value2>, ...) ... <tb_name2> USING <stb_name2> TAGS(<tag1_value2>, ...) VALUES (<field1_value1>, ...) ...;
```
向多张表tb1_name,tb2_name等插入一条或多条记录,并分别指定各自的超级表进行自动建表。
## STable中TAG管理
除了更新标签的值的操作是针对子表进行,其他所有的标签操作(添加标签、删除标签等)均只能作用于STable,不能对单个子表操作。对STable添加标签以后,依托于该STable建立的所有表将自动增加了一个标签,对于数值型的标签,新增加的标签的默认值是0.
- 添加新的标签
```mysql
ALTER TABLE <stable_name> ADD TAG <new_tag_name> <TYPE>
```
为STable增加一个新的标签,并指定新标签的类型。标签总数不能超过6个。
- 删除标签
```mysql
ALTER TABLE <stable_name> DROP TAG <tag_name>
```
删除超级表的一个标签,从超级表删除某个标签后,该超级表下的所有子表也会自动删除该标签。
说明:第一列标签不能删除,至少需要为STable保留一个标签。
- 修改标签名
```mysql
ALTER TABLE <stable_name> CHANGE TAG <old_tag_name> <new_tag_name>
```
修改超级表的标签名,从超级表修改某个标签名后,该超级表下的所有子表也会自动更新该标签名。
- 修改子表的标签值
```mysql
ALTER TABLE <table_name> SET TAG <tag_name>=<new_tag_value>
```
## STable多表聚合
针对所有的通过STable创建的子表进行多表聚合查询,支持按照全部的TAG值进行条件过滤,并可将结果按照TAGS中的值进行聚合,暂不支持针对binary类型的模糊匹配过滤。语法如下:
```mysql
SELECT function<field_name>,…
FROM <stable_name>
WHERE <tag_name> <[=|<=|>=|<>] values..> ([AND|OR] …)
INTERVAL (<interval> [, offset])
GROUP BY <tag_name>, <tag_name>…
ORDER BY <tag_name> <asc|desc>
SLIMIT <group_limit>
SOFFSET <group_offset>
LIMIT <record_limit>
OFFSET <record_offset>
```
**说明**
超级表聚合查询,TDengine目前支持以下聚合\选择函数:sum、count、avg、first、last、min、max、top、bottom,以及针对全部或部分列的投影操作,使用方式与单表查询的计算过程相同。暂不支持其他类型的聚合计算和四则运算。当前所有的函数及计算过程均不支持嵌套的方式进行执行。
不使用GROUP BY的查询将会对超级表下所有满足筛选条件的表按时间进行聚合,结果输出默认是按照时间戳单调递增输出,用户可以使用ORDER BY _c0 ASC|DESC选择查询结果时间戳的升降排序;使用GROUP BY <tag_name> 的聚合查询会按照tags进行分组,并对每个组内的数据分别进行聚合,输出结果为各个组的聚合结果,组间的排序可以由ORDER BY <tag_name> 语句指定,每个分组内部,时间序列是单调递增的。
使用SLIMIT/SOFFSET语句指定组间分页,即指定结果集中输出的最大组数以及对组起始的位置。使用LIMIT/OFFSET语句指定组内分页,即指定结果集中每个组内最多输出多少条记录以及记录起始的位置。
## STable使用示例
以温度传感器采集时序数据作为例,示范STable的使用。 在这个例子中,对每个温度计都会建立一张表,表名为温度计的ID,温度计读数的时刻记为ts,采集的值记为degree。通过tags给每个采集器打上不同的标签,其中记录温度计的地区和类型,以方便我们后面的查询。所有温度计的采集量都一样,因此我们用STable来定义表结构。
###定义STable表结构并使用它创建子表
创建STable语句如下:
```mysql
CREATE TABLE thermometer (ts timestamp, degree double)
TAGS(location binary(20), type int)
```
假设有北京,天津和上海三个地区的采集器共4个,温度采集器有3种类型,我们就可以对每个采集器建表如下:
```mysql
CREATE TABLE therm1 USING thermometer TAGS (’beijing’, 1);
CREATE TABLE therm2 USING thermometer TAGS (’beijing’, 2);
CREATE TABLE therm3 USING thermometer TAGS (’tianjin’, 1);
CREATE TABLE therm4 USING thermometer TAGS (’shanghai’, 3);
```
其中therm1,therm2,therm3,therm4是超级表thermometer四个具体的子表,也即普通的Table。以therm1为例,它表示采集器therm1的数据,表结构完全由thermometer定义,标签location=”beijing”, type=1表示therm1的地区是北京,类型是第1类的温度计。
###写入数据
注意,写入数据时不能直接对STable操作,而是要对每张子表进行操作。我们分别向四张表therm1,therm2, therm3, therm4写入一条数据,写入语句如下:
```mysql
INSERT INTO therm1 VALUES (’2018-01-01 00:00:00.000’, 20);
INSERT INTO therm2 VALUES (’2018-01-01 00:00:00.000’, 21);
INSERT INTO therm3 VALUES (’2018-01-01 00:00:00.000’, 24);
INSERT INTO therm4 VALUES (’2018-01-01 00:00:00.000’, 23);
```
###按标签聚合查询
查询位于北京(beijing)和天津(tianjing)两个地区的温度传感器采样值的数量count(*)、平均温度avg(degree)、最高温度max(degree)、最低温度min(degree),并将结果按所处地域(location)和传感器类型(type)进行聚合。
```mysql
SELECT COUNT(*), AVG(degree), MAX(degree), MIN(degree)
FROM thermometer
WHERE location=’beijing’ or location=’tianjing’
GROUP BY location, type
```
###按时间周期聚合查询
查询仅位于北京以外地区的温度传感器最近24小时(24h)采样值的数量count(*)、平均温度avg(degree)、最高温度max(degree)和最低温度min(degree),将采集结果按照10分钟为周期进行聚合,并将结果按所处地域(location)和传感器类型(type)再次进行聚合。
```mysql
SELECT COUNT(*), AVG(degree), MAX(degree), MIN(degree)
FROM thermometer
WHERE name<>’beijing’ and ts>=now-1d
INTERVAL(10M)
GROUP BY location, type
```
\ No newline at end of file
# STable: Super Table
"One Table for One Device" design can improve the insert/query performance significantly for a single device. But it has a side effect, the aggregation of multiple tables becomes hard. To reduce the complexity and improve the efficiency, TDengine introduced a new concept: STable (Super Table).
## What is a Super Table
STable is an abstract and a template for a type of device. A STable contains a set of devices (tables) that have the same schema or data structure. Besides the shared schema, a STable has a set of tags, like the model, serial number and so on. Tags are used to record the static attributes for the devices and are used to group a set of devices (tables) for aggregation. Tags are metadata of a table and can be added, deleted or changed.
TDengine does not save tags as a part of the data points collected. Instead, tags are saved as metadata. Each table has a set of tags. To improve query performance, tags are all cached and indexed. One table can only belong to one STable, but one STable may contain many tables.
Like a table, you can create, show, delete and describe STables. Most query operations on tables can be applied to STable too, including the aggregation and selector functions. For queries on a STable, if no tags filter, the operations are applied to all the tables created via this STable. If there is a tag filter, the operations are applied only to a subset of the tables which satisfy the tag filter conditions. It will be very convenient to use tags to put devices into different groups for aggregation.
##Create a STable
Similiar to creating a standard table, syntax is:
```mysql
CREATE TABLE <stable_name> (<field_name> TIMESTAMP, field_name1 field_type,…) TAGS(tag_name tag_type, …)
```
New keyword "tags" is introduced, where tag_name is the tag name, and tag_type is the associated data type.
Note:
1. The bytes of all tags together shall be less than 512
2. Tag's data type can not be time stamp or nchar
3. Tag name shall be different from the field name
4. Tag name shall not be the same as system keywords
5. Maximum number of tags is 6
For example:
```mysql
create table thermometer (ts timestamp, degree float)
tags (location binary(20), type int)
```
The above statement creates a STable thermometer with two tag "location" and "type"
##Create a Table via STable
To create a table for a device, you can use a STable as its template and assign the tag values. The syntax is:
```mysql
CREATE TABLE <tb_name> USING <stb_name> TAGS (tag_value1,...)
```
You can create any number of tables via a STable, and each table may have different tag values. For example, you create five tables via STable thermometer below:
```mysql
create table t1 using thermometer tags (‘beijing’, 10);
create table t2 using thermometer tags (‘beijing’, 20);
create table t3 using thermometer tags (‘shanghai’, 10);
create table t4 using thermometer tags (‘shanghai’, 20);
create table t5 using thermometer tags (‘new york’, 10);
```
## Aggregate Tables via STable
You can group a set of tables together by specifying the tags filter condition, then apply the aggregation operations. The result set can be grouped and ordered based on tag value. Syntax is:
```mysql
SELECT function<field_name>,…
FROM <stable_name>
WHERE <tag_name> <[=|<=|>=|<>] values..> ([AND|OR] …)
INTERVAL (<time range>)
GROUP BY <tag_name>, <tag_name>…
ORDER BY <tag_name> <asc|desc>
SLIMIT <group_limit>
SOFFSET <group_offset>
LIMIT <record_limit>
OFFSET <record_offset>
```
For the time being, STable supports only the following aggregation/selection functions: *sum, count, avg, first, last, min, max, top, bottom*, and the projection operations, the same syntax as a standard table. Arithmetic operations are not supported, embedded queries not either.
*INTERVAL* is used for the aggregation over a time range.
If *GROUP BY* is not used, the aggregation is applied to all the selected tables, and the result set is output in ascending order of the timestamp, but you can use "*ORDER BY _c0 ASC|DESC*" to specify the order you like.
If *GROUP BY <tag_name>* is used, the aggregation is applied to groups based on tags. Each group is aggregated independently. Result set is a group of aggregation results. The group order is decided by *ORDER BY <tag_name>*. Inside each group, the result set is in the ascending order of the time stamp.
*SLIMIT/SOFFSET* are used to limit the number of groups and starting group number.
*LIMIT/OFFSET* are used to limit the number of records in a group and the starting rows.
###Example 1:
Check the average, maximum, and minimum temperatures of Beijing and Shanghai, and group the result set by location and type. The SQL statement shall be:
```mysql
SELECT COUNT(*), AVG(degree), MAX(degree), MIN(degree)
FROM thermometer
WHERE location=’beijing’ or location=’tianjing’
GROUP BY location, type
```
### Example 2:
List the number of records, average, maximum, and minimum temperature every 10 minutes for the past 24 hours for all the thermometers located in Beijing with type 10. The SQL statement shall be:
```mysql
SELECT COUNT(*), AVG(degree), MAX(degree), MIN(degree)
FROM thermometer
WHERE name=’beijing’ and type=10 and ts>=now-1d
INTERVAL(10M)
```
## Create Table Automatically
Insert operation will fail if the table is not created yet. But for STable, TDengine can create the table automatically if the application provides the STable name, table name and tags' value when inserting data points. The syntax is:
```mysql
INSERT INTO <tb_name> USING <stb_name> TAGS (<tag1_value>, ...) VALUES (field_value, ...) (field_value, ...) ... <tb_name2> USING <stb_name2> TAGS(<tag1_value2>, ...) VALUES (<field1_value1>, ...) ...;
```
When inserting data points into table tb_name, the system will check if table tb_name is created or not. If it is already created, the data points will be inserted as usual. But if the table is not created yet, the system will create the table tb_bame using STable stb_name as the template with the tags. Multiple tables can be specified in the SQL statement.
## Management of STables
After you can create a STable, you can describe, delete, change STables. This section lists all the supported operations.
### Show STables in current DB
```mysql
show stables;
```
It lists all STables in current DB, including the name, created time, number of fileds, number of tags, and number of tables which are created via this STable.
### Describe a STable
```mysql
DESCRIBE <stable_name>
```
It lists the STable's schema and tags
### Drop a STable
```mysql
DROP TABLE <stable_name>
```
To delete a STable, all the tables created via this STable shall be deleted first, otherwise, it will fail.
### List the Associated Tables of a STable
```mysql
SELECT TBNAME,[TAG_NAME,…] FROM <stable_name> WHERE <tag_name> <[=|=<|>=|<>] values..> ([AND|OR] …)
```
It will list all the tables which satisfy the tag filter conditions. The tables are all created from this specific STable. TBNAME is a new keyword introduced, it is the table name associated with the STable.
```mysql
SELECT COUNT(TBNAME) FROM <stable_name> WHERE <tag_name> <[=|=<|>=|<>] values..> ([AND|OR] …)
```
The above SQL statement will list the number of tables in a STable, which satisfy the filter condition.
## Management of Tags
You can add, delete and change the tags for a STable, and you can change the tag value of a table. The SQL commands are listed below.
###Add a Tag
```mysql
ALTER TABLE <stable_name> ADD TAG <new_tag_name> <TYPE>
```
It adds a new tag to the STable with a data type. The maximum number of tags is 6.
###Drop a Tag
```mysql
ALTER TABLE <stable_name> DROP TAG <tag_name>
```
It drops a tag from a STable. The first tag could not be deleted, and there must be at least one tag.
###Change a Tag's Name
```mysql
ALTER TABLE <stable_name> CHANGE TAG <old_tag_name> <new_tag_name>
```
It changes the name of a tag from old to new.
###Change the Tag's Value
```mysql
ALTER TABLE <table_name> SET TAG <tag_name>=<new_tag_value>
```
It changes a table's tag value to a new one.
#Administrator
## Directory and Files
After TDengine is installed, by default, the following directories will be created:
| Directory/File | Description |
| ---------------------- | :------------------------------ |
| /etc/taos/taos.cfg | TDengine configuration file |
| /usr/local/taos/driver | TDengine dynamic link library |
| /var/lib/taos | TDengine default data directory |
| /var/log/taos | TDengine default log directory |
| /usr/local/taos/bin. | TDengine executables |
### Executables
All TDengine executables are located at _/usr/local/taos/bin_ , including:
- `taosd`:TDengine server
- `taos`: TDengine Shell, the command line interface.
- `taosdump`:TDengine data export tool
- `rmtaos`: a script to uninstall TDengine
You can change the data directory and log directory setting through the system configuration file
## Configuration on Server
`taosd` is running on the server side, you can change the system configuration file taos.cfg to customize its behavior. By default, taos.cfg is located at /etc/taos, but you can specify the path to configuration file via the command line parameter -c. For example: `taosd -c /home/user` means the configuration file will be read from directory /home/user.
This section lists only the most important configuration parameters. Please check taos.cfg to find all the configurable parameters. **Note: to make your new configurations work, you have to restart taosd after you change taos.cfg**.
- mgmtShellPort: TCP and UDP port between client and TDengine mgmt (default: 6030). Note: 5 successive UDP ports (6030-6034) starting from this number will be used.
- vnodeShellPort: TCP and UDP port between client and TDengine vnode (default: 6035). Note: 5 successive UDP ports (6035-6039) starting from this number will be used.
- httpPort: TCP port for RESTful service (default: 6020)
- dataDir: data directory, default is /var/lib/taos
- maxUsers: maximum number of users allowed
- maxDbs: maximum number of databases allowed
- maxTables: maximum number of tables allowed
- enableMonitor: turn on/off system monitoring, 0: off, 1: on
- logDir: log directory, default is /var/log/taos
- numOfLogLines: maximum number of lines in the log file
- debugFlag: log level, 131: only error and warnings, 135: all
In different scenarios, data characteristics are different. For example, the retention policy, data sampling period, record size, the number of devices, and data compression may be different. To gain the best performance, you can change the following configurations related to storage:
- days: number of days to cover for a data file
- keep: number of days to keep the data
- rows: number of rows of records in a block in data file.
- comp: compression algorithm, 0: off, 1: standard; 2: maximum compression
- ctime: period (seconds) to flush data to disk
- clog: flag to turn on/off Write Ahead Log, 0: off, 1: on
- tables: maximum number of tables allowed in a vnode
- cache: cache block size (bytes)
- tblocks: maximum number of cache blocks for a table
- abloks: average number of cache blocks for a table
- precision: timestamp precision, us: microsecond ms: millisecond, default is ms
For an application, there may be multiple data scenarios. The best design is to put all data with the same characteristics into one database. One application may have multiple databases, and every database has its own configuration to maximize the system performance. You can specify the above configurations related to storage when you create a database. For example:
```mysql
CREATE DATABASE demo DAYS 10 CACHE 16000 ROWS 2000
```
The above SQL statement will create a database demo, with 10 days for each data file, 16000 bytes for a cache block, and 2000 rows in a file block.
The configuration provided when creating a database will overwrite the configuration in taos.cfg.
## Configuration on Client
*taos* is the TDengine shell and is a client that connects to taosd. TDengine uses the same configuration file taos.cfg for the client, with default location at /etc/taos. You can change it by specifying command line parameter -c when you run taos. For example, *taos -c /home/user*, it will read the configuration file taos.cfg from directory /home/user.
The parameters related to client configuration are listed below:
- masterIP: IP address of TDengine server
- charset: character set, default is the system . For data type nchar, TDengine uses unicode to store the data. Thus, the client needs to tell its character set.
- locale: system language setting
- defaultUser: default login user, default is root
- defaultPass: default password, default is taosdata
For TCP/UDP port, and system debug/log configuration, it is the same as the server side.
For server IP, user name, password, you can always specify them in the command line when you run taos. If they are not specified, they will be read from the taos.cfg
## User Management
System administrator (user root) can add, remove a user, or change the password from the TDengine shell. Commands are listed below:
Create a user, password shall be quoted with the single quote.
```mysql
CREATE USER user_name PASS ‘password’
```
Remove a user
```mysql
DROP USER user_name
```
Change the password for a user
```mysql
ALTER USER user_name PASS ‘password’
```
List all users
```mysql
SHOW USERS
```
## Import Data
Inside the TDengine shell, you can import data into TDengine from either a script or CSV file
**Import from Script**
```
source <filename>
```
Inside the file, you can put all SQL statements there. Each SQL statement has a line. If a line starts with "#", it means comments, it will be skipped. The system will execute the SQL statements line by line automatically until the ends
**Import from CSV**
```mysql
insert into tb1 file 'path/data.csv'
```
CSV file contains records for only one table, and the data structure shall be the same as the defined schema for the table. The header of CSV file shall be removed.
For example, the following is a sub-table d1001:
```mysql
taos> DESCRIBE d1001
Field | Type | Length | Note |
=================================================================================
ts | TIMESTAMP | 8 | |
current | FLOAT | 4 | |
voltage | INT | 4 | |
phase | FLOAT | 4 | |
location | BINARY | 64 | TAG |
groupid | INT | 4 | TAG |
```
The data format in data.csv like this:
```csv
'2018-10-04 06:38:05.000',10.30000,219,0.31000
'2018-10-05 06:38:15.000',12.60000,218,0.33000
'2018-10-06 06:38:16.800',13.30000,221,0.32000
'2018-10-07 06:38:05.000',13.30000,219,0.33000
'2018-10-08 06:38:05.000',14.30000,219,0.34000
'2018-10-09 06:38:05.000',15.30000,219,0.35000
'2018-10-10 06:38:05.000',16.30000,219,0.31000
'2018-10-11 06:38:05.000',17.30000,219,0.32000
'2018-10-12 06:38:05.000',18.30000,219,0.31000
```
then data can be imported into database by this cmd:
```
taos> insert into d1001 file '~/data.csv';
Query OK, 9 row(s) affected (0.004763s)
```
## Export Data
You can export data either from TDengine shell or from tool taosdump.
**Export from TDengine Shell**
```mysql
select * from <tb_name> >> data.csv
```
The above SQL statement will dump the query result set into data.csv file.
**Export Using taosdump**
TDengine provides a data dumping tool taosdump. You can choose to dump a database, a table, all data or data only a time range, even only the metadata. For example:
- Export one or more tables in a DB: taosdump [OPTION…] dbname tbname …
- Export one or more DBs: taosdump [OPTION…] --databases dbname…
- Export all DBs (excluding system DB): taosdump [OPTION…] --all-databases
run *taosdump —help* to get a full list of the options
## Management of Connections, Streams, Queries
The system administrator can check, kill the ongoing connections, streams, or queries.
```
SHOW CONNECTIONS
```
It lists all connections. The first column shows connection-id from the client.
```
KILL CONNECTION <connection-id>
```
It kills the connection, where connection-id is the number of the first column showed by "SHOW CONNECTIONS".
```
SHOW QUERIES
```
It shows the ongoing queries. The first column shows the connection-id:query-no, where connection-id is the connection from the client, and id assigned by the system
```
KILL QUERY <query-id>
```
It kills the query, where query-id is the connection-id:query-no showed by "SHOW QUERIES". You can copy and paste it.
```
SHOW STREAMS
```
It shows the continuous queries. The first column shows the connection-id:stream-no, where connection-id is the connection from the client, and id assigned by the system.
```
KILL STREAM <stream-id>
```
It kills the continuous query, where stream-id is the connection-id:stream-no showed by "SHOW STREAMS". You can copy and paste it.
## System Monitor
TDengine runs a system monitor in the background. Once it is started, it will create a database sys automatically. System monitor collects the metric like CPU, memory, network, disk, number of requests periodically, and writes them into database sys. Also, TDengine will log all important actions, like login, logout, create database, drop database and so on, and write them into database sys.
You can check all the saved monitor information from database sys. By default, system monitor is turned on. But you can turn it off by changing the parameter in the configuration file.
#Advanced Features
##Continuous Query
Continuous Query is a query executed by TDengine periodically with a sliding window, it is a simplified stream computing driven by timers, not by events. Continuous query can be applied to a table or a STable, and the result set can be passed to the application directly via call back function, or written into a new table in TDengine. The query is always executed on a specified time window (window size is specified by parameter interval), and this window slides forward while time flows (the sliding period is specified by parameter sliding).
Continuous query is defined by TAOS SQL, there is nothing special. One of the best applications is downsampling. Once it is defined, at the end of each cycle, the system will execute the query, pass the result to the application or write it to a database.
If historical data pints are inserted into the stream, the query won't be re-executed, and the result set won't be updated. If the result set is passed to the application, the application needs to keep the status of continuous query, the server won't maintain it. If application re-starts, it needs to decide the time where the stream computing shall be started.
####How to use continuous query
- Pass result set to application
Application shall use API taos_stream (details in connector section) to start the stream computing. Inside the API, the SQL syntax is:
```sql
SELECT aggregation FROM [table_name | stable_name]
INTERVAL(window_size) SLIDING(period)
```
where the new keyword INTERVAL specifies the window size, and SLIDING specifies the sliding period. If parameter sliding is not specified, the sliding period will be the same as window size. The minimum window size is 10ms. The sliding period shall not be larger than the window size. If you set a value larger than the window size, the system will adjust it to window size automatically.
For example:
```sql
SELECT COUNT(*) FROM FOO_TABLE
INTERVAL(1M) SLIDING(30S)
```
The above SQL statement will count the number of records for the past 1-minute window every 30 seconds.
- Save the result into a database
If you want to save the result set of stream computing into a new table, the SQL shall be:
```sql
CREATE TABLE table_name AS
SELECT aggregation from [table_name | stable_name]
INTERVAL(window_size) SLIDING(period)
```
Also, you can set the time range to execute the continuous query. If no range is specified, the continuous query will be executed forever. For example, the following continuous query will be executed from now and will stop in one hour.
```sql
CREATE TABLE QUERY_RES AS
SELECT COUNT(*) FROM FOO_TABLE
WHERE TS > NOW AND TS <= NOW + 1H
INTERVAL(1M) SLIDING(30S)
```
###Manage the Continuous Query
Inside TDengine shell, you can use the command "show streams" to list the ongoing continuous queries, the command "kill stream" to kill a specific continuous query.
If you drop a table generated by the continuous query, the query will be removed too.
##Publisher/Subscriber
Time series data is a sequence of data points over time. Inside a table, the data points are stored in order of timestamp. Also, there is a data retention policy, the data points will be removed once their lifetime is passed. From another view, a table in DTengine is just a standard message queue.
To reduce the development complexity and improve data consistency, TDengine provides the pub/sub functionality. To publish a message, you simply insert a record into a table. Compared with popular messaging tool Kafka, you subscribe to a table or a SQL query statement, instead of a topic. Once new data points arrive, TDengine will notify the application. The process is just like Kafka.
The detailed API will be introduced in the [connectors](https://www.taosdata.com/en/documentation/advanced-features/) section.
##Caching
TDengine allocates a fixed-size buffer in memory, the newly arrived data will be written into the buffer first. Every device or table gets one or more memory blocks. For typical IoT scenarios, the hot data shall always be newly arrived data, they are more important for timely analysis. Based on this observation, TDengine manages the cache blocks in First-In-First-Out strategy. If no enough space in the buffer, the oldest data will be saved into hard disk first, then be overwritten by newly arrived data. TDengine also guarantees every device can keep at least one block of data in the buffer.
By this design, the application can retrieve the latest data from each device super-fast, since they are all available in memory. You can use last or last_row function to return the last data record. If the super table is used, it can be used to return the last data records of all or a subset of devices. For example, to retrieve the latest temperature from thermometers in located Beijing, execute the following SQL
```mysql
select last(*) from thermometers where location=’beijing’
```
By this design, caching tool, like Redis, is not needed in the system. It will reduce the complexity of the system.
TDengine creates one or more virtual nodes(vnode) in each data node. Each vnode contains data for multiple tables and has its own buffer. The buffer of a vnode is fully separated from the buffer of another vnode, not shared. But the tables in a vnode share the same buffer.
System configuration parameter cacheBlockSize configures the cache block size in bytes, and another parameter cacheNumOfBlocks configures the number of cache blocks. The total memory for the buffer of a vnode is $cacheBlockSize \times cacheNumOfBlocks$. Another system parameter numOfBlocksPerMeter configures the maximum number of cache blocks a table can use. When you create a database, you can specify these parameters.
\ No newline at end of file
#FAQ
#### 1. How to upgrade TDengine from 1.X versions to 2.X and above versions?
Version 2.X is a complete refactoring of the previous version, and configuration files and data files are incompatible. Be sure to do the following before upgrading:
1. Delete the configuration file, and execute <code>sudo rm -rf /etc/taos/taos</code>
2. Delete the log file, and execute <code>sudo rm -rf /var/log/taos </code>
3. ENSURE THAT YOUR DATAS ARE NO LONGER NEEDED! Delete the data file, and execute <code>sudo rm -rf /var/lib/taos </code>
4. Enjoy the latest stable version of TDengine
5. If the data needs to be migrated or the data file is corrupted, please contact the official technical support team for assistance
#### 2. When encoutered with the error "Unable to establish connection", what can I do?
The client may encounter connection errors. Please follow the steps below for troubleshooting:
1. Make sure that the client and server version Numbers are exactly the same, and that the open source community and Enterprise versions are not mixed.
2. On the server side, execute `systemctl status taosd` to check the status of *taosd* service. If *taosd* is not running, start it and retry connecting.
3. Make sure you have used the correct server IP address to connect to.
4. Ping the server. If no response is received, check your network connection.
5. Check the firewall setting, make sure the TCP/UDP ports from 6030-6039 are enabled.
6. For JDBC, ODBC, Python, Go connections on Linux, make sure the native library *libtaos.so* are located at /usr/local/lib/taos, and /usr/local/lib/taos is in the *LD_LIBRARY_PATH*.
7. For JDBC, ODBC, Python, Go connections on Windows, make sure *driver/c/taos.dll* is in the system search path (or you can copy taos.dll into *C:\Windows\System32*)
8. If the above steps can not help, try the network diagnostic tool *nc* to check if TCP/UDP port works
check UDP port:`nc -vuz {hostIP} {port} `
check TCP port on server: `nc -l {port}`
check TCP port on client: ` nc {hostIP} {port}`
#### 3. Why I get "Invalid SQL" error when a query is syntactically correct?
If you are sure your query has correct syntax, please check the length of the SQL string. Before version 2.0, it shall be less than 64KB.
#### 4. Does TDengine support validation queries?
For the time being, TDengine does not have a specific set of validation queries. However, TDengine comes with a system monitoring database named 'sys', which can usually be used as a validation query object.
#### 5. Can I delete or update a record that has been written into TDengine?
The answer is NO. The design of TDengine is based on the assumption that records are generated by the connected devices, you won't be allowed to change it. But TDengine provides a retention policy, the data records will be removed once their lifetime is passed.
#### 6. What is the most efficient way to write data to TDengine?
TDengine supports several different writing regimes. The most efficient way to write data to TDengine is to use batch inserting. For details on batch insertion syntax, please refer to [Taos SQL](../documentation/taos-sql)
此差异已折叠。
此差异已折叠。
......@@ -48,9 +48,8 @@ typedef struct SDnodeObj {
int32_t dnodeId;
int32_t openVnodes;
int64_t createdTime;
int32_t resever0; // from dnode status msg, config information
int64_t lastAccess;
int32_t customScore; // config by user
uint32_t lastAccess;
uint16_t numOfCores; // from dnode status msg
uint16_t dnodePort;
char dnodeFqdn[TSDB_FQDN_LEN];
......
......@@ -78,7 +78,7 @@ void mnodeUpdateDnode(SDnodeObj *pDnode);
int32_t mnodeDropDnode(SDnodeObj *pDnode, void *pMsg);
int32_t mnodeCompactDnodes();
extern int32_t tsAccessSquence;
extern int64_t tsAccessSquence;
#ifdef __cplusplus
}
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册