提交 97052fcc 编写于 作者: arielyangpan's avatar arielyangpan

docs: change TAOS SQL to TDengine SQL

上级 8c69f698
---
title: TDengine SQL
description: "The syntax supported by TDengine SQL "
description: 'The syntax supported by TDengine SQL '
---
This section explains the syntax of SQL to perform operations on databases, tables and STables, insert data, select data and use functions. We also provide some tips that can be used in TDengine SQL. If you have previous experience with SQL this section will be fairly easy to understand. If you do not have previous experience with SQL, you'll come to appreciate the simplicity and power of SQL. TDengine SQL has been enhanced in version 3.0, and the query engine has been rearchitected. For information about how TDengine SQL has changed, see [Changes in TDengine 3.0](../taos-sql/changes).
......@@ -15,7 +15,7 @@ Syntax Specifications used in this chapter:
- | means one of a few options, excluding | itself.
- … means the item prior to it can be repeated multiple times.
To better demonstrate the syntax, usage and rules of TAOS SQL, hereinafter it's assumed that there is a data set of data from electric meters. Each meter collects 3 data measurements: current, voltage, phase. The data model is shown below:
To better demonstrate the syntax, usage and rules of TDengine SQL, hereinafter it's assumed that there is a data set of data from electric meters. Each meter collects 3 data measurements: current, voltage, phase. The data model is shown below:
```
taos> DESCRIBE meters;
......
---
title: Schemaless Writing
description: "The Schemaless write method eliminates the need to create super tables/sub tables in advance and automatically creates the storage structure corresponding to the data, as it is written to the interface."
description: 'The Schemaless write method eliminates the need to create super tables/sub tables in advance and automatically creates the storage structure corresponding to the data, as it is written to the interface.'
---
In IoT applications, data is collected for many purposes such as intelligent control, business analysis, device monitoring and so on. Due to changes in business or functional requirements or changes in device hardware, the application logic and even the data collected may change. Schemaless writing automatically creates storage structures for your data as it is being written to TDengine, so that you do not need to create supertables in advance. When necessary, schemaless writing
......@@ -25,7 +25,7 @@ where:
- measurement will be used as the data table name. It will be separated from tag_set by a comma.
- `tag_set` will be used as tags, with format like `<tag_key>=<tag_value>,<tag_key>=<tag_value>` Enter a space between `tag_set` and `field_set`.
- `field_set`will be used as data columns, with format like `<field_key>=<field_value>,<field_key>=<field_value>` Enter a space between `field_set` and `timestamp`.
- `timestamp` is the primary key timestamp corresponding to this row of data
- `timestamp` is the primary key timestamp corresponding to this row of data
All data in tag_set is automatically converted to the NCHAR data type and does not require double quotes (").
......@@ -36,14 +36,14 @@ In the schemaless writing data line protocol, each data item in the field_set ne
- Spaces, equal signs (=), commas (,), and double quotes (") need to be escaped with a backslash (\\) in front. (All refer to the ASCII character)
- Numeric types will be distinguished from data types by the suffix.
| **Serial number** | **Postfix** | **Mapping type** | **Size (bytes)** |
| -------- | -------- | ------------ | -------------- |
| 1 | None or f64 | double | 8 |
| 2 | f32 | float | 4 |
| 3 | i8/u8 | TinyInt/UTinyInt | 1 |
| 4 | i16/u16 | SmallInt/USmallInt | 2 |
| 5 | i32/u32 | Int/UInt | 4 |
| 6 | i64/i/u64/u | BigInt/BigInt/UBigInt/UBigInt | 8 |
| **Serial number** | **Postfix** | **Mapping type** | **Size (bytes)** |
| ----------------- | ----------- | ----------------------------- | ---------------- |
| 1 | None or f64 | double | 8 |
| 2 | f32 | float | 4 |
| 3 | i8/u8 | TinyInt/UTinyInt | 1 |
| 4 | i16/u16 | SmallInt/USmallInt | 2 |
| 5 | i32/u32 | Int/UInt | 4 |
| 6 | i64/i/u64/u | BigInt/BigInt/UBigInt/UBigInt | 8 |
- `t`, `T`, `true`, `True`, `TRUE`, `f`, `F`, `false`, and `False` will be handled directly as BOOL types.
......@@ -61,14 +61,14 @@ Note that if the wrong case is used when describing the data type suffix, or if
Schemaless writes process row data according to the following principles.
1. You can use the following rules to generate the subtable names: first, combine the measurement name and the key and value of the label into the next string:
1. You can use the following rules to generate the subtable names: first, combine the measurement name and the key and value of the label into the next string:
```json
"measurement,tag_key1=tag_value1,tag_key2=tag_value2"
```
Note that tag_key1, tag_key2 are not the original order of the tags entered by the user but the result of using the tag names in ascending order of the strings. Therefore, tag_key1 is not the first tag entered in the line protocol.
The string's MD5 hash value "md5_val" is calculated after the ranking is completed. The calculation result is then combined with the string to generate the table name: "t_md5_val". "t_" is a fixed prefix that every table generated by this mapping relationship has.
Note that tag*key1, tag_key2 are not the original order of the tags entered by the user but the result of using the tag names in ascending order of the strings. Therefore, tag_key1 is not the first tag entered in the line protocol.
The string's MD5 hash value "md5_val" is calculated after the ranking is completed. The calculation result is then combined with the string to generate the table name: "t_md5_val". "t*" is a fixed prefix that every table generated by this mapping relationship has.
You can configure smlChildTableName to specify table names, for example, `smlChildTableName=tname`. You can insert `st,tname=cpul,t1=4 c1=3 1626006833639000000` and the cpu1 table will be automatically created. Note that if multiple rows have the same tname but different tag_set values, the tag_set of the first row is used to create the table and the others are ignored.
2. If the super table obtained by parsing the line protocol does not exist, this super table is created.
......@@ -82,7 +82,7 @@ You can configure smlChildTableName to specify table names, for example, `smlChi
:::tip
All processing logic of schemaless will still follow TDengine's underlying restrictions on data structures, such as the total length of each row of data cannot exceed
16KB. See [TAOS SQL Boundary Limits](/taos-sql/limit) for specific constraints in this area.
16KB. See [TDengine SQL Boundary Limits](/taos-sql/limit) for specific constraints in this area.
:::
......@@ -90,23 +90,23 @@ All processing logic of schemaless will still follow TDengine's underlying restr
Three specified modes are supported in the schemaless writing process, as follows:
| **Serial** | **Value** | **Description** |
| -------- | ------------------- | ------------------------------- |
| 1 | SML_LINE_PROTOCOL | InfluxDB Line Protocol |
| 2 | SML_TELNET_PROTOCOL | OpenTSDB file protocol |
| 3 | SML_JSON_PROTOCOL | OpenTSDB JSON protocol |
| **Serial** | **Value** | **Description** |
| ---------- | ------------------- | ---------------------- |
| 1 | SML_LINE_PROTOCOL | InfluxDB Line Protocol |
| 2 | SML_TELNET_PROTOCOL | OpenTSDB file protocol |
| 3 | SML_JSON_PROTOCOL | OpenTSDB JSON protocol |
In InfluxDB line protocol mode, you must specify the precision of the input timestamp. Valid precisions are described in the following table.
| **No.** | **Precision** | **Description** |
| -------- | --------------------------------- | -------------- |
| 1 | TSDB_SML_TIMESTAMP_NOT_CONFIGURED | Not defined (invalid) |
| 2 | TSDB_SML_TIMESTAMP_HOURS | Hours |
| 3 | TSDB_SML_TIMESTAMP_MINUTES | Minutes |
| 4 | TSDB_SML_TIMESTAMP_SECONDS | Seconds |
| 5 | TSDB_SML_TIMESTAMP_MILLI_SECONDS | Milliseconds |
| 6 | TSDB_SML_TIMESTAMP_MICRO_SECONDS | Microseconds |
| 7 | TSDB_SML_TIMESTAMP_NANO_SECONDS | Nanoseconds |
| **No.** | **Precision** | **Description** |
| ------- | --------------------------------- | --------------------- |
| 1 | TSDB_SML_TIMESTAMP_NOT_CONFIGURED | Not defined (invalid) |
| 2 | TSDB_SML_TIMESTAMP_HOURS | Hours |
| 3 | TSDB_SML_TIMESTAMP_MINUTES | Minutes |
| 4 | TSDB_SML_TIMESTAMP_SECONDS | Seconds |
| 5 | TSDB_SML_TIMESTAMP_MILLI_SECONDS | Milliseconds |
| 6 | TSDB_SML_TIMESTAMP_MICRO_SECONDS | Microseconds |
| 7 | TSDB_SML_TIMESTAMP_NANO_SECONDS | Nanoseconds |
In OpenTSDB file and JSON protocol modes, the precision of the timestamp is determined from its length in the standard OpenTSDB manner. User input is ignored.
......
......@@ -12,6 +12,7 @@ The design of TDengine is based on the assumption that any hardware or software
Logical structure diagram of TDengine's distributed architecture is as follows:
![TDengine Database architecture diagram](structure.webp)
<center> Figure 1: TDengine architecture diagram </center>
A complete TDengine system runs on one or more physical nodes. Logically, it includes data node (dnode), TDengine client driver (TAOSC) and application (app). There are one or more data nodes in the system, which form a cluster. The application interacts with the TDengine cluster through TAOSC's API. The following is a brief introduction to each logical unit.
......@@ -38,15 +39,16 @@ A complete TDengine system runs on one or more physical nodes. Logically, it inc
**Cluster external connection**: TDengine cluster can accommodate a single, multiple or even thousands of data nodes. The application only needs to initiate a connection to any data node in the cluster. The network parameter required for connection is the End Point (FQDN plus configured port number) of a data node. When starting the application taos through CLI, the FQDN of the data node can be specified through the option `-h`, and the configured port number can be specified through `-p`. If the port is not configured, the system configuration parameter “serverPort” of TDengine will be adopted.
**Inter-cluster communication**: Data nodes connect with each other through TCP/UDP. When a data node starts, it will obtain the EP information of the dnode where the mnode is located, and then establish a connection with the mnode in the system to exchange information. There are three steps to obtain EP information of the mnode:
**Inter-cluster communication**: Data nodes connect with each other through TCP/UDP. When a data node starts, it will obtain the EP information of the dnode where the mnode is located, and then establish a connection with the mnode in the system to exchange information. There are three steps to obtain EP information of the mnode:
1. Check whether the mnodeEpList file exists, if it does not exist or cannot be opened normally to obtain EP information of the mnode, skip to the second step;
1. Check whether the mnodeEpList file exists, if it does not exist or cannot be opened normally to obtain EP information of the mnode, skip to the second step;
2. Check the system configuration file taos.cfg to obtain node configuration parameters “firstEp” and “secondEp” (the node specified by these two parameters can be a normal node without mnode, in this case, the node will try to redirect to the mnode node when connected). If these two configuration parameters do not exist or do not exist in taos.cfg, or are invalid, skip to the third step;
3. Set your own EP as a mnode EP and run it independently. After obtaining the mnode EP list, the data node initiates the connection. It will successfully join the working cluster after connection. If not successful, it will try the next item in the mnode EP list. If all attempts are made, but the connection still fails, sleep for a few seconds before trying again.
**The choice of MNODE**: TDengine logically has a management node, but there is no separate execution code. The server-side only has one set of execution code, taosd. So which data node will be the management node? This is determined automatically by the system without any manual intervention. The principle is as follows: when a data node starts, it will check its End Point and compare it with the obtained mnode EP List. If its EP exists in it, the data node shall start the mnode module and become a mnode. If your own EP is not in the mnode EP List, the mnode module will not start. During the system operation, due to load balancing, downtime and other reasons, mnode may migrate to the new dnode, totally transparently and without manual intervention. The modification of configuration parameters is the decision made by mnode itself according to resources usage.
**Add new data nodes:** After the system has a data node, it has become a working system. There are two steps to add a new node into the cluster.
**Add new data nodes:** After the system has a data node, it has become a working system. There are two steps to add a new node into the cluster.
- Step1: Connect to the existing working data node using TDengine CLI, and then add the End Point of the new data node with the command "create dnode"
- Step 2: In the system configuration parameter file taos.cfg of the new data node, set the “firstEp” and “secondEp” parameters to the EP of any two data nodes in the existing cluster. Please refer to the user tutorial for detailed steps. In this way, the cluster will be established step by step.
......@@ -57,6 +59,7 @@ A complete TDengine system runs on one or more physical nodes. Logically, it inc
To explain the relationship between vnode, mnode, TAOSC and application and their respective roles, the following is an analysis of a typical data writing process.
![typical process of TDengine Database](message.webp)
<center> Figure 2: Typical process of TDengine </center>
1. Application initiates a request to insert data through JDBC, ODBC, or other APIs.
......@@ -121,16 +124,17 @@ The load balancing process does not require any manual intervention, and it is t
If a database has N replicas, a virtual node group has N virtual nodes. But only one is the Leader and all others are slaves. When the application writes a new record to system, only the Leader vnode can accept the writing request. If a follower vnode receives a writing request, the system will notifies TAOSC to redirect.
### Leader vnode Writing Process
### Leader vnode Writing Process
Leader Vnode uses a writing process as follows:
![TDengine Database Leader Writing Process](write_master.webp)
<center> Figure 3: TDengine Leader writing process </center>
1. Leader vnode receives the application data insertion request, verifies, and moves to next step;
2. If the system configuration parameter `“walLevel”` is greater than 0, vnode will write the original request packet into database log file WAL. If walLevel is set to 2 and fsync is set to 0, TDengine will make WAL data written immediately to ensure that even system goes down, all data can be recovered from database log file;
3. If there are multiple replicas, vnode will forward data packet to follower vnodes in the same virtual node group, and the forwarded packet has a version number with data;
3. If there are multiple replicas, vnode will forward data packet to follower vnodes in the same virtual node group, and the forwarded packet has a version number with data;
4. Write into memory and add the record to “skip list”;
5. Leader vnode returns a confirmation message to the application, indicating a successful write.
6. If any of Step 2, 3 or 4 fails, the error will directly return to the application.
......@@ -140,6 +144,7 @@ Leader Vnode uses a writing process as follows:
For a follower vnode, the write process as follows:
![TDengine Database Follower Writing Process](write_slave.webp)
<center> Figure 4: TDengine Follower Writing Process </center>
1. Follower vnode receives a data insertion request forwarded by Leader vnode;
......@@ -212,6 +217,7 @@ When data is written to disk, the system decideswhether to compress the data bas
By default, TDengine saves all data in /var/lib/taos directory, and the data files of each vnode are saved in a different directory under this directory. In order to expand the storage space, minimize the bottleneck of file reading and improve the data throughput rate, TDengine can configure the system parameter “dataDir” to allow multiple mounted hard disks to be used by system at the same time. In addition, TDengine also provides the function of tiered data storage, i.e. storage on different storage media according to the time stamps of data files. For example, the latest data is stored on SSD, the data older than a week is stored on local hard disk, and data older than four weeks is stored on network storage device. This reduces storage costs and ensures efficient data access. The movement of data on different storage media is automatically done by the system and is completely transparent to applications. Tiered storage of data is also configured through the system parameter “dataDir”.
dataDir format is as follows:
```
dataDir data_path [tier_level]
```
......@@ -270,6 +276,7 @@ For the data collected by device D1001, the number of records per hour is counte
TDengine creates a separate table for each data collection point, but in practical applications, it is often necessary to aggregate data from different data collection points. In order to perform aggregation operations efficiently, TDengine introduces the concept of STable (super table). STable is used to represent a specific type of data collection point. It is a table set containing multiple tables. The schema of each table in the set is the same, but each table has its own static tag. There can be multiple tags which can be added, deleted and modified at any time. Applications can aggregate or statistically operate on all or a subset of tables under a STABLE by specifying tag filters. This greatly simplifies the development of applications. The process is shown in the following figure:
![TDengine Database Diagram of multi-table aggregation query](multi_tables.webp)
<center> Figure 5: Diagram of multi-table aggregation query </center>
1. Application sends a query condition to system;
......@@ -279,9 +286,8 @@ TDengine creates a separate table for each data collection point, but in practic
5. Each vnode first finds the set of tables within its own node that meet the tag filters from memory, then scans the stored time-series data, completes corresponding aggregation calculations, and returns result to TAOSC;
6. TAOSC finally aggregates the results returned by multiple data nodes and send them back to application.
Since TDengine stores tag data and time-series data separately in vnode, by filtering tag data in memory, the set of tables that need to participate in aggregation operation is first found, which reduces the volume of data to be scanned and improves aggregation speed. At the same time, because the data is distributed in multiple vnodes/dnodes, the aggregation operation is carried out concurrently in multiple vnodes, which further improves the aggregation speed. Aggregation functions for ordinary tables and most operations are applicable to STables. The syntax is exactly the same. Please see TAOS SQL for details.
Since TDengine stores tag data and time-series data separately in vnode, by filtering tag data in memory, the set of tables that need to participate in aggregation operation is first found, which reduces the volume of data to be scanned and improves aggregation speed. At the same time, because the data is distributed in multiple vnodes/dnodes, the aggregation operation is carried out concurrently in multiple vnodes, which further improves the aggregation speed. Aggregation functions for ordinary tables and most operations are applicable to STables. The syntax is exactly the same. Please see TDengine SQL for details.
### Precomputation
In order to effectively improve the performance of query processing, based-on the unchangeable feature of IoT data, statistical information of data stored in data block is recorded in the head of data block, including max value, min value, and sum. We call it a precomputing unit. If the query processing involves all the data of a whole data block, the pre-calculated results are directly used, and no need to read the data block contents at all. Since the amount of pre-calculated data is much smaller than the actual size of data block stored on disk, for query processing with disk IO as bottleneck, the use of pre-calculated results can greatly reduce the pressure of reading IO and accelerate the query process. The precomputation mechanism is similar to the BRIN (Block Range Index) of PostgreSQL.
......@@ -41,7 +41,7 @@ USE power;
CREATE STABLE meters (ts timestamp, current float, voltage int, phase float) TAGS (location binary(64), groupId int);
```
与创建普通表一样,创建超级表时,需要提供表名(示例中为 meters),表结构 Schema,即数据列的定义。第一列必须为时间戳(示例中为 ts),其他列为采集的物理量(示例中为 current, voltage, phase),数据类型可以为整型、浮点型、字符串等。除此之外,还需要提供标签的 schema (示例中为 location, groupId),标签的数据类型可以为整型、浮点型、字符串等。采集点的静态属性往往可以作为标签,比如采集点的地理位置、设备型号、设备组 ID、管理员 ID 等等。标签的 schema 可以事后增加、删除、修改。具体定义以及细节请见 [TAOS SQL 的超级表管理](/taos-sql/stable) 章节。
与创建普通表一样,创建超级表时,需要提供表名(示例中为 meters),表结构 Schema,即数据列的定义。第一列必须为时间戳(示例中为 ts),其他列为采集的物理量(示例中为 current, voltage, phase),数据类型可以为整型、浮点型、字符串等。除此之外,还需要提供标签的 schema (示例中为 location, groupId),标签的数据类型可以为整型、浮点型、字符串等。采集点的静态属性往往可以作为标签,比如采集点的地理位置、设备型号、设备组 ID、管理员 ID 等等。标签的 schema 可以事后增加、删除、修改。具体定义以及细节请见 [TDengine SQL 的超级表管理](/taos-sql/stable) 章节。
每一种类型的数据采集点需要建立一个超级表,因此一个物联网系统,往往会有多个超级表。对于电网,我们就需要对智能电表、变压器、母线、开关等都建立一个超级表。在物联网中,一个设备就可能有多个数据采集点(比如一台风力发电的风机,有的采集点采集电流、电压等电参数,有的采集点采集温度、湿度、风向等环境参数),这个时候,对这一类型的设备,需要建立多张超级表。
......@@ -55,7 +55,7 @@ TDengine 对每个数据采集点需要独立建表。与标准的关系型数
CREATE TABLE d1001 USING meters TAGS ("California.SanFrancisco", 2);
```
其中 d1001 是表名,meters 是超级表的表名,后面紧跟标签 Location 的具体标签值 "California.SanFrancisco",标签 groupId 的具体标签值 2。虽然在创建表时,需要指定标签值,但可以事后修改。详细细则请见 [TAOS SQL 的表管理](/taos-sql/table) 章节。
其中 d1001 是表名,meters 是超级表的表名,后面紧跟标签 Location 的具体标签值 "California.SanFrancisco",标签 groupId 的具体标签值 2。虽然在创建表时,需要指定标签值,但可以事后修改。详细细则请见 [TDengine SQL 的表管理](/taos-sql/table) 章节。
TDengine 建议将数据采集点的全局唯一 ID 作为表名(比如设备序列号)。但对于有的场景,并没有唯一的 ID,可以将多个 ID 组合成一个唯一的 ID。不建议将具有唯一性的 ID 作为标签值。
......
......@@ -26,6 +26,7 @@ import PhpStmt from "./_php_stmt.mdx";
应用通过连接器执行 INSERT 语句来插入数据,用户还可以通过 TDengine CLI,手动输入 INSERT 语句插入数据。
### 一次写入一条
下面这条 INSERT 就将一条记录写入到表 d1001 中:
```sql
......@@ -48,7 +49,7 @@ TDengine 也支持一次向多个表写入数据,比如下面这条命令就
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31) (1538548695000, 12.6, 218, 0.33) d1002 VALUES (1538548696800, 12.3, 221, 0.31);
```
详细的 SQL INSERT 语法规则参考 [TAOS SQL 的数据写入](/taos-sql/insert)。
详细的 SQL INSERT 语法规则参考 [TDengine SQL 的数据写入](/taos-sql/insert)。
:::info
......@@ -134,4 +135,3 @@ TDengine 也提供了支持参数绑定的 Prepare API,与 MySQL 类似,这
<PhpStmt />
</TabItem>
</Tabs>
......@@ -44,7 +44,7 @@ Query OK, 2 row(s) in set (0.001100s)
为满足物联网场景的需求,TDengine 支持几个特殊的函数,比如 twa(时间加权平均),spread (最大值与最小值的差),last_row(最后一条记录)等,更多与物联网场景相关的函数将添加进来。
具体的查询语法请看 [TAOS SQL 的数据查询](../../taos-sql/select) 章节。
具体的查询语法请看 [TDengine SQL 的数据查询](../../taos-sql/select) 章节。
## 多表聚合查询
......@@ -75,7 +75,7 @@ taos> SELECT count(*), max(current) FROM meters where groupId = 2;
Query OK, 1 row(s) in set (0.002136s)
```
在 [TAOS SQL 的数据查询](../../taos-sql/select) 一章,查询类操作都会注明是否支持超级表。
在 [TDengine SQL 的数据查询](../../taos-sql/select) 一章,查询类操作都会注明是否支持超级表。
## 降采样查询、插值
......@@ -123,7 +123,7 @@ Query OK, 6 rows in database (0.005515s)
如果一个时间间隔里,没有采集的数据,TDengine 还提供插值计算的功能。
语法规则细节请见 [TAOS SQL 的按时间窗口切分聚合](../../taos-sql/distinguished) 章节。
语法规则细节请见 [TDengine SQL 的按时间窗口切分聚合](../../taos-sql/distinguished) 章节。
## 示例代码
......
---
title: TAOS SQL
description: "TAOS SQL 支持的语法规则、主要查询功能、支持的 SQL 查询函数,以及常用技巧等内容"
title: TDengine SQL
description: 'TDengine SQL 支持的语法规则、主要查询功能、支持的 SQL 查询函数,以及常用技巧等内容'
---
本文档说明 TAOS SQL 支持的语法规则、主要查询功能、支持的 SQL 查询函数,以及常用技巧等内容。阅读本文档需要读者具有基本的 SQL 语言的基础。TDengine 3.0 版本相比 2.x 版本做了大量改进和优化,特别是查询引擎进行了彻底的重构,因此 SQL 语法相比 2.x 版本有很多变更。详细的变更内容请见 [3.0 版本语法变更](/taos-sql/changes) 章节
本文档说明 TDengine SQL 支持的语法规则、主要查询功能、支持的 SQL 查询函数,以及常用技巧等内容。阅读本文档需要读者具有基本的 SQL 语言的基础。TDengine 3.0 版本相比 2.x 版本做了大量改进和优化,特别是查询引擎进行了彻底的重构,因此 SQL 语法相比 2.x 版本有很多变更。详细的变更内容请见 [3.0 版本语法变更](/taos-sql/changes) 章节
TAOS SQL 是用户对 TDengine 进行数据写入和查询的主要工具。TAOS SQL 提供标准的 SQL 语法,并针对时序数据和业务的特点优化和新增了许多语法和功能。TAOS SQL 语句的最大长度为 1M。TAOS SQL 不支持关键字的缩写,例如 DELETE 不能缩写为 DEL。
TDengine SQL 是用户对 TDengine 进行数据写入和查询的主要工具。TDengine SQL 提供标准的 SQL 语法,并针对时序数据和业务的特点优化和新增了许多语法和功能。TDengine SQL 语句的最大长度为 1M。TDengine SQL 不支持关键字的缩写,例如 DELETE 不能缩写为 DEL。
本章节 SQL 语法遵循如下约定:
......
......@@ -3,7 +3,7 @@ title: Schemaless 写入
description: 'Schemaless 写入方式,可以免于预先创建超级表/子表的步骤,随着数据写入接口能够自动创建与数据对应的存储结构'
---
在物联网应用中,常会采集比较多的数据项,用于实现智能控制、业务分析、设备监控等。由于应用逻辑的版本升级,或者设备自身的硬件调整等原因,数据采集项就有可能比较频繁地出现变动。为了在这种情况下方便地完成数据记录工作,TDengine提供调用 Schemaless 写入方式,可以免于预先创建超级表/子表的步骤,随着数据写入接口能够自动创建与数据对应的存储结构。并且在必要时,Schemaless
在物联网应用中,常会采集比较多的数据项,用于实现智能控制、业务分析、设备监控等。由于应用逻辑的版本升级,或者设备自身的硬件调整等原因,数据采集项就有可能比较频繁地出现变动。为了在这种情况下方便地完成数据记录工作,TDengine 提供调用 Schemaless 写入方式,可以免于预先创建超级表/子表的步骤,随着数据写入接口能够自动创建与数据对应的存储结构。并且在必要时,Schemaless
将自动增加必要的数据列,保证用户写入的数据可以被正确存储。
无模式写入方式建立的超级表及其对应的子表与通过 SQL 直接建立的超级表和子表完全没有区别,你也可以通过,SQL 语句直接向其中写入数据。需要注意的是,通过无模式写入方式建立的表,其表名是基于标签值按照固定的映射规则生成,所以无法明确地进行表意,缺乏可读性。
......@@ -36,14 +36,14 @@ tag_set 中的所有的数据自动转化为 nchar 数据类型,并不需要
- 对空格、等号(=)、逗号(,)、双引号("),前面需要使用反斜杠(\)进行转义。(都指的是英文半角符号)
- 数值类型将通过后缀来区分数据类型:
| **序号** | **后缀** | **映射类型** | **大小(字节)** |
| -------- | -------- | ------------ | -------------- |
| 1 | 无或 f64 | double | 8 |
| 2 | f32 | float | 4 |
| 3 | i8/u8 | TinyInt/UTinyInt | 1 |
| 4 | i16/u16 | SmallInt/USmallInt | 2 |
| 5 | i32/u32 | Int/UInt | 4 |
| 6 | i64/i/u64/u | BigInt/BigInt/UBigInt/UBigInt | 8 |
| **序号** | **后缀** | **映射类型** | **大小(字节)** |
| -------- | ----------- | ----------------------------- | -------------- |
| 1 | 无或 f64 | double | 8 |
| 2 | f32 | float | 4 |
| 3 | i8/u8 | TinyInt/UTinyInt | 1 |
| 4 | i16/u16 | SmallInt/USmallInt | 2 |
| 5 | i32/u32 | Int/UInt | 4 |
| 6 | i64/i/u64/u | BigInt/BigInt/UBigInt/UBigInt | 8 |
- t, T, true, True, TRUE, f, F, false, False 将直接作为 BOOL 型来处理。
......@@ -67,9 +67,9 @@ st,t1=3,t2=4,t3=t3 c1=3i64,c3="passit",c2=false,c4=4f64 1626006833639000000
"measurement,tag_key1=tag_value1,tag_key2=tag_value2"
```
需要注意的是,这里的 tag_key1, tag_key2 并不是用户输入的标签的原始顺序,而是使用了标签名称按照字符串升序排列后的结果。所以,tag_key1 并不是在行协议中输入的第一个标签。
排列完成以后计算该字符串的 MD5 散列值 "md5_val"。然后将计算的结果与字符串组合生成表名:“t_md5_val”。其中的 “t_” 是固定的前缀,每个通过该映射关系自动生成的表都具有该前缀。
为了让用户可以指定生成的表名,可以通过配置smlChildTableName来指定(比如 配置smlChildTableName=tname 插入数据为st,tname=cpu1,t1=4 c1=3 1626006833639000000 则创建的表名为cpu1,注意如果多行数据tname相同,但是后面的tag_set不同,则使用第一次自动建表时指定的tag_set,其他的会忽略)。
需要注意的是,这里的 tag*key1, tag_key2 并不是用户输入的标签的原始顺序,而是使用了标签名称按照字符串升序排列后的结果。所以,tag_key1 并不是在行协议中输入的第一个标签。
排列完成以后计算该字符串的 MD5 散列值 "md5_val"。然后将计算的结果与字符串组合生成表名:“t_md5_val”。其中的 “t*” 是固定的前缀,每个通过该映射关系自动生成的表都具有该前缀。
为了让用户可以指定生成的表名,可以通过配置 smlChildTableName 来指定(比如 配置 smlChildTableName=tname 插入数据为 st,tname=cpu1,t1=4 c1=3 1626006833639000000 则创建的表名为 cpu1,注意如果多行数据 tname 相同,但是后面的 tag_set 不同,则使用第一次自动建表时指定的 tag_set,其他的会忽略)。
2. 如果解析行协议获得的超级表不存在,则会创建这个超级表(不建议手动创建超级表,不然插入数据可能异常)。
3. 如果解析行协议获得子表不存在,则 Schemaless 会按照步骤 1 或 2 确定的子表名来创建子表。
......@@ -78,11 +78,11 @@ st,t1=3,t2=4,t3=t3 c1=3i64,c3="passit",c2=false,c4=4f64 1626006833639000000
NULL。
6. 对 BINARY 或 NCHAR 列,如果数据行中所提供值的长度超出了列类型的限制,自动增加该列允许存储的字符长度上限(只增不减),以保证数据的完整保存。
7. 整个处理过程中遇到的错误会中断写入过程,并返回错误代码。
8. 为了提高写入的效率,默认假设同一个超级表中field_set的顺序是一样的(第一条数据包含所有的field,后面的数据按照这个顺序),如果顺序不一样,需要配置参数smlDataFormat为false,否则,数据写入按照相同顺序写入,库中数据会异常。
8. 为了提高写入的效率,默认假设同一个超级表中 field_set 的顺序是一样的(第一条数据包含所有的 field,后面的数据按照这个顺序),如果顺序不一样,需要配置参数 smlDataFormat 为 false,否则,数据写入按照相同顺序写入,库中数据会异常。
:::tip
无模式所有的处理逻辑,仍会遵循 TDengine 对数据结构的底层限制,例如每行数据的总长度不能超过
16KB。这方面的具体限制约束请参见 [TAOS SQL 边界限制](/taos-sql/limit)
16KB。这方面的具体限制约束请参见 [TDengine SQL 边界限制](/taos-sql/limit)
:::
......
......@@ -288,7 +288,7 @@ TDengine 对每个数据采集点单独建表,但在实际应用中经常需
7. vnode 返回本节点的查询计算结果;
8. qnode 完成多节点数据聚合后将最终查询结果返回给客户端;
由于 TDengine 在 vnode 内将标签数据与时序数据分离存储,通过在内存里过滤标签数据,先找到需要参与聚合操作的表的集合,将需要扫描的数据集大幅减少,大幅提升聚合计算速度。同时,由于数据分布在多个 vnode/dnode,聚合计算操作在多个 vnode 里并发进行,又进一步提升了聚合的速度。 对普通表的聚合函数以及绝大部分操作都适用于超级表,语法完全一样,细节请看 TAOS SQL。
由于 TDengine 在 vnode 内将标签数据与时序数据分离存储,通过在内存里过滤标签数据,先找到需要参与聚合操作的表的集合,将需要扫描的数据集大幅减少,大幅提升聚合计算速度。同时,由于数据分布在多个 vnode/dnode,聚合计算操作在多个 vnode 里并发进行,又进一步提升了聚合的速度。 对普通表的聚合函数以及绝大部分操作都适用于超级表,语法完全一样,细节请看 TDengine SQL。
### 预计算
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册