@@ -52,14 +52,14 @@ For more details about `INSERT` please refer to [INSERT](/taos-sql/insert).
:::info
- Inserting in batch can gain better performance. Normally, the higher the batch size, the better the performance. Please be noted each single row can't exceed 16K bytes and each single SQL statement can't exceed 1M bytes.
- Inserting with multiple threads can gain better performance too. However, depending on the system resources on the application side and the server side, with the number of inserting threads grows to a specific point, the performance may drop instead of growing. The proper number of threads need to be tested in a specific environment to find the best number.
- Inserting in batches can improve performance. Normally, the higher the batch size, the better the performance. Please note that a single row can't exceed 16K bytes and each SQL statement can't exceed 1MB.
- Inserting with multiple threads can also improve performance. However, depending on the system resources on the application side and the server side, when the number of inserting threads grows beyond a specific point the performance may drop instead of improving. The proper number of threads needs to be tested in a specific environment to find the best number.
:::
:::warning
- If the timestamp for the row to be inserted already exists in the table, the behavior depends on the value of parameter `UPDATE`. If it's set to 0 (also the default value), the row will be discarded. If it's set to 1, the new values will override the old values for the same row.
- If the timestamp for the row to be inserted already exists in the table, the behavior depends on the value of parameter `UPDATE`. If it's set to 0 (the default value), the row will be discarded. If it's set to 1, the new values will override the old values for the same row.
- The timestamp to be inserted must be newer than the timestamp of subtracting current time by the parameter `KEEP`. If `KEEP` is set to 3650 days, then the data older than 3650 days ago can't be inserted. The timestamp to be inserted can't be newer than the timestamp of current time plus parameter `DAYS`. If `DAYS` is set to 2, the data newer than 2 days later can't be inserted.
:::
...
...
@@ -95,13 +95,13 @@ For more details about `INSERT` please refer to [INSERT](/taos-sql/insert).
:::note
1. With either native connection or REST connection, the above samples can work well.
2. Please be noted that `use db` can't be used with REST connection because REST connection is stateless, so in the samples `dbName.tbName` is used to specify the table name.
2. Please note that `use db` can't be used with a REST connection because REST connections are stateless, so in the samples `dbName.tbName` is used to specify the table name.
:::
### Insert with Parameter Binding
TDengine also provides Prepare API that support parameter binding. Similar to MySQL, only `?` can be used in these APIs to represent the parameters to bind. From version 2.1.1.0 and 2.1.2.0, parameter binding support for inserting data has been improved significantly to improve the insert performance by avoiding the cost of parsing SQL statements.
TDengine also provides API support for parameter binding. Similar to MySQL, only `?` can be used in these APIs to represent the parameters to bind. From version 2.1.1.0 and 2.1.2.0, parameter binding support for inserting data has improved significantly to improve the insert performance by avoiding the cost of parsing SQL statements.
Parameter binding is available only with native connection.
@@ -15,16 +15,16 @@ import CTelnet from "./_c_opts_telnet.mdx";
## Introduction
A single line of text is used in OpenTSDB line protocol to represent one row of data. OpenTSDB employs single column data model, so one line can only contains single data column. There can be multiple tags. Each line contains 4 parts as below:
A single line of text is used in OpenTSDB line protocol to represent one row of data. OpenTSDB employs single column data model, so one line can only contain a single data column. There can be multiple tags. Each line contains 4 parts as below:
- `timestamp` is the timestamp of current row of data. The time precision will be determined automatically based on the length of the timestamp. second and millisecond time precision are supported.\
- `metric` will be used as the STable name.
- `timestamp` is the timestamp of current row of data. The time precision will be determined automatically based on the length of the timestamp. Second and millisecond time precision are supported.
- `value` is a metric which must be a numeric value, the corresponding column name is "value".
- The last part is tag sets separated by space, all tags will be converted to nchar type automatically.
- The last part is the tag set separated by spaces, all tags will be converted to nchar type automatically.
TDengine supports multiple protocols of inserting data, including SQL, InfluxDB Line protocol, OpenTSDB Telnet protocol, OpenTSDB JSON protocol. Data can be inserted row by row, or in batch. Data from one or more collecting points can be inserted simultaneously. In the meantime, data can be inserted with multiple threads, out of order data and historical data can be inserted too. InfluxDB Line protocol, OpenTSDB Telnet protocol and OpenTSDB JSON protocol are the 3 kinds of schemaless insert protocols supported by TDengine. It's not necessary to create stable and table in advance if using schemaless protocols, and the schemas can be adjusted automatically according to the data to be inserted.
TDengine supports multiple protocols of inserting data, including SQL, InfluxDB Line protocol, OpenTSDB Telnet protocol, and OpenTSDB JSON protocol. Data can be inserted row by row, or in batches. Data from one or more collection points can be inserted simultaneously. Data can be inserted with multiple threads, and out of order data and historical data can be inserted as well. InfluxDB Line protocol, OpenTSDB Telnet protocol and OpenTSDB JSON protocol are the 3 kinds of schemaless insert protocols supported by TDengine. It's not necessary to create STables and tables in advance if using schemaless protocols, and the schemas can be adjusted automatically based on the data being inserted.
```mdx-code-block
import DocCardList from '@theme/DocCardList';
import {useCurrentSidebarCategory} from '@docusaurus/theme-common';
@@ -20,7 +20,7 @@ import CAsync from "./_c_async.mdx";
## Introduction
SQL is used by TDengine as the query language. Application programs can send SQL statements to TDengine through REST API or connectors. TDengine CLI `taos` can also be used to execute SQL Ad-Hoc query. Here is the list of major query functionalities supported by TDengine:
SQL is used by TDengine as the query language. Application programs can send SQL statements to TDengine through REST API or connectors. TDengine CLI `taos` can also be used to execute SQL Ad-Hoc queries. Here is the list of major query functionalities supported by TDengine:
- Query on single column or multiple columns
- Filter on tags or data columns:>, <, =, <\>, like
...
...
@@ -31,7 +31,7 @@ SQL is used by TDengine as the query language. Application programs can send SQL
For example, below SQL statement can be executed in TDengine CLI `taos` to select the rows whose voltage column is bigger than 215 and limit the output to only 2 rows.
For example, the SQL statement below can be executed in TDengine CLI `taos` to select the rows whose voltage column is bigger than 215 and limit the output to only 2 rows.
```sql
select * from d1001 where voltage > 215 order by ts desc limit 2;
...
...
@@ -46,15 +46,15 @@ taos> select * from d1001 where voltage > 215 order by ts desc limit 2;
Query OK, 2 row(s) in set (0.001100s)
```
To meet the requirements in many use cases, some special functions have been added in TDengine, for example `twa` (Time Weighted Average), `spared` (The difference between the maximum and the minimum), `last_row` (the last row), more and more functions will be added to better perform in many use cases. Furthermore, continuous query is also supported in TDengine.
To meet the requirements of many use cases, some special functions have been added in TDengine, for example `twa` (Time Weighted Average), `spared` (The difference between the maximum and the minimum), and `last_row` (the last row). Furthermore, continuous query is also supported in TDengine.
For detailed query syntax please refer to [Select](/taos-sql/select).
## Aggregation among Tables
In many use cases, there are always multiple kinds of data collection points. A new concept, called STable (abbreviated for super table), is used in TDengine to represent a kind of data collection points, and a table is used to represent a specific data collection point. Tags are used by TDengine to represent the static properties of data collection points. A specific data collection point has its own values for static properties. By specifying filter conditions on tags, aggregation can be performed efficiently among all the subtables created via the same STable, i.e. same kind of data collection points, can be. Aggregate functions applicable for tables can be used directly on STables, syntax is exactly same.
In many use cases, there are always multiple kinds of data collection points. A new concept, called STable (abbreviated for super table), is used in TDengine to represent a kind of data collection point, and a subtable is used to represent a specific data collection point. Tags are used by TDengine to represent the static properties of data collection points. A specific data collection point has its own values for static properties. By specifying filter conditions on tags, aggregation can be performed efficiently among all the subtables created via the same STable, i.e. same kind of data collection points. Aggregate functions applicable for tables can be used directly on STables, the syntax is exactly the same.
In summary, for a STable, its subtables can be aggregated by a simple query on STable, it's kind of join operation. But tables belong to different STables could not be aggregated.
In summary, for a STable, its subtables can be aggregated by a simple query on the STable, it's a kind of join operation. But tables belong to different STables can not be aggregated.
### Example 1
...
...
@@ -81,11 +81,11 @@ taos> SELECT count(*), max(current) FROM meters where groupId = 2 and ts > now -
Query OK, 1 row(s) in set (0.002136s)
```
Join query is allowed between only the tables of same STable. In [Select](/taos-sql/select), all query operations are marked as whether it supports STable or not.
Join queries are only allowed between the subtables of the same STable. In [Select](/taos-sql/select), all query operations are marked as to whether they supports STables or not.
## Down Sampling and Interpolation
In IoT use cases, down sampling is widely used to aggregate the data by time range. `INTERVAL` keyword in TDengine can be used to simplify the query by time window. For example, below SQL statement can be used to get the sum of current every 10 seconds from meters table d1001.
In IoT use cases, down sampling is widely used to aggregate the data by time range. The `INTERVAL` keyword in TDengine can be used to simplify the query by time window. For example, the SQL statement below can be used to get the sum of current every 10 seconds from meters table d1001.
```
taos> SELECT sum(current) FROM d1001 INTERVAL(10s);
...
...
@@ -96,7 +96,7 @@ taos> SELECT sum(current) FROM d1001 INTERVAL(10s);
Query OK, 2 row(s) in set (0.000883s)
```
Down sampling can also be used for STable. For example, below SQL statement can be used to get the sum of current from all meters in BeiJing.
Down sampling can also be used for STable. For example, the below SQL statement can be used to get the sum of current from all meters in BeiJing.
```
taos> SELECT SUM(current) FROM meters where location like "Beijing%" INTERVAL(1s);
...
...
@@ -110,7 +110,7 @@ taos> SELECT SUM(current) FROM meters where location like "Beijing%" INTERVAL(1s
Query OK, 5 row(s) in set (0.001538s)
```
Down sampling also supports time offset. For example, below SQL statement can be used to get the sum of current from all meters but each time window must start at the boundary of 500 milliseconds.
Down sampling also supports time offset. For example, the below SQL statement can be used to get the sum of current from all meters but each time window must start at the boundary of 500 milliseconds.
```
taos> SELECT SUM(current) FROM meters INTERVAL(1s, 500a);
In many use cases, it's hard to align the timestamp of the data collected by each collection point. However, a lot of algorithms like FFT require the data to be aligned with same time interval and application programs have to handle by themselves in many systems. In TDengine, it's easy to achieve the alignment using down sampling.
In many use cases, it's hard to align the timestamp of the data collected by each collection point. However, a lot of algorithms like FFT require the data to be aligned with same time interval and application programs have to handle this by themselves. In TDengine, it's easy to achieve the alignment using down sampling.
Interpolation can be performed in TDengine if there is no data in a time range.
...
...
@@ -162,16 +162,16 @@ In the section describing [Insert](/develop/insert-data/sql-writing), a database
:::note
1. With either REST connection or native connection, the above sample code work well.
2. Please be noted that `use db` can't be used in case of REST connection because it's stateless.
1. With either REST connection or native connection, the above sample code works well.
2. Please note that `use db` can't be used in case of REST connection because it's stateless.
:::
### Asynchronous Query
Besides synchronous query, asynchronous query API is also provided by TDengine to insert or query data more efficiently. With similar hardware and software environment, async API is 2~4 times faster than sync APIs. Async API works in non-blocking mode, which means an operation can be returned without finishing so that the calling thread can switch to other works to improve the performance of the whole application system. Async APIs perform especially better in case of poor network.
Besides synchronous queries, an asynchronous query API is also provided by TDengine to insert or query data more efficiently. With a similar hardware and software environment, the async API is 2~4 times faster than sync APIs. Async API works in non-blocking mode, which means an operation can be returned without finishing so that the calling thread can switch to other works to improve the performance of the whole application system. Async APIs perform especially better in the case of poor networks.
Please be noted that async query can only be used with native connection.
Please note that async query can only be used with a native connection.