@@ -13,7 +13,7 @@ TDengine greatly improves the efficiency of data ingestion, querying and storage
...
@@ -13,7 +13,7 @@ TDengine greatly improves the efficiency of data ingestion, querying and storage
If you are a developer, please read the [“Developer Guide”](./develop) carefully. This section introduces the database connection, data modeling, data ingestion, query, continuous query, cache, data subscription, user-defined functions, and other functionality in detail. Sample code is provided for a variety of programming languages. In most cases, you can just copy and paste the sample code, make a few changes to accommodate your application, and it will work.
If you are a developer, please read the [“Developer Guide”](./develop) carefully. This section introduces the database connection, data modeling, data ingestion, query, continuous query, cache, data subscription, user-defined functions, and other functionality in detail. Sample code is provided for a variety of programming languages. In most cases, you can just copy and paste the sample code, make a few changes to accommodate your application, and it will work.
We live in the era of big data, and scale-up is unable to meet the growing needs of business. Any modern data system must have the ability to scale out, and clustering has become an indispensable feature of big data systems. Not only did the TDengine team develop the cluster feature, but also decided to open source this important feature. To learn how to deploy, manage and maintain a TDengine cluster please refer to ["cluster"](./cluster).
We live in the era of big data, and scale-up is unable to meet the growing needs of business. Any modern data system must have the ability to scale out, and clustering has become an indispensable feature of big data systems. Not only did the TDengine team develop the cluster feature, but also decided to open source this important feature. To learn how to deploy, manage and maintain a TDengine cluster please refer to ["cluster deployment"](../deployment).
TDengine uses ubiquitious SQL as its query language, which greatly reduces learning costs and migration costs. In addition to the standard SQL, TDengine has extensions to better support time series data analysis. These extensions include functions such as roll up, interpolation and time weighted average, among many others. The ["SQL Reference"](./taos-sql) chapter describes the SQL syntax in detail, and lists the various supported commands and functions.
TDengine uses ubiquitious SQL as its query language, which greatly reduces learning costs and migration costs. In addition to the standard SQL, TDengine has extensions to better support time series data analysis. These extensions include functions such as roll up, interpolation and time weighted average, among many others. The ["SQL Reference"](./taos-sql) chapter describes the SQL syntax in detail, and lists the various supported commands and functions.
TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. Its code, including its cluster feature is open source under GNU AGPL v3.0. Besides the database engine, it provides [caching](/develop/cache), [stream processing](/develop/continuous-query), [data subscription](/develop/subscribe) and other functionalities to reduce the system complexity and cost of development and operation.
TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. Its code, including its cluster feature is open source under GNU AGPL v3.0. Besides the database engine, it provides [caching](../develop/cache), [stream processing](../develop/stream), [data subscription](../develop/tmq) and other functionalities to reduce the system complexity and cost of development and operation.
This section introduces the major features, competitive advantages, typical use-cases and benchmarks to help you get a high level overview of TDengine.
This section introduces the major features, competitive advantages, typical use-cases and benchmarks to help you get a high level overview of TDengine.
...
@@ -16,9 +16,9 @@ The major features are listed below:
...
@@ -16,9 +16,9 @@ The major features are listed below:
3. Support for [all kinds of queries](/develop/query-data), including aggregation, nested query, downsampling, interpolation and others.
3. Support for [all kinds of queries](/develop/query-data), including aggregation, nested query, downsampling, interpolation and others.
4. Support for [user defined functions](/develop/udf).
4. Support for [user defined functions](/develop/udf).
5. Support for [caching](/develop/cache). TDengine always saves the last data point in cache, so Redis is not needed in some scenarios.
5. Support for [caching](/develop/cache). TDengine always saves the last data point in cache, so Redis is not needed in some scenarios.
6. Support for [continuous query](/develop/continuous-query).
6. Support for [continuous query](../develop/stream).
7. Support for [data subscription](/develop/subscribe) with the capability to specify filter conditions.
7. Support for [data subscription](../develop/tmq) with the capability to specify filter conditions.
8. Support for [cluster](/cluster/), with the capability of increasing processing power by adding more nodes. High availability is supported by replication.
8. Support for [cluster](../deployment/), with the capability of increasing processing power by adding more nodes. High availability is supported by replication.
9. Provides an interactive [command-line interface](/reference/taos-shell) for management, maintenance and ad-hoc queries.
9. Provides an interactive [command-line interface](/reference/taos-shell) for management, maintenance and ad-hoc queries.
10. Provides many ways to [import](/operation/import) and [export](/operation/export) data.
10. Provides many ways to [import](/operation/import) and [export](/operation/export) data.
11. Provides [monitoring](/operation/monitor) on running instances of TDengine.
11. Provides [monitoring](/operation/monitor) on running instances of TDengine.
...
@@ -43,7 +43,7 @@ By making full use of [characteristics of time series data](https://tdengine.com
...
@@ -43,7 +43,7 @@ By making full use of [characteristics of time series data](https://tdengine.com
-**Easy Data Analytics**: Through super tables, storage and compute separation, data partitioning by time interval, pre-computation and other means, TDengine makes it easy to explore, format, and get access to data in a highly efficient way.
-**Easy Data Analytics**: Through super tables, storage and compute separation, data partitioning by time interval, pre-computation and other means, TDengine makes it easy to explore, format, and get access to data in a highly efficient way.
-**Open Source**: TDengine’s core modules, including cluster feature, are all available under open source licenses. It has gathered 18.8k stars on GitHub. There is an active developer community, and over 139k running instances worldwide.
-**Open Source**: TDengine’s core modules, including cluster feature, are all available under open source licenses. It has gathered over 19k stars on GitHub. There is an active developer community, and over 140k running instances worldwide.
With TDengine, the total cost of ownership of your time-series data platform can be greatly reduced. 1: With its superior performance, the computing and storage resources are reduced significantly;2: With SQL support, it can be seamlessly integrated with many third party tools, and learning costs/migration costs are reduced significantly;3: With its simplified solution and nearly zero management, the operation and maintenance costs are reduced significantly.
With TDengine, the total cost of ownership of your time-series data platform can be greatly reduced. 1: With its superior performance, the computing and storage resources are reduced significantly;2: With SQL support, it can be seamlessly integrated with many third party tools, and learning costs/migration costs are reduced significantly;3: With its simplified solution and nearly zero management, the operation and maintenance costs are reduced significantly.
@@ -31,17 +31,6 @@ You can now access TDengine or run other Linux commands.
...
@@ -31,17 +31,6 @@ You can now access TDengine or run other Linux commands.
Note: For information about installing docker, see the [official documentation](https://docs.docker.com/get-docker/).
Note: For information about installing docker, see the [official documentation](https://docs.docker.com/get-docker/).
## Open the TDengine CLI
On the container, run the following command to open the TDengine CLI:
```
$ taos
taos>
```
## Insert Data into TDengine
## Insert Data into TDengine
You can use the `taosBenchmark` tool included with TDengine to write test data into your deployment.
You can use the `taosBenchmark` tool included with TDengine to write test data into your deployment.
...
@@ -59,39 +48,51 @@ To do so, run the following command:
...
@@ -59,39 +48,51 @@ To do so, run the following command:
You can customize the test deployment that taosBenchmark creates by specifying command-line parameters. For information about command-line parameters, run the `taosBenchmark --help` command. For more information about taosBenchmark, see [taosBenchmark](/reference/taosbenchmark).
You can customize the test deployment that taosBenchmark creates by specifying command-line parameters. For information about command-line parameters, run the `taosBenchmark --help` command. For more information about taosBenchmark, see [taosBenchmark](/reference/taosbenchmark).
## Open the TDengine CLI
On the container, run the following command to open the TDengine CLI:
```
$ taos
taos>
```
## Query Data in TDengine
## Query Data in TDengine
After using taosBenchmark to create your test deployment, you can run queries in the TDengine CLI to test its performance. For example:
After using taosBenchmark to create your test deployment, you can run queries in the TDengine CLI to test its performance. For example:
Query the number of rows in the `meters` supertable:
From the TDengine CLI query the number of rows in the `meters` supertable:
```sql
```sql
taos>selectcount(*)fromtest.meters;
selectcount(*)fromtest.meters;
```
```
Query the average, maximum, and minimum values of all 100 million rows of data:
Query the average, maximum, and minimum values of all 100 million rows of data:
In the query above you are selecting the first timestamp (ts) in the interval, another way of selecting this would be _wstart which will give the start of the time window. For more information about windowed queries, see [Time-Series Extensions](../../taos-sql/distinguished/).
@@ -72,6 +72,9 @@ Users will be prompted to enter some configuration information when install.sh i
...
@@ -72,6 +72,9 @@ Users will be prompted to enter some configuration information when install.sh i
1. Download the Windows installation package.
1. Download the Windows installation package.
<PkgListV3type={3}/>
<PkgListV3type={3}/>
2. Run the downloaded package to install TDengine.
2. Run the downloaded package to install TDengine.
:::info
TDengine only supports Windows Server 2016/2019 and windows 10/11 system versions on the windows platform.
:::
</TabItem>
</TabItem>
<TabItemvalue="apt-get"label="apt-get">
<TabItemvalue="apt-get"label="apt-get">
...
@@ -172,6 +175,20 @@ After the installation is complete, run `C:\TDengine\taosd.exe` to start TDengin
...
@@ -172,6 +175,20 @@ After the installation is complete, run `C:\TDengine\taosd.exe` to start TDengin
</TabItem>
</TabItem>
</Tabs>
</Tabs>
## Test data insert performance
After your TDengine Server is running normally, you can run the taosBenchmark utility to test its performance:
```bash
taosBenchmark
```
This command creates the `meters` supertable in the `test` database. In the `meters` supertable, it then creates 10,000 subtables named `d0` to `d9999`. Each table has 10,000 rows and each row has four columns: `ts`, `current`, `voltage`, and `phase`. The timestamps of the data in these columns range from 2017-07-14 10:40:00 000 to 2017-07-14 10:40:09 999. Each table is randomly assigned a `groupId` tag from 1 to 10 and a `location` tag of either `Campbell`, `Cupertino`, `Los Angeles`, `Mountain View`, `Palo Alto`, `San Diego`, `San Francisco`, `San Jose`, `Santa Clara` or `Sunnyvale`.
The `taosBenchmark` command creates a deployment with 100 million data points that you can use for testing purposes. The time required to create the deployment depends on your hardware. On most modern servers, the deployment is created in less than a minute.
You can customize the test deployment that taosBenchmark creates by specifying command-line parameters. For information about command-line parameters, run the `taosBenchmark --help` command. For more information about taosBenchmark, see [taosBenchmark](../../reference/taosbenchmark).
## Command Line Interface
## Command Line Interface
You can use the TDengine CLI to monitor your TDengine deployment and execute ad hoc queries. To open the CLI, run the following command:
You can use the TDengine CLI to monitor your TDengine deployment and execute ad hoc queries. To open the CLI, run the following command:
...
@@ -203,51 +220,38 @@ Query OK, 2 row(s) in set (0.003128s)
...
@@ -203,51 +220,38 @@ Query OK, 2 row(s) in set (0.003128s)
```
```
You can also can monitor the deployment status, add and remove user accounts, and manage running instances. You can run the TDengine CLI on either Linux or Windows machines. For more information, see [TDengine CLI](../../reference/taos-shell/).
You can also can monitor the deployment status, add and remove user accounts, and manage running instances. You can run the TDengine CLI on either Linux or Windows machines. For more information, see [TDengine CLI](../../reference/taos-shell/).
## Test data insert performance
After your TDengine Server is running normally, you can run the taosBenchmark utility to test its performance:
```bash
taosBenchmark
```
This command creates the `meters` supertable in the `test` database. In the `meters` supertable, it then creates 10,000 subtables named `d0` to `d9999`. Each table has 10,000 rows and each row has four columns: `ts`, `current`, `voltage`, and `phase`. The timestamps of the data in these columns range from 2017-07-14 10:40:00 000 to 2017-07-14 10:40:09 999. Each table is randomly assigned a `groupId` tag from 1 to ten and a `location` tag of either `California.SanFrancisco` or `California.LosAngeles`.
The `taosBenchmark` command creates a deployment with 100 million data points that you can use for testing purposes. The time required to create the deployment depends on your hardware. On most modern servers, the deployment is created in less than a minute.
You can customize the test deployment that taosBenchmark creates by specifying command-line parameters. For information about command-line parameters, run the `taosBenchmark --help` command. For more information about taosBenchmark, see [taosBenchmark](../../reference/taosbenchmark).
## Test data query performance
## Test data query performance
After using taosBenchmark to create your test deployment, you can run queries in the TDengine CLI to test its performance:
After using taosBenchmark to create your test deployment, you can run queries in the TDengine CLI to test its performance:
Query the number of rows in the `meters` supertable:
From the TDengine CLI query the number of rows in the `meters` supertable:
```sql
```sql
taos>selectcount(*)fromtest.meters;
selectcount(*)fromtest.meters;
```
```
Query the average, maximum, and minimum values of all 100 million rows of data:
Query the average, maximum, and minimum values of all 100 million rows of data:
In the query above you are selecting the first timestamp (ts) in the interval, another way of selecting this would be _wstart which will give the start of the time window. For more information about windowed queries, see [Time-Series Extensions](../../taos-sql/distinguished/).
description: "Continuous query is a query that's executed automatically at a predefined frequency to provide aggregate query capability by time window. It is essentially simplified, time driven, stream computing."
title: "Continuous Query"
---
A continuous query is a query that's executed automatically at a predefined frequency to provide aggregate query capability by time window. It is essentially simplified, time driven, stream computing. A continuous query can be performed on a table or STable in TDengine. The results of a continuous query can be pushed to clients or written back to TDengine. Each query is executed on a time window, which moves forward with time. The size of time window and the forward sliding time need to be specified with parameter `INTERVAL` and `SLIDING` respectively.
A continuous query in TDengine is time driven, and can be defined using TAOS SQL directly without any extra operations. With a continuous query, the result can be generated based on a time window to achieve down sampling of the original data. Once a continuous query is defined using TAOS SQL, the query is automatically executed at the end of each time window and the result is pushed back to clients or written to TDengine.
There are some differences between continuous query in TDengine and time window computation in stream computing:
- The computation is performed and the result is returned in real time in stream computing, but the computation in continuous query is only started when a time window closes. For example, if the time window is 1 day, then the result will only be generated at 23:59:59.
- If a historical data row is written in to a time window for which the computation has already finished, the computation will not be performed again and the result will not be pushed to client applications again. If the results have already been written into TDengine, they will not be updated.
- In continuous query, if the result is pushed to a client, the client status is not cached on the server side and Exactly-once is not guaranteed by the server. If the client program crashes, a new time window will be generated from the time where the continuous query is restarted. If the result is written into TDengine, the data written into TDengine can be guaranteed as valid and continuous.
INTERVAL: The time window for which continuous query is performed
SLIDING: The time step for which the time window moves forward each time
## How to Use
In this section the use case of meters will be used to introduce how to use continuous query. Assume the STable and subtables have been created using the SQL statements below.
```sql
create table meters (ts timestamp, current float, voltage int, phase float) tags (location binary(64), groupId int);
create table D1001 using meters tags ("California.SanFrancisco", 2);
create table D1002 using meters tags ("California.LosAngeles", 2);
```
The SQL statement below retrieves the average voltage for a one minute time window, with each time window moving forward by 30 seconds.
```sql
select avg(voltage) from meters interval(1m) sliding(30s);
```
Whenever the above SQL statement is executed, all the existing data will be computed again. If the computation needs to be performed every 30 seconds automatically to compute on the data in the past one minute, the above SQL statement needs to be revised as below, in which `{startTime}` stands for the beginning timestamp in the latest time window.
```sql
select avg(voltage) from meters where ts > {startTime} interval(1m) sliding(30s);
```
An easier way to achieve this is to prepend `create table {tableName} as` before the `select`.
```sql
create table avg_vol as select avg(voltage) from meters interval(1m) sliding(30s);
```
A table named as `avg_vol` will be created automatically, then every 30 seconds the `select` statement will be executed automatically on the data in the past 1 minute, i.e. the latest time window, and the result is written into table `avg_vol`. The client program just needs to query from table `avg_vol`. For example:
Please note that the minimum allowed time window is 10 milliseconds, and there is no upper limit.
It's possible to specify the start and end time of a continuous query. If the start time is not specified, the timestamp of the first row will be considered as the start time; if the end time is not specified, the continuous query will be performed indefinitely, otherwise it will be terminated once the end time is reached. For example, the continuous query in the SQL statement below will be started from now and terminated one hour later.
```sql
create table avg_vol as select avg(voltage) from meters where ts > now and ts <= now + 1h interval(1m) sliding(30s);
```
`now` in the above SQL statement stands for the time when the continuous query is created, not the time when the computation is actually performed. To avoid the trouble caused by a delay in receiving data as much as possible, the actual computation in a continuous query is started after a little delay. That means, once a time window closes, the computation is not started immediately. Normally, the result are available after a little time, normally within one minute, after the time window closes.
## How to Manage
`show streams` command can be used in the TDengine CLI `taos` to show all the continuous queries in the system, and `kill stream` can be used to terminate a continuous query.
description: "Lightweight service for data subscription and publishing. Time series data inserted into TDengine continuously can be pushed automatically to subscribing clients."
title: Data Subscription
---
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
import Java from "./_sub_java.mdx";
import Python from "./_sub_python.mdx";
import Go from "./_sub_go.mdx";
import Rust from "./_sub_rust.mdx";
import Node from "./_sub_node.mdx";
import CSharp from "./_sub_cs.mdx";
import CDemo from "./_sub_c.mdx";
## Introduction
Due to the nature of time series data, data insertion into TDengine is similar to data publishing in message queues. Data is stored in ascending order of timestamp inside TDengine, and so each table in TDengine can essentially be considered as a message queue.
A lightweight service for data subscription and publishing is built into TDengine. With the API provided by TDengine, client programs can use `select` statements to subscribe to data from one or more tables. The subscription and state maintenance is performed on the client side. The client programs poll the server to check whether there is new data, and if so the new data will be pushed back to the client side. If the client program is restarted, where to start retrieving new data is up to the client side.
There are 3 major APIs related to subscription provided in the TDengine client driver.
```c
taos_subscribe
taos_consume
taos_unsubscribe
```
For more details about these APIs please refer to [C/C++ Connector](/reference/connector/cpp). Their usage will be introduced below using the use case of meters, in which the schema of STable and subtables from the previous section [Continuous Query](/develop/continuous-query) are used. Full sample code can be found [here](https://github.com/taosdata/TDengine/blob/master/examples/c/subscribe.c).
If we want to get a notification and take some actions if the current exceeds a threshold, like 10A, from some meters, there are two ways:
The first way is to query each sub table and record the last timestamp matching the criteria. Then after some time, query the data later than the recorded timestamp, and repeat this process. The SQL statements for this way are as below.
```sql
select * from D1001 where ts > {last_timestamp1} and current > 10;
select * from D1002 where ts > {last_timestamp2} and current > 10;
...
```
The above way works, but the problem is that the number of `select` statements increases with the number of meters. Additionally, the performance of both client side and server side will be unacceptable once the number of meters grows to a big enough number.
A better way is to query on the STable, only one `select` is enough regardless of the number of meters, like below:
```sql
select * from meters where ts > {last_timestamp} and current > 10;
```
However, this presents a new problem in how to choose `last_timestamp`. First, the timestamp when the data is generated is different from the timestamp when the data is inserted into the database, sometimes the difference between them may be very big. Second, the time when the data from different meters arrives at the database may be different too. If the timestamp of the "slowest" meter is used as `last_timestamp` in the query, the data from other meters may be selected repeatedly; but if the timestamp of the "fastest" meter is used as `last_timestamp`, some data from other meters may be missed.
All the problems mentioned above can be resolved easily using the subscription functionality provided by TDengine.
The first step is to create subscription using `taos_subscribe`.
```c
TAOS_SUB* tsub = NULL;
if (async) {
// create an asynchronous subscription, the callback function will be called every 1s
The subscription in TDengine can be either synchronous or asynchronous. In the above sample code, the value of variable `async` is determined from the CLI input, then it's used to create either an async or sync subscription. Sync subscription means the client program needs to invoke `taos_consume` to retrieve data, and async subscription means another thread created by `taos_subscribe` internally invokes `taos_consume` to retrieve data and pass the data to `subscribe_callback` for processing. `subscribe_callback` is a callback function provided by the client program. You should not perform time consuming operations in the callback function.
The parameter `taos` is an established connection. Nothing special needs to be done for thread safety for synchronous subscription. For asynchronous subscription, the taos_subscribe function should be called exclusively by the current thread, to avoid unpredictable errors.
The parameter `sql` is a `select` statement in which the `where` clause can be used to specify filter conditions. In our example, we can subscribe to the records in which the current exceeds 10A, with the following SQL statement:
```sql
select * from meters where current > 10;
```
Please note that, all the data will be processed because no start time is specified. If we only want to process data for the past day, a time related condition can be added:
```sql
select * from meters where ts > now - 1d and current > 10;
```
The parameter `topic` is the name of the subscription. The client application must guarantee that the name is unique. However, it doesn't have to be globally unique because subscription is implemented in the APIs on the client side.
If the subscription named as `topic` doesn't exist, the parameter `restart` will be ignored. If the subscription named as `topic` has been created before by the client program, when the client program is restarted with the subscription named `topic`, parameter `restart` is used to determine whether to retrieve data from the beginning or from the last point where the subscription was broken.
If the value of `restart` is **true** (i.e. a non-zero value), data will be retrieved from the beginning. If it is **false** (i.e. zero), the data already consumed before will not be processed again.
The last parameter of `taos_subscribe` is the polling interval in units of millisecond. In sync mode, if the time difference between two continuous invocations to `taos_consume` is smaller than the interval specified by `taos_subscribe`, `taos_consume` will be blocked until the interval is reached. In async mode, this interval is the minimum interval between two invocations to the call back function.
The second to last parameter of `taos_subscribe` is used to pass arguments to the call back function. `taos_subscribe` doesn't process this parameter and simply passes it to the call back function. This parameter is simply ignored in sync mode.
After a subscription is created, its data can be consumed and processed. Shown below is the sample code to consume data in sync mode, in the else condition of `if (async)`.
```c
if (async) {
getchar();
} else while(1) {
TAOS_RES* res = taos_consume(tsub);
if (res == NULL) {
printf("failed to consume data.");
break;
} else {
print_result(res, blockFetch);
getchar();
}
}
```
In the above sample code in the else condition, there is an infinite loop. Each time carriage return is entered `taos_consume` is invoked. The return value of `taos_consume` is the selected result set. In the above sample, `print_result` is used to simplify the printing of the result set. It is similar to `taos_use_result`. Below is the implementation of `print_result`.
```c
void print_result(TAOS_RES* res, int blockFetch) {
TAOS_ROW row = NULL;
int num_fields = taos_num_fields(res);
TAOS_FIELD* fields = taos_fetch_fields(res);
int nRows = 0;
if (blockFetch) {
nRows = taos_fetch_block(res, &row);
for (int i = 0; i < nRows; i++) {
char temp[256];
taos_print_row(temp, row + i, fields, num_fields);
puts(temp);
}
} else {
while ((row = taos_fetch_row(res))) {
char temp[256];
taos_print_row(temp, row, fields, num_fields);
puts(temp);
nRows++;
}
}
printf("%d rows consumed.\n", nRows);
}
```
In the above code `taos_print_row` is used to process the data consumed. All matching rows are printed.
In async mode, consuming data is simpler as shown below.
```c
void subscribe_callback(TAOS_SUB* tsub, TAOS_RES *res, void* param, int code) {
print_result(res, *(int*)param);
}
```
`taos_unsubscribe` can be invoked to terminate a subscription.
```c
taos_unsubscribe(tsub, keep);
```
The second parameter `keep` is used to specify whether to keep the subscription progress on the client sde. If it is **false**, i.e. **0**, then subscription will be restarted from beginning regardless of the `restart` parameter's value when `taos_subscribe` is invoked again. The subscription progress information is stored in _{DataDir}/subscribe/_ , under which there is a file with the same name as `topic` for each subscription(Note: The default value of `DataDir` in the `taos.cfg` file is **/var/lib/taos/**. However, **/var/lib/taos/** does not exist on the Windows server. So you need to change the `DataDir` value to the corresponding existing directory."), the subscription will be restarted from the beginning if the corresponding progress file is removed.
Now let's see the effect of the above sample code, assuming below prerequisites have been done.
- The sample code has been downloaded to local system
- TDengine has been installed and launched properly on same system
- The database, STable, and subtables required in the sample code are ready
Launch the command below in the directory where the sample code resides to compile and start the program.
```bash
make
./subscribe -sql='select * from meters where current > 10;'
```
After the program is started, open another terminal and launch TDengine CLI `taos`, then use the below SQL commands to insert a row whose current is 12A into table **D1001**.
```sql
use test;
insert into D1001 values(now, 12, 220, 1);
```
Then, this row of data will be shown by the example program on the first terminal because its current exceeds 10A. More data can be inserted for you to observe the output of the example program.
## Examples
The example program below demonstrates how to subscribe, using connectors, to data rows in which current exceeds 10A.
### Prepare Data
```bash
# create database "power"
taos> create database power;
# use "power" as the database in following operations
taos> use power;
# create super table "meters"
taos> create table meters(ts timestamp, current float, voltage int, phase int) tags(location binary(64), groupId int);
# create tabes using the schema defined by super table "meters"
taos> create table d1001 using meters tags ("California.SanFrancisco", 2);
taos> create table d1002 using meters tags ("California.LoSangeles", 2);
The FQDN of all hosts must be setup properly. All FQDNs need to be configured in the /etc/hosts file on each host. You must confirm that each FQDN can be accessed from any other host, you can do this by using the `ping` command.
The command `hostname -f` can be executed to get the hostname on any host. `ping <FQDN>` command can be executed on each host to check whether any other host is accessible from it. If any host is not accessible, the network configuration, like /etc/hosts or DNS configuration, needs to be checked and revised, to make any two hosts accessible to each other.
:::note
- The host where the client program runs also needs to be configured properly for FQDN, to make sure all hosts for client or server can be accessed from any other. In other words, the hosts where the client is running are also considered as a part of the cluster.
- Please ensure that your firewall rules do not block TCP/UDP on ports 6030-6042 on all hosts in the cluster.
:::
### Step 2
If any previous version of TDengine has been installed and configured on any host, the installation needs to be removed and the data needs to be cleaned up. For details about uninstalling please refer to [Install and Uninstall](/operation/pkg-install). To clean up the data, please use `rm -rf /var/lib/taos/\*` assuming the `dataDir` is configured as `/var/lib/taos`.
:::note
As a best practice, before cleaning up any data files or directories, please ensure that your data has been backed up correctly, if required by your data integrity, backup, security, or other standard operating protocols (SOP).
:::
### Step 3
Now it's time to install TDengine on all hosts but without starting `taosd`. Note that the versions on all hosts should be same. If you are prompted to input the existing TDengine cluster, simply press carriage return to ignore the prompt. `install.sh -e no` can also be used to disable this prompt. For details please refer to [Install and Uninstall](/operation/pkg-install).
### Step 4
Now each physical node (referred to, hereinafter, as `dnode` which is an abbreviation for "data node") of TDengine needs to be configured properly. Please note that one dnode doesn't stand for one host. Multiple TDengine dnodes can be started on a single host as long as they are configured properly without conflicting. More specifically each instance of the configuration file `taos.cfg` stands for a dnode. Assuming the first dnode of TDengine cluster is "h1.taosdata.com:6030", its `taos.cfg` is configured as following.
```c
// firstEp is the end point to connect to when any dnode starts
firstEph1.taosdata.com:6030
// must be configured to the FQDN of the host where the dnode is launched
fqdnh1.taosdata.com
// the port used by the dnode, default is 6030
serverPort6030
// only necessary when replica is configured to an even number
#arbitrator ha.taosdata.com:6042
```
`firstEp` and `fqdn` must be configured properly. In `taos.cfg` of all dnodes in TDengine cluster, `firstEp` must be configured to point to same address, i.e. the first dnode of the cluster. `fqdn` and `serverPort` compose the address of each node itself. If you want to start multiple TDengine dnodes on a single host, please make sure all other configurations like `dataDir`, `logDir`, and other resources related parameters are not conflicting.
For all the dnodes in a TDengine cluster, the below parameters must be configured exactly the same, any node whose configuration is different from dnodes already in the cluster can't join the cluster.
| 1 | statusInterval | The time interval for which dnode reports its status to mnode |
| 2 | timezone | Time Zone where the server is located |
| 3 | locale | Location code of the system |
| 4 | charset | Character set of the system |
## Start Cluster
In the following example we assume that first dnode has FQDN h1.taosdata.com and the second dnode has FQDN h2.taosdata.com.
### Start The First DNODE
Start the first dnode following the instructions in [Get Started](/get-started/). Then launch TDengine CLI `taos` and execute command `show dnodes`, the output is as following for example:
```
Welcome to the TDengine shell from Linux, Client Version:3.0.0.0
Copyright (c) 2022 by TAOS Data, Inc. All rights reserved.
Server is Enterprise trial Edition, ver:3.0.0.0 and will never expire.
taos> show dnodes;
id | endpoint | vnodes | support_vnodes | status | create_time | note |
From the above output, it is shown that the end point of the started dnode is "h1.taosdata.com:6030", which is the `firstEp` of the cluster.
### Start Other DNODEs
There are a few steps necessary to add other dnodes in the cluster.
Let's assume we are starting the second dnode with FQDN, h2.taosdata.com. Firstly we make sure the configuration is correct.
```c
// firstEp is the end point to connect to when any dnode starts
firstEph1.taosdata.com:6030
// must be configured to the FQDN of the host where the dnode is launched
fqdnh2.taosdata.com
// the port used by the dnode, default is 6030
serverPort6030
```
Secondly, we can start `taosd` as instructed in [Get Started](/get-started/).
Then, on the first dnode i.e. h1.taosdata.com in our example, use TDengine CLI `taos` to execute the following command to add the end point of the dnode in the cluster. In the command "fqdn:port" should be quoted using double quotes.
```sql
CREATEDNODE"h2.taos.com:6030";
```
Then on the first dnode h1.taosdata.com, execute `show dnodes` in `taos` to show whether the second dnode has been added in the cluster successfully or not.
```sql
SHOWDNODES;
```
If the status of the newly added dnode is offline, please check:
- Whether the `taosd` process is running properly or not
- In the log file `taosdlog.0` to see whether the fqdn and port are correct
The above process can be repeated to add more dnodes in the cluster.
The previous section, [Deployment],(/cluster/deploy) showed you how to deploy and start a cluster from scratch. Once a cluster is ready, the status of dnode(s) in the cluster can be shown at any time. Dnodes can be managed from the TDengine CLI. New dnode(s) can be added to scale out the cluster, an existing dnode can be removed and you can even perform load balancing manually, if necessary.
:::note
All the commands introduced in this chapter must be run in the TDengine CLI - `taos`. Note that sometimes it is necessary to use root privilege.
:::
## Show DNODEs
The below command can be executed in TDengine CLI `taos` to list all dnodes in the cluster, including ID, end point (fqdn:port), status (ready, offline), number of vnodes, number of free vnodes and so on. We recommend executing this command after adding or removing a dnode.
```sql
SHOWDNODES;
```
Below is the example output of this command.
```
taos> show dnodes;
id | end_point | vnodes | cores | status | role | create_time | offline reason |
To utilize system resources efficiently and provide scalability, data sharding is required. The data of each database is divided into multiple shards and stored in multiple vnodes. These vnodes may be located on different dnodes. One way of scaling out is to add more vnodes on dnodes. Each vnode can only be used for a single DB, but one DB can have multiple vnodes. The allocation of vnode is scheduled automatically by mnode based on system resources of the dnodes.
Launch TDengine CLI `taos` and execute below command:
Launch TDengine CLI `taos` and execute the command below to add the end point of a new dnode into the EPI (end point) list of the cluster. "fqdn:port" must be quoted using double quotes.
```sql
CREATE DNODE "fqdn:port";
````
The example output is as below:
```
taos> create dnode "localhost:7030";
Query OK, 0 of 0 row(s) in database (0.008203s)
taos> show dnodes;
id | end_point | vnodes | cores | status | role | create_time | offline reason |
2 | localhost:7030 | 0 | 0 | offline | any | 2022-04-19 08:11:42.158 | status not received |
Query OK, 2 row(s) in set (0.001017s)
```
It can be seen that the status of the new dnode is "offline". Once the dnode is started and connects to the firstEp of the cluster, you can execute the command again and get the example output below. As can be seen, both dnodes are in "ready" status.
```
taos> show dnodes;
id | end_point | vnodes | cores | status | role | create_time | offline reason |
Launch TDengine CLI `taos` and execute the command below to drop or remove a dnode from the cluster. In the command, you can get `dnodeId` from `show dnodes`.
```sql
DROPDNODE"fqdn:port";
```
or
```sql
DROPDNODEdnodeId;
```
The example output is below:
```
taos> show dnodes;
id | end_point | vnodes | cores | status | role | create_time | offline reason |
In the above example, when `show dnodes` is executed the first time, two dnodes are shown. After `drop dnode 2` is executed, you can execute `show dnodes` again and it can be seen that only the dnode with ID 1 is still in the cluster.
:::note
- Once a dnode is dropped, it can't rejoin the cluster. To rejoin, the dnode needs to deployed again after cleaning up the data directory. Before dropping a dnode, the data belonging to the dnode MUST be migrated/backed up according to your data retention, data security or other SOPs.
- Please note that `drop dnode` is different from stopping `taosd` process. `drop dnode` just removes the dnode out of TDengine cluster. Only after a dnode is dropped, can the corresponding `taosd` process be stopped.
- Once a dnode is dropped, other dnodes in the cluster will be notified of the drop and will not accept the request from the dropped dnode.
- dnodeID is allocated automatically and can't be manually modified. dnodeID is generated in ascending order without duplication.
TDengine has a native distributed design and provides the ability to scale out. A few nodes can form a TDengine cluster. If you need higher processing power, you just need to add more nodes into the cluster. TDengine uses virtual node technology to virtualize a node into multiple virtual nodes to achieve load balancing. At the same time, TDengine can group virtual nodes on different nodes into virtual node groups, and use the replication mechanism to ensure the high availability of the system. The cluster feature of TDengine is completely open source.
This chapter mainly introduces cluster deployment, maintenance, and how to achieve high availability and load balancing.
```mdx-code-block
import DocCardList from '@theme/DocCardList';
import {useCurrentSidebarCategory} from '@docusaurus/theme-common';
Enter FQDN:port (like h1.taosdata.com:6030) of an existing TDengine cluster node to join
OR leave it blank to build one:
Enter your email address for priority support or enter empty to skip:
Created symlink from /etc/systemd/system/multi-user.target.wants/taosd.service to /etc/systemd/system/taosd.service.
To configure TDengine : edit /etc/taos/taos.cfg
To start TDengine : sudo systemctl start taosd
To access TDengine : taos -h centos7 to login into TDengine server
TDengine is installed successfully!
```
</TabItem>
<TabItemlabel="Install tar.gz"value="tarinst">
1. Download the tar.gz package, for example TDengine-server-3.0.0.0-Linux-x64.tar.gz;
2. In the directory where the package is located, first decompress the file, then switch to the sub-directory generated in decompressing, i.e. "TDengine-enterprise-server-3.0.0.0/" in this example, and execute the `install.sh` script.
```bash
$ tar xvzf TDengine-enterprise-server-3.0.0.0-Linux-x64.tar.gz
drwxrwxr-x 4 ubuntu ubuntu 4096 Feb 22 09:30 TDengine-enterprise-server-3.0.0.0/
-rw-rw-r-- 1 ubuntu ubuntu 44852544 Feb 22 09:31 TDengine-enterprise-server-3.0.0.0-Linux-x64.tar.gz
$ cd TDengine-enterprise-server-3.0.0.0/
$ ll
total 40784
drwxrwxr-x 4 ubuntu ubuntu 4096 Feb 22 09:30 ./
drwxrwxr-x 3 ubuntu ubuntu 4096 Feb 22 09:31 ../
drwxrwxr-x 2 ubuntu ubuntu 4096 Feb 22 09:30 driver/
drwxrwxr-x 10 ubuntu ubuntu 4096 Feb 22 09:30 examples/
-rwxrwxr-x 1 ubuntu ubuntu 33294 Feb 22 09:30 install.sh*
-rw-rw-r-- 1 ubuntu ubuntu 41704288 Feb 22 09:30 taos.tar.gz
$ sudo ./install.sh
Start to update TDengine...
Created symlink /etc/systemd/system/multi-user.target.wants/taosd.service → /etc/systemd/system/taosd.service.
Nginx for TDengine is updated successfully!
To configure TDengine : edit /etc/taos/taos.cfg
To configure Taos Adapter (if has) : edit /etc/taos/taosadapter.toml
To start TDengine : sudo systemctl start taosd
To access TDengine : use taos -h ubuntu-1804 in shell OR from http://127.0.0.1:6060
TDengine is updated successfully!
Install taoskeeper as a standalone service
taoskeeper is installed, enable it by `systemctl enable taoskeeper`
```
:::info
Users will be prompted to enter some configuration information when install.sh is executing. The interactive mode can be disabled by executing `./install.sh -e no`. `./install.sh -h` can show all parameters with detailed explanation.
:::
</TabItem>
</Tabs>
:::note
When installing on the first node in the cluster, at the "Enter FQDN:" prompt, nothing needs to be provided. When installing on subsequent nodes, at the "Enter FQDN:" prompt, you must enter the end point of the first dnode in the cluster if it is already up. You can also just ignore it and configure it later after installation is finished.
A system operator can use the TDengine CLI to show connections, ongoing queries, stream computing, and can close connections or stop ongoing query tasks or stream computing.
## Show Connections
```sql
SHOWCONNECTIONS;
```
One column of the output of the above SQL command is "ip:port", which is the end point of the client.
## Force Close Connections
```sql
KILLCONNECTION<connection-id>;
```
In the above SQL command, `connection-id` is from the first column of the output of `SHOW CONNECTIONS`.
## Show Ongoing Queries
```sql
SHOWQUERIES;
```
The first column of the output is query ID, which is composed of the corresponding connection ID and the sequence number of the current query task started on this connection. The format is "connection-id:query-no".
## Force Close Queries
```sql
KILLQUERY<query-id>;
```
In the above SQL command, `query-id` is from the first column of the output of `SHOW QUERIES `.
## Show Continuous Query
```sql
SHOWSTREAMS;
```
The first column of the output is stream ID, which is composed of the connection ID and the sequence number of the current stream started on this connection. The format is "connection-id:stream-no".
## Force Close Continuous Query
```sql
KILLSTREAM<stream-id>;
```
The above SQL command, `stream-id` is from the first column of the output of `SHOW STREAMS`.
@@ -419,11 +419,11 @@ Note that once the installation is complete, do not immediately start the `taosd
...
@@ -419,11 +419,11 @@ Note that once the installation is complete, do not immediately start the `taosd
To ensure that the system can obtain the necessary information for regular operation. Please set the following vital parameters correctly on the server:
To ensure that the system can obtain the necessary information for regular operation. Please set the following vital parameters correctly on the server:
FQDN, firstEp, secondEP, dataDir, logDir, tmpDir, serverPort. For the specific meaning and setting requirements of each parameter, please refer to the document "[TDengine Cluster Installation and Management](/cluster/)"
FQDN, firstEp, secondEP, dataDir, logDir, tmpDir, serverPort. For the specific meaning and setting requirements of each parameter, please refer to the document "[TDengine Cluster Deployment](../../deployment)"
Follow the same steps to set parameters on the other nodes, start the taosd service, and then add Dnodes to the cluster.
Follow the same steps to set parameters on the other nodes, start the taosd service, and then add Dnodes to the cluster.
Finally, start `taos` and execute the `show dnodes` command. If you can see all the nodes that have joined the cluster, the cluster building process was successfully completed. For specific operation procedures and precautions, please refer to the document "[TDengine Cluster Installation and Management](/cluster/)".
Finally, start `taos` and execute the `show dnodes` command. If you can see all the nodes that have joined the cluster, the cluster building process was successfully completed. For specific operation procedures and precautions, please refer to the document "[TDengine Cluster Deployment](../../deployment)".
## Appendix 4: Super Table Names
## Appendix 4: Super Table Names
...
@@ -431,5 +431,5 @@ Since OpenTSDB's metric name has a dot (".") in it, for example, a metric with a
...
@@ -431,5 +431,5 @@ Since OpenTSDB's metric name has a dot (".") in it, for example, a metric with a
## Appendix 5: Reference Articles
## Appendix 5: Reference Articles
1.[Using TDengine + collectd/StatsD + Grafana to quickly build an IT operation and maintenance monitoring system](/application/collectd/)
1.[Using TDengine + collectd/StatsD + Grafana to quickly build an IT operation and maintenance monitoring system](../collectd/)
2.[Write collected data directly to TDengine through collectd](/third-party/collectd/)
2.[Write collected data directly to TDengine through collectd](../collectd/)
@@ -74,6 +74,7 @@ sql explain analyze verbose true select ts from tb1 where f1 > 0;
...
@@ -74,6 +74,7 @@ sql explain analyze verbose true select ts from tb1 where f1 > 0;
sql explain analyze verbose true select f1 from st1 where f1 > 0 and ts > '2020-10-31 00:00:00' and ts < '2021-10-31 00:00:00';
sql explain analyze verbose true select f1 from st1 where f1 > 0 and ts > '2020-10-31 00:00:00' and ts < '2021-10-31 00:00:00';
sql explain analyze verbose true select * from information_schema.ins_stables where db_name='db2';
sql explain analyze verbose true select * from information_schema.ins_stables where db_name='db2';
sql explain analyze verbose true select * from (select min(f1),count(*) a from st1 where f1 > 0) where a < 0;
sql explain analyze verbose true select * from (select min(f1),count(*) a from st1 where f1 > 0) where a < 0;
sql explain analyze verbose true select count(f1) from st1 group by tbname;
#not pass case
#not pass case
#sql explain verbose true select count(*),sum(f1) as aa from tb1 where (f1 > 0 or f1 < -1) and ts > '2020-10-31 00:00:00' and ts < '2021-10-31 00:00:00' order by aa;
#sql explain verbose true select count(*),sum(f1) as aa from tb1 where (f1 > 0 or f1 < -1) and ts > '2020-10-31 00:00:00' and ts < '2021-10-31 00:00:00' order by aa;