未验证 提交 52d33843 编写于 作者: H haojun Liao 提交者: GitHub

Merge branch 'develop' into feature/query

......@@ -131,10 +131,10 @@ cmake .. -DCPUTYPE=mips64 && cmake --build .
### On Windows platform
If you use the Visual Studio 2013, please open a command window by executing "cmd.exe".
Please specify "x86_amd64" for 64 bits Windows or specify "x86" is for 32 bits Windows when you execute vcvarsall.bat.
Please specify "amd64" for 64 bits Windows or specify "x86" is for 32 bits Windows when you execute vcvarsall.bat.
```cmd
mkdir debug && cd debug
"C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" < x86_amd64 | x86 >
"C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" < amd64 | x86 >
cmake .. -G "NMake Makefiles"
nmake
```
......
......@@ -122,10 +122,14 @@ IF (TD_LINUX)
ADD_DEFINITIONS(-D_TD_NINGSI_60)
MESSAGE(STATUS "set ningsi macro to true")
ENDIF ()
SET(DEBUG_FLAGS "-O0 -g3 -DDEBUG")
IF (TD_MEMORY_SANITIZER)
SET(DEBUG_FLAGS "-fsanitize=address -fsanitize=undefined -fno-sanitize-recover=all -fsanitize=float-divide-by-zero -fsanitize=float-cast-overflow -fno-sanitize=null -fno-sanitize=alignment -static-libasan -O0 -g3 -DDEBUG")
ELSE ()
SET(DEBUG_FLAGS "-O0 -g3 -DDEBUG")
ENDIF ()
SET(RELEASE_FLAGS "-O3 -Wno-error")
IF (${COVER} MATCHES "true")
MESSAGE(STATUS "Test coverage mode, add extra flags")
SET(GCC_COVERAGE_COMPILE_FLAGS "-fprofile-arcs -ftest-coverage")
......@@ -144,7 +148,11 @@ IF (TD_DARWIN_64)
ADD_DEFINITIONS(-DUSE_LIBICONV)
MESSAGE(STATUS "darwin64 is defined")
SET(COMMON_FLAGS "-std=gnu99 -Wall -Werror -Wno-missing-braces -fPIC -msse4.2 -D_FILE_OFFSET_BITS=64 -D_LARGE_FILE")
SET(DEBUG_FLAGS "-O0 -g3 -DDEBUG")
IF (TD_MEMORY_SANITIZER)
SET(DEBUG_FLAGS "-fsanitize=address -fsanitize=undefined -fno-sanitize-recover=all -fsanitize=float-divide-by-zero -fsanitize=float-cast-overflow -fno-sanitize=null -fno-sanitize=alignment -O0 -g3 -DDEBUG")
ELSE ()
SET(DEBUG_FLAGS "-O0 -g3 -DDEBUG")
ENDIF ()
SET(RELEASE_FLAGS "-Og")
INCLUDE_DIRECTORIES(${TD_COMMUNITY_DIR}/deps/cJson/inc)
INCLUDE_DIRECTORIES(${TD_COMMUNITY_DIR}/deps/lz4/inc)
......@@ -162,7 +170,14 @@ IF (TD_WINDOWS)
IF (MSVC AND (MSVC_VERSION GREATER_EQUAL 1900))
SET(COMMON_FLAGS "${COMMON_FLAGS} /Wv:18")
ENDIF ()
SET(DEBUG_FLAGS "/fsanitize=thread /fsanitize=leak /fsanitize=memory /fsanitize=undefined /fsanitize=hwaddress /Zi /W3 /GL")
IF (TD_MEMORY_SANITIZER)
MESSAGE("memory sanitizer detected as true")
SET(DEBUG_FLAGS "/fsanitize=address /Zi /W3 /GL")
ELSE ()
MESSAGE("memory sanitizer detected as false")
SET(DEBUG_FLAGS "/Zi /W3 /GL")
ENDIF ()
SET(RELEASE_FLAGS "/W0 /O3 /GL")
ENDIF ()
......@@ -171,7 +186,7 @@ IF (TD_WINDOWS)
INCLUDE_DIRECTORIES(${TD_COMMUNITY_DIR}/deps/regex)
INCLUDE_DIRECTORIES(${TD_COMMUNITY_DIR}/deps/wepoll/inc)
INCLUDE_DIRECTORIES(${TD_COMMUNITY_DIR}/deps/MsvcLibX/include)
ENDIF ()
ENDIF ()
IF (TD_WINDOWS_64)
ADD_DEFINITIONS(-D_M_X64)
......
......@@ -83,3 +83,8 @@ SET(TD_BUILD_JDBC TRUE)
IF (${BUILD_JDBC} MATCHES "false")
SET(TD_BUILD_JDBC FALSE)
ENDIF ()
SET(TD_MEMORY_SANITIZER FALSE)
IF (${MEMORY_SANITIZER} MATCHES "true")
SET(TD_MEMORY_SANITIZER TRUE)
ENDIF ()
......@@ -15,6 +15,7 @@ TDengine是一个高效的存储、查询、分析时序大数据的平台,专
* [命令行程序TAOS](/getting-started#console):访问TDengine的简便方式
* [极速体验](/getting-started#demo):运行示例程序,快速体验高效的数据插入、查询
* [支持平台列表](/getting-started#platforms):TDengine服务器和客户端支持的平台列表
* [Kubernetes部署](https://taosdata.github.io/TDengine-Operator/zh/index.html):TDengine在Kubernetes环境进行部署的详细说明
## [整体架构](/architecture)
......
......@@ -176,7 +176,7 @@ TDengine 分布式架构的逻辑结构图如下:
**通讯方式:**TDengine系统的各个数据节点之间,以及应用驱动与各数据节点之间的通讯是通过TCP/UDP进行的。因为考虑到物联网场景,数据写入的包一般不大,因此TDengine 除采用TCP做传输之外,还采用UDP方式,因为UDP 更加高效,而且不受连接数的限制。TDengine实现了自己的超时、重传、确认等机制,以确保UDP的可靠传输。对于数据量不到15K的数据包,采取UDP的方式进行传输,超过15K的,或者是查询类的操作,自动采取TCP的方式进行传输。同时,TDengine根据配置和数据包,会自动对数据进行压缩/解压缩,数字签名/认证等处理。对于数据节点之间的数据复制,只采用TCP方式进行数据传输。
**FQDN配置**:一个数据节点有一个或多个FQDN,可以在系统配置文件taos.cfg通过参数“fqdn"进行指定,如果没有指定,系统将自动获取计算机的hostname作为其FQDN。如果节点没有配置FQDN,可以直接将该节点的配置参数fqdn设置为它的IP地址。但不建议使用IP,因为IP地址可变,一旦变化,将让集群无法正常工作。一个数据节点的EP(End Point)由FQDN + Port组成。采用FQDN,需要保证DNS服务正常工作,或者在节点以及应用所在的节点配置好hosts文件。
**FQDN配置**:一个数据节点有一个或多个FQDN,可以在系统配置文件taos.cfg通过参数“fqdn"进行指定,如果没有指定,系统将自动获取计算机的hostname作为其FQDN。如果节点没有配置FQDN,可以直接将该节点的配置参数fqdn设置为它的IP地址。但不建议使用IP,因为IP地址可变,一旦变化,将让集群无法正常工作。一个数据节点的EP(End Point)由FQDN + Port组成。采用FQDN,需要保证DNS服务正常工作,或者在节点以及应用所在的节点配置好hosts文件。另外,这个参数值的长度需要控制在 96 个字符以内。
**端口配置:**一个数据节点对外的端口由TDengine的系统配置参数serverPort决定,对集群内部通讯的端口是serverPort+5。集群内数据节点之间的数据复制操作还占有一个TCP端口,是serverPort+10. 为支持多线程高效的处理UDP数据,每个对内和对外的UDP连接,都需要占用5个连续的端口。因此一个数据节点总的端口范围为serverPort到serverPort + 10,总共11个TCP/UDP端口。(另外还可能有 RESTful、Arbitrator 所使用的端口,那样的话就一共是 13 个。)使用时,需要确保防火墙将这些端口打开,以备使用。每个数据节点可以配置不同的serverPort。(详细的端口情况请参见 [TDengine 2.0 端口说明](https://www.taosdata.com/cn/documentation/faq#port)
......
......@@ -325,10 +325,12 @@ for (int i = 0; i < numOfRows; i++){
}
s.setString(2, s2, 10);
// AddBatch 之后,可以再设定新的表名、TAGS、VALUES 取值,这样就能实现一次执行向多个数据表写入
// AddBatch 之后,缓存并未清空。为避免混乱,并不推荐在 ExecuteBatch 之前再次绑定新一批的数据
s.columnDataAddBatch();
// 执行语句:
// 执行绑定数据后的语句:
s.columnDataExecuteBatch();
// 执行语句后清空缓存。在清空之后,可以复用当前的对象,绑定新的一批数据(可以是新表名、新 TAGS 值、新 VALUES 值):
s.columnDataClearBatch();
// 执行完毕,释放资源:
s.columnDataCloseBatch();
```
......
......@@ -307,6 +307,8 @@ TDengine的异步API均采用非阻塞调用模式。应用程序可以用多线
8. 调用 `taos_stmt_execute` 执行已经准备好的批处理指令;
9. 执行完毕,调用 `taos_stmt_close` 释放所有资源。
说明:如果 `taos_stmt_execute` 执行成功,假如不需要改变 SQL 语句的话,那么是可以复用 `taos_stmt_prepare` 的解析结果,直接进行第 3~6 步绑定新数据的。但如果执行出错,那么并不建议继续在当前的环境上下文下继续工作,而是建议释放资源,然后从 `taos_stmt_init` 步骤重新开始。
除 C/C++ 语言外,TDengine 的 Java 语言 JNI Connector 也提供参数绑定接口支持,具体请另外参见:[参数绑定接口的 Java 用法](https://www.taosdata.com/cn/documentation/connector/java#stmt-java)
接口相关的具体函数如下(也可以参考 [apitest.c](https://github.com/taosdata/TDengine/blob/develop/tests/examples/c/apitest.c) 文件中使用对应函数的方式):
......@@ -378,6 +380,11 @@ typedef struct TAOS_MULTI_BIND {
执行完毕,释放所有资源。
- `char * taos_stmt_errstr(TAOS_STMT *stmt)`
(2.1.3.0 版本新增)
用于在其他 stmt API 返回错误(返回错误码或空指针)时获取错误信息。
### 连续查询接口
TDengine提供时间驱动的实时流式计算API。可以每隔一指定的时间段,对一张或多张数据库的表(数据流)进行各种实时聚合计算操作。操作简单,仅有打开、关闭流的API。具体如下:
......@@ -892,7 +899,7 @@ go env -w GOPROXY=https://goproxy.io,direct
Node.js连接器支持的系统有:
| **CPU类型** | x64(64bit) | | | aarch64 | aarch32 |
|**CPU类型** | x64(64bit) | | | aarch64 | aarch32 |
| ------------ | ------------ | -------- | -------- | -------- | -------- |
| **OS类型** | Linux | Win64 | Win32 | Linux | Linux |
| **支持与否** | **支持** | **支持** | **支持** | **支持** | **支持** |
......
......@@ -99,7 +99,7 @@ taosd -C
下面仅仅列出一些重要的配置参数,更多的参数请看配置文件里的说明。各个参数的详细介绍及作用请看前述章节,而且这些参数的缺省配置都是工作的,一般无需设置。**注意:配置修改后,需要重启*taosd*服务才能生效。**
- firstEp: taosd启动时,主动连接的集群中首个dnode的end point, 默认值为localhost:6030。
- fqdn:数据节点的FQDN,缺省为操作系统配置的第一个hostname。如果习惯IP地址访问,可设置为该节点的IP地址。
- fqdn:数据节点的FQDN,缺省为操作系统配置的第一个hostname。如果习惯IP地址访问,可设置为该节点的IP地址。这个参数值的长度需要控制在 96 个字符以内。
- serverPort:taosd启动后,对外服务的端口号,默认值为6030。(RESTful服务使用的端口号是在此基础上+11,即默认值为6041。)
- dataDir: 数据文件目录,所有的数据文件都将写入该目录。默认值:/var/lib/taos。
- logDir:日志文件目录,客户端和服务器的运行日志文件将写入该目录。默认值:/var/log/taos。
......@@ -444,7 +444,7 @@ TDengine的所有可执行文件默认存放在 _/usr/local/taos/bin_ 目录下
- 数据库名:不能包含“.”以及特殊字符,不能超过 32 个字符
- 表名:不能包含“.”以及特殊字符,与所属数据库名一起,不能超过 192 个字符
- 表的列名:不能包含特殊字符,不能超过 64 个字符
- 数据库名、表名、列名,都不能以数字开头
- 数据库名、表名、列名,都不能以数字开头,合法的可用字符集是“英文字符、数字和下划线”
- 表的列数:不能超过 1024 列
- 记录的最大长度:包括时间戳 8 byte,不能超过 16KB(每个 BINARY/NCHAR 类型的列还会额外占用 2 个 byte 的存储位置)
- 单条 SQL 语句默认最大字符串长度:65480 byte
......
......@@ -681,9 +681,10 @@ Query OK, 1 row(s) in set (0.001091s)
| % | match with any char sequences | **`binary`** **`nchar`** |
| _ | match with a single char | **`binary`** **`nchar`** |
1. 同时进行多个字段的范围过滤,需要使用关键词 AND 来连接不同的查询条件,暂不支持 OR 连接的不同列之间的查询过滤条件。
2. 针对单一字段的过滤,如果是时间过滤条件,则一条语句中只支持设定一个;但针对其他的(普通)列或标签列,则可以使用 `OR` 关键字进行组合条件的查询过滤。例如:((value > 20 AND value < 30) OR (value < 12)) 。
3. 从 2.0.17 版本开始,条件过滤开始支持 BETWEEN AND 语法,例如 `WHERE col2 BETWEEN 1.5 AND 3.25` 表示查询条件为“1.5 ≤ col2 ≤ 3.25”。
1. <> 算子也可以写为 != ,请注意,这个算子不能用于数据表第一列的 timestamp 字段。
2. 同时进行多个字段的范围过滤,需要使用关键词 AND 来连接不同的查询条件,暂不支持 OR 连接的不同列之间的查询过滤条件。
3. 针对单一字段的过滤,如果是时间过滤条件,则一条语句中只支持设定一个;但针对其他的(普通)列或标签列,则可以使用 `OR` 关键字进行组合条件的查询过滤。例如:((value > 20 AND value < 30) OR (value < 12)) 。
4. 从 2.0.17 版本开始,条件过滤开始支持 BETWEEN AND 语法,例如 `WHERE col2 BETWEEN 1.5 AND 3.25` 表示查询条件为“1.5 ≤ col2 ≤ 3.25”。
<!--
<a class="anchor" id="having"></a>
......@@ -1178,9 +1179,9 @@ TDengine支持针对数据的聚合查询。提供支持的聚合和选择函数
应用字段:不能应用在timestamp、binary、nchar、bool类型字段。
适用于:**表**。
适用于:**表、(超级表)**。
说明:输出结果行数是范围内总行数减一,第一行没有结果输出。
说明:输出结果行数是范围内总行数减一,第一行没有结果输出。从 2.1.3.0 版本开始,DIFF 函数可以在由 GROUP BY 划分出单独时间线的情况下用于超级表(也即 GROUP BY tbname)。
示例:
```mysql
......
# TDengine Documentation
TDengine is a highly efficient platform to store, query, and analyze time-series data. It is specially designed and optimized for IoT, Internet of Vehicles, Industrial IoT, IT Infrastructure and Application Monitoring, etc. It works like a relational database, such as MySQL, but you are strongly encouraged to read through the following documentation before you experience it, especially the Data Model and Data Modeling sections. In addition to this document, you should also download and read our technology white paper. For the older TDengine version 1.6 documentation, please click here.
## [TDengine Introduction](/evaluation)
* [TDengine Introduction and Features](/evaluation#intro)
* [TDengine Use Scenes](/evaluation#scenes)
* [TDengine Performance Metrics and Verification]((/evaluation#))
## [Getting Started](/getting-started)
* [Quickly Install](/getting-started#install): install via source code/package / Docker within seconds
- [Easy to Launch](/getting-started#start): start / stop TDengine with systemctl
- [Command-line](/getting-started#console) : an easy way to access TDengine server
- [Experience Lightning Speed](/getting-started#demo): running a demo, inserting/querying data to experience faster speed
- [List of Supported Platforms](/getting-started#platforms): a list of platforms supported by TDengine server and client
- [Deploy to Kubernetes](https://taosdata.github.io/TDengine-Operator/en/index.html):a detailed guide for TDengine deployment in Kubernetes environment
## [Overall Architecture](/architecture)
- [Data Model](/architecture#model): relational database model, but one table for one device with static tags
- [Cluster and Primary Logical Unit](/architecture#cluster): Take advantage of NoSQL, support scale-out and high-reliability
- [Storage Model and Data Partitioning/Sharding](/architecture#sharding): tag data will be separated from time-series data, segmented by vnode and time
- [Data Writing and Replication Process](/architecture#replication): records received are written to WAL, cached, with acknowledgement is sent back to client, while supporting multi-replicas
- [Caching and Persistence](/architecture#persistence): latest records are cached in memory, but are written in columnar format with an ultra-high compression ratio
- [Data Query](/architecture#query): support various functions, time-axis aggregation, interpolation, and multi-table aggregation
## [Data Modeling](/model)
- [Create a Database](/model#create-db): create a database for all data collection points with similar features
- [Create a Super Table(STable)](/model#create-stable): create a STable for all data collection points with the same type
- [Create a Table](/model#create-table): use STable as the template, to create a table for each data collecting point
## [TAOS SQL](/taos-sql)
- [Data Types](/taos-sql#data-type): support timestamp, int, float, nchar, bool, and other types
- [Database Management](/taos-sql#management): add, drop, check databases
- [Table Management](/taos-sql#table): add, drop, check, alter tables
- [STable Management](/taos-sql#super-table): add, drop, check, alter STables
- [Tag Management](/taos-sql#tags): add, drop, alter tags
- [Inserting Records](/taos-sql#insert): support to write single/multiple items per table, multiple items across tables, and support to write historical data
- [Data Query](/taos-sql#select): support time segment, value filtering, sorting, manual paging of query results, etc
- [SQL Function](/taos-sql#functions): support various aggregation functions, selection functions, and calculation functions, such as avg, min, diff, etc
- [Time Dimensions Aggregation](/taos-sql#aggregation): aggregate and reduce the dimension after cutting table data by time segment
- [Boundary Restrictions](/taos-sql#limitation): restrictions for the library, table, SQL, and others
- [Error Code](/taos-sql/error-code): TDengine 2.0 error codes and corresponding decimal codes
## [Efficient Data Ingestion](/insert)
- [SQL Ingestion](/insert#sql): write one or multiple records into one or multiple tables via SQL insert command
- [Prometheus Ingestion](/insert#prometheus): Configure Prometheus to write data directly without any code
- [Telegraf Ingestion](/insert#telegraf): Configure Telegraf to write collected data directly without any code
- [EMQ X Broker](/insert#emq): Configure EMQ X to write MQTT data directly without any code
- [HiveMQ Broker](/insert#hivemq): Configure HiveMQ to write MQTT data directly without any code
## [Efficient Data Querying](/queries)
- [Main Query Features](/queries#queries): support various standard functions, setting filter conditions, and querying per time segment
- [Multi-table Aggregation Query](/queries#aggregation): use STable and set tag filter conditions to perform efficient aggregation queries
- [Downsampling to Query Value](/queries#sampling): aggregate data in successive time windows, support interpolation
## [Advanced Features](/advanced-features)
- [Continuous Query](/advanced-features#continuous-query): Based on sliding windows, the data stream is automatically queried and calculated at regular intervals
- [Data Publisher/Subscriber](/advanced-features#subscribe): subscribe to the newly arrived data like a typical messaging system
- [Cache](/advanced-features#cache): the newly arrived data of each device/table will always be cached
- [Alarm Monitoring](/advanced-features#alert): automatically monitor out-of-threshold data, and actively push it based-on configuration rules
## [Connector](/connector)
- [C/C++ Connector](/connector#c-cpp): primary method to connect to TDengine server through libtaos client library
- [Java Connector(JDBC)]: driver for connecting to the server from Java applications using the JDBC API
- [Python Connector](/connector#python): driver for connecting to TDengine server from Python applications
- [RESTful Connector](/connector#restful): a simple way to interact with TDengine via HTTP
- [Go Connector](/connector#go): driver for connecting to TDengine server from Go applications
- [Node.js Connector](/connector#nodejs): driver for connecting to TDengine server from Node.js applications
- [C# Connector](/connector#csharp): driver for connecting to TDengine server from C# applications
- [Windows Client](https://www.taosdata.com/blog/2019/07/26/514.html): compile your own Windows client, which is required by various connectors on the Windows environment
## [Connections with Other Tools](/connections)
- [Grafana](/connections#grafana): query the data saved in TDengine and provide visualization
- [Matlab](/connections#matlab): access data stored in TDengine server via JDBC configured within Matlab
- [R](/connections#r): access data stored in TDengine server via JDBC configured within R
- [IDEA Database](https://www.taosdata.com/blog/2020/08/27/1767.html): use TDengine visually through IDEA Database Management Tool
## [Installation and Management of TDengine Cluster](/cluster)
- [Preparation](/cluster#prepare): important considerations before deploying TDengine for production usage
- [Create Your First Node](/cluster#node-one): simple to follow the quick setup
- [Create Subsequent Nodes](/cluster#node-other): configure taos.cfg for new nodes to add more to the existing cluster
- [Node Management](/cluster#management): add, delete, and check nodes in the cluster
- [High-availability of Vnode](/cluster#high-availability): implement high-availability of Vnode through multi-replicas
- [Mnode Management](/cluster#mnode): automatic system creation without any manual intervention
- [Load Balancing](/cluster#load-balancing): automatically performed once the number of nodes or load changes
- [Offline Node Processing](/cluster#offline): any node that offline for more than a certain period will be removed from the cluster
- [Arbitrator](/cluster#arbitrator): used in the case of an even number of replicas to prevent split-brain
## [TDengine Operation and Maintenance](/administrator)
- [Capacity Planning](/administrator#planning): Estimating hardware resources based on scenarios
- [Fault Tolerance and Disaster Recovery](/administrator#tolerance): set the correct WAL and number of data replicas
- [System Configuration](/administrator#config): port, cache size, file block size, and other system configurations
- [User Management](/administrator#user): add/delete TDengine users, modify user password
- [Import Data](/administrator#import): import data into TDengine from either script or CSV file
- [Export Data](/administrator#export): export data either from TDengine shell or from the taosdump tool
- [System Monitor](/administrator#status): monitor the system connections, queries, streaming calculation, logs, and events
- [File Directory Structure](/administrator#directories): directories where TDengine data files and configuration files located
- [Parameter Restrictions and Reserved Keywords](/administrator#keywords): TDengine’s list of parameter restrictions and reserved keywords
## TDengine Technical Design
- [System Module]: taosd functions and modules partitioning
- [Data Replication]: support real-time synchronous/asynchronous replication, to ensure high-availability of the system
- [Technical Blog](https://www.taosdata.com/cn/blog/?categories=3): More technical analysis and architecture design articles
## Common Tools
- [TDengine sample import tools](https://www.taosdata.com/blog/2020/01/18/1166.html)
- [TDengine performance comparison test tools](https://www.taosdata.com/blog/2020/01/18/1166.html)
- [Use TDengine visually through IDEA Database Management Tool](https://www.taosdata.com/blog/2020/08/27/1767.html)
## Performance: TDengine vs Others
- [Performance: TDengine vs InfluxDB with InfluxDB’s open-source performance testing tool](https://www.taosdata.com/blog/2020/01/13/1105.html)
- [Performance: TDengine vs OpenTSDB](https://www.taosdata.com/blog/2019/08/21/621.html)
- [Performance: TDengine vs Cassandra](https://www.taosdata.com/blog/2019/08/14/573.html)
- [Performance: TDengine vs InfluxDB](https://www.taosdata.com/blog/2019/07/19/419.html)
- [Performance Test Reports of TDengine vs InfluxDB/OpenTSDB/Cassandra/MySQL/ClickHouse](https://www.taosdata.com/downloads/TDengine_Testing_Report_cn.pdf)
## More on IoT Big Data
- [Characteristics of IoT and Industry Internet Big Data](https://www.taosdata.com/blog/2019/07/09/characteristics-of-iot-big-data/)
- [Features and Functions of IoT Big Data platforms](https://www.taosdata.com/blog/2019/07/29/542.html)
- [Why don’t General Big Data Platforms Fit IoT Scenarios?](https://www.taosdata.com/blog/2019/07/09/why-does-the-general-big-data-platform-not-fit-iot-data-processing/)
- [Why TDengine is the best choice for IoT, Internet of Vehicles, and Industry Internet Big Data platforms?](https://www.taosdata.com/blog/2019/07/09/why-tdengine-is-the-best-choice-for-iot-big-data-processing/)
## FAQ
- [FAQ: Common questions and answers](/faq)
# TDengine Introduction
## <a class="anchor" id="intro"></a> About TDengine
TDengine is an innovative Big Data processing product launched by Taos Data in the face of the fast-growing Internet of Things (IoT) Big Data market and technical challenges. It does not rely on any third-party software, nor does it optimize or package any open-source database or stream computing product. Instead, it is a product independently developed after absorbing the advantages of many traditional relational databases, NoSQL databases, stream computing engines, message queues, and other software. TDengine has its own unique Big Data processing advantages in time-series space.
One of the modules of TDengine is the time-series database. However, in addition to this, to reduce the complexity of research and development and the difficulty of system operation, TDengine also provides functions such as caching, message queuing, subscription, stream computing, etc. TDengine provides a full-stack technical solution for the processing of IoT and Industrial Internet BigData. It is an efficient and easy-to-use IoT Big Data platform. Compared with typical Big Data platforms such as Hadoop, TDengine has the following distinct characteristics:
- **Performance improvement over 10 times**: An innovative data storage structure is defined, with each single core can process at least 20,000 requests per second, insert millions of data points, and read more than 10 million data points, which is more than 10 times faster than other existing general database.
- **Reduce the cost of hardware or cloud services to 1/5**: Due to its ultra-performance, TDengine’s computing resources consumption is less than 1/5 of other common Big Data solutions; through columnar storage and advanced compression algorithms, the storage consumption is less than 1/10 of other general databases.
- **Full-stack time-series data processing engine**: Integrate database, message queue, cache, stream computing, and other functions, and the applications do not need to integrate with software such as Kafka/Redis/HBase/Spark/HDFS, thus greatly reducing the complexity cost of application development and maintenance.
- **Powerful analysis functions**: Data from ten years ago or one second ago, can all be queried based on a specified time range. Data can be aggregated on a timeline or multiple devices. Ad-hoc queries can be made at any time through Shell, Python, R, and Matlab.
- **Seamless connection with third-party tools**: Integration with Telegraf, Grafana, EMQ, HiveMQ, Prometheus, Matlab, R, etc. without even one single line of code. OPC, Hadoop, Spark, etc. will be supported in the future, and more BI tools will be seamlessly connected to.
- **Zero operation cost & zero learning cost**: Installing clusters is simple and quick, with real-time backup built-in, and no need to split libraries or tables. Similar to standard SQL, TDengine can support RESTful, Python/Java/C/C + +/C#/Go/Node.js, and similar to MySQL with zero learning cost.
With TDengine, the total cost of ownership of typical IoT, Internet of Vehicles, and Industrial Internet Big Data platforms can be greatly reduced. However, it should be pointed out that due to making full use of the characteristics of IoT time-series data, TDengine cannot be used to process general data from web crawlers, microblogs, WeChat, e-commerce, ERP, CRM, and other sources.
![TDengine Technology Ecosystem](page://images/eco_system.png)
<center>Figure 1. TDengine Technology Ecosystem</center>
## <a class="anchor" id="scenes"></a>Overall Scenarios of TDengine
As an IoT Big Data platform, the typical application scenarios of TDengine are mainly presented in the IoT category, with users having a certain amount of data. The following sections of this document are mainly aimed at IoT-relevant systems. Other systems, such as CRM, ERP, etc., are beyond the scope of this article.
### Characteristics and Requirements of Data Sources
From the perspective of data sources, designers can analyze the applicability of TDengine in target application systems as following.
| **Data Source Characteristics and Requirements** | **Not Applicable** | **Might Be Applicable** | **Very Applicable** | **Description** |
| -------------------------------------------------------- | ------------------ | ----------------------- | ------------------- | :----------------------------------------------------------- |
| A huge amount of total data | | | √ | TDengine provides excellent scale-out functions in terms of capacity, and has a storage structure matching high compression ratio to achieve the best storage efficiency in the industry. |
| Data input velocity is occasionally or continuously huge | | | √ | TDengine's performance is much higher than other similar products. It can continuously process a large amount of input data in the same hardware environment, and provide a performance evaluation tool that can easily run in the user environment. |
| A huge amount of data sources | | | √ | TDengine is designed to include optimizations specifically for a huge amount of data sources, such as data writing and querying, which is especially suitable for efficiently processing massive (tens of millions or more) data sources. |
### System Architecture Requirements
| **System Architecture Requirements** | **Not Applicable** | **Might Be Applicable** | **Very Applicable** | **Description** |
| ------------------------------------------------- | ------------------ | ----------------------- | ------------------- | ------------------------------------------------------------ |
| Require a simple and reliable system architecture | | | √ | TDengine's system architecture is very simple and reliable, with its own message queue, cache, stream computing, monitoring and other functions, and no need to integrate any additional third-party products. |
| Require fault-tolerance and high-reliability | | | √ | TDengine has cluster functions to automatically provide high-reliability functions such as fault tolerance and disaster recovery. |
| Standardization specifications | | | √ | TDengine uses standard SQL language to provide main functions and follow standardization specifications. |
### System Function Requirements
| **System Architecture Requirements** | **Not Applicable** | **Might Be Applicable** | **Very Applicable** | **Description** |
| ------------------------------------------------- | ------------------ | ----------------------- | ------------------- | ------------------------------------------------------------ |
| Require completed data processing algorithms built-in | | √ | | TDengine implements various general data processing algorithms, but has not properly handled all requirements of different industries, so special types of processing shall be processed at the application level. |
| Require a huge amount of crosstab queries | | √ | | This type of processing should be handled more by relational database systems, or TDengine and relational database systems should fit together to implement system functions. |
### System Performance Requirements
| **System Architecture Requirements** | **Not Applicable** | **Might Be Applicable** | **Very Applicable** | **Description** |
| ------------------------------------------------- | ------------------ | ----------------------- | ------------------- | ------------------------------------------------------------ |
| Require larger total processing capacity | | | √ | TDengine’s cluster functions can easily improve processing capacity via multi-server-cooperating. |
| Require high-speed data processing | | | √ | TDengine’s storage and data processing are designed to be optimized for IoT, can generally improve the processing speed by multiple times than other similar products. |
| Require fast processing of fine-grained data | | | √ | TDengine has achieved the same level of performance with relational and NoSQL data processing systems. |
### System Maintenance Requirements
| **System Architecture Requirements** | **Not Applicable** | **Might Be Applicable** | **Very Applicable** | **Description** |
| ------------------------------------------------- | ------------------ | ----------------------- | ------------------- | ------------------------------------------------------------ |
| Require system with high-reliability | | | √ | TDengine has a very robust and reliable system architecture to implement simple and convenient daily operation with streamlined experiences for operators, thus human errors and accidents are eliminated to the greatest extent. |
| Require controllable operation learning cost | | | √ | As above. |
| Require abundant talent supply | √ | | | As a new-generation product, it’s still difficult to find talents with TDengine experiences from market. However, the learning cost is low. As the vendor, we also provide extensive operation training and counselling services. |
# Quick Start
## <a class="anchor" id="install"></a>Quick Install
TDegnine software consists of 3 parts: server, client, and alarm module. At the moment, TDengine server only runs on Linux (Windows, mac OS and more OS supports will come soon), but client can run on either Windows or Linux. TDengine client can be installed and run on Windows or Linux. Applications based-on any OSes can all connect to server taosd via a RESTful interface. About CPU, TDegnine supports X64/ARM64/MIPS64/Alpha64, and ARM32、RISC-V, other more CPU architectures will be supported soon. You can set up and install TDengine server either from the [source code](https://www.taosdata.com/en/getting-started/#Install-from-Source) or the [packages](https://www.taosdata.com/en/getting-started/#Install-from-Package).
### <a class="anchor" id="source-install"></a>Install from Source
Please visit our [TDengine github page](https://github.com/taosdata/TDengine) for instructions on installation from the source code.
### Install from Docker Container
Please visit our [TDengine Official Docker Image: Distribution, Downloading, and Usage](https://www.taosdata.com/blog/2020/05/13/1509.html).
### <a class="anchor" id="package-install"></a>Install from Package
It’s extremely easy to install for TDegnine, which takes only a few seconds from downloaded to successful installed. The server installation package includes clients and connectors. We provide 3 installation packages, which you can choose according to actual needs:
Click [here](https://www.taosdata.com/cn/getting-started/#%E9%80%9A%E8%BF%87%E5%AE%89%E8%A3%85%E5%8C%85%E5%AE%89%E8%A3%85) to download the install package.
For more about installation process, please refer [TDengine Installation Packages: Install and Uninstall](https://www.taosdata.com/blog/2019/08/09/566.html), and [Video Tutorials](https://www.taosdata.com/blog/2020/11/11/1941.html).
## <a class="anchor" id="start"></a>Quick Launch
After installation, you can start the TDengine service by the `systemctl` command.
```bash
$ systemctl start taosd
```
Then check if the service is working now.
```bash
$ systemctl status taosd
```
If the service is running successfully, you can play around through TDengine shell `taos`.
**Note:**
- The `systemctl` command needs the **root** privilege. Use **sudo** if you are not the **root** user.
- To get better product feedback and improve our solution, TDegnine will collect basic usage information, but you can modify the configuration parameter **telemetryReporting** in the system configuration file taos.cfg, and set it to 0 to turn it off.
- TDegnine uses FQDN (usually hostname) as the node ID. In order to ensure normal operation, you need to set hostname for the server running taosd, and configure DNS service or hosts file for the machine running client application, to ensure the FQDN can be resolved.
- TDengine supports installation on Linux systems with[ systemd ](https://en.wikipedia.org/wiki/Systemd)as the process service management, and uses `which systemctl` command to detect whether `systemd` packages exist in the system:
```bash
$ which systemctl
```
If `systemd` is not supported in the system, TDengine service can also be launched via `/usr/local/taos/bin/taosd` manually.
## <a class="anchor" id="console"></a>TDengine Shell Command Line
To launch TDengine shell, the command line interface, in a Linux terminal, type:
```bash
$ taos
```
The welcome message is printed if the shell connects to TDengine server successfully, otherwise, an error message will be printed (refer to our [FAQ](https://www.taosdata.com/en/faq) page for troubleshooting the connection error). The TDengine shell prompt is:
```cmd
taos>
```
In the TDengine shell, you can create databases, create tables and insert/query data with SQL. Each query command ends with a semicolon. It works like MySQL, for example:
```mysql
create database demo;
use demo;
create table t (ts timestamp, speed int);
insert into t values ('2019-07-15 00:00:00', 10);
insert into t values ('2019-07-15 01:00:00', 20);
select * from t;
ts | speed |
===================================
19-07-15 00:00:00.000| 10|
19-07-15 01:00:00.000| 20|
Query OK, 2 row(s) in set (0.001700s)
```
Besides the SQL commands, the system administrator can check system status, add or delete accounts, and manage the servers.
### Shell Command Line Parameters
You can configure command parameters to change how TDengine shell executes. Some frequently used options are listed below:
- -c, --config-dir: set the configuration directory. It is */etc/taos* by default.
- -h, --host: set the IP address of the server it will connect to. Default is localhost.
- -s, --commands: set the command to run without entering the shell.
- -u, -- user: user name to connect to server. Default is root.
- -p, --password: password. Default is 'taosdata'.
- -?, --help: get a full list of supported options.
Examples:
```bash
$ taos -h 192.168.0.1 -s "use db; show tables;"
```
### Run SQL Command Scripts
Inside TDengine shell, you can run SQL scripts in a file with source command.
```mysql
taos> source <filename>;
```
### Shell Tips
- Use up/down arrow key to check the command history
- To change the default password, use "alter user" command
- Use ctrl+c to interrupt any queries
- To clean the schema of local cached tables, execute command `RESET QUERY CACHE`
## <a class="anchor" id="demo"></a>Experience TDengine’s Lightning Speed
After starting the TDengine server, you can execute the command `taosdemo` in the Linux terminal.
```bash
$ taosdemo
```
Using this command, a STable named `meters` will be created in the database `test` There are 10k tables under this stable, named from `t0` to `t9999`. In each table there are 100k rows of records, each row with columns (`f1`, `f2` and `f3`. The timestamp is from "2017-07-14 10:40:00 000" to "2017-07-14 10:41:39 999". Each table also has tags `areaid` and `loc`: `areaid` is set from 1 to 10, `loc` is set to "beijing" or "shanghai".
It takes about 10 minutes to execute this command. Once finished, 1 billion rows of records will be inserted.
In the TDengine client, enter sql query commands and then experience our lightning query speed.
- query total rows of records:
```mysql
taos> select count(*) from test.meters;
```
- query average, max and min of the total 1 billion records:
```mysql
taos> select avg(f1), max(f2), min(f3) from test.meters;
```
- query the number of records where loc="beijing":
```mysql
taos> select count(*) from test.meters where loc="beijing";
```
- query the average, max and min of total records where areaid=10:
```mysql
taos> select avg(f1), max(f2), min(f3) from test.meters where areaid=10;
```
- query the average, max, min from table t10 when aggregating over every 10s:
```mysql
taos> select avg(f1), max(f2), min(f3) from test.t10 interval(10s);
```
**Note**: you can run command `taosdemo` with many options, like number of tables, rows of records and so on. To know more about these options, you can execute `taosdemo --help` and then take a try using different options.
## Client and Alarm Module
If your client and server running on different machines, please install the client separately. Linux and Windows packages are provided:
- TDengine-client-2.0.10.0-Linux-x64.tar.gz(3.0M)
- TDengine-client-2.0.10.0-Windows-x64.exe(2.8M)
- TDengine-client-2.0.10.0-Windows-x86.exe(2.8M)
Linux package of Alarm Module is as following (please refer [How to Use Alarm Module](https://github.com/taosdata/TDengine/blob/master/alert/README_cn.md)):
- TDengine-alert-2.0.10.0-Linux-x64.tar.gz (8.1M)
## <a class="anchor" id="platforms"></a>List of Supported Platforms
List of platforms supported by TDengine server
| | **CentOS 6/7/8** | **Ubuntu 16/18/20** | **Other Linux** | UnionTech UOS | NeoKylin | LINX V60/V80 |
| ------------------ | ---------------- | ------------------- | --------------- | ------------- | -------- | ------------ |
| X64 | ● | ● | | ○ | ● | ● |
| Raspberry ARM32 | | ● | ● | | | |
| Loongson MIPS64 | | | ● | | | |
| Kunpeng ARM64 | | ○ | ○ | | ● | |
| SWCPU Alpha64 | | | ○ | ● | | |
| FT ARM64 | | ○Ubuntu Kylin | | | | |
| Hygon X64 | ● | ● | ● | ○ | ● | ● |
| Rockchip ARM64/32 | | | ○ | | | |
| Allwinner ARM64/32 | | | ○ | | | |
| Actions ARM64/32 | | | ○ | | | |
| TI ARM32 | | | ○ | | | |
Note: ● has been verified by official tests; ○ has been verified by unofficial tests.
List of platforms supported by TDengine client and connectors
At the moment, TDengine connectors can support a wide range of platforms, including hardware platforms such as X64/X86/ARM64/ARM32/MIPS/Alpha, and development environments such as Linux/Win64/Win32.
Comparison matrix as following:
| **CPU** | **X64 64bit** | | | **X86 32bit** | **ARM64** | **ARM32** | **MIPS Godson** | **Alpha Shenwei** | **X64 TimecomTech** |
| ----------- | ------------- | --------- | --------- | ------------- | --------- | --------- | --------------- | ----------------- | ------------------- |
| **OS** | **Linux** | **Win64** | **Win32** | **Win32** | **Linux** | **Linux** | **Linux** | **Linux** | **Linux** |
| **C/C++** | ● | ● | ● | ○ | ● | ● | ● | ● | ● |
| **JDBC** | ● | ● | ● | ○ | ● | ● | ● | ● | ● |
| **Python** | ● | ● | ● | ○ | ● | ● | ● | -- | ● |
| **Go** | ● | ● | ● | ○ | ● | ● | ○ | -- | -- |
| **NodeJs** | ● | ● | ○ | ○ | ● | ● | ○ | -- | -- |
| **C#** | ○ | ● | ● | ○ | ○ | ○ | ○ | -- | -- |
| **RESTful** | ● | ● | ● | ● | ● | ● | ● | ● | ● |
Note: ● has been verified by official tests; ○ has been verified by unofficial tests.
Please visit [Connectors](https://www.taosdata.com/en/documentation/connector) section for more detailed information.
此差异已折叠。
# Data Modeling
TDengine adopts a relational data model, so we need to build the "database" and "table". Therefore, for a specific application scenario, it is necessary to consider the design of the database, STable and ordinary table. This section does not discuss detailed syntax rules, but only concepts.
Please watch the [video tutorial](https://www.taosdata.com/blog/2020/11/11/1945.html) for data modeling.
## <a class="anchor" id="create-db"></a> Create a Database
Different types of data collection points often have different data characteristics, including frequency of data collecting, length of data retention time, number of replicas, size of data blocks, whether to update data or not, and so on. To ensure TDengine working with great efficiency in various scenarios, TDengine suggests creating tables with different data characteristics in different databases, because each database can be configured with different storage strategies. When creating a database, in addition to SQL standard options, the application can also specify a variety of parameters such as retention duration, number of replicas, number of memory blocks, time accuracy, max and min number of records in a file block, whether it is compressed or not, and number of days a data file will be overwritten. For example:
```mysql
CREATE DATABASE power KEEP 365 DAYS 10 BLOCKS 4 UPDATE 1;
```
The above statement will create a database named “power”. The data of this database will be kept for 365 days (it will be automatically deleted 365 days later), one data file created per 10 days, and the number of memory blocks is 4 for data updating. For detailed syntax and parameters, please refer to [Data Management section of TAOS SQL](https://www.taosdata.com/en/documentation/taos-sql#management).
After the database created, please use SQL command USE to switch to the new database, for example:
```mysql
USE power;
```
Replace the database operating in the current connection with “power”, otherwise, before operating on a specific table, you need to use "database name. table name" to specify the name of database to use.
**Note:**
- Any table or STable belongs to a database. Before creating a table, a database must be created first.
- Tables in two different databases cannot be JOIN.
## <a class="anchor" id="create-stable"></a> Create a STable
An IoT system often has many types of devices, such as smart meters, transformers, buses, switches, etc. for power grids. In order to facilitate aggregation among multiple tables, using TDengine, it is necessary to create a STable for each type of data collection point. Taking the smart meter in Table 1 as an example, you can use the following SQL command to create a STable:
```mysql
CREATE STABLE meters (ts timestamp, current float, voltage int, phase float) TAGS (location binary(64), groupdId int);
```
**Note:** The STABLE keyword in this instruction needs to be written as TABLE in versions before 2.0.15.
Just like creating an ordinary table, you need to provide the table name (‘meters’ in the example) and the table structure Schema, that is, the definition of data columns. The first column must be a timestamp (‘ts’ in the example), the other columns are the physical metrics collected (current, volume, phase in the example), and the data types can be int, float, string, etc. In addition, you need to provide the schema of the tag (location, groupId in the example), and the data types of the tag can be int, float, string and so on. Static attributes of collection points can often be used as tags, such as geographic location of collection points, device model, device group ID, administrator ID, etc. The schema of the tag can be added, deleted and modified afterwards. Please refer to the [STable Management section of TAOS SQL](https://www.taosdata.com/cn/documentation/taos-sql#super-table) for specific definitions and details.
Each type of data collection point needs an established STable, so an IoT system often has multiple STables. For the power grid, we need to build a STable for smart meters, transformers, buses, switches, etc. For IoT, a device may have multiple data collection points (for example, a fan for wind-driven generator, some collection points capture parameters such as current and voltage, and some capture environmental parameters such as temperature, humidity and wind direction). In this case, multiple STables need to be established for corresponding types of devices. All collected physical metrics contained in one and the same STable must be collected at the same time (with a consistent timestamp).
A STable allows up to 1024 columns. If the number of physical metrics collected at a collection point exceeds 1024, multiple STables need to be built to process them. A system can have multiple DBs, and a DB can have one or more STables.
## <a class="anchor" id="create-table"></a> Create a Table
TDengine builds a table independently for each data collection point. Similar to standard relational data, one table has a table name, Schema, but in addition, it can also carry one or more tags. When creating, you need to use the STable as a template and specify the specific value of the tag. Taking the smart meter in Table 1 as an example, the following SQL command can be used to build the table:
```mysql
CREATE TABLE d1001 USING meters TAGS ("Beijing.Chaoyang", 2);
```
Where d1001 is the table name, meters is the name of the STable, followed by the specific tag value of tag Location as "Beijing.Chaoyang", and the specific tag value of tag groupId 2. Although the tag value needs to be specified when creating the table, it can be modified afterwards. Please refer to the [Table Management section of TAOS SQL](https://www.taosdata.com/en/documentation/taos-sql#table) for details.
**Note: ** At present, TDengine does not technically restrict the use of a STable of a database (dbA) as a template to create a sub-table of another database (dbB). This usage will be prohibited later, and it is not recommended to use this method to create a table.
TDengine suggests to use the globally unique ID of data collection point as a table name (such as device serial number). However, in some scenarios, there is no unique ID, and multiple IDs can be combined into a unique ID. It is not recommended to use a unique ID as tag value.
**Automatic table creating** : In some special scenarios, user is not sure whether the table of a certain data collection point exists when writing data. In this case, the non-existent table can be created by using automatic table building syntax when writing data. If the table already exists, no new table will be created. For example:
```mysql
INSERT INTO d1001 USING METERS TAGS ("Beijng.Chaoyang", 2) VALUES (now, 10.2, 219, 0.32);
```
The SQL statement above inserts records (now, 10.2, 219, 0.32) into table d1001. If table d1001 has not been created yet, the STable meters is used as the template to automatically create it, and the tag value "Beijing.Chaoyang", 2 is marked at the same time.
For detailed syntax of automatic table building, please refer to the "[Automatic Table Creation When Inserting Records](https://www.taosdata.com/en/documentation/taos-sql#auto_create_table)" section.
## Multi-column Model vs Single-column Model
TDengine supports multi-column model. As long as physical metrics are collected simultaneously by a data collection point (with a consistent timestamp), these metrics can be placed in a STable as different columns. However, there is also an extreme design, a single-column model, in which each collected physical metric is set up separately, so each type of physical metrics is set up separately with a STable. For example, create 3 Stables, one each for current, voltage and phase.
TDengine recommends using multi-column model as much as possible because of higher insertion and storage efficiency. However, for some scenarios, types of collected metrics often change. In this case, if multi-column model is adopted, the structure definition of STable needs to be frequently modified so make the application complicated. To avoid that, single-column model is recommended.
# Efficient Data Writing
TDengine supports multiple interfaces to write data, including SQL, Prometheus, Telegraf, EMQ MQTT Broker, HiveMQ Broker, CSV file, etc. Kafka, OPC and other interfaces will be provided in the future. Data can be inserted in a single piece or in batches, data from one or multiple data collection points can be inserted at the same time. TDengine supports multi-thread insertion, nonsequential data insertion, and also historical data insertion.
## <a class="anchor" id="sql"></a> SQL Writing
Applications insert data by executing SQL insert statements through C/C + +, JDBC, GO, or Python Connector, and users can manually enter SQL insert statements to insert data through TAOS Shell. For example, the following insert writes a record to table d1001:
```mysql
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31);
```
TDengine supports writing multiple records at a time. For example, the following command writes two records to table d1001:
```mysql
INSERT INTO d1001 VALUES (1538548684000, 10.2, 220, 0.23) (1538548696650, 10.3, 218, 0.25);
```
TDengine also supports writing data to multiple tables at a time. For example, the following command writes two records to d1001 and one record to d1002:
```mysql
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31) (1538548695000, 12.6, 218, 0.33) d1002 VALUES (1538548696800, 12.3, 221, 0.31);
```
For the SQL INSERT Grammar, please refer to [Taos SQL insert](https://www.taosdata.com/en/documentation/taos-sql#insert)
**Tips:**
- To improve writing efficiency, batch writing is required. The more records written in a batch, the higher the insertion efficiency. However, a record cannot exceed 16K, and the total length of an SQL statement cannot exceed 64K (it can be configured by parameter maxSQLLength, and the maximum can be configured to 1M).
- TDengine supports multi-thread parallel writing. To further improve writing speed, a client needs to open more than 20 threads to write parallelly. However, after the number of threads reaches a certain threshold, it cannot be increased or even become decreased, because too much frequent thread switching brings extra overhead.
- For a same table, if the timestamp of a newly inserted record already exists, (no database was created using UPDATE 1) the new record will be discarded as default, that is, the timestamp must be unique in a table. If an application automatically generates records, it is very likely that the generated timestamps will be the same, so the number of records successfully inserted will be smaller than the number of records the application try to insert. If you use UPDATE 1 option when creating a database, inserting a new record with the same timestamp will overwrite the original record.
- The timestamp of written data must be greater than the current time minus the time of configuration parameter keep. If keep is configured for 3650 days, data older than 3650 days cannot be written. The timestamp for writing data cannot be greater than the current time plus configuration parameter days. If days is configured to 2, data 2 days later than the current time cannot be written.
## <a class="anchor" id="prometheus"></a> Direct Writing of Prometheus
As a graduate project of Cloud Native Computing Foundation, [Prometheus](https://www.prometheus.io/) is widely used in the field of performance monitoring and K8S performance monitoring. TDengine provides a simple tool [Bailongma](https://github.com/taosdata/Bailongma), which only needs to be simply configured in Prometheus without any code, and can directly write the data collected by Prometheus into TDengine, then automatically create databases and related table entries in TDengine according to rules. Blog post [Use Docker Container to Quickly Build a Devops Monitoring Demo](https://www.taosdata.com/blog/2020/02/03/1189.html), which is an example of using bailongma to write Prometheus and Telegraf data into TDengine.
### Compile blm_prometheus From Source
Users need to download the source code of [Bailongma](https://github.com/taosdata/Bailongma) from github, then compile and generate an executable file using Golang language compiler. Before you start compiling, you need to complete following prepares:
- A server running Linux OS
- Golang version 1.10 and higher installed
- An appropriated TDengine version. Because the client dynamic link library of TDengine is used, it is necessary to install the same version of TDengine as the server-side; for example, if the server version is TDengine 2.0. 0, ensure install the same version on the linux server where bailongma is located (can be on the same server as TDengine, or on a different server)
Bailongma project has a folder, blm_prometheus, which holds the prometheus writing API. The compiling process is as follows:
```bash
cd blm_prometheus
go build
```
If everything goes well, an executable of blm_prometheus will be generated in the corresponding directory.
### Install Prometheus
Download and install as the instruction of Prometheus official website. [Download Address](https://prometheus.io/download/)
### Configure Prometheus
Read the Prometheus [configuration document](https://prometheus.io/docs/prometheus/latest/configuration/configuration/) and add following configurations in the section of Prometheus configuration file
- url: The URL provided by bailongma API service, refer to the blm_prometheus startup example section below
After Prometheus launched, you can check whether data is written successfully through query taos client.
### Launch blm_prometheus
blm_prometheus has following options that you can configure when you launch blm_prometheus.
```sh
--tdengine-name
If TDengine is installed on a server with a domain name, you can also access the TDengine by configuring the domain name of it. In K8S environment, it can be configured as the service name that TDengine runs
--batch-size
blm_prometheus assembles the received prometheus data into a TDengine writing request. This parameter controls the number of data pieces carried in a writing request sent to TDengine at a time.
--dbname
Set a name for the database created in TDengine, blm_prometheus will automatically create a database named dbname in TDengine, and the default value is prometheus.
--dbuser
Set the user name to access TDengine, the default value is'root '
--dbpassword
Set the password to access TDengine, the default value is'taosdata '
--port
The port number blm_prometheus used to serve prometheus.
```
### Example
Launch an API service for blm_prometheus with the following command:
```bash
./blm_prometheus -port 8088
```
Assuming that the IP address of the server where blm_prometheus located is "10.1.2. 3", the URL shall be added to the configuration file of Prometheus as:
remote_write:
\- url: "http://10.1.2.3:8088/receive"
### Query written data of prometheus
The format of generated data by Prometheus is as follows:
```json
{
Timestamp: 1576466279341,
Value: 37.000000,
apiserver_request_latencies_bucket {
component="apiserver",
instance="192.168.99.116:8443",
job="kubernetes-apiservers",
le="125000",
resource="persistentvolumes", s
cope="cluster",
verb="LIST",
version=“v1"
}
}
```
Where apiserver_request_latencies_bucket is the name of the time-series data collected by prometheus, and the tag of the time-series data is in the following {}. blm_prometheus automatically creates a STable in TDengine with the name of the time series data, and converts the tag in {} into the tag value of TDengine, with Timestamp as the timestamp and value as the value of the time-series data. Therefore, in the client of TDEngine, you can check whether this data was successfully written through the following instruction.
```mysql
use prometheus;
select * from apiserver_request_latencies_bucket;
```
## <a class="anchor" id="telegraf"></a> Direct Writing of Telegraf
[Telegraf](https://www.influxdata.com/time-series-platform/telegraf/) is a popular open source tool for IT operation data collection. TDengine provides a simple tool [Bailongma](https://github.com/taosdata/Bailongma), which only needs to be simply configured in Telegraf without any code, and can directly write the data collected by Telegraf into TDengine, then automatically create databases and related table entries in TDengine according to rules. Blog post [Use Docker Container to Quickly Build a Devops Monitoring Demo](https://www.taosdata.com/blog/2020/02/03/1189.html), which is an example of using bailongma to write Prometheus and Telegraf data into TDengine.
### Compile blm_telegraf From Source Code
Users need to download the source code of [Bailongma](https://github.com/taosdata/Bailongma) from github, then compile and generate an executable file using Golang language compiler. Before you start compiling, you need to complete following prepares:
- A server running Linux OS
- Golang version 1.10 and higher installed
- An appropriated TDengine version. Because the client dynamic link library of TDengine is used, it is necessary to install the same version of TDengine as the server-side; for example, if the server version is TDengine 2.0. 0, ensure install the same version on the linux server where bailongma is located (can be on the same server as TDengine, or on a different server)
Bailongma project has a folder, blm_telegraf, which holds the Telegraf writing API. The compiling process is as follows:
```bash
cd blm_telegraf
go build
```
If everything goes well, an executable of blm_telegraf will be generated in the corresponding directory.
### Install Telegraf
At the moment, TDengine supports Telegraf version 1.7. 4 and above. Users can download the installation package on Telegraf's website according to your current operating system. The download address is as follows: https://portal.influxdata.com/downloads
### Configure Telegraf
Modify the TDengine-related configurations in the Telegraf configuration file /etc/telegraf/telegraf.conf.
In the output plugins section, add the [[outputs.http]] configuration:
- url: The URL provided by bailongma API service, please refer to the example section below
- data_format: "json"
- json_timestamp_units: "1ms"
In agent section:
- hostname: The machine name that distinguishes different collection devices, and it is necessary to ensure its uniqueness
- metric_batch_size: 100, which is the max number of records per batch wriiten by Telegraf allowed. Increasing the number can reduce the request sending frequency of Telegraf.
For information on how to use Telegraf to collect data and more about using Telegraf, please refer to the official [document](https://docs.influxdata.com/telegraf/v1.11/) of Telegraf.
### Launch blm_telegraf
blm_telegraf has following options, which can be set to tune configurations of blm_telegraf when launching.
```sh
--host
The ip address of TDengine server, default is null
--batch-size
blm_prometheus assembles the received telegraf data into a TDengine writing request. This parameter controls the number of data pieces carried in a writing request sent to TDengine at a time.
--dbname
Set a name for the database created in TDengine, blm_telegraf will automatically create a database named dbname in TDengine, and the default value is prometheus.
--dbuser
Set the user name to access TDengine, the default value is 'root '
--dbpassword
Set the password to access TDengine, the default value is'taosdata '
--port
The port number blm_telegraf used to serve Telegraf.
```
### Example
Launch an API service for blm_telegraf with the following command
```bash
./blm_telegraf -host 127.0.0.1 -port 8089
```
Assuming that the IP address of the server where blm_telegraf located is "10.1.2. 3", the URL shall be added to the configuration file of telegraf as:
```yaml
url = "http://10.1.2.3:8089/telegraf"
```
### Query written data of telegraf
The format of generated data by telegraf is as follows:
```json
{
"fields": {
"usage_guest": 0,
"usage_guest_nice": 0,
"usage_idle": 89.7897897897898,
"usage_iowait": 0,
"usage_irq": 0,
"usage_nice": 0,
"usage_softirq": 0,
"usage_steal": 0,
"usage_system": 5.405405405405405,
"usage_user": 4.804804804804805
},
"name": "cpu",
"tags": {
"cpu": "cpu2",
"host": "bogon"
},
"timestamp": 1576464360
}
```
Where the name field is the name of the time-series data collected by telegraf, and the tag field is the tag of the time-series data. blm_telegraf automatically creates a STable in TDengine with the name of the time series data, and converts the tag field into the tag value of TDengine, with Timestamp as the timestamp and fields values as the value of the time-series data. Therefore, in the client of TDEngine, you can check whether this data was successfully written through the following instruction.
```mysql
use telegraf;
select * from cpu;
```
MQTT is a popular data transmission protocol in the IoT. TDengine can easily access the data received by MQTT Broker and write it to TDengine.
## <a class="anchor" id="emq"></a> Direct Writing of EMQ Broker
[EMQ](https://github.com/emqx/emqx) is an open source MQTT Broker software, with no need of coding, only to use "rules" in EMQ Dashboard for simple configuration, and MQTT data can be directly written into TDengine. EMQ X supports storing data to the TDengine by sending it to a Web service, and also provides a native TDengine driver on Enterprise Edition for direct data store. Please refer to [EMQ official documents](https://docs.emqx.io/broker/latest/cn/rule/rule-example.html#%E4%BF%9D%E5%AD%98%E6%95%B0%E6%8D%AE%E5%88%B0-tdengine) for more details.
## <a class="anchor" id="hivemq"></a> Direct Writing of HiveMQ Broker
[HiveMQ](https://www.hivemq.com/) is an MQTT agent that provides Free Personal and Enterprise Edition versions. It is mainly used for enterprises, emerging machine-to-machine(M2M) communication and internal transmission to meet scalability, easy management and security features. HiveMQ provides an open source plug-in development kit. You can store data to TDengine via HiveMQ extension-TDengine. Refer to the [HiveMQ extension-TDengine documentation](https://github.com/huskar-t/hivemq-tdengine-extension/blob/b62a26ecc164a310104df57691691b237e091c89/README.md) for more details.
# Efficient Data Querying
## <a class="anchor" id="queries"></a> Main Query Features
TDengine uses SQL as the query language. Applications can send SQL statements through C/C + +, Java, Go, Python connectors, and users can manually execute SQL Ad-Hoc Query through the Command Line Interface (CLI) tool TAOS Shell provided by TDengine. TDengine supports the following query functions:
- Single-column and multi-column data query
- Multiple filters for tags and numeric values: >, <, =, < >, like, etc
- Group by, Order by, Limit/Offset of aggregation results
- Four operations for numeric columns and aggregation results
- Time stamp aligned join query (implicit join) operations
- Multiple aggregation/calculation functions: count, max, min, avg, sum, twa, stddev, leastsquares, top, bottom, first, last, percentile, apercentile, last_row, spread, diff, etc
For example, in TAOS shell, the records with vlotage > 215 are queried from table d1001, sorted in descending order by timestamps, and only two records are outputted.
```mysql
taos> select * from d1001 where voltage > 215 order by ts desc limit 2;
ts | current | voltage | phase |
======================================================================================
2018-10-03 14:38:16.800 | 12.30000 | 221 | 0.31000 |
2018-10-03 14:38:15.000 | 12.60000 | 218 | 0.33000 |
Query OK, 2 row(s) in set (0.001100s)
```
In order to meet the needs of an IoT scenario, TDengine supports several special functions, such as twa (time weighted average), spread (difference between maximum and minimum), last_row (last record), etc. More functions related to IoT scenarios will be added. TDengine also supports continuous queries.
For specific query syntax, please see the [Data Query section of TAOS SQL](https://www.taosdata.com/cn/documentation/taos-sql#select).
## <a class="anchor" id="aggregation"></a> Multi-table Aggregation Query
In an IoT scenario, there are often multiple data collection points in a same type. TDengine uses the concept of STable to describe a certain type of data collection point, and an ordinary table to describe a specific data collection point. At the same time, TDengine uses tags to describe the statical attributes of data collection points. A given data collection point has a specific tag value. By specifying the filters of tags, TDengine provides an efficient method to aggregate and query the sub-tables of STables (data collection points of a certain type). Aggregation functions and most operations on ordinary tables are applicable to STables, and the syntax is exactly the same.
**Example 1**: In TAOS Shell, look up the average voltages collected by all smart meters in Beijing and group them by location
```mysql
taos> SELECT AVG(voltage) FROM meters GROUP BY location;
avg(voltage) | location |
=============================================================
222.000000000 | Beijing.Haidian |
219.200000000 | Beijing.Chaoyang |
Query OK, 2 row(s) in set (0.002136s)
```
**Example 2**: In TAOS Shell, look up the number of records with groupId 2 in the past 24 hours, check the maximum current of all smart meters
```mysql
taos> SELECT count(*), max(current) FROM meters where groupId = 2 and ts > now - 24h;
cunt(*) | max(current) |
==================================
5 | 13.4 |
Query OK, 1 row(s) in set (0.002136s)
```
TDengine only allows aggregation queries between tables belonging to a same STable, means aggregation queries between different STables are not supported. In the Data Query section of TAOS SQL, query class operations will all be indicated that whether STables are supported.
## <a class="anchor" id="sampling"></a> Down Sampling Query, Interpolation
In a scenario of IoT, it is often necessary to aggregate the collected data by intervals through down sampling. TDengine provides a simple keyword interval, which makes query operations according to time windows extremely simple. For example, the current values collected by smart meter d1001 are summed every 10 seconds.
```mysql
taos> SELECT sum(current) FROM d1001 INTERVAL(10s);
ts | sum(current) |
======================================================
2018-10-03 14:38:00.000 | 10.300000191 |
2018-10-03 14:38:10.000 | 24.900000572 |
Query OK, 2 row(s) in set (0.000883s)
```
The down sampling operation is also applicable to STables, such as summing the current values collected by all smart meters in Beijing every second.
```mysql
taos> SELECT SUM(current) FROM meters where location like "Beijing%" INTERVAL(1s);
ts | sum(current) |
======================================================
2018-10-03 14:38:04.000 | 10.199999809 |
2018-10-03 14:38:05.000 | 32.900000572 |
2018-10-03 14:38:06.000 | 11.500000000 |
2018-10-03 14:38:15.000 | 12.600000381 |
2018-10-03 14:38:16.000 | 36.000000000 |
Query OK, 5 row(s) in set (0.001538s)
```
The down sampling operation also supports time offset, such as summing the current values collected by all smart meters every second, but requires each time window to start from 500 milliseconds.
```mysql
taos> SELECT SUM(current) FROM meters INTERVAL(1s, 500a);
ts | sum(current) |
======================================================
2018-10-03 14:38:04.500 | 11.189999809 |
2018-10-03 14:38:05.500 | 31.900000572 |
2018-10-03 14:38:06.500 | 11.600000000 |
2018-10-03 14:38:15.500 | 12.300000381 |
2018-10-03 14:38:16.500 | 35.000000000 |
Query OK, 5 row(s) in set (0.001521s)
```
In a scenario of IoT, it is difficult to synchronize the time stamp of collected data at each point, but many analysis algorithms (such as FFT) need to align the collected data strictly at equal intervals of time. In many systems, it’s required to write their own programs to process, but the down sampling operation of TDengine can be easily solved. If there is no collected data in an interval, TDengine also provides interpolation calculation function.
For details of syntax rules, please refer to the [Time-dimension Aggregation section of TAOS SQL](https://www.taosdata.com/en/documentation/taos-sql#aggregation).
\ No newline at end of file
此差异已折叠。
此差异已折叠。
# Connections with Other Tools
## <a class="anchor" id="grafana"></a> Grafana
TDengine can quickly integrate with [Grafana](https://www.grafana.com/), an open source data visualization system, to build a data monitoring and alarming system. The whole process does not require any code to write. The contents of the data table in TDengine can be visually showed on DashBoard.
### Install Grafana
TDengine currently supports Grafana 5.2.4 and above. You can download and install the package from Grafana website according to the current operating system. The download address is as follows:
https://grafana.com/grafana/download.
### Configure Grafana
TDengine Grafana plugin is in the /usr/local/taos/connector/grafanaplugin directory.
Taking Centos 7.2 as an example, just copy grafanaplugin directory to /var/lib/grafana/plugins directory and restart Grafana.
```bash
sudo cp -rf /usr/local/taos/connector/grafanaplugin /var/lib/grafana/plugins/tdengine
```
### Use Grafana
#### Configure data source
You can log in the Grafana server (username/password:admin/admin) through localhost:3000, and add data sources through `Configuration -> Data Sources` on the left panel, as shown in the following figure:
![img](page://images/connections/add_datasource1.jpg)
Click `Add data source` to enter the Add Data Source page, and enter TDengine in the query box to select Add, as shown in the following figure:
![img](page://images/connections/add_datasource2.jpg)
Enter the data source configuration page and modify the corresponding configuration according to the default prompt:
![img](page://images/connections/add_datasource3.jpg)
- Host: IP address of any server in TDengine cluster and port number of TDengine RESTful interface (6041), default [http://localhost:6041](http://localhost:6041/)
- User: TDengine username.
- Password: TDengine user password.
Click `Save & Test` to test. Success will be prompted as follows:
![img](page://images/connections/add_datasource4.jpg)
#### Create Dashboard
Go back to the home to create Dashboard, and click `Add Query` to enter the panel query page:
![img](page://images/connections/create_dashboard1.jpg)
As shown in the figure above, select the TDengine data source in Query, and enter the corresponding sql in the query box below to query. Details are as follows:
- INPUT SQL: Enter the statement to query (the result set of the SQL statement should be two columns and multiple rows), for example: `select avg(mem_system) from log.dn where ts >= $from and ts < $to interval($interval)` , where `from`, `to` and `interval` are built-in variables of the TDengine plug-in, representing the query range and time interval obtained from the Grafana plug-in panel. In addition to built-in variables, it is also supported to use custom template variables.
- ALIAS BY: You can set alias for the current queries.
- GENERATE SQL: Clicking this button will automatically replace the corresponding variable and generate the final statement to execute.
According to the default prompt, query the average system memory usage at the specified interval of the server where the current TDengine deployed in as follows:
![img](page://images/connections/create_dashboard2.jpg)
> Please refer to Grafana [documents](https://grafana.com/docs/) for how to use Grafana to create the corresponding monitoring interface and for more about Grafana usage.
#### Import Dashboard
A `tdengine-grafana.json` importable dashboard is provided under the Grafana plug-in directory/usr/local/taos/connector/grafana/tdengine/dashboard/.
Click the `Import` button on the left panel and upload the `tdengine-grafana.json` file:
![img](page://images/connections/import_dashboard1.jpg)
You can see as follows after Dashboard imported.
![img](page://images/connections/import_dashboard2.jpg)
## <a class="anchor" id="matlab"></a> Matlab
MatLab can access data to the local workspace by connecting directly to the TDengine via the JDBC Driver provided in the installation package.
### JDBC Interface Adaptation of MatLab
Several steps are required to adapt Matlab to TDengine. Taking adapting Matlab2017a on Windows10 as an example:
- Copy the file JDBCDriver-1.0.0-dist.ja*r* in TDengine package to the directory ${matlab_root}\MATLAB\R2017a\java\jar\toolbox
- Copy the file taos.lib in TDengine package to ${matlab root dir}\MATLAB\R2017a\lib\win64
- Add the .jar package just copied to the Matlab classpath. Append the line below as the end of the file of ${matlab root dir}\MATLAB\R2017a\toolbox\local\classpath.txt
- ```
$matlabroot/java/jar/toolbox/JDBCDriver-1.0.0-dist.jar
```
- Create a file called javalibrarypath.txt in directory ${user_home}\AppData\Roaming\MathWorks\MATLAB\R2017a_, and add the _taos.dll path in the file. For example, if the file taos.dll is in the directory of C:\Windows\System32,then add the following line in file javalibrarypath.txt:
- ```
C:\Windows\System32
```
- ### Connect to TDengine in MatLab to get data
After the above configured successfully, open MatLab.
- Create a connection:
```matlab
conn = database(db, root, taosdata, com.taosdata.jdbc.TSDBDriver, jdbc:TSDB://127.0.0.1:0/)
```
* Make a query:
```matlab
sql0 = [select * from tb]
data = select(conn, sql0);
```
* Insert a record:
```matlab
sql1 = [insert into tb values (now, 1)]
exec(conn, sql1)
```
For more detailed examples, please refer to the examples\Matlab\TDEngineDemo.m file in the package.
## <a class="anchor" id="r"></a> R
R language supports connection to the TDengine database through the JDBC interface. First, you need to install the JDBC package of R language. Launch the R language environment, and then execute the following command to install the JDBC support library for R language:
```R
install.packages('RJDBC', repos='http://cran.us.r-project.org')
```
After installed, load the RJDBC package by executing `library('RJDBC')` command.
Then load the TDengine JDBC driver:
```R
drv<-JDBC("com.taosdata.jdbc.TSDBDriver","JDBCDriver-2.0.0-dist.jar", identifier.quote="\"")
```
If succeed, no error message will display. Then use the following command to try a database connection:
```R
conn<-dbConnect(drv,"jdbc:TSDB://192.168.0.1:0/?user=root&password=taosdata","root","taosdata")
```
Please replace the IP address in the command above to the correct one. If no error message is shown, then the connection is established successfully, otherwise the connection command needs to be adjusted according to the error prompt. TDengine supports below functions in *RJDBC* package:
- `dbWriteTable(conn, "test", iris, overwrite=FALSE, append=TRUE)`: Write the data in a data frame iris to the table test in the TDengine server. Parameter overwrite must be false. append must be TRUE and the schema of the data frame iris should be the same as the table test.
- `dbGetQuery(conn, "select count(*) from test")`: run a query command
- `dbSendUpdate (conn, "use db")`: Execute any non-query sql statement. For example, `dbSendUpdate (conn, "use db")`, write data `dbSendUpdate (conn, "insert into t1 values (now, 99)")`, and the like.
- `dbReadTable(conn, "test")`: read all the data in table test
- `dbDisconnect(conn)`: close a connection
- `dbRemoveTable(conn, "test")`: remove table test
The functions below are not supported currently:
- `dbExistsTable(conn, "test")`: if table test exists
- `dbListTables(conn)`: list all tables in the connection
\ No newline at end of file
此差异已折叠。
此差异已折叠。
此差异已折叠。
# FAQ
Tutorials & FAQ
## 0.How to report an issue?
If the contents in FAQ cannot help you and you need the technical support and assistance of TDengine team, please package the contents in the following two directories:
1./var/log/taos (if default path has not been modified)
2./etc/taos
Provide the necessary description of the problem, including the version information of TDengine used, the platform environment information, the execution operation of the problem, the characterization of the problem and the approximate time, and submit the Issue on [GitHub](https://github.com/taosdata/TDengine).
To ensure that there is enough debug information, if the problem can be repeated, please modify the/etc/taos/taos.cfg file, add a line of "debugFlag 135" at the end (without quotation marks themselves), then restart taosd, repeat the problem, and then submit. You can also temporarily set the log level of taosd through the following SQL statement.
```
alter dnode <dnode_id> debugFlag 135;
```
However, when the system is running normally, please set debugFlag to 131, otherwise a large amount of log information will be generated and the system efficiency will be reduced.
## 1.What should I pay attention to when upgrading TDengine from older versions to 2.0 and above? ☆☆☆
Version 2.0 is a complete refactoring of the previous version, and the configuration and data files are incompatible. Be sure to do the following before upgrading:
1. Delete the configuration file, execute sudo rm `-rf /etc/taos/taos.cfg`
2. Delete the log file, execute `sudo rm -rf /var/log/taos/`
3. By ensuring that the data is no longer needed, delete the data file and execute `sudo rm -rf /var/lib/taos/`
4. Install the latest stable version of TDengine
5. If you need to migrate data or the data file is corrupted, please contact the official technical support team of TAOS Data to assist
## 2. When encoutered with the error " Unable to establish connection " in Windows, what can I do?
See the [technical blog](https://www.taosdata.com/blog/2019/12/03/jdbcdriver%E6%89%BE%E4%B8%8D%E5%88%B0%E5%8A%A8%E6%80%81%E9%93%BE%E6%8E%A5%E5%BA%93/) for this issue.
## 3. Why I get “more dnodes are needed” when create a table?
See the [technical blog](https://www.taosdata.com/blog/2019/12/03/%E5%88%9B%E5%BB%BA%E6%95%B0%E6%8D%AE%E8%A1%A8%E6%97%B6%E6%8F%90%E7%A4%BAmore-dnodes-are-needed/) for this issue.
## 4. How do I generate a core file when TDengine crashes?
See the [technical blog](https://www.taosdata.com/blog/2019/12/06/tdengine-crash%E6%97%B6%E7%94%9F%E6%88%90core%E6%96%87%E4%BB%B6%E7%9A%84%E6%96%B9%E6%B3%95/) for this issue.
## 5. What should I do if I encounter an error "Unable to establish connection"?
When the client encountered a connection failure, please follow the following steps to check:
1. Check your network environment
2. - Cloud server: Check whether the security group of the cloud server opens access to TCP/UDP ports 6030-6042
- Local virtual machine: Check whether the network can be pinged, and try to avoid using localhost as hostname
- Corporate server: If you are in a NAT network environment, be sure to check whether the server can return messages to the client
2. Make sure that the client and server version numbers are exactly the same, and the open source Community Edition and Enterprise Edition cannot be mixed.
3. On the server, execute systemctl status taosd to check the running status of *taosd*. If not running, start *taosd*.
4. Verify that the correct server FQDN (Fully Qualified Domain Name, which is available by executing the Linux command hostname-f on the server) is specified when the client connects. FQDN configuration reference: "[All about FQDN of TDengine](https://www.taosdata.com/blog/2020/09/11/1824.html)".
5. Ping the server FQDN. If there is no response, please check your network, DNS settings, or the system hosts file of the computer where the client is located.
6. Check the firewall settings (Ubuntu uses ufw status, CentOS uses firewall-cmd-list-port) to confirm that TCP/UDP ports 6030-6042 are open.
7. For JDBC (ODBC, Python, Go and other interfaces are similar) connections on Linux, make sure that libtaos.so is in the directory /usr/local/taos/driver, and /usr/local/taos/driver is in the system library function search path LD_LIBRARY_PATH.
8. For JDBC, ODBC, Python, Go, etc. connections on Windows, make sure that C:\ TDengine\ driver\ taos.dll is in your system library function search directory (it is recommended that taos.dll be placed in the directory C:\ Windows\ System32)
9. If the connection issue still exist
1. - On Linux system, please use the command line tool nc to determine whether the TCP and UDP connections on the specified ports are unobstructed. Check whether the UDP port connection works: nc -vuz {hostIP} {port} Check whether the server-side TCP port connection works: nc -l {port}Check whether the client-side TCP port connection works: nc {hostIP} {port}
- Windows systems use the PowerShell command Net-TestConnection-ComputerName {fqdn} Port {port} to detect whether the service-segment port is accessed
10. You can also use the built-in network connectivity detection function of taos program to verify whether the specified port connection between the server and the client is unobstructed (including TCP and UDP): [TDengine's Built-in Network Detection Tool Use Guide](https://www.taosdata.com/blog/2020/09/08/1816.html).
## 6.What to do if I encounter an error "Unexpected generic error in RPC" or "TDengine error: Unable to resolve FQDN"?
This error occurs because the client or data node cannot parse the FQDN (Fully Qualified Domain Name). For TAOS shell or client applications, check the following:
1. Please verify whether the FQDN of the connected server is correct. FQDN configuration reference: "[All about FQDN of TDengine](https://www.taosdata.com/blog/2020/09/11/1824.html)".
2. If the network is configured with a DNS server, check that it is working properly.
3. If the network does not have a DNS server configured, check the hosts file of the machine where the client is located to see if the FQDN is configured and has the correct IP address.
4. If the network configuration is OK, from the machine where the client is located, you need to be able to ping the connected FQDN, otherwise the client cannot connect to the server
## 7.Although the syntax is corrected, why do I still get the “Invalid SQL" error?
If you confirm that the syntax is correct, for versions older than 2.0, please check whether the SQL statement length exceeds 64K. If it does, this error will also be returned.
## 8. Are “validation queries” supported?
The TDengine does not yet have a dedicated set of validation queries. However, it is recommended to use the database "log" monitored by the system.
## 9. Can I delete or update a record?
TDengine does not support the deletion function at present, and may support it in the future according to user requirements.
Starting from 2.0. 8.0, TDengine supports the function of updating written data. Using the update function requires using UPDATE 1 parameter when creating the database, and then you can use INSERT INTO command to update the same timestamp data that has been written. UPDATE parameter does not support ALTER DATABASE command modification. Without a database created using UPDATE 1 parameter, writing data with the same timestamp will not modify the previous data with no error reported.
It should also be noted that when UPDATE is set to 0, the data with the same timestamp sent later will be discarded directly, but no error will be reported, and will still be included in affected rows (so the return information of INSERT instruction cannot be used for timestamp duplicate checking). The main reason for this design is that TDengine regards the written data as a stream. Regardless of whether the timestamp conflicts or not, TDengine believes that the original device that generates the data actually generates such data. The UPDATE parameter only controls how such stream data should be processed when persistence-when UPDATE is 0, it means that the data written first overwrites the data written later; When UPDATE is 1, it means that the data written later overwrites the data written first. How to choose this coverage relationship depends on whether the data generated first or later is expected in the subsequent use and statistics compile.
## 10. How to create a table with more than 1024 columns?
Using version 2.0 and above, 1024 columns are supported by default; for older versions, TDengine allowed the creation of a table with a maximum of 250 columns. However, if the limit is exceeded, it is recommended to logically split this wide table into several small ones according to the data characteristics.
## 11. What is the most effective way to write data?
Insert in batches. Each write statement can insert multiple records into one or multiple tables at the same time.
## 12. What is the most effective way to write data? How to solve the problem that Chinese characters in nchar inserted under Windows systems are parsed into messy code?
If there are Chinese characters in nchar data under Windows, please first confirm that the region of the system is set to China (which can be set in the Control Panel), then the taos client in cmd should already support it normally; If you are developing Java applications in an IDE, such as Eclipse and Intellij, please confirm that the file code in the IDE is GBK (this is the default coding type of Java), and then initialize the configuration of the client when generating the Connection. The specific statement is as follows:
```JAVA
Class.forName("com.taosdata.jdbc.TSDBDriver");
Properties properties = new Properties();
properties.setProperty(TSDBDriver.LOCALE_KEY, "UTF-8");
Connection = DriverManager.getConnection(url, properties);
```
## 13. JDBC error: the excluded SQL is not a DML or a DDL?
Please update to the latest JDBC driver.
```xml
<dependency>
<groupId>com.taosdata.jdbc</groupId>
<artifactId>taos-jdbcdriver</artifactId>
<version>2.0.27</version>
</dependency>
```
## 14. taos connect failed, reason: invalid timestamp.
The common reason is that the server time and client time are not calibrated, which can be calibrated by synchronizing with the time server (use ntpdate command under Linux, and select automatic synchronization in the Windows time setting).
## 15. Incomplete display of table name
Due to the limited display width of taos shell in the terminal, it is possible that a relatively long table name is not displayed completely. If relevant operations are carried out according to the displayed incomplete table name, a Table does not exist error will occur. The workaround can be by modifying the setting option maxBinaryDisplayWidth in the taos.cfg file, or directly entering the command `set max_binary_display_width 100`. Or, use the \\G parameter at the end of the command to adjust how the results are displayed.
## 16. How to migrate data?
TDengine uniquely identifies a machine according to hostname. When moving data files from machine A to machine B, pay attention to the following three points:
- For versions 2.0. 0.0 to 2.0. 6. x, reconfigure machine B's hostname to machine A's.
- For 2.0. 7.0 and later versions, go to/var/lib/taos/dnode, repair the FQDN corresponding to dnodeId of dnodeEps.json, and restart. Make sure this file is identical for all machines.
- The storage structures of versions 1. x and 2. x are incompatible, and it is necessary to use migration tools or your own application to export and import data.
## 17. How to temporarily adjust the log level in command line program taos?
For the convenience of debugging, since version 2.0. 16, command line program taos gets two new instructions related to logging:
```mysql
ALTER LOCAL flag_name flag_value;
```
This means that under the current command line program, modify the loglevel of a specific module (only valid for the current command line program, if taos is restarted, it needs to be reset):
- The values of flag_name can be: debugFlag, cDebugFlag, tmrDebugFlag, uDebugFlag, rpcDebugFlag
- Flag_value values can be: 131 (output error and alarm logs), 135 (output error, alarm, and debug logs), 143 (output error, alarm, debug, and trace logs)
```mysql
ALTER LOCAL RESETLOG;
```
This means wiping up all client-generated log files on the machine.
# Connect with other tools
## Telegraf
TDengine is easy to integrate with [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/), an open-source server agent for collecting and sending metrics and events, without more development.
### Install Telegraf
At present, TDengine supports Telegraf newer than version 1.7.4. Users can go to the [download link] and choose the proper package to install on your system.
### Configure Telegraf
Telegraf is configured by changing items in the configuration file */etc/telegraf/telegraf.conf*.
In **output plugins** section,add _[[outputs.http]]_ iterm:
- _url_: http://ip:6020/telegraf/udb, in which _ip_ is the IP address of any node in TDengine cluster. Port 6020 is the RESTful APT port used by TDengine. _udb_ is the name of the database to save data, which needs to create beforehand.
- _method_: "POST"
- _username_: username to login TDengine
- _password_: password to login TDengine
- _data_format_: "json"
- _json_timestamp_units_: "1ms"
In **agent** part:
- hostname: used to distinguish different machines. Need to be unique.
- metric_batch_size: 30,the maximum number of records allowed to write in Telegraf. The larger the value is, the less frequent requests are sent. For TDengine, the value should be less than 50.
Please refer to the [Telegraf docs](https://docs.influxdata.com/telegraf/v1.11/) for more information.
## Grafana
[Grafana] is an open-source system for time-series data display. It is easy to integrate TDengine and Grafana to build a monitor system. Data saved in TDengine can be fetched and shown on the Grafana dashboard.
### Install Grafana
For now, TDengine only supports Grafana newer than version 5.2.4. Users can go to the [Grafana download page] for the proper package to download.
### Configure Grafana
TDengine Grafana plugin is in the _/usr/local/taos/connector/grafana_ directory.
Taking Centos 7.2 as an example, just copy TDengine directory to _/var/lib/grafana/plugins_ directory and restart Grafana.
### Use Grafana
Users can log in the Grafana server (username/password:admin/admin) through localhost:3000 to configure TDengine as the data source. As is shown in the picture below, TDengine as a data source option is shown in the box:
![img](../assets/clip_image001.png)
When choosing TDengine as the data source, the Host in HTTP configuration should be configured as the IP address of any node of a TDengine cluster. The port should be set as 6020. For example, when TDengine and Grafana are on the same machine, it should be configured as _http://localhost:6020.
Besides, users also should set the username and password used to log into TDengine. Then click _Save&Test_ button to save.
![img](../assets/clip_image001-2474914.png)
Then, TDengine as a data source should show in the Grafana data source list.
![img](../assets/clip_image001-2474939.png)
Then, users can create Dashboards in Grafana using TDengine as the data source:
![img](../assets/clip_image001-2474961.png)
Click _Add Query_ button to add a query and input the SQL command you want to run in the _INPUT SQL_ text box. The SQL command should expect a two-row, multi-column result, such as _SELECT count(*) FROM sys.cpu WHERE ts>=from and ts<​to interval(interval)_, in which, _from_, _to_ and _inteval_ are TDengine inner variables representing query time range and time interval.
_ALIAS BY_ field is to set the query alias. Click _GENERATE SQL_ to send the command to TDengine:
![img](../assets/clip_image001-2474987.png)
Please refer to the [Grafana official document] for more information about Grafana.
## Matlab
Matlab can connect to and retrieve data from TDengine by TDengine JDBC Driver.
### MatLab and TDengine JDBC adaptation
Several steps are required to adapt Matlab to TDengine. Taking adapting Matlab2017a on Windows10 as an example:
1. Copy the file _JDBCDriver-1.0.0-dist.jar_ in TDengine package to the directory _${matlab_root}\MATLAB\R2017a\java\jar\toolbox_
2. Copy the file _taos.lib_ in TDengine package to _${matlab_ root _dir}\MATLAB\R2017a\lib\win64_
3. Add the .jar package just copied to the Matlab classpath. Append the line below as the end of the file of _${matlab_ root _dir}\MATLAB\R2017a\toolbox\local\classpath.txt_
`$matlabroot/java/jar/toolbox/JDBCDriver-1.0.0-dist.jar`
4. Create a file called _javalibrarypath.txt_ in directory _${user_home}\AppData\Roaming\MathWorks\MATLAB\R2017a\_, and add the _taos.dll_ path in the file. For example, if the file _taos.dll_ is in the directory of _C:\Windows\System32_,then add the following line in file *javalibrarypath.txt*:
`C:\Windows\System32`
### TDengine operations in Matlab
After correct configuration, open Matlab:
- build a connection:
`conn = database(‘db’, ‘root’, ‘taosdata’, ‘com.taosdata.jdbc.TSDBDriver’, ‘jdbc:TSDB://127.0.0.1:0/’)`
- Query:
`sql0 = [‘select * from tb’]`
`data = select(conn, sql0);`
- Insert a record:
`sql1 = [‘insert into tb values (now, 1)’]`
`exec(conn, sql1)`
Please refer to the file _examples\Matlab\TDengineDemo.m_ for more information.
## R
Users can use R language to access the TDengine server with the JDBC interface. At first, install JDBC package in R:
```R
install.packages('rJDBC', repos='http://cran.us.r-project.org')
```
Then use _library_ function to load the package:
```R
library('RJDBC')
```
Then load the TDengine JDBC driver:
```R
drv<-JDBC("com.taosdata.jdbc.TSDBDriver","JDBCDriver-1.0.0-dist.jar", identifier.quote="\"")
```
If succeed, no error message will display. Then use the following command to try a database connection:
```R
conn<-dbConnect(drv,"jdbc:TSDB://192.168.0.1:0/?user=root&password=taosdata","root","taosdata")
```
Please replace the IP address in the command above to the correct one. If no error message is shown, then the connection is established successfully. TDengine supports below functions in _RJDBC_ package:
- _dbWriteTable(conn, "test", iris, overwrite=FALSE, append=TRUE)_: write the data in a data frame _iris_ to the table _test_ in the TDengine server. Parameter _overwrite_ must be _false_. _append_ must be _TRUE_ and the schema of the data frame _iris_ should be the same as the table _test_.
- _dbGetQuery(conn, "select count(*) from test")_: run a query command
- _dbSendUpdate(conn, "use db")_: run any non-query command.
- _dbReadTable(conn, "test"_): read all the data in table _test_
- _dbDisconnect(conn)_: close a connection
- _dbRemoveTable(conn, "test")_: remove table _test_
Below functions are **not supported** currently:
- _dbExistsTable(conn, "test")_: if talbe _test_ exists
- _dbListTables(conn)_: list all tables in the connection
[Telegraf]: www.taosdata.com
[download link]: https://portal.influxdata.com/downloads
[Telegraf document]: www.taosdata.com
[Grafana]: https://grafana.com
[Grafana download page]: https://grafana.com/grafana/download
[Grafana official document]: https://grafana.com/docs/
# TaosData Contributor License Agreement
This TaosData Contributor License Agreement (CLA) applies to any contribution you make to any TaosData projects. If you are representing your employing organization to sign this agreement, please warrant that you have the authority to grant the agreement.
## Terms
**"TaosData"**, **"we"**, **"our"** and **"us"** means TaosData, inc.
**"You"** and **"your"** means you or the organization you are on behalf of to sign this agreement.
**"Contribution"** means any original work you, or the organization you represent submit to TaosData for any project in any manner.
## Copyright License
All rights of your Contribution submitted to TaosData in any manner are granted to TaosData and recipients of software distributed by TaosData. You waive any rights that my affect our ownership of the copyright and grant to us a perpetual, worldwide, transferable, non-exclusive, no-charge, royalty-free, irrevocable, and sublicensable license to use, reproduce, prepare derivative works of, publicly display, publicly perform, sublicense, and distribute Contributions and any derivative work created based on a Contribution.
## Patent License
With respect to any patents you own or that you can license without payment to any third party, you grant to us and to any recipient of software distributed by us, a perpetual, worldwide, transferable, non-exclusive, no-charge, royalty-free, irrevocable patent license to make, have make, use, sell, offer to sell, import, and otherwise transfer the Contribution in whole or in part, alone or included in any product under any patent you own, or license from a third party, that is necessarily infringed by the Contribution or by combination of the Contribution with any Work.
## Your Representations and Warranties
You represent and warrant that:
- the Contribution you submit is an original work that you can legally grant the rights set out in this agreement.
- the Contribution you submit and licenses you granted does not and will not, infringe the rights of any third party.
- you are not aware of any pending or threatened claims, suits, actions, or charges pertaining to the contributions. You also warrant to notify TaosData immediately if you become aware of any such actual or potential claims, suits, actions, allegations or charges.
## Support
You are not obligated to support your Contribution except you volunteer to provide support. If you want, you can provide for a fee.
**I agree and accept on behalf of myself and behalf of my organization:**
\ No newline at end of file
#Documentation
TDengine is a highly efficient platform to store, query, and analyze time-series data. It works like a relational database, but you are strongly suggested to read through the following documentation before you experience it.
##Getting Started
- Quick Start: download, install and experience TDengine in a few seconds
- TDengine Shell: command-line interface to access TDengine server
- Major Features: insert/query, aggregation, cache, pub/sub, continuous query
## Data Model and Architecture
- Data Model: relational database model, but one table for one device with static tags
- Architecture: Management Module, Data Module, Client Module
- Writing Process: records recieved are written to WAL, cache, then ack is sent back to client
- Data Storage: records are sharded in the time range, and stored column by column
##TAOS SQL
- Data Types: support timestamp, int, float, double, binary, nchar, bool, and other types
- Database Management: add, drop, check databases
- Table Management: add, drop, check, alter tables
- Inserting Records: insert one or more records into tables, historical records can be imported
- Data Query: query data with time range and filter conditions, support limit/offset
- SQL Functions: support aggregation, selector, transformation functions
- Downsampling: aggregate data in successive time windows, support interpolation
##STable: Super Table
- What is a Super Table: an innovated way to aggregate tables
- Create a STable: it is like creating a standard table, but with tags defined
- Create a Table via STable: use STable as the template, with tags specified
- Aggregate Tables via STable: group tables together by specifying the tags filter condition
- Create Table Automatically: create tables automatically with a STable as a template
- Management of STables: create/delete/alter super table just like standard tables
- Management of Tags: add/delete/alter tags on super tables or tables
##Advanced Features
- Continuous Query: query executed by TDengine periodically with a sliding window
- Publisher/Subscriber: subscribe to the newly arrived data like a typical messaging system
- Caching: the newly arrived data of each device/table will always be cached
##Connector
- C/C++ Connector: primary method to connect to the server through libtaos client library
- Java Connector: driver for connecting to the server from Java applications using the JDBC API
- Python Connector: driver for connecting to the server from Python applications
- RESTful Connector: a simple way to interact with TDengine via HTTP
- Go Connector: driver for connecting to the server from Go applications
- Node.js Connector: driver for connecting to the server from node applications
##Connections with Other Tools
- Telegraf: pass the collected DevOps metrics to TDengine
- Grafana: query the data saved in TDengine and visualize them
- Matlab: access TDengine server from Matlab via JDBC
- R: access TDengine server from R via JDBC
##Administrator
- Directory and Files: files and directories related with TDengine
- Configuration on Server: customize IP port, cache size, file block size and other settings
- Configuration on Client: customize locale, default user and others
- User Management: add/delete users, change passwords
- Import Data: import data into TDengine from either script or CSV file
- Export Data: export data either from TDengine shell or from tool taosdump
- Management of Connections, Streams, Queries: check or kill the connections, queries
- System Monitor: collect the system metric, and log important operations
##More on System Architecture
- Storage Design: column-based storage with optimization on time-series data
- Query Design: an efficient way to query time-series data
- Technical blogs to delve into the inside of TDengine
## More on IoT Big Data
- [Characteristics of IoT Big Data](https://www.taosdata.com/blog/2019/07/09/characteristics-of-iot-big-data/)
- [Why don’t General Big Data Platforms Fit IoT Scenarios?](https://www.taosdata.com/blog/2019/07/09/why-does-the-general-big-data-platform-not-fit-iot-data-processing/)
- [Why TDengine is the Best Choice for IoT Big Data Processing?](https://www.taosdata.com/blog/2019/07/09/why-tdengine-is-the-best-choice-for-iot-big-data-processing/)
##Tutorials & FAQ
- <a href='https://www.taosdata.com/en/faq'>FAQ</a>: a list of frequently asked questions and answers
- <a href='https://www.taosdata.com/en/blog/?categories=4'>Use cases</a>: a few typical cases to explain how to use TDengine in IoT platform
#Getting Started
## Quick Start
At the moment, TDengine only runs on Linux. You can set up and install it either from the <a href='#Install-from-Source'>source code</a> or the <a href='#Install-from-Package'>packages</a>. It takes only a few seconds from download to run it successfully.
### Install from Source
Please visit our [github page](https://github.com/taosdata/TDengine) for instructions on installation from the source code.
### Install from Package
Three different packages are provided, please pick up the one you like.
<ul id='packageList'>
<li><a id='tdengine-rpm' style='color:var(--b2)'>TDengine RPM package (1.5M)</a></li>
<li><a id='tdengine-deb' style='color:var(--b2)'>TDengine DEB package (1.7M)</a></li>
<li><a id='tdengine-tar' style='color:var(--b2)'>TDengine Tarball (3.0M)</a></li>
</ul>
For the time being, TDengine only supports installation on Linux systems using [`systemd`](https://en.wikipedia.org/wiki/Systemd) as the service manager. To check if your system has *systemd* package, use the _which systemctl_ command.
```cmd
which systemctl
```
If the `systemd` package is not found, please [install from source code](#Install-from-Source).
### Running TDengine
After installation, start the TDengine service by the `systemctl` command.
```cmd
systemctl start taosd
```
Then check if the server is working now.
```cmd
systemctl status taosd
```
If the service is running successfully, you can play around through TDengine shell `taos`, the command line interface tool located in directory /usr/local/bin/taos
**Note: The _systemctl_ command needs the root privilege. Use _sudo_ if you are not the _root_ user.**
##TDengine Shell
To launch TDengine shell, the command line interface, in a Linux terminal, type:
```cmd
taos
```
The welcome message is printed if the shell connects to TDengine server successfully, otherwise, an error message will be printed (refer to our [FAQ](../faq) page for troubleshooting the connection error). The TDengine shell prompt is:
```cmd
taos>
```
In the TDengine shell, you can create databases, create tables and insert/query data with SQL. Each query command ends with a semicolon. It works like MySQL, for example:
```mysql
create database db;
use db;
create table t (ts timestamp, cdata int);
insert into t values ('2019-07-15 10:00:00', 10);
insert into t values ('2019-07-15 10:01:05', 20);
select * from t;
ts | speed |
===================================
19-07-15 10:00:00.000| 10|
19-07-15 10:01:05.000| 20|
Query OK, 2 row(s) in set (0.001700s)
```
Besides the SQL commands, the system administrator can check system status, add or delete accounts, and manage the servers.
###Shell Command Line Parameters
You can run `taos` command with command line options to fit your needs. Some frequently used options are listed below:
- -c, --config-dir: set the configuration directory. It is _/etc/taos_ by default
- -h, --host: set the IP address of the server it will connect to, Default is localhost
- -s, --commands: set the command to run without entering the shell
- -u, -- user: user name to connect to server. Default is root
- -p, --password: password. Default is 'taosdata'
- -?, --help: get a full list of supported options
Examples:
```cmd
taos -h 192.168.0.1 -s "use db; show tables;"
```
###Run Batch Commands
Inside TDengine shell, you can run batch commands in a file with *source* command.
```
taos> source <filename>;
```
### Tips
- Use up/down arrow key to check the command history
- To change the default password, use "`alter user`" command
- ctrl+c to interrupt any queries
- To clean the cached schema of tables or STables, execute command `RESET QUERY CACHE`
## Major Features
The core functionality of TDengine is the time-series database. To reduce the development and management complexity, and to improve the system efficiency further, TDengine also provides caching, pub/sub messaging system, and stream computing functionalities. It provides a full stack for IoT big data platform. The detailed features are listed below:
- SQL like query language used to insert or explore data
- C/C++, Java(JDBC), Python, Go, RESTful, and Node.JS interfaces for development
- Ad hoc queries/analysis via Python/R/Matlab or TDengine shell
- Continuous queries to support sliding-window based stream computing
- Super table to aggregate multiple time-streams efficiently with flexibility
- Aggregation over a time window on one or multiple time-streams
- Built-in messaging system to support publisher/subscriber model
- Built-in cache for each time stream to make latest data available as fast as light speed
- Transparent handling of historical data and real-time data
- Integrating with Telegraf, Grafana and other tools seamlessly
- A set of tools or configuration to manage TDengine
For enterprise edition, TDengine provides more advanced features below:
- Linear scalability to deliver higher capacity/throughput
- High availability to guarantee the carrier-grade service
- Built-in replication between nodes which may span multiple geographical sites
- Multi-tier storage to make historical data management simpler and cost-effective
- Web-based management tools and other tools to make maintenance simpler
TDengine is specially designed and optimized for time-series data processing in IoT, connected cars, Industrial IoT, IT infrastructure and application monitoring, and other scenarios. Compared with other solutions, it is 10x faster on insert/query speed. With a single-core machine, over 20K requestes can be processed, millions data points can be ingested, and over 10 million data points can be retrieved in a second. Via column-based storage and tuned compression algorithm for different data types, less than 1/10 storage space is required.
## Explore More on TDengine
Please read through the whole <a href='../documentation'>documentation</a> to learn more about TDengine.
# TDengine System Architecture
## Storage Design
TDengine data mainly include **metadata** and **data** that we will introduce in the following sections.
### Metadata Storage
Metadata include the information of databases, tables, etc. Metadata files are saved in _/var/lib/taos/mgmt/_ directory by default. The directory tree is as below:
```
/var/lib/taos/
+--mgmt/
+--db.db
+--meters.db
+--user.db
+--vgroups.db
```
A metadata structure (database, table, etc.) is saved as a record in a metadata file. All metadata files are appended only, and even a drop operation adds a deletion record at the end of the file.
### Data storage
Data in TDengine are sharded according to the time range. Data of tables in the same vnode in a certain time range are saved in the same filegroup, such as files v0f1804*. This sharding strategy can effectively improve data searching speed. By default, a group of files contains data in 10 days, which can be configured by *daysPerFile* in the configuration file or by *DAYS* keyword in *CREATE DATABASE* clause. Data in files are blockwised. A data block only contains one table's data. Records in the same data block are sorted according to the primary timestamp, which helps to improve the compression rate and save storage. The compression algorithms used in TDengine include simple8B, delta-of-delta, RLE, LZ4, etc.
By default, TDengine data are saved in */var/lib/taos/data/* directory. _/var/lib/taos/tsdb/_ directory contains vnode informations and data file linkes.
```
/var/lib/taos/
+--tsdb/
| +--vnode0
| +--meterObj.v0
| +--db/
| +--v0f1804.head->/var/lib/taos/data/vnode0/v0f1804.head1
| +--v0f1804.data->/var/lib/taos/data/vnode0/v0f1804.data
| +--v0f1804.last->/var/lib/taos/data/vnode0/v0f1804.last1
| +--v0f1805.head->/var/lib/taos/data/vnode0/v0f1805.head1
| +--v0f1805.data->/var/lib/taos/data/vnode0/v0f1805.data
| +--v0f1805.last->/var/lib/taos/data/vnode0/v0f1805.last1
| :
+--data/
+--vnode0/
+--v0f1804.head1
+--v0f1804.data
+--v0f1804.last1
+--v0f1805.head1
+--v0f1805.data
+--v0f1805.last1
:
```
#### meterObj file
There are only one meterObj file in a vnode. Informations bout the vnode, such as created time, configuration information, vnode statistic informations are saved in this file. It has the structure like below:
```
<start_of_file>
[file_header]
[table_record1_offset&length]
[table_record2_offset&length]
...
[table_recordN_offset&length]
[table_record1]
[table_record2]
...
[table_recordN]
<end_of_file>
```
The file header takes 512 bytes, which mainly contains informations about the vnode. Each table record is the representation of a table on disk.
#### head file
The _head_ files contain the index of data blocks in the _data_ file. The inner organization is as below:
```
<start_of_file>
[file_header]
[table1_offset]
[table2_offset]
...
[tableN_offset]
[table1_index_block]
[table2_index_block]
...
[tableN_index_block]
<end_of_file>
```
The table offset array in the _head_ file saves the information about the offsets of each table index block. Indices on data blocks in the same table are saved continuously. This also makes it efficient to load data indices on the same table. The data index block has a structure like:
```
[index_block_info]
[block1_index]
[block2_index]
...
[blockN_index]
```
The index block info part contains the information about the index block such as the number of index blocks, etc. Each block index corresponds to a real data block in the _data_ file or _last_ file. Information about the location of the real data block, the primary timestamp range of the data block, etc. are all saved in the block index part. The block indices are sorted in ascending order according to the primary timestamp. So we can apply algorithms such as the binary search on the data to efficiently search blocks according to time.
#### data file
The _data_ files store the real data block. They are append-only. The organization is as:
```
<start_of_file>
[file_header]
[block1]
[block2]
...
[blockN]
<end_of_file>
```
A data block in _data_ files only belongs to a table in the vnode and the records in a data block are sorted in ascending order according to the primary timestamp key. Data blocks are column-oriented. Data in the same column are stored contiguously, which improves reading speed and compression rate because of their similarity. A data block has the following organization:
```
[column1_info]
[column2_info]
...
[columnN_info]
[column1_data]
[column2_data]
...
[columnN_data]
```
The column info part includes information about column types, column compression algorithm, column data offset and length in the _data_ file, etc. Besides, pre-calculated results of the column data in the block are also in the column info part, which helps to improve reading speed by avoiding loading data block necessarily.
#### last file
To avoid storage fragment and to import query speed and compression rate, TDengine introduces an extra file, the _last_ file. When the number of records in a data block is lower than a threshold, TDengine will flush the block to the _last_ file for temporary storage. When new data comes, the data in the _last_ file will be merged with the new data and form a larger data block and written to the _data_ file. The organization of the _last_ file is similar to the _data_ file.
### Summary
The innovation in architecture and storage design of TDengine improves resource usage. On the one hand, the virtualization makes it easy to distribute resources between different vnodes and for future scaling. On the other hand, sorted and column-oriented storage makes TDengine have a great advantage in writing, querying and compression.
## Query Design
#### Introduction
TDengine provides a variety of query functions for both tables and super tables. In addition to regular aggregate queries, it also provides time window based query and statistical aggregation for time series data. TDengine's query processing requires the client app, management node, and data node to work together. The functions and modules involved in query processing included in each component are as follows:
Client (Client App). The client development kit, embed in a client application, consists of TAOS SQL parser and query executor, the second-stage aggregator (Result Merger), continuous query manager and other major functional modules. The SQL parser is responsible for parsing and verifying the SQL statement and converting it into an abstract syntax tree. The query executor is responsible for transforming the abstract syntax tree into the query execution logic and creates the metadata query according to the query condition of the SQL statement. Since TAOS SQL does not currently include complex nested queries and pipeline query processing mechanism, there is no longer need for query plan optimization and physical query plan conversions. The second-stage aggregator is responsible for performing the aggregation of the independent results returned by query involved data nodes at the client side to generate final results. The continuous query manager is dedicated to managing the continuous queries created by users, including issuing fixed-interval query requests and writing the results back to TDengine or returning to the client application as needed. Also, the client is also responsible for retrying after the query fails, canceling the query request, and maintaining the connection heartbeat and reporting the query status to the management node.
Management Node. The management node keeps the metadata of all the data of the entire cluster system, provides the metadata of the data required for the query from the client node, and divides the query request according to the load condition of the cluster. The super table contains information about all the tables created according to the super table, so the query processor (Query Executor) of the management node is responsible for the query processing of the tags of tables and returns the table information satisfying the tag query. Besides, the management node maintains the query status of the cluster in the Query Status Manager component, in which the metadata of all queries that are currently executing are temporarily stored in-memory buffer. When the client issues *show queries* command to management node, current running queries information is returned to the client.
Data Node. The data node, responsible for storing all data of the database, consists of query executor, query processing scheduler, query task queue, and other related components. Once the query requests from the client received, they are put into query task queue and waiting to be processed by query executor. The query executor extracts the query request from the query task queue and invokes the query optimizer to perform the basic optimization for the query execution plan. And then query executor scans the qualified data blocks in both cache and disk to obtain qualified data and return the calculated results. Besides, the data node also needs to respond to management information and commands from the management node. For example, after the *kill query* received from the management node, the query task needs to be stopped immediately.
<center> <img src="../assets/fig1.png"> </center>
<center>Fig 1. System query processing architecture diagram (only query related components)</center>
#### Query Process Design
The client, the management node, and the data node cooperate to complete the entire query processing of TDengine. Let's take a concrete SQL query as an example to illustrate the whole query processing flow. The SQL statement is to query on super table *FOO_SUPER_TABLE* to get the total number of records generated on January 12, 2019, from the table, of which TAG_LOC equals to 'beijing'. The SQL statement is as follows:
```sql
SELECT COUNT(*)
FROM FOO_SUPER_TABLE
WHERE TAG_LOC = 'beijing' AND TS >= '2019-01-12 00:00:00' AND TS < '2019-01-13 00:00:00'
```
First, the client invokes the TAOS SQL parser to parse and validate the SQL statement, then generates a syntax tree, and extracts the object of the query - the super table *FOO_SUPER_TABLE*, and then the parser sends requests with filtering information (TAG_LOC='beijing') to management node to get the corresponding metadata about *FOO_SUPER_TABLE*.
Once the management node receives the request for metadata acquisition, first finds the super table *FOO_SUPER_TABLE* basic information, and then applies the query condition (TAG_LOC='beijing') to filter all the related tables created according to it. And finally, the query executor returns the metadata information that satisfies the query request to the client.
After the client obtains the metadata information of *FOO_SUPER_TABLE*, the query executor initiates a query request with timestamp range filtering condition (TS >= '2019- 01-12 00:00:00' AND TS < '2019-01-13 00:00:00') to all nodes that hold the corresponding data according to the information about data distribution in metadata.
The data node receives the query sent from the client, converts it into an internal structure and puts it into the query task queue to be executed by query executor after optimizing the execution plan. When the query result is obtained, the query result is returned to the client. It should be noted that the data nodes perform the query process independently of each other, and rely solely on their data and content for processing.
When all data nodes involved in the query return results, the client aggregates the result sets from each data node. In this case, all results are accumulated to generate the final query result. The second stage of aggregation is not always required for all queries. For example, a column selection query does not require a second-stage aggregation at all.
#### REST Query Process
In addition to C/C++, Python, and JDBC interface, TDengine also provides a REST interface based on the HTTP protocol, which is different from using the client application programming interface. When the user uses the REST interface, all the query processing is completed on the server-side, and the user's application is not involved in query processing anymore. After the query processing is completed, the result is returned to the client through the HTTP JSON string.
<center> <img src="../assets/fig2.png"> </center>
<center>Fig. 2 REST query architecture</center>
When a client uses an HTTP-based REST query interface, the client first establishes a connection with the HTTP connector at the data node and then uses the token to ensure the reliability of the request through the REST signature mechanism. For the data node, after receiving the request, the HTTP connector invokes the embedded client program to initiate a query processing, and then the embedded client parses the SQL statement from the HTTP connector and requests the management node to get metadata as needed. After that, the embedded client sends query requests to the same data node or other nodes in the cluster and aggregates the calculation results on demand. Finally, you also need to convert the result of the query into a JSON format string and return it to the client via an HTTP response. After the HTTP connector receives the request SQL, the subsequent process processing is completely consistent with the query processing using the client application development kit.
It should be noted that during the entire processing, the client application is no longer involved in, and is only responsible for sending SQL requests through the HTTP protocol and receiving the results in JSON format. Besides, each data node is embedded with an HTTP connector and a client, so any data node in the cluster received requests from a client, the data node can initiate the query and return the result to the client through the HTTP protocol, with transfer the request to other data nodes.
#### Technology
Because TDengine stores data and tags value separately, the tag value is kept in the management node and directly associated with each table instead of records, resulting in a great reduction of the data storage. Therefore, the tag value can be managed by a fully in-memory structure. First, the filtering of the tag data can drastically reduce the data size involved in the second phase of the query. The query processing for the data is performed at the data node. TDengine takes advantage of the immutable characteristics of IoT data by calculating the maximum, minimum, and other statistics of the data in one data block on each saved data block, to effectively improve the performance of query processing. If the query process involves all the data of the entire data block, the pre-computed result is used directly, and the content of the data block is no longer needed. Since the size of disk space required to store the pre-computation result is much smaller than the size of the specific data, the pre-computation result can greatly reduce the disk IO and speed up the query processing.
TDengine employs column-oriented data storage techniques. When the data block is involved to be loaded from the disk for calculation, only the required column is read according to the query condition, and the read overhead can be minimized. The data of one column is stored in a contiguous memory block and therefore can make full use of the CPU L2 cache to greatly speed up the data scanning. Besides, TDengine utilizes the eagerly responding mechanism and returns a partial result before the complete result is acquired. For example, when the first batch of results is obtained, the data node immediately returns it directly to the client in case of a column select query.
\ No newline at end of file
#Advanced Features
##Continuous Query
Continuous Query is a query executed by TDengine periodically with a sliding window, it is a simplified stream computing driven by timers, not by events. Continuous query can be applied to a table or a STable, and the result set can be passed to the application directly via call back function, or written into a new table in TDengine. The query is always executed on a specified time window (window size is specified by parameter interval), and this window slides forward while time flows (the sliding period is specified by parameter sliding).
Continuous query is defined by TAOS SQL, there is nothing special. One of the best applications is downsampling. Once it is defined, at the end of each cycle, the system will execute the query, pass the result to the application or write it to a database.
If historical data pints are inserted into the stream, the query won't be re-executed, and the result set won't be updated. If the result set is passed to the application, the application needs to keep the status of continuous query, the server won't maintain it. If application re-starts, it needs to decide the time where the stream computing shall be started.
####How to use continuous query
- Pass result set to application
Application shall use API taos_stream (details in connector section) to start the stream computing. Inside the API, the SQL syntax is:
```sql
SELECT aggregation FROM [table_name | stable_name]
INTERVAL(window_size) SLIDING(period)
```
where the new keyword INTERVAL specifies the window size, and SLIDING specifies the sliding period. If parameter sliding is not specified, the sliding period will be the same as window size. The minimum window size is 10ms. The sliding period shall not be larger than the window size. If you set a value larger than the window size, the system will adjust it to window size automatically.
For example:
```sql
SELECT COUNT(*) FROM FOO_TABLE
INTERVAL(1M) SLIDING(30S)
```
The above SQL statement will count the number of records for the past 1-minute window every 30 seconds.
- Save the result into a database
If you want to save the result set of stream computing into a new table, the SQL shall be:
```sql
CREATE TABLE table_name AS
SELECT aggregation from [table_name | stable_name]
INTERVAL(window_size) SLIDING(period)
```
Also, you can set the time range to execute the continuous query. If no range is specified, the continuous query will be executed forever. For example, the following continuous query will be executed from now and will stop in one hour.
```sql
CREATE TABLE QUERY_RES AS
SELECT COUNT(*) FROM FOO_TABLE
WHERE TS > NOW AND TS <= NOW + 1H
INTERVAL(1M) SLIDING(30S)
```
###Manage the Continuous Query
Inside TDengine shell, you can use the command "show streams" to list the ongoing continuous queries, the command "kill stream" to kill a specific continuous query.
If you drop a table generated by the continuous query, the query will be removed too.
##Publisher/Subscriber
Time series data is a sequence of data points over time. Inside a table, the data points are stored in order of timestamp. Also, there is a data retention policy, the data points will be removed once their lifetime is passed. From another view, a table in DTengine is just a standard message queue.
To reduce the development complexity and improve data consistency, TDengine provides the pub/sub functionality. To publish a message, you simply insert a record into a table. Compared with popular messaging tool Kafka, you subscribe to a table or a SQL query statement, instead of a topic. Once new data points arrive, TDengine will notify the application. The process is just like Kafka.
The detailed API will be introduced in the [connectors](https://www.taosdata.com/en/documentation/advanced-features/) section.
##Caching
TDengine allocates a fixed-size buffer in memory, the newly arrived data will be written into the buffer first. Every device or table gets one or more memory blocks. For typical IoT scenarios, the hot data shall always be newly arrived data, they are more important for timely analysis. Based on this observation, TDengine manages the cache blocks in First-In-First-Out strategy. If no enough space in the buffer, the oldest data will be saved into hard disk first, then be overwritten by newly arrived data. TDengine also guarantees every device can keep at least one block of data in the buffer.
By this design, the application can retrieve the latest data from each device super-fast, since they are all available in memory. You can use last or last_row function to return the last data record. If the super table is used, it can be used to return the last data records of all or a subset of devices. For example, to retrieve the latest temperature from thermometers in located Beijing, execute the following SQL
```mysql
select last(*) from thermometers where location=’beijing’
```
By this design, caching tool, like Redis, is not needed in the system. It will reduce the complexity of the system.
TDengine creates one or more virtual nodes(vnode) in each data node. Each vnode contains data for multiple tables and has its own buffer. The buffer of a vnode is fully separated from the buffer of another vnode, not shared. But the tables in a vnode share the same buffer.
System configuration parameter cacheBlockSize configures the cache block size in bytes, and another parameter cacheNumOfBlocks configures the number of cache blocks. The total memory for the buffer of a vnode is $cacheBlockSize \times cacheNumOfBlocks$. Another system parameter numOfBlocksPerMeter configures the maximum number of cache blocks a table can use. When you create a database, you can specify these parameters.
\ No newline at end of file
#FAQ
#### 1. How to upgrade TDengine from 1.X versions to 2.X and above versions?
Version 2.X is a complete refactoring of the previous version, and configuration files and data files are incompatible. Be sure to do the following before upgrading:
1. Delete the configuration file, and execute <code>sudo rm -rf /etc/taos/taos</code>
2. Delete the log file, and execute <code>sudo rm -rf /var/log/taos </code>
3. ENSURE THAT YOUR DATAS ARE NO LONGER NEEDED! Delete the data file, and execute <code>sudo rm -rf /var/lib/taos </code>
4. Enjoy the latest stable version of TDengine
5. If the data needs to be migrated or the data file is corrupted, please contact the official technical support team for assistance
#### 2. When encoutered with the error "Unable to establish connection", what can I do?
The client may encounter connection errors. Please follow the steps below for troubleshooting:
1. Make sure that the client and server version Numbers are exactly the same, and that the open source community and Enterprise versions are not mixed.
2. On the server side, execute `systemctl status taosd` to check the status of *taosd* service. If *taosd* is not running, start it and retry connecting.
3. Make sure you have used the correct server IP address to connect to.
4. Ping the server. If no response is received, check your network connection.
5. Check the firewall setting, make sure the TCP/UDP ports from 6030-6039 are enabled.
6. For JDBC, ODBC, Python, Go connections on Linux, make sure the native library *libtaos.so* are located at /usr/local/lib/taos, and /usr/local/lib/taos is in the *LD_LIBRARY_PATH*.
7. For JDBC, ODBC, Python, Go connections on Windows, make sure *driver/c/taos.dll* is in the system search path (or you can copy taos.dll into *C:\Windows\System32*)
8. If the above steps can not help, try the network diagnostic tool *nc* to check if TCP/UDP port works
check UDP port:`nc -vuz {hostIP} {port} `
check TCP port on server: `nc -l {port}`
check TCP port on client: ` nc {hostIP} {port}`
#### 3. Why I get "Invalid SQL" error when a query is syntactically correct?
If you are sure your query has correct syntax, please check the length of the SQL string. Before version 2.0, it shall be less than 64KB.
#### 4. Does TDengine support validation queries?
For the time being, TDengine does not have a specific set of validation queries. However, TDengine comes with a system monitoring database named 'sys', which can usually be used as a validation query object.
#### 5. Can I delete or update a record that has been written into TDengine?
The answer is NO. The design of TDengine is based on the assumption that records are generated by the connected devices, you won't be allowed to change it. But TDengine provides a retention policy, the data records will be removed once their lifetime is passed.
#### 6. What is the most efficient way to write data to TDengine?
TDengine supports several different writing regimes. The most efficient way to write data to TDengine is to use batch inserting. For details on batch insertion syntax, please refer to [Taos SQL](../documentation/taos-sql)
......@@ -220,10 +220,6 @@ int32_t bnAllocVnodes(SVgObj *pVgroup) {
}
static bool bnCheckVgroupReady(SVgObj *pVgroup, SVnodeGid *pRmVnode) {
if (pVgroup->lbTime + 5 * tsStatusInterval > tsAccessSquence) {
return false;
}
int32_t rmVnodeVer = 0;
for (int32_t i = 0; i < pVgroup->numOfVnodes; ++i) {
SVnodeGid *pVnode = pVgroup->vnodeGid + i;
......@@ -405,7 +401,7 @@ void bnReset() {
if (pDnode == NULL) break;
// while master change, should reset dnode to offline
mInfo("dnode:%d set access:%d to 0", pDnode->dnodeId, pDnode->lastAccess);
mInfo("dnode:%d set access:%" PRId64 " to 0", pDnode->dnodeId, pDnode->lastAccess);
pDnode->lastAccess = 0;
if (pDnode->status != TAOS_DN_STATUS_DROPPING) {
pDnode->status = TAOS_DN_STATUS_OFFLINE;
......@@ -499,7 +495,7 @@ static bool bnMontiorDropping() {
if (dnodeIsMasterEp(pDnode->dnodeEp)) continue;
if (mnodeGetDnodesNum() <= 1) continue;
mLInfo("dnode:%d, set to removing state for it offline:%d seconds", pDnode->dnodeId,
mLInfo("dnode:%d, set to removing state for it offline:%" PRId64 " seconds", pDnode->dnodeId,
tsAccessSquence - pDnode->lastAccess);
pDnode->status = TAOS_DN_STATUS_DROPPING;
......@@ -574,8 +570,8 @@ void bnCheckStatus() {
if (pDnode->status != TAOS_DN_STATUS_DROPPING && pDnode->status != TAOS_DN_STATUS_OFFLINE) {
pDnode->status = TAOS_DN_STATUS_OFFLINE;
pDnode->offlineReason = TAOS_DN_OFF_STATUS_MSG_TIMEOUT;
mInfo("dnode:%d, set to offline state, access seq:%d last seq:%d laststat:%d", pDnode->dnodeId, tsAccessSquence,
pDnode->lastAccess, pDnode->status);
mInfo("dnode:%d, set to offline state, access seq:%" PRId64 " last seq:%" PRId64 " laststat:%d", pDnode->dnodeId,
tsAccessSquence, pDnode->lastAccess, pDnode->status);
bnSetVgroupOffline(pDnode);
bnStartTimer(3000);
}
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册