From d5578540ea177626b95c1063742ac938008e6580 Mon Sep 17 00:00:00 2001 From: gccgdb1234 Date: Fri, 13 May 2022 12:54:22 +0800 Subject: [PATCH] docs: English version develop chapter --- .../03-insert-data/02-influxdb-line.mdx | 2 +- docs-en/04-develop/04-query-data/index.mdx | 2 +- docs-en/04-develop/05-continuous-query.mdx | 47 ++-- docs-en/04-develop/06-subscribe.mdx | 108 ++++----- docs-en/04-develop/07-cache.md | 18 +- docs-en/04-develop/08-udf.md | 206 +++++++++--------- 6 files changed, 197 insertions(+), 186 deletions(-) diff --git a/docs-en/04-develop/03-insert-data/02-influxdb-line.mdx b/docs-en/04-develop/03-insert-data/02-influxdb-line.mdx index c22dc0b59d..b5ea308803 100644 --- a/docs-en/04-develop/03-insert-data/02-influxdb-line.mdx +++ b/docs-en/04-develop/03-insert-data/02-influxdb-line.mdx @@ -1,6 +1,6 @@ --- sidebar_label: InfluxDB Line Protocol -title: Insert with InfluxDB Line Protocol +title: InfluxDB Line Protocol --- import Tabs from "@theme/Tabs"; diff --git a/docs-en/04-develop/04-query-data/index.mdx b/docs-en/04-develop/04-query-data/index.mdx index 2815e57bea..641dc23cb8 100644 --- a/docs-en/04-develop/04-query-data/index.mdx +++ b/docs-en/04-develop/04-query-data/index.mdx @@ -133,7 +133,7 @@ For more details please refer to [Aggregate by Window](/taos-sql/interval). ### Query -In [Insert](/develop/insert-data/sql-writing), a database named `power` is created and some data are inserted into stable `meters`. Below sample code demonstrates how to query the data in this stable. +In the section describing [Insert](/develop/insert-data/sql-writing), a database named `power` is created and some data are inserted into stable `meters`. Below sample code demonstrates how to query the data in this stable. diff --git a/docs-en/04-develop/05-continuous-query.mdx b/docs-en/04-develop/05-continuous-query.mdx index 2fd1b3cc75..f0250d9cf2 100644 --- a/docs-en/04-develop/05-continuous-query.mdx +++ b/docs-en/04-develop/05-continuous-query.mdx @@ -1,20 +1,20 @@ --- -sidebar_label: 连续查询 -description: "连续查询是一个按照预设频率自动执行的查询功能,提供按照时间窗口的聚合查询能力,是一种简化的时间驱动流式计算。" -title: "连续查询(Continuous Query)" +sidebar_label: Continuous Query +description: "Continuous query is a query that's executed automatically according to predefined frequency to provide aggregate query capability by time window, it's actually a simplified time driven stream computing." +title: "Continuous Query" --- -连续查询是 TDengine 定期自动执行的查询,采用滑动窗口的方式进行计算,是一种简化的时间驱动的流式计算。针对库中的表或超级表,TDengine 可提供定期自动执行的连续查询,用户可让 TDengine 推送查询的结果,也可以将结果再写回到 TDengine 中。每次执行的查询是一个时间窗口,时间窗口随着时间流动向前滑动。在定义连续查询的时候需要指定时间窗口(time window, 参数 interval)大小和每次前向增量时间(forward sliding times, 参数 sliding)。 +Continuous query is a query that's executed automatically according to predefined frequency to provide aggregate query capability by time window, it's actually a simplified time driven stream computing. Continuous query can be performed on a table or stable in TDengine. The result of continuous query can be pushed to client or written back to TDengine. Each query is executed on a time window, which moves forward with time. The size of time window and the forward sliding time need to be specified with parameter `INTERVAL` and `SLIDING` respectively. -TDengine 的连续查询采用时间驱动模式,可以直接使用 TAOS SQL 进行定义,不需要额外的操作。使用连续查询,可以方便快捷地按照时间窗口生成结果,从而对原始采集数据进行降采样(down sampling)。用户通过 TAOS SQL 定义连续查询以后,TDengine 自动在最后的一个完整的时间周期末端拉起查询,并将计算获得的结果推送给用户或者写回 TDengine。 +Continuous query in TDengine is time driven, and can be defined using TAOS SQL directly without any extra operations. With continuous query, the result can be generated according to time window to achieve down sampling of original data. Once a continuous query is defined using TAOS SQL, the query is automatically executed at the end of each time window and the result is pushed back to client or written to TDengine. -TDengine 提供的连续查询与普通流计算中的时间窗口计算具有以下区别: +There are some differences between continuous query in TDengine and time window computation in stream computing: -- 不同于流计算的实时反馈计算结果,连续查询只在时间窗口关闭以后才开始计算。例如时间周期是 1 天,那么当天的结果只会在 23:59:59 以后才会生成。 -- 如果有历史记录写入到已经计算完成的时间区间,连续查询并不会重新进行计算,也不会重新将结果推送给用户。对于写回 TDengine 的模式,也不会更新已经存在的计算结果。 -- 使用连续查询推送结果的模式,服务端并不缓存客户端计算状态,也不提供 Exactly-Once 的语义保证。如果用户的应用端崩溃,再次拉起的连续查询将只会从再次拉起的时间开始重新计算最近的一个完整的时间窗口。如果使用写回模式,TDengine 可确保数据写回的有效性和连续性。 +- The computation is performed and the result is returned in real time in stream computing, but the computation in continuous query is only started when a time window closes. For example, if the time window is 1 day, then the result will only be generated at 23:59:59. +- If a historical data row is written in to a time widow for which the computation has been finished, the computation will not be performed again and the result will not be pushed to client again either. If the result has been written into TDengine, there will be no update for the result. +- In continuous query, if the result is pushed to client, the client status is not cached on the server side and Exactly-once is not guaranteed by the server either. If the client program crashes, a new time window will be generated from the time where the continuous query is restarted. If the result is written into TDengine, the data written into TDengine can be guaranteed as valid and continuous. -## 连续查询语法 +## Syntax ```sql [CREATE TABLE AS] SELECT select_expr [, select_expr ...] @@ -24,40 +24,39 @@ TDengine 提供的连续查询与普通流计算中的时间窗口计算具有 ``` -INTERVAL: 连续查询作用的时间窗口 +INTERVAL: The time window for which continuous query is performed -SLIDING: 连续查询的时间窗口向前滑动的时间间隔 +SLIDING: The time step for which the time window moves forward each time -## 使用连续查询 +## How to Use -下面以智能电表场景为例介绍连续查询的具体使用方法。假设我们通过下列 SQL 语句创建了超级表和子表: +In this section the use case of meters will be used to introduce how to use continuous query. Assume the stable and sub tables have been created using below SQL statement. ```sql create table meters (ts timestamp, current float, voltage int, phase float) tags (location binary(64), groupId int); create table D1001 using meters tags ("Beijing.Chaoyang", 2); create table D1002 using meters tags ("Beijing.Haidian", 2); -... ``` -可以通过下面这条 SQL 语句以一分钟为时间窗口、30 秒为前向增量统计这些电表的平均电压。 +The average voltage for each time window of one minute with 30 seconds as the length of moving forward can be retrieved using below SQL statement. ```sql select avg(voltage) from meters interval(1m) sliding(30s); ``` -每次执行这条语句,都会重新计算所有数据。 如果需要每隔 30 秒执行一次来增量计算最近一分钟的数据,可以把上面的语句改进成下面的样子,每次使用不同的 `startTime` 并定期执行: +Whenever the above SQL statement is executed, all the existing data will be computed again. If the computation needs to be performed every 30 seconds automatically to compute on the data in the past one minute, the above SQL statement needs to be revised as below, in which `{startTime}` stands for the beginning timestamp in the latest time window. ```sql select avg(voltage) from meters where ts > {startTime} interval(1m) sliding(30s); ``` -这样做没有问题,但 TDengine 提供了更简单的方法,只要在最初的查询语句前面加上 `create table {tableName} as` 就可以了,例如: +Another easier way for same purpose is prepend `create table {tableName} as` before the `select`. ```sql create table avg_vol as select avg(voltage) from meters interval(1m) sliding(30s); ``` -会自动创建一个名为 `avg_vol` 的新表,然后每隔 30 秒,TDengine 会增量执行 `as` 后面的 SQL 语句,并将查询结果写入这个表中,用户程序后续只要从 `avg_vol` 中查询数据即可。例如: +A table named as `avg_vol` will be created automatically, then every 30 seconds the `select` statement will be executed automatically on the data in the past 1 minutes, i.e. the latest time window, and the result is written into table `avg_vol`. The client program just needs to query from table `avg_vol`. For example: ```sql taos> select * from avg_vol; @@ -69,16 +68,16 @@ taos> select * from avg_vol; 2020-07-29 13:39:00.000 | 223.0800000 | ``` -需要注意,查询时间窗口的最小值是 10 毫秒,没有时间窗口范围的上限。 +Please be noted that the minimum allowed time window is 10 milliseconds, and no upper limit. -此外,TDengine 还支持用户指定连续查询的起止时间。如果不输入开始时间,连续查询将从第一条原始数据所在的时间窗口开始;如果没有输入结束时间,连续查询将永久运行;如果用户指定了结束时间,连续查询在系统时间达到指定的时间以后停止运行。比如使用下面的 SQL 创建的连续查询将运行一小时,之后会自动停止。 +Besides, it's allowed to specify the start and end time of continuous query. If the start time is not specified, the timestamp of the first original row will be considered as the start time; if the end time is not specified, the continuous will be performed infinitely, otherwise it will be terminated once the end time is reached. For example, the continuous query in below SQL statement will be started from now and terminated one hour later. ```sql create table avg_vol as select avg(voltage) from meters where ts > now and ts <= now + 1h interval(1m) sliding(30s); ``` -需要说明的是,上面例子中的 `now` 是指创建连续查询的时间,而不是查询执行的时间,否则,查询就无法自动停止了。另外,为了尽量避免原始数据延迟写入导致的问题,TDengine 中连续查询的计算有一定的延迟。也就是说,一个时间窗口过去后,TDengine 并不会立即计算这个窗口的数据,所以要稍等一会(一般不会超过 1 分钟)才能查到计算结果。 +`now` in above SQL statement stands for the time when the continuous query is created, not the time when the computation is actually performed. Besides, to avoid the trouble caused by the delay of original data as much as possible, the actual computation in continuous query is also started with a little delay. That means, once a time window closes, the computation is not started immediately. Normally, the result can only be available a little time later, normally within one minute, after the time window closes. -## 管理连续查询 +## How to Manage -用户可在控制台中通过 `show streams` 命令来查看系统中全部运行的连续查询,并可以通过 `kill stream` 命令杀掉对应的连续查询。后续版本会提供更细粒度和便捷的连续查询管理命令。 +`show streams` command can be used in TDengine CLI `taos` to show all the continuous queries in the system, and `kill stream` can be used to terminate a continuous query. diff --git a/docs-en/04-develop/06-subscribe.mdx b/docs-en/04-develop/06-subscribe.mdx index d471c114e8..f80667032b 100644 --- a/docs-en/04-develop/06-subscribe.mdx +++ b/docs-en/04-develop/06-subscribe.mdx @@ -1,7 +1,7 @@ --- -sidebar_label: 数据订阅 -description: "轻量级的数据订阅与推送服务。连续写入到 TDengine 中的时序数据能够被自动推送到订阅客户端。" -title: 数据订阅 +sidebar_label: Subscription +description: "Lightweight service for data subscription and pushing, the time series data inserted into TDengine continuously can be pushed automatically to the subscribing clients." +title: Data Subscription --- import Tabs from "@theme/Tabs"; @@ -14,13 +14,13 @@ import Node from "./_sub_node.mdx"; import CSharp from "./_sub_cs.mdx"; import CDemo from "./_sub_c.mdx"; -基于数据天然的时间序列特性,TDengine 的数据写入(insert)与消息系统的数据发布(pub)逻辑上一致,均可视为系统中插入一条带时间戳的新记录。同时,TDengine 在内部严格按照数据时间序列单调递增的方式保存数据。本质上来说,TDengine 中每一张表均可视为一个标准的消息队列。 +## Introduction -TDengine 内嵌支持轻量级的消息订阅与推送服务。使用系统提供的 API,用户可使用普通查询语句订阅数据库中的一张或多张表。订阅的逻辑和操作状态的维护均是由客户端完成,客户端定时轮询服务器是否有新的记录到达,有新的记录到达就会将结果反馈到客户。 +According to the time series nature of the data, data inserting in TDengine is similar to data publishing in message queues, they both can be considered as a new data record with timestamp is inserted into the system. Data is stored in ascending order of timestamp inside TDengine, so essentially each table in TDengine can be considered as a message queue. -TDengine 的订阅与推送服务的状态是由客户端维持,TDengine 服务端并不维持。因此如果应用重启,从哪个时间点开始获取最新数据,由应用决定。 +Lightweight service for data subscription and pushing is built in TDengine. With the API provided by TDengine, client programs can used `select` statement to subscribe the data from one or more tables. The subscription and and state maintenance is performed on the client side, the client programs polls the server to check whether there is new data, and if so the new data will be pushed back to the client side. If the client program is restarted, where to start for retrieving new data is up to the client side. -TDengine 的 API 中,与订阅相关的主要有以下三个: +There are 3 major APIs related to subscription provided in the TDengine client driver. ```c taos_subscribe @@ -28,9 +28,11 @@ taos_consume taos_unsubscribe ``` -这些 API 的文档请见 [C/C++ Connector](/reference/connector/cpp),下面仍以智能电表场景为例介绍一下它们的具体用法(超级表和子表结构请参考上一节“连续查询”),完整的示例代码可以在 [这里](https://github.com/taosdata/TDengine/blob/master/examples/c/subscribe.c) 找到。 +For more details about these API please refer to [C/C++ Connector](/reference/connector/cpp). Their usage will be introduced below using the use case of meters, in which the schema of stable and sub tables please refer to the previous section "continuous query". Full sample code can be found [here](https://github.com/taosdata/TDengine/blob/master/examples/c/subscribe.c). -如果我们希望当某个电表的电流超过一定限制(比如 10A)后能得到通知并进行一些处理, 有两种方法:一是分别对每张子表进行查询,每次查询后记录最后一条数据的时间戳,后续只查询这个时间戳之后的数据: +If we want to get notification and take some actions if the current exceeds a threshold, like 10A, from some meters, there are two ways: + +The first way is to query on each sub table and record the last timestamp matching the criteria, then after some time query on the data later than recorded timestamp and repeat this process. The SQL statements for this way are as below. ```sql select * from D1001 where ts > {last_timestamp1} and current > 10; @@ -38,19 +40,19 @@ select * from D1002 where ts > {last_timestamp2} and current > 10; ... ``` -这确实可行,但随着电表数量的增加,查询数量也会增加,客户端和服务端的性能都会受到影响,当电表数增长到一定的程度,系统就无法承受了。 +The above way works, but the problem is that the number of `select` statements increases with the number of meters grows. Finally the performance of both client side and server side will be unacceptable once the number of meters grows to a big enough number. -另一种方法是对超级表进行查询。这样,无论有多少电表,都只需一次查询: +A better way is to query on the stable, only one `select` is enough regardless of the number of meters, like below: ```sql select * from meters where ts > {last_timestamp} and current > 10; ``` -但是,如何选择 `last_timestamp` 就成了一个新的问题。因为,一方面数据的产生时间(也就是数据时间戳)和数据入库的时间一般并不相同,有时偏差还很大;另一方面,不同电表的数据到达 TDengine 的时间也会有差异。所以,如果我们在查询中使用最慢的那台电表的数据的时间戳作为 `last_timestamp`,就可能重复读入其它电表的数据;如果使用最快的电表的时间戳,其它电表的数据就可能被漏掉。 +However, how to choose `last_timestamp` becomes a new problem if using this way. Firstly, the timestamp when the data is generated is different from the timestamp when the data is inserted into the database, sometimes the difference between them may be very big. Secondly, the time when the data from different meters may arrives at the database may be different too. If the timestamp of the "slowest" meter is used as `last_timestamp` in the query, the data from other meters may be selected repeatedly; but if the timestamp of the "fasted" meters is used as `last_timestamp`, some data from other meters may be missed. -TDengine 的订阅功能为上面这个问题提供了一个彻底的解决方案。 +All the problems mentioned above can be resolved thoroughly using subscription provided by TDengine. -首先是使用 `taos_subscribe` 创建订阅: +The first step is to create subscription using `taos_subscribe`. ```c TAOS_SUB* tsub = NULL; @@ -63,31 +65,31 @@ if (async) { } ``` -TDengine 中的订阅既可以是同步的,也可以是异步的,上面的代码会根据从命令行获取的参数 `async` 的值来决定使用哪种方式。这里,同步的意思是用户程序要直接调用 `taos_consume` 来拉取数据,而异步则由 API 在内部的另一个线程中调用 `taos_consume`,然后把拉取到的数据交给回调函数 `subscribe_callback`去处理。(注意,`subscribe_callback` 中不宜做较为耗时的操作,否则有可能导致客户端阻塞等不可控的问题。) +The subscription in TDengine can be either synchronous or asynchronous. In the above sample code, the value of variable `async` is determined from the CLI input, then it's used to create either an async or sync subscription. Sync subscription means the client program needs to invoke `taos_consume` to retrieve data, and async subscription means another thread created by `taos_subscribe` internally invokes `taos_consume` to retrieve data and pass the data to `subscribe_callback` for processing, `subscribe_callback` is a call back function provided by the client program and it's suggested not to do time consuming operation in the call back function. -参数 `taos` 是一个已经建立好的数据库连接,在同步模式下无特殊要求。但在异步模式下,需要注意它不会被其它线程使用,否则可能导致不可预计的错误,因为回调函数在 API 的内部线程中被调用,而 TDengine 的部分 API 不是线程安全的。 +The parameter `taos` is an established connection. There is nothing special in sync subscription mode. In async subscription, it should be exclusively by current thread, otherwise unpredictable error may occur. -参数 `sql` 是查询语句,可以在其中使用 where 子句指定过滤条件。在我们的例子中,如果只想订阅电流超过 10A 时的数据,可以这样写: +The parameter `sql` is a `select` statement in which `where` clause can be used to specify filter conditions. In our example, the data whose current exceeds 10A needs to be subscribed like below SQL statement: ```sql select * from meters where current > 10; ``` -注意,这里没有指定起始时间,所以会读到所有时间的数据。如果只想从一天前的数据开始订阅,而不需要更早的历史数据,可以再加上一个时间条件: +Please be noted that, all the data will be processed because no start time is specified. If only the data from one day ago needs to be processed, a time related condition can be added: ```sql select * from meters where ts > now - 1d and current > 10; ``` -订阅的 `topic` 实际上是它的名字,因为订阅功能是在客户端 API 中实现的,所以没必要保证它全局唯一,但需要它在一台客户端机器上唯一。 +The parameter `topic` is the name of the subscription, it needs to be guaranteed unique in the client program, but it's not necessary to be globally unique because subscription is implemented in the APIs on client side. -如果名为 `topic` 的订阅不存在,参数 `restart` 没有意义;但如果用户程序创建这个订阅后退出,当它再次启动并重新使用这个 `topic` 时,`restart` 就会被用于决定是从头开始读取数据,还是接续上次的位置进行读取。本例中,如果 `restart` 是 **true**(非零值),用户程序肯定会读到所有数据。但如果这个订阅之前就存在了,并且已经读取了一部分数据,且 `restart` 是 **false**(**0**),用户程序就不会读到之前已经读取的数据了。 +If the subscription named as `topic` doesn't exist, parameter `restart` would be ignored. If the subscription named as `topic` has been created before by the client program which then exited, when the client program is restarted to use this `topic`, parameter `restart` is used to determine retrieving data from beginning or from the last point where the subscription was broken. If the value of `restart` is **true** (i.e. a non-zero value), the data will be retrieved from beginning, or if it is **false** (i.e. zero), the data already consumed before will not be processed again. -`taos_subscribe`的最后一个参数是以毫秒为单位的轮询周期。在同步模式下,如果前后两次调用 `taos_consume` 的时间间隔小于此时间,`taos_consume` 会阻塞,直到间隔超过此时间。异步模式下,这个时间是两次调用回调函数的最小时间间隔。 +The last parameter of `taos_subscribe` is the polling interval in unit of millisecond. In sync mode, if the time difference between two continuous invocations to `taos_consume` is smaller than the interval specified by `taos_subscribe`, `taos_consume` would be blocked until the interval is reached. In async mode, this interval is the minimum interval between two invocations to the call back function. -`taos_subscribe` 的倒数第二个参数用于用户程序向回调函数传递附加参数,订阅 API 不对其做任何处理,只原样传递给回调函数。此参数在同步模式下无意义。 +The last second parameter of `taos_subscribe` is used to pass arguments to the call back function. `taos_subscribe` doesn't process this parameter and simply passes it to the call back function. This parameter is simply ignored in sync mode. -订阅创建以后,就可以消费其数据了,同步模式下,示例代码是下面的 else 部分: +After a subscription is created, its data can be consumed and processed, below is the sample code of how to consume data in sync mode, in the else part if `if (async)`. ```c if (async) { @@ -104,7 +106,7 @@ if (async) { } ``` -这里是一个 **while** 循环,用户每按一次回车键就调用一次 `taos_consume`,而 `taos_consume` 的返回值是查询到的结果集,与 `taos_use_result` 完全相同,例子中使用这个结果集的代码是函数 `print_result`: +In the above sample code, there is an infinite loop, each time carriage return is entered `taos_consume` is invoked, the return value of `taos_consume` is the selected result set, exactly as the input of `taos_use_result`, in the above sample `print_result` is used instead to simplify the sample. Below is the implementation of `print_result`. ```c void print_result(TAOS_RES* res, int blockFetch) { @@ -131,7 +133,9 @@ void print_result(TAOS_RES* res, int blockFetch) { } ``` -其中的 `taos_print_row` 用于处理订阅到数据,在我们的例子中,它会打印出所有符合条件的记录。而异步模式下,消费订阅到的数据则显得更为简单: +In the above code `taos_print_row` is used to process the data consumed. All the matching rows will be printed. + +In async mode, the data consuming is simpler as below. ```c void subscribe_callback(TAOS_SUB* tsub, TAOS_RES *res, void* param, int code) { @@ -139,44 +143,43 @@ void subscribe_callback(TAOS_SUB* tsub, TAOS_RES *res, void* param, int code) { } ``` -当要结束一次数据订阅时,需要调用 `taos_unsubscribe`: +`taos_unsubscribe` can be invoked to terminate a subscription. ```c taos_unsubscribe(tsub, keep); ``` -其第二个参数,用于决定是否在客户端保留订阅的进度信息。如果这个参数是**false**(**0**),那无论下次调用 `taos_subscribe` 时的 `restart` 参数是什么,订阅都只能重新开始。另外,进度信息的保存位置是 _{DataDir}/subscribe/_ 这个目录下,每个订阅有一个与其 `topic` 同名的文件,删掉某个文件,同样会导致下次创建其对应的订阅时只能重新开始。 +The second parameter `keep` is used to specify whether to keep the subscription progress on the client sde. If it is **false**, i.e. **0**, then subscription will be restarted from beginning regardless of the `restart` parameter's value in when `taos_subscribe` is invoked again. The subscription progress information is stored in _{DataDir}/subscribe/_ , under which there is a file with same name as `topic` for each subscription, the subscription will be restarted from beginning if the corresponding progress file is removed. -代码介绍完毕,我们来看一下实际的运行效果。假设: +Now let's see the effect of the above sample code, assuming below prerequisites have been done. -- 示例代码已经下载到本地 -- TDengine 也已经在同一台机器上安装好 -- 示例所需的数据库、超级表、子表已经全部创建好 +- The sample code has been downloaded to local system 示 +- TDengine has been installed and launched properly on same system +- The database, stable, sub tables required in the sample code have been ready -则可以在示例代码所在目录执行以下命令来编译并启动示例程序: +It's ready to launch below command in the directory where the sample code resides to compile and start the program. ```bash make ./subscribe -sql='select * from meters where current > 10;' ``` -示例程序启动后,打开另一个终端窗口,启动 TDengine CLI 向 **D1001** 插入一条电流为 12A 的数据: +After the program is started, open another terminal and launch TDengine CLI `taos`, then use below SQL commands to insert a row whose current is 12A into table **D1001**. ```sql -$ taos -> use test; -> insert into D1001 values(now, 12, 220, 1); +use test; +insert into D1001 values(now, 12, 220, 1); ``` -这时,因为电流超过了 10A,您应该可以看到示例程序将它输出到了屏幕上。您可以继续插入一些数据观察示例程序的输出。 +Then, this row of data will be shown by the example program on the first terminal because its current exceeds 10A. More data can be inserted for you to observe the output of the example program. -## 示例程序 +## Examples -下面的示例程序展示是如何使用连接器订阅所有电流超过 10A 的记录。 +Below example program demonstrates how to subscribe the data rows whose current exceeds 10A using connectors. -### 准备数据 +### Prepare Data -``` +```bash # create database "power" taos> create database power; # use "power" as the database in following operations @@ -200,20 +203,21 @@ taos> select * from meters where current > 10; 2020-08-15 12:20:00.000 | 12.20000 | 220 | 1 | Beijing.Chaoyang | 2 | Query OK, 5 row(s) in set (0.004896s) ``` -### 示例代码 + +### Example Programs - + - + {/* */} - + {/* @@ -222,13 +226,13 @@ Query OK, 5 row(s) in set (0.004896s) */} - - + + -### 运行示例程序 - -示例程序会先消费符合查询条件的所有历史数据: +### Run the Examples + +The example programs firstly consume all historical data matching the criteria. ```bash ts: 1597464000000 current: 12.0 voltage: 220 phase: 1 location: Beijing.Chaoyang groupid : 2 @@ -238,7 +242,7 @@ ts: 1597464600000 current: 10.3 voltage: 220 phase: 1 location: Beijing.Haidian ts: 1597465200000 current: 11.2 voltage: 220 phase: 1 location: Beijing.Haidian groupid : 2 ``` -接着,使用 TDengine CLI 向表中新增一条数据: +Next, use TDengine CLI to insert a new row. ``` # taos @@ -246,7 +250,7 @@ taos> use power; taos> insert into d1001 values(now, 12.4, 220, 1); ``` -因为这条数据的电流大于 10A,示例程序会将其消费: +Because the current in inserted row exceeds 10A, it will be consumed by the example program. ``` ts: 1651146662805 current: 12.4 voltage: 220 phase: 1 location: Beijing.Chaoyang groupid: 2 diff --git a/docs-en/04-develop/07-cache.md b/docs-en/04-develop/07-cache.md index fd31335310..3148d84abe 100644 --- a/docs-en/04-develop/07-cache.md +++ b/docs-en/04-develop/07-cache.md @@ -1,21 +1,19 @@ --- -sidebar_label: 缓存 -title: 缓存 -description: "提供写驱动的缓存管理机制,将每个表最近写入的一条记录持续保存在缓存中,可以提供高性能的最近状态查询。" +sidebar_label: Cache +title: Cache +description: "The latest row of each table is kept in cache to provide high performance query of latest state." --- -TDengine 采用时间驱动缓存管理策略(First-In-First-Out,FIFO),又称为写驱动的缓存管理机制。这种策略有别于读驱动的数据缓存模式(Least-Recent-Used,LRU),直接将最近写入的数据保存在系统的缓存中。当缓存达到临界值的时候,将最早的数据批量写入磁盘。一般意义上来说,对于物联网数据的使用,用户最为关心最近产生的数据,即当前状态。TDengine 充分利用了这一特性,将最近到达的(当前状态)数据保存在缓存中。 +The cache management policy in TDengine is First-In-First-Out (FIFO), which is also known as insert driven cache management policy and different from read driven cache management, i.e. Least-Recent-Used (LRU). It simply stores the latest data in cache and flushes the oldest data in cache to disk when the cache usage reaches a threshold. In IoT use cases, the most cared about data is the latest data, i.e. current state. The cache policy in TDengine is based the nature of IoT data. -TDengine 通过查询函数向用户提供毫秒级的数据获取能力。直接将最近到达的数据保存在缓存中,可以更加快速地响应用户针对最近一条或一批数据的查询分析,整体上提供更快的数据库查询响应能力。从这个意义上来说,可通过设置合适的配置参数将 TDengine 作为数据缓存来使用,而不需要再部署额外的缓存系统,可有效地简化系统架构,降低运维的成本。需要注意的是,TDengine 重启以后系统的缓存将被清空,之前缓存的数据均会被批量写入磁盘,缓存的数据将不会像专门的 key-value 缓存系统再将之前缓存的数据重新加载到缓存中。 +Caching the latest data provides the capability of retrieving data in milliseconds. With this capability, TDengine can be configured properly to be used as caching system without deploying another separate caching system to simplify the system architecture and minimize the operation cost. The cache will be emptied after TDengine is restarted, TDengine doesn't reload data from disk into cache like a real key-value caching system. -TDengine 分配固定大小的内存空间作为缓存空间,缓存空间可根据应用的需求和硬件资源配置。通过适当的设置缓存空间,TDengine 可以提供极高性能的写入和查询的支持。TDengine 中每个虚拟节点(virtual node)创建时分配独立的缓存池。每个虚拟节点管理自己的缓存池,不同虚拟节点间不共享缓存池。每个虚拟节点内部所属的全部表共享该虚拟节点的缓存池。 +The memory space used by TDengine cache is fixed in size, according to the configuration based on application requirement and system resources. Independent memory pool is allocated for and managed by each vnode (virtual node) in TDengine, there is no sharing of memory pools between vnodes. All the tables belonging to a vnode share all the cache memory of the vnode. -TDengine 将内存池按块划分进行管理,数据在内存块里是以行(row)的形式存储。一个 vnode 的内存池是在 vnode 创建时按块分配好,而且每个内存块按照先进先出的原则进行管理。在创建内存池时,块的大小由系统配置参数 cache 决定;每个 vnode 中内存块的数目则由配置参数 blocks 决定。因此对于一个 vnode,总的内存大小为:`cache * blocks`。一个 cache block 需要保证每张表能存储至少几十条以上记录,才会有效率。 +Memory pool is divided into blocks and data is stored in row format in memory and each block follows FIFO policy. The size of each block is determined by configuration parameter `cache`, the number of blocks for each vnode is determined by `blocks`. For each vnode, the total cache size is `cache * blocks`. It's better to set the size of each block to hold at least tends of rows. -你可以通过函数 last_row() 快速获取一张表或一张超级表的最后一条记录,这样很便于在大屏显示各设备的实时状态或采集值。例如: +`last_row` function can be used to retrieve the last row of a table or a stable to quickly show the current state of devices on monitoring screen. For example below SQL statement retrieves the latest voltage of all meters in Chaoyang district of Beijing. ```sql select last_row(voltage) from meters where location='Beijing.Chaoyang'; ``` - -该 SQL 语句将获取所有位于北京朝阳区的电表最后记录的电压值。 diff --git a/docs-en/04-develop/08-udf.md b/docs-en/04-develop/08-udf.md index 09681650db..893eba80bb 100644 --- a/docs-en/04-develop/08-udf.md +++ b/docs-en/04-develop/08-udf.md @@ -1,180 +1,190 @@ --- -sidebar_label: 用户定义函数 -title: UDF(用户定义函数) -description: "支持用户编码的聚合函数和标量函数,在查询中嵌入并使用用户定义函数,拓展查询的能力和功能。" +sidebar_label: UDF +title: User Defined Functions +description: "Scalar functions and aggregate functions developed by users can be utilized by the query framework to expand the query capability" --- -在有些应用场景中,应用逻辑需要的查询无法直接使用系统内置的函数来表示。利用 UDF 功能,TDengine 可以插入用户编写的处理代码并在查询中使用它们,就能够很方便地解决特殊应用场景中的使用需求。 UDF 通常以数据表中的一列数据做为输入,同时支持以嵌套子查询的结果作为输入。 +In some use cases, the query capability required by application programs can't be achieved directly by builtin functions. With UDF, the functions developed by users can be utilized by query framework to meet some special requirements. UDF normally takes one column of data as input, but can also support the result of sub query as input. -从 2.2.0.0 版本开始,TDengine 支持通过 C/C++ 语言进行 UDF 定义。接下来结合示例讲解 UDF 的使用方法。 +From version 2.2.0.0, UDF programmed in C/C++ language can be supported by TDengine. -用户可以通过 UDF 实现两类函数: 标量函数 和 聚合函数。 +Two kinds of functions can be implemented by UDF: scalar function and aggregate function. -## 用 C/C++ 语言来定义 UDF +## Define UDF -### 标量函数 +### Scalar Function -用户可以按照下列函数模板定义自己的标量计算函数 +Below function template can be used to define your own scalar function. - `void udfNormalFunc(char* data, short itype, short ibytes, int numOfRows, long long* ts, char* dataOutput, char* interBuf, char* tsOutput, int* numOfOutput, short otype, short obytes, SUdfInit* buf)` - - 其中 udfNormalFunc 是函数名的占位符,以上述模板实现的函数对行数据块进行标量计算,其参数项是固定的,用于按照约束完成与引擎之间的数据交换。 +`void udfNormalFunc(char* data, short itype, short ibytes, int numOfRows, long long* ts, char* dataOutput, char* interBuf, char* tsOutput, int* numOfOutput, short otype, short obytes, SUdfInit* buf)` -- udfNormalFunc 中各参数的具体含义是: - - data:输入数据。 - - itype:输入数据的类型。这里采用的是短整型表示法,与各种数据类型对应的值可以参见 [column_meta 中的列类型说明](/reference/rest-api/)。例如 4 用于表示 INT 型。 - - iBytes:输入数据中每个值会占用的字节数。 - - numOfRows:输入数据的总行数。 - - ts:主键时间戳在输入中的列数据(只读)。 - - dataOutput:输出数据的缓冲区,缓冲区大小为用户指定的输出类型大小 \* numOfRows。 - - interBuf:中间计算结果的缓冲区,大小为用户在创建 UDF 时指定的 BUFSIZE 大小。通常用于计算中间结果与最终结果不一致时使用,由引擎负责分配与释放。 - - tsOutput:主键时间戳在输出时的列数据,如果非空可用于输出结果对应的时间戳。 - - numOfOutput:输出结果的个数(行数)。 - - oType:输出数据的类型。取值含义与 itype 参数一致。 - - oBytes:输出数据中每个值占用的字节数。 - - buf:用于在 UDF 与引擎间的状态控制信息传递块。 +`udfNormalFunc` is the place holder of function name, a function implemented based on the above template can be used to perform scalar computation on data rows. The parameters are fixed to control the data exchange between UDF and TDengine. - [add_one.c](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/add_one.c) 是结构最简单的 UDF 实现,也即上面定义的 udfNormalFunc 函数的一个具体实现。其功能为:对传入的一个数据列(可能因 WHERE 子句进行了筛选)中的每一项,都输出 +1 之后的值,并且要求输入的列数据类型为 INT。 +- Defintions of the parameters: -### 聚合函数 + - data:input data + - itype:the type of input data, for details please refer to [type definition in column_meta](/reference/rest-api/), for example 4 represents INT + - iBytes:the number of bytes consumed by each value in the input data + - oType:the type of output data, similar to iType + - oBytes:the number of bytes consumed by each value in the output data + - numOfRows:the number of rows in the input data + - ts: the column of timestamp corresponding to the input data + - dataOutput:the buffer for output data, total size is `oBytes * numberOfRows` + - interBuf:the buffer for intermediate result, its size is specified by `BUFSIZE` parameter when creating a UDF. It's normally used when the intermediate result is not same as the final result, it's allocated and freed by TDengine. + - tsOutput:the column of timestamps corresponding to the output data; it can be used to output timestamp together with the output data if it's not NULL + - numOfOutput:the number of rows in output data + - buf:for the state exchange between UDF and TDengine -用户可以按照如下函数模板定义自己的聚合函数。 + [add_one.c](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/add_one.c) is one example of the simplest UDF implementations, i.e. one instance of the above `udfNormalFunc` template. It adds one to each value of a column passed in which can be filtered using `where` clause and outputs the result. + +### Aggregate Function + +Below function template can be used to define your own aggregate function. `void abs_max_merge(char* data, int32_t numOfRows, char* dataOutput, int32_t* numOfOutput, SUdfInit* buf)` -其中 udfMergeFunc 是函数名的占位符,以上述模板实现的函数用于对计算中间结果进行聚合,只有针对超级表的聚合查询才需要调用该函数。其中各参数的具体含义是: +`udfMergeFunc` is the place holder of function name, the function implemented with the above template is used to aggregate the intermediate result, only can be used in the aggregate query for stable. - - data:udfNormalFunc 的输出数据数组,如果使用了 interBuf 那么 data 就是 interBuf 的数组。 - - numOfRows:data 中数据的行数。 - - dataOutput:输出数据的缓冲区,大小等于一条最终结果的大小。如果此时输出还不是最终结果,可以选择输出到 interBuf 中即 data 中。 - - numOfOutput:输出结果的个数(行数)。 - - buf:用于在 UDF 与引擎间的状态控制信息传递块。 +Definitions of the parameters: -[abs_max.c](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/abs_max.c) 实现的是一个聚合函数,功能是对一组数据按绝对值取最大值。 +- data:array of output data, if interBuf is used it's an array of interBuf +- numOfRows:number of rows in `data` +- dataOutput:the buffer for output data, the size is same as that of the final result; If the result is not final, it can be put in the interBuf, i.e. `data`. +- numOfOutput:number of rows in the output data +- buf:for the state exchange between UDF and TDengine -其计算过程为:与所在查询语句相关的数据会被分为多个行数据块,对每个行数据块调用 udfNormalFunc(在本例的实现代码中,实际函数名是 `abs_max`)来生成每个子表的中间结果,再将子表的中间结果调用 udfMergeFunc(本例中,其实际的函数名是 `abs_max_merge`)进行聚合,生成超级表的最终聚合结果或中间结果。聚合查询最后还会通过 udfFinalizeFunc(本例中,其实际的函数名是 `abs_max_finalize`)再把超级表的中间结果处理为最终结果,最终结果只能含 0 或 1 条结果数据。 +[abs_max.c](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/abs_max.c) is an user defined aggregate function to get the maximum from the absolute value of a column. -其他典型场景,如协方差的计算,也可通过定义聚合 UDF 的方式实现。 +The internal processing is that the data affected by the select statement will be divided into multiple row blocks and `udfNormalFunc`, i.e. `abs_max` in this case, is performed on each row block to generate the intermediate of each sub table, then `udfMergeFunc`, i.e. `abs_max_merge` in this case, is performed on the intermediate result of sub tables to aggregate to generate the final or intermediate result of stable. The intermediate result of stable is finally processed by `udfFinalizeFunc` to generate the final result, which contain either 0 or 1 row. -### 最终计算 +Other typical scenarios, like covariance, can also be achieved by aggregate UDF. -用户可以按下面的函数模板实现自己的函数对计算结果进行最终计算,通常用于有 interBuf 使用的场景。 +### Finalize + +Below function template can be used to finalize the result of your own UDF, normally used when interBuf is used. `void abs_max_finalize(char* dataOutput, char* interBuf, int* numOfOutput, SUdfInit* buf)` -其中 udfFinalizeFunc 是函数名的占位符 ,其中各参数的具体含义是: - - dataOutput:输出数据的缓冲区。 - - interBuf:中间结算结果缓冲区,可作为输入。 - - numOfOutput:输出数据的个数,对聚合函数来说只能是 0 或者 1。 - - buf:用于在 UDF 与引擎间的状态控制信息传递块。 +`udfFinalizeFunc` is the place holder of function name, definitions of the parameter are as below: -## UDF 实现方式的规则总结 +- dataOutput:buffer for output data +- interBuf:buffer for intermediate result, can be used as input for next processing step +- numOfOutput:number of output data, can only be 0 or 1 for aggregate function +- buf:for state exchange between UDF and TDengine -三类 UDF 函数: udfNormalFunc、udfMergeFunc、udfFinalizeFunc ,其函数名约定使用相同的前缀,此前缀即 udfNormalFunc 的实际函数名,也即 udfNormalFunc 函数不需要在实际函数名后添加后缀;而udfMergeFunc 的函数名要加上后缀 `_merge`、udfFinalizeFunc 的函数名要加上后缀 `_finalize`,这是 UDF 实现规则的一部分,系统会按照这些函数名后缀来调用相应功能。 +## UDF Conventions -根据 UDF 函数类型的不同,用户所要实现的功能函数也不同: +The naming of 3 kinds of UDF, i.e. udfNormalFunc, udfMergeFunc, and udfFinalizeFunc is required to have same prefix, i.e. the actual name of udfNormalFunc, which means udfNormalFunc doesn't need a suffix following the function name. While udfMergeFunc should be udfNormalFunc followed by `_merge`, udfFinalizeFunc should be udfNormalFunc followed by `_finalize`. The naming convention is part of UDF framework, TDengine follows this convention to invoke corresponding actual functions.\ -- 标量函数:UDF 中需实现 udfNormalFunc。 -- 聚合函数:UDF 中需实现 udfNormalFunc、udfMergeFunc(对超级表查询)、udfFinalizeFunc。 +According to the kind of UDF to implement, the functions that need to be implemented are different. -:::note -如果对应的函数不需要具体的功能,也需要实现一个空函数。 +- Scalar function:udfNormalFunc is required +- Aggregate function:udfNormalFunc, udfMergeFunc (if query on stable) and udfFinalizeFunc are required -::: +To be more accurate, assuming we want to implement a UDF named "foo". If the function is a scalar function, what we really need to implement is `foo`; if the function is aggregate function, we need to implement `foo`, `foo_merge`, and `foo_finalize`. For aggregate UDF, even though one of the three functions is not necessary, there must be an empty implementation. -## 编译 UDF +## Compile UDF -用户定义函数的 C 语言源代码无法直接被 TDengine 系统使用,而是需要先编译为 动态链接库,之后才能载入 TDengine 系统。 +The source code of UDF in C can't be utilized by TDengine directly. UDF can only be loaded into TDengine after compiling to dynamically linked library. -例如,按照上一章节描述的规则准备好了用户定义函数的源代码 add_one.c,以 Linux 为例可以执行如下指令编译得到动态链接库文件: +For example, the example UDF `add_one.c` mentioned in previous sections need to be compiled into DLL using below command on Linux Shell. ```bash gcc -g -O0 -fPIC -shared add_one.c -o add_one.so ``` -这样就准备好了动态链接库 add_one.so 文件,可以供后文创建 UDF 时使用了。为了保证可靠的系统运行,编译器 GCC 推荐使用 7.5 及以上版本。 +The generated DLL file `dd_one.so` can be used later when creating UDF. It's recommended to use GCC not older than 7.5. + +## Create and Use UDF -## 在系统中管理和使用 UDF +### Create UDF -### 创建 UDF +SQL command can be executed on the same hos where the generated UDF DLL resides to load the UDF DLL into TDengine, this operation can't be done through REST interface or web console. Once created, all the clients of the current TDengine can use these UDF functions in their SQL commands. UDF are stored in the management node of TDengine. The UDFs loaded in TDengine would be still available after TDengine is restarted. -用户可以通过 SQL 指令在系统中加载客户端所在主机上的 UDF 函数库(不能通过 RESTful 接口或 HTTP 管理界面来进行这一过程)。一旦创建成功,则当前 TDengine 集群的所有用户都可以在 SQL 指令中使用这些函数。UDF 存储在系统的 MNode 节点上,因此即使重启 TDengine 系统,已经创建的 UDF 也仍然可用。 +When creating UDF, it needs to be clarified as either scalar function or aggregate function. If the specified type is wrong, the SQL statements using the function would fail with error. Besides, the input type and output type don't need to be same in UDF, but the input data type and output data type need to be consistent with the UDF definition. -在创建 UDF 时,需要区分标量函数和聚合函数。如果创建时声明了错误的函数类别,则可能导致通过 SQL 指令调用函数时出错。此外, UDF 支持输入与输出类型不一致,用户需要保证输入数据类型与 UDF 程序匹配,UDF 输出数据类型与 OUTPUTTYPE 匹配。 +- Create Scalar Function -- 创建标量函数 ```sql CREATE FUNCTION ids(X) AS ids(Y) OUTPUTTYPE typename(Z) [ BUFSIZE B ]; ``` - - ids(X):标量函数未来在 SQL 指令中被调用时的函数名,必须与函数实现中 udfNormalFunc 的实际名称一致; - - ids(Y):包含 UDF 函数实现的动态链接库的库文件绝对路径(指的是库文件在当前客户端所在主机上的保存路径,通常是指向一个 .so 文件),这个路径需要用英文单引号或英文双引号括起来; - - typename(Z):此函数计算结果的数据类型,与上文中 udfNormalFunc 的 itype 参数不同,这里不是使用数字表示法,而是直接写类型名称即可; - - B:中间计算结果的缓冲区大小,单位是字节,最小 0,最大 512,如果不使用可以不设置。 +- ids(X):the function name to be sued in SQL statement, must be consistent with the function name defined by `udfNormalFunc` +- ids(Y):the absolute path of the DLL file including the implementation of the UDF, the path needs to be quoted by single or double quotes +- typename(Z):the output data type, the value is the literal string of the type +- B:the size of intermediate buffer, in bytes; it's an optional parameter and the range is [0,512] - 例如,如下语句可以把 add_one.so 创建为系统中可用的 UDF: +For example, below SQL statement can be used to create a UDF from `add_one.so`. + +```sql +CREATE FUNCTION add_one AS "/home/taos/udf_example/add_one.so" OUTPUTTYPE INT; +``` - ```sql - CREATE FUNCTION add_one AS "/home/taos/udf_example/add_one.so" OUTPUTTYPE INT; - ``` +- Create Aggregate Function -- 创建聚合函数: ```sql CREATE AGGREGATE FUNCTION ids(X) AS ids(Y) OUTPUTTYPE typename(Z) [ BUFSIZE B ]; ``` - - ids(X):聚合函数未来在 SQL 指令中被调用时的函数名,必须与函数实现中 udfNormalFunc 的实际名称一致; - - ids(Y):包含 UDF 函数实现的动态链接库的库文件绝对路径(指的是库文件在当前客户端所在主机上的保存路径,通常是指向一个 .so 文件),这个路径需要用英文单引号或英文双引号括起来; - - typename(Z):此函数计算结果的数据类型,与上文中 udfNormalFunc 的 itype 参数不同,这里不是使用数字表示法,而是直接写类型名称即可; - - B:中间计算结果的缓冲区大小,单位是字节,最小 0,最大 512,如果不使用可以不设置。 +- ids(X):the function name to be sued in SQL statement, must be consistent with the function name defined by `udfNormalFunc` +- ids(Y):the absolute path of the DLL file including the implementation of the UDF, the path needs to be quoted by single or double quotes +- typename(Z):the output data type, the value is the literal string of the type 此 +- B:the size of intermediate buffer, in bytes; it's an optional parameter and the range is [0,512] - 关于中间计算结果的使用,可以参考示例程序[demo.c](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/demo.c) +For details about how to use intermediate result, please refer to example program [demo.c](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/demo.c). - 例如,如下语句可以把 demo.so 创建为系统中可用的 UDF: +For example, below SQL statement can be used to create a UDF rom `demo.so`. - ```sql - CREATE AGGREGATE FUNCTION demo AS "/home/taos/udf_example/demo.so" OUTPUTTYPE DOUBLE bufsize 14; - ``` +```sql +CREATE AGGREGATE FUNCTION demo AS "/home/taos/udf_example/demo.so" OUTPUTTYPE DOUBLE bufsize 14; +``` -### 管理 UDF +### Manage UDF + +- Delete UDF -- 删除指定名称的用户定义函数: ``` DROP FUNCTION ids(X); ``` -- ids(X):此参数的含义与 CREATE 指令中的 ids(X) 参数一致,也即要删除的函数的名字,例如 +- ids(X):same as that in `CREATE FUNCTION` statement + ```sql DROP FUNCTION add_one; ``` -- 显示系统中当前可用的所有 UDF: + +- Show Available UDF + ```sql SHOW FUNCTIONS; ``` -### 调用 UDF +### Use UDF + +The function name specified when creating UDF can be used directly in SQL statements, just like builtin functions. -在 SQL 指令中,可以直接以在系统中创建 UDF 时赋予的函数名来调用用户定义函数。例如: ```sql SELECT X(c) FROM table/stable; ``` -表示对名为 c 的数据列调用名为 X 的用户定义函数。SQL 指令中用户定义函数可以配合 WHERE 等查询特性来使用。 +The above SQL statement invokes function X for column c. -## UDF 的一些使用限制 +## Restrictions for UDF -在当前版本下,使用 UDF 存在如下这些限制: +In current version there are some restrictions for UDF -1. 在创建和调用 UDF 时,服务端和客户端都只支持 Linux 操作系统; -2. UDF 不能与系统内建的 SQL 函数混合使用,暂不支持在一条 SQL 语句中使用多个不同名的 UDF ; -3. UDF 只支持以单个数据列作为输入; -4. UDF 只要创建成功,就会被持久化存储到 MNode 节点中; -5. 无法通过 RESTful 接口来创建 UDF; -6. UDF 在 SQL 中定义的函数名,必须与 .so 库文件实现中的接口函数名前缀保持一致,也即必须是 udfNormalFunc 的名称,而且不可与 TDengine 中已有的内建 SQL 函数重名。 +1. Only Linux is supported when creating and invoking UDF for both client side and server side +2. UDF can't be mixed with builtin functions +3. Only one UDF can be used in a SQL statement +4. Single column is supported as input for UDF +5. Once created successfully, UDF is persisted in MNode of TDengineUDF +6. UDF can't be created through REST interface +7. The function name used when creating UDF in SQL must be consistent with the function name defined in the DLL, i.e. the name defined by `udfNormalFunc` +8. The name name of UDF name should not conflict with any of builtin functions -## 示例代码 +## Examples -### 标量函数示例 [add_one](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/add_one.c) +### Scalar function example [add_one](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/add_one.c)
add_one.c @@ -185,7 +195,7 @@ SELECT X(c) FROM table/stable;
-### 向量函数示例 [abs_max](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/abs_max.c) +### Aggregate function example [abs_max](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/abs_max.c)
abs_max.c @@ -196,7 +206,7 @@ SELECT X(c) FROM table/stable;
-### 使用中间计算结果示例 [demo](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/demo.c) +### Example for using intermediate result [demo](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/demo.c)
demo.c -- GitLab