未验证 提交 598e1a8b 编写于 作者: W wade zhang 提交者: GitHub

Merge pull request #21865 from taosdata/fix/sunpeng/version-history-for-python-connector

docs: add version history for python connector
......@@ -33,7 +33,7 @@ The below SQL statement is used to insert one row into table "d1001".
INSERT INTO d1001 VALUES (ts1, 10.3, 219, 0.31);
```
`ts1` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detial, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
`ts1` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detail, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
### Insert Multiple Rows
......@@ -43,7 +43,7 @@ Multiple rows can be inserted in a single SQL statement. The example below inser
INSERT INTO d1001 VALUES (ts2, 10.2, 220, 0.23) (ts2, 10.3, 218, 0.25);
```
`ts1` and `ts2` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detial, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
`ts1` and `ts2` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detail, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
### Insert into Multiple Tables
......@@ -53,7 +53,7 @@ Data can be inserted into multiple tables in the same SQL statement. The example
INSERT INTO d1001 VALUES (ts1, 10.3, 219, 0.31) (ts2, 12.6, 218, 0.33) d1002 VALUES (ts3, 12.3, 221, 0.31);
```
`ts1`, `ts2` and `ts3` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detial, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
`ts1`, `ts2` and `ts3` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detail, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
For more details about `INSERT` please refer to [INSERT](/taos-sql/insert).
......
......@@ -17,7 +17,7 @@ When you create a user-defined function, you must implement standard interface f
- For aggregate functions, implement the `aggfn_start`, `aggfn`, and `aggfn_finish` interface functions.
- To initialize your function, implement the `udf_init` function. To terminate your function, implement the `udf_destroy` function.
There are strict naming conventions for these interface functions. The names of the start, finish, init, and destroy interfaces must be <udf-name\>_start, <udf-name\>_finish, <udf-name\>_init, and <udf-name\>_destroy, respectively. Replace `scalarfn`, `aggfn`, and `udf` with the name of your user-defined function.
There are strict naming conventions for these interface functions. The names of the start, finish, init, and destroy interfaces must be `_start`, `_finish`, `_init`, and `_destroy`, respectively. Replace `scalarfn`, `aggfn`, and `udf` with the name of your user-defined function.
### Implementing a Scalar Function in C
The implementation of a scalar function is described as follows:
......@@ -318,7 +318,7 @@ The implementation of a scalar UDF is described as follows:
def process(input: datablock) -> tuple[output_type]:
```
Description: this function prcesses datablock, which is the input; you can use datablock.data(row, col) to access the python object at location(row,col); the output is a tuple object consisted of objects of type outputtype
Description: this function processes datablock, which is the input; you can use datablock.data(row, col) to access the python object at location(row,col); the output is a tuple object consisted of objects of type outputtype
#### Aggregate UDF Interface
......@@ -356,7 +356,7 @@ def process(input: datablock) -> tuple[output_type]:
# return tuple object consisted of object of type outputtype
```
Note:process() must be implemeted, init() and destroy() must be defined too but they can do nothing.
Note:process() must be implemented, init() and destroy() must be defined too but they can do nothing.
#### Aggregate Template
......@@ -377,7 +377,7 @@ def finish(buf: bytes) -> output_type:
#return obj of type outputtype
```
Note: aggregate UDF requires init(), destroy(), start(), reduce() and finish() to be impemented. start() generates the initial result in buffer, then the input data is divided into multiple row data blocks, reduce() is invoked for each data block `inputs` and intermediate `buf`, finally finish() is invoked to generate final result from the intermediate result `buf`.
Note: aggregate UDF requires init(), destroy(), start(), reduce() and finish() to be implemented. start() generates the initial result in buffer, then the input data is divided into multiple row data blocks, reduce() is invoked for each data block `inputs` and intermediate `buf`, finally finish() is invoked to generate final result from the intermediate result `buf`.
### Data Mapping between TDengine SQL and Python UDF
......@@ -559,7 +559,7 @@ Note: Prior to TDengine 3.0.5.0 (excluding), updating a UDF requires to restart
#### Sample 3: UDF with n arguments
A UDF which accepts n intergers, likee (x1, x2, ..., xn) and output the sum of the product of each value and its sequence number: 1 * x1 + 2 * x2 + ... + n * xn. If there is `null` in the input, then the result is `null`. The difference from sample 1 is that it can accept any number of columns as input and process each column. Assume the program is written in /root/udf/nsum.py:
A UDF which accepts n integers, likee (x1, x2, ..., xn) and output the sum of the product of each value and its sequence number: 1 * x1 + 2 * x2 + ... + n * xn. If there is `null` in the input, then the result is `null`. The difference from sample 1 is that it can accept any number of columns as input and process each column. Assume the program is written in /root/udf/nsum.py:
```python
def init():
......@@ -607,7 +607,7 @@ Query OK, 4 row(s) in set (0.010653s)
#### Sample 4: Utilize 3rd party package
A UDF which accepts a timestamp and output the next closed Sunday. This sample requires to use third party package `moment`, you need to install it firslty.
A UDF which accepts a timestamp and output the next closed Sunday. This sample requires to use third party package `moment`, you need to install it firstly.
```shell
pip3 install moment
......@@ -701,7 +701,7 @@ Query OK, 4 row(s) in set (1.011474s)
#### Sample 5: Aggregate Function
An aggregate function which calculates the difference of the maximum and the minimum in a column. An aggregate funnction takes multiple rows as input and output only one data. The execution process of an aggregate UDF is like map-reduce, the framework divides the input into multiple parts, each mapper processes one block and the reducer aggregates the result of the mappers. The reduce() of Python UDF has the functionality of both map() and reduce(). The reduce() takes two arguments: the data to be processed; and the result of other tasks executing reduce(). For exmaple, assume the code is in `/root/udf/myspread.py`.
An aggregate function which calculates the difference of the maximum and the minimum in a column. An aggregate funnction takes multiple rows as input and output only one data. The execution process of an aggregate UDF is like map-reduce, the framework divides the input into multiple parts, each mapper processes one block and the reducer aggregates the result of the mappers. The reduce() of Python UDF has the functionality of both map() and reduce(). The reduce() takes two arguments: the data to be processed; and the result of other tasks executing reduce(). For example, assume the code is in `/root/udf/myspread.py`.
```python
import io
......@@ -755,7 +755,7 @@ In this example, we implemented an aggregate function, and added some logging.
2. log() is the function for logging, it converts the input object to string and output with an end of line
3. destroy() closes the log file \
4. start() returns the initial buffer for storing the intermediate result
5. reduce() processes each daa block and aggregates the result
5. reduce() processes each data block and aggregates the result
6. finish() converts the final buffer() to final result\
Create the UDF.
......
......@@ -672,7 +672,7 @@ If you input a specific column, the number of non-null values in the column is r
ELAPSED(ts_primary_key [, time_unit])
```
**Description**: `elapsed` function can be used to calculate the continuous time length in which there is valid data. If it's used with `INTERVAL` clause, the returned result is the calculated time length within each time window. If it's used without `INTERVAL` caluse, the returned result is the calculated time length within the specified time range. Please be noted that the return value of `elapsed` is the number of `time_unit` in the calculated time length.
**Description**: `elapsed` function can be used to calculate the continuous time length in which there is valid data. If it's used with `INTERVAL` clause, the returned result is the calculated time length within each time window. If it's used without `INTERVAL` clause, the returned result is the calculated time length within the specified time range. Please be noted that the return value of `elapsed` is the number of `time_unit` in the calculated time length.
**Return value type**: Double if the input value is not NULL;
......
......@@ -21,7 +21,7 @@ part_list can be any scalar expression, such as a column, constant, scalar funct
A PARTITION BY clause is processed as follows:
- The PARTITION BY clause must occur after the WHERE clause
- The PARTITION BY caluse partitions the data according to the specified dimensions, then perform computation on each partition. The performed computation is determined by the rest of the statement - a window clause, GROUP BY clause, or SELECT clause.
- The PARTITION BY clause partitions the data according to the specified dimensions, then perform computation on each partition. The performed computation is determined by the rest of the statement - a window clause, GROUP BY clause, or SELECT clause.
- The PARTITION BY clause can be used together with a window clause or GROUP BY clause. In this case, the window or GROUP BY clause takes effect on every partition. For example, the following statement partitions the table by the location tag, performs downsampling over a 10 minute window, and returns the maximum value:
```sql
......
......@@ -24,6 +24,16 @@ The source code for the Python connector is hosted on [GitHub](https://github.co
We recommend using the latest version of `taospy`, regardless of the version of TDengine.
|Python Connector Version|major changes|
|:-------------------:|:----:|
|2.7.9|support for getting assignment and seek function on subscription|
|2.7.8|add `execute_many` method|
|Python Websocket Connector Version|major changes|
|:----------------------------:|:-----:|
|0.2.5|1. support for getting assignment and seek function on subscription <br/> 2. support schemaless <br/> 3. support STMT|
|0.2.4|support `unsubscribe` on subscription|
## Handling Exceptions
There are 4 types of exception in python connector.
......
......@@ -17,7 +17,7 @@ TDengine 支持通过 C/Python 语言进行 UDF 定义。接下来结合示例
- 聚合函数需要实现聚合接口函数 aggfn_start , aggfn , aggfn_finish。
- 如果需要初始化,实现 udf_init;如果需要清理工作,实现udf_destroy。
接口函数的名称是 UDF 名称,或者是 UDF 名称和特定后缀(_start, _finish, _init, _destroy)的连接。列表中的scalarfn,aggfn, udf需要替换成udf函数名。
接口函数的名称是 UDF 名称,或者是 UDF 名称和特定后缀(`_start`, `_finish`, `_init`, `_destroy`)的连接。列表中的scalarfn,aggfn, udf需要替换成udf函数名。
### 用 C 语言实现标量函数
标量函数实现模板如下
......
......@@ -25,6 +25,16 @@ Python 连接器的源码托管在 [GitHub](https://github.com/taosdata/taos-con
无论使用什么版本的 TDengine 都建议使用最新版本的 `taospy`。
|Python Connector 版本|主要变化|
|:-------------------:|:----:|
|2.7.9|数据订阅支持获取消费进度和重置消费进度|
|2.7.8|新增 `execute_many`|
|Python Websocket Connector 版本|主要变化|
|:----------------------------:|:-----:|
|0.2.5|1. 数据订阅支持获取消费进度和重置消费进度 <br/> 2. 支持 schemaless <br/> 3. 支持 STMT|
|0.2.4|数据订阅新增取消订阅方法|
## 处理异常
Python 连接器可能会产生 4 种异常:
......@@ -549,7 +559,7 @@ consumer = Consumer({"group.id": "local", "td.connect.ip": "127.0.0.1"})
#### 订阅 topics
Comsumer API 的 `subscribe` 方法用于订阅 topics,consumer 支持同时订阅多个 topic。
Consumer API 的 `subscribe` 方法用于订阅 topics,consumer 支持同时订阅多个 topic。
```python
consumer.subscribe(['topic1', 'topic2'])
......@@ -631,7 +641,7 @@ consumer = taosws.(conf={"group.id": "local", "td.connect.websocket.scheme": "ws
#### 订阅 topics
Comsumer API 的 `subscribe` 方法用于订阅 topics,consumer 支持同时订阅多个 topic。
Consumer API 的 `subscribe` 方法用于订阅 topics,consumer 支持同时订阅多个 topic。
```python
consumer.subscribe(['topic1', 'topic2'])
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册