@@ -10,7 +10,7 @@ User-defined functions can be scalar functions or aggregate functions. Scalar fu
...
@@ -10,7 +10,7 @@ User-defined functions can be scalar functions or aggregate functions. Scalar fu
TDengine supports user-defined functions written in C or Python. This document describes the usage of user-defined functions.
TDengine supports user-defined functions written in C or Python. This document describes the usage of user-defined functions.
# Implement a UDF in C
## Implement a UDF in C
When you create a user-defined function, you must implement standard interface functions:
When you create a user-defined function, you must implement standard interface functions:
- For scalar functions, implement the `scalarfn` interface function.
- For scalar functions, implement the `scalarfn` interface function.
...
@@ -19,7 +19,7 @@ When you create a user-defined function, you must implement standard interface f
...
@@ -19,7 +19,7 @@ When you create a user-defined function, you must implement standard interface f
There are strict naming conventions for these interface functions. The names of the start, finish, init, and destroy interfaces must be <udf-name\>_start, <udf-name\>_finish, <udf-name\>_init, and <udf-name\>_destroy, respectively. Replace `scalarfn`, `aggfn`, and `udf` with the name of your user-defined function.
There are strict naming conventions for these interface functions. The names of the start, finish, init, and destroy interfaces must be <udf-name\>_start, <udf-name\>_finish, <udf-name\>_init, and <udf-name\>_destroy, respectively. Replace `scalarfn`, `aggfn`, and `udf` with the name of your user-defined function.
## Implementing a Scalar Function in C
### Implementing a Scalar Function in C
The implementation of a scalar function is described as follows:
The implementation of a scalar function is described as follows:
```c
```c
#include "taos.h"
#include "taos.h"
...
@@ -102,7 +102,7 @@ int32_t aggfn_destroy() {
...
@@ -102,7 +102,7 @@ int32_t aggfn_destroy() {
```
```
Replace `aggfn` with the name of your function.
Replace `aggfn` with the name of your function.
## C UDF Interface Functions
### UDF Interface Definition in C
There are strict naming conventions for interface functions. The names of the start, finish, init, and destroy interfaces must be <udf-name\>_start, <udf-name\>_finish, <udf-name\>_init, and <udf-name\>_destroy, respectively. Replace `scalarfn`, `aggfn`, and `udf` with the name of your user-defined function.
There are strict naming conventions for interface functions. The names of the start, finish, init, and destroy interfaces must be <udf-name\>_start, <udf-name\>_finish, <udf-name\>_init, and <udf-name\>_destroy, respectively. Replace `scalarfn`, `aggfn`, and `udf` with the name of your user-defined function.
...
@@ -110,8 +110,7 @@ Interface functions return a value that indicates whether the operation was succ
...
@@ -110,8 +110,7 @@ Interface functions return a value that indicates whether the operation was succ
For information about the parameters for interface functions, see Data Model
For information about the parameters for interface functions, see Data Model
Replace `scalarfn` with the name of your function. This function performs scalar calculations on data blocks. You can configure a value through the parameters in the `resultColumn` structure.
Replace `scalarfn` with the name of your function. This function performs scalar calculations on data blocks. You can configure a value through the parameters in the `resultColumn` structure.
...
@@ -120,7 +119,7 @@ The parameters in the function are defined as follows:
...
@@ -120,7 +119,7 @@ The parameters in the function are defined as follows:
- inputDataBlock: The data block to input.
- inputDataBlock: The data block to input.
- resultColumn: The column to output. The column to output.
- resultColumn: The column to output. The column to output.
### Interfaces for C UDF Aggregate Functions
#### Aggregate Interface
`int32_t aggfn_start(SUdfInterBuf *interBuf)`
`int32_t aggfn_start(SUdfInterBuf *interBuf)`
...
@@ -137,7 +136,7 @@ The parameters in the function are defined as follows:
...
@@ -137,7 +136,7 @@ The parameters in the function are defined as follows:
- result: The final result.
- result: The final result.
### C UDF Initializing and Terminating User-Defined Functions
#### Initialization and Cleanup Interface
`int32_t udf_init()`
`int32_t udf_init()`
`int32_t udf_destroy()`
`int32_t udf_destroy()`
...
@@ -145,7 +144,7 @@ The parameters in the function are defined as follows:
...
@@ -145,7 +144,7 @@ The parameters in the function are defined as follows:
Replace `udf` with the name of your function. udf_init initializes the function. udf_destroy terminates the function. If it is not necessary to initialize your function, udf_init is not required. If it is not necessary to terminate your function, udf_destroy is not required.
Replace `udf` with the name of your function. udf_init initializes the function. udf_destroy terminates the function. If it is not necessary to initialize your function, udf_init is not required. If it is not necessary to terminate your function, udf_destroy is not required.
## Data Structure of C User-Defined Functions
### Data Structures for UDF in C
```c
```c
typedefstructSUdfColumnMeta{
typedefstructSUdfColumnMeta{
int16_ttype;
int16_ttype;
...
@@ -203,7 +202,7 @@ The data structure is described as follows:
...
@@ -203,7 +202,7 @@ The data structure is described as follows:
Additional functions are defined in `taosudf.h` to make it easier to work with these structures.
Additional functions are defined in `taosudf.h` to make it easier to work with these structures.
## Compile C UDF
### Compiling C UDF
To use your user-defined function in TDengine, first, compile it to a shared library.
To use your user-defined function in TDengine, first, compile it to a shared library.
The bit_and function implements bitwise addition for multiple columns. If there is only one column, the column is returned. The bit_and function ignores null values.
The bit_and function implements bitwise addition for multiple columns. If there is only one column, the column is returned. The bit_and function ignores null values.
...
@@ -230,7 +229,7 @@ The bit_and function implements bitwise addition for multiple columns. If there
...
@@ -230,7 +229,7 @@ The bit_and function implements bitwise addition for multiple columns. If there
</details>
</details>
### C UDF Sample aggregate function 1: [l2norm](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/l2norm.c)
#### Aggregate function 1: [l2norm](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/l2norm.c)
The l2norm function finds the second-order norm for all data in the input column. This squares the values, takes a cumulative sum, and finds the square root.
The l2norm function finds the second-order norm for all data in the input column. This squares the values, takes a cumulative sum, and finds the square root.
...
@@ -243,7 +242,7 @@ The l2norm function finds the second-order norm for all data in the input column
...
@@ -243,7 +242,7 @@ The l2norm function finds the second-order norm for all data in the input column
</details>
</details>
### C UDF Sample aggregate function 2: [max_vol](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/max_vol.c)
#### Aggregate function 2: [max_vol](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/max_vol.c)
The max_vol function returns a string concatenating the deviceId column, the row number and column number of the maximum voltage and the maximum voltage given several voltage columns as input.
The max_vol function returns a string concatenating the deviceId column, the row number and column number of the maximum voltage and the maximum voltage given several voltage columns as input.
...
@@ -269,14 +268,14 @@ select max_vol(vol1,vol2,vol3,deviceid) from battery;
...
@@ -269,14 +268,14 @@ select max_vol(vol1,vol2,vol3,deviceid) from battery;
</details>
</details>
#Implement a UDF in Python
## Implement a UDF in Python
Implement the specified interface functions when implementing a UDF in Python.
Implement the specified interface functions when implementing a UDF in Python.
- implement `process` function for the scalar UDF。
- implement `process` function for the scalar UDF。
- implement `start`, `reduce`, `finish` for the aggregate UDF。
- implement `start`, `reduce`, `finish` for the aggregate UDF。
- implement `init` for initialization and `destroy` for termination。
- implement `init` for initialization and `destroy` for termination。
## Implement a Scalar UDF in Python
### Implement a Scalar UDF in Python
The implementation of a scalar UDF is described as follows:
The implementation of a scalar UDF is described as follows:
-`input` is a data block two-dimension matrix-like object, of which method `data(row, col)` returns the Python object located at location (`row`, `col`)
-`input` is a data block two-dimension matrix-like object, of which method `data(row, col)` returns the Python object located at location (`row`, `col`)
- return a Python tuple object, of which each item is a Python object of type `output_type`
- return a Python tuple object, of which each item is a Python object of type `output_type`
- finally, the `finish` function is called on the intermediate result `buf` and outputs 0 or 1 data of type `output_type`
- finally, the `finish` function is called on the intermediate result `buf` and outputs 0 or 1 data of type `output_type`
### Python UDF Initialization and Termination
#### Initialization and Cleanup Interface
```Python
```Python
def init()
def init()
def destroy()
def destroy()
```
```
Implement `init` for initialization and `destroy` for termination.
Implement `init` for initialization and `destroy` for termination.
## TDengine SQL data type and Python UDF Data Type Mapping Table
### Data Mapping between TDengine SQL and Python UDF
The following table describes the mapping between TDengine SQL data type and Python UDF Data Type. The `NULL` value of all TDengine SQL types is mapped to the `None` value in Python.
The following table describes the mapping between TDengine SQL data type and Python UDF Data Type. The `NULL` value of all TDengine SQL types is mapped to the `None` value in Python.
...
@@ -354,7 +353,7 @@ The following table describes the mapping between TDengine SQL data type and Pyt
...
@@ -354,7 +353,7 @@ The following table describes the mapping between TDengine SQL data type and Pyt
|TIMESTAMP | int |
|TIMESTAMP | int |
|JSON and other types | Not Supported |
|JSON and other types | Not Supported |
## Python UDF Installation
### Installing Python UDF
1. Install Python package `taospyudf` that executes Python UDF
1. Install Python package `taospyudf` that executes Python UDF
```bash
```bash
sudo pip install taospyudf
sudo pip install taospyudf
...
@@ -362,8 +361,8 @@ ldconfig
...
@@ -362,8 +361,8 @@ ldconfig
```
```
2. If PYTHONPATH is needed to find Python packages when the Python UDF executes, include the PYTHONPATH contents into the udfdLdLibPath variable of the taos.cfg configuration file
2. If PYTHONPATH is needed to find Python packages when the Python UDF executes, include the PYTHONPATH contents into the udfdLdLibPath variable of the taos.cfg configuration file
## Python UDF Sample Code
### Python UDF Sample Code
### Python UDF Scalar Function Sample Code [pybitand](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/pybitand.py)
#### Scalar Function [pybitand](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/pybitand.py)
The `pybitand` function implements bitwise addition for multiple columns. If there is only one column, the column is returned. The `pybitand` function ignores null values.
The `pybitand` function implements bitwise addition for multiple columns. If there is only one column, the column is returned. The `pybitand` function ignores null values.
...
@@ -376,7 +375,7 @@ The `pybitand` function implements bitwise addition for multiple columns. If the
...
@@ -376,7 +375,7 @@ The `pybitand` function implements bitwise addition for multiple columns. If the
</details>
</details>
### Python UDF Aggregate Function Sample Code [pyl2norm](https://github.com/taosdata/TDengine/blob/develop/tests/script/sh/pyl2norm.py)
#### Aggregate Function [pyl2norm](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/pyl2norm.py)
The `pyl2norm` function finds the second-order norm for all data in the input column. This squares the values, takes a cumulative sum, and finds the square root.
The `pyl2norm` function finds the second-order norm for all data in the input column. This squares the values, takes a cumulative sum, and finds the square root.
<details>
<details>
...
@@ -389,4 +388,4 @@ The `pyl2norm` function finds the second-order norm for all data in the input co
...
@@ -389,4 +388,4 @@ The `pyl2norm` function finds the second-order norm for all data in the input co
</details>
</details>
## Manage and Use User-Defined Functions
## Manage and Use User-Defined Functions
You can add UDF to TDengine before using it in SQL queries. For more information, see [User-Defined Functions](../12-taos-sql/26-udf.md).
You need to add UDF to TDengine before using it in SQL queries. For more information about how to manage UDF and how to invoke UDF, please see [Manage and Use UDF](../12-taos-sql/26-udf.md).