@@ -33,7 +33,7 @@ The below SQL statement is used to insert one row into table "d1001".
...
@@ -33,7 +33,7 @@ The below SQL statement is used to insert one row into table "d1001".
INSERT INTO d1001 VALUES (ts1, 10.3, 219, 0.31);
INSERT INTO d1001 VALUES (ts1, 10.3, 219, 0.31);
```
```
`ts1` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detial, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
`ts1` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detail, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
### Insert Multiple Rows
### Insert Multiple Rows
...
@@ -43,7 +43,7 @@ Multiple rows can be inserted in a single SQL statement. The example below inser
...
@@ -43,7 +43,7 @@ Multiple rows can be inserted in a single SQL statement. The example below inser
`ts1` and `ts2` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detial, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
`ts1` and `ts2` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detail, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
### Insert into Multiple Tables
### Insert into Multiple Tables
...
@@ -53,7 +53,7 @@ Data can be inserted into multiple tables in the same SQL statement. The example
...
@@ -53,7 +53,7 @@ Data can be inserted into multiple tables in the same SQL statement. The example
`ts1`, `ts2` and `ts3` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detial, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
`ts1`, `ts2` and `ts3` is Unix timestamp, the timestamps which is larger than the difference between current time and KEEP in config is only allowed. For further detail, refer to [TDengine SQL insert timestamp section](/taos-sql/insert).
For more details about `INSERT` please refer to [INSERT](/taos-sql/insert).
For more details about `INSERT` please refer to [INSERT](/taos-sql/insert).
@@ -17,7 +17,7 @@ When you create a user-defined function, you must implement standard interface f
...
@@ -17,7 +17,7 @@ When you create a user-defined function, you must implement standard interface f
- For aggregate functions, implement the `aggfn_start`, `aggfn`, and `aggfn_finish` interface functions.
- For aggregate functions, implement the `aggfn_start`, `aggfn`, and `aggfn_finish` interface functions.
- To initialize your function, implement the `udf_init` function. To terminate your function, implement the `udf_destroy` function.
- To initialize your function, implement the `udf_init` function. To terminate your function, implement the `udf_destroy` function.
There are strict naming conventions for these interface functions. The names of the start, finish, init, and destroy interfaces must be <udf-name\>_start, <udf-name\>_finish, <udf-name\>_init, and <udf-name\>_destroy, respectively. Replace `scalarfn`, `aggfn`, and `udf` with the name of your user-defined function.
There are strict naming conventions for these interface functions. The names of the start, finish, init, and destroy interfaces must be `_start`, `_finish`, `_init`, and `_destroy`, respectively. Replace `scalarfn`, `aggfn`, and `udf` with the name of your user-defined function.
### Implementing a Scalar Function in C
### Implementing a Scalar Function in C
The implementation of a scalar function is described as follows:
The implementation of a scalar function is described as follows:
...
@@ -318,7 +318,7 @@ The implementation of a scalar UDF is described as follows:
...
@@ -318,7 +318,7 @@ The implementation of a scalar UDF is described as follows:
Description: this function prcesses datablock, which is the input; you can use datablock.data(row, col) to access the python object at location(row,col); the output is a tuple object consisted of objects of type outputtype
Description: this function processes datablock, which is the input; you can use datablock.data(row, col) to access the python object at location(row,col); the output is a tuple object consisted of objects of type outputtype
Note: aggregate UDF requires init(), destroy(), start(), reduce() and finish() to be impemented. start() generates the initial result in buffer, then the input data is divided into multiple row data blocks, reduce() is invoked for each data block `inputs` and intermediate `buf`, finally finish() is invoked to generate final result from the intermediate result `buf`.
Note: aggregate UDF requires init(), destroy(), start(), reduce() and finish() to be implemented. start() generates the initial result in buffer, then the input data is divided into multiple row data blocks, reduce() is invoked for each data block `inputs` and intermediate `buf`, finally finish() is invoked to generate final result from the intermediate result `buf`.
### Data Mapping between TDengine SQL and Python UDF
### Data Mapping between TDengine SQL and Python UDF
...
@@ -559,7 +559,7 @@ Note: Prior to TDengine 3.0.5.0 (excluding), updating a UDF requires to restart
...
@@ -559,7 +559,7 @@ Note: Prior to TDengine 3.0.5.0 (excluding), updating a UDF requires to restart
#### Sample 3: UDF with n arguments
#### Sample 3: UDF with n arguments
A UDF which accepts n intergers, likee (x1, x2, ..., xn) and output the sum of the product of each value and its sequence number: 1 * x1 + 2 * x2 + ... + n * xn. If there is `null` in the input, then the result is `null`. The difference from sample 1 is that it can accept any number of columns as input and process each column. Assume the program is written in /root/udf/nsum.py:
A UDF which accepts n integers, likee (x1, x2, ..., xn) and output the sum of the product of each value and its sequence number: 1 * x1 + 2 * x2 + ... + n * xn. If there is `null` in the input, then the result is `null`. The difference from sample 1 is that it can accept any number of columns as input and process each column. Assume the program is written in /root/udf/nsum.py:
```python
```python
definit():
definit():
...
@@ -607,7 +607,7 @@ Query OK, 4 row(s) in set (0.010653s)
...
@@ -607,7 +607,7 @@ Query OK, 4 row(s) in set (0.010653s)
#### Sample 4: Utilize 3rd party package
#### Sample 4: Utilize 3rd party package
A UDF which accepts a timestamp and output the next closed Sunday. This sample requires to use third party package `moment`, you need to install it firslty.
A UDF which accepts a timestamp and output the next closed Sunday. This sample requires to use third party package `moment`, you need to install it firstly.
```shell
```shell
pip3 install moment
pip3 install moment
...
@@ -701,7 +701,7 @@ Query OK, 4 row(s) in set (1.011474s)
...
@@ -701,7 +701,7 @@ Query OK, 4 row(s) in set (1.011474s)
#### Sample 5: Aggregate Function
#### Sample 5: Aggregate Function
An aggregate function which calculates the difference of the maximum and the minimum in a column. An aggregate funnction takes multiple rows as input and output only one data. The execution process of an aggregate UDF is like map-reduce, the framework divides the input into multiple parts, each mapper processes one block and the reducer aggregates the result of the mappers. The reduce() of Python UDF has the functionality of both map() and reduce(). The reduce() takes two arguments: the data to be processed; and the result of other tasks executing reduce(). For exmaple, assume the code is in `/root/udf/myspread.py`.
An aggregate function which calculates the difference of the maximum and the minimum in a column. An aggregate funnction takes multiple rows as input and output only one data. The execution process of an aggregate UDF is like map-reduce, the framework divides the input into multiple parts, each mapper processes one block and the reducer aggregates the result of the mappers. The reduce() of Python UDF has the functionality of both map() and reduce(). The reduce() takes two arguments: the data to be processed; and the result of other tasks executing reduce(). For example, assume the code is in `/root/udf/myspread.py`.
```python
```python
importio
importio
...
@@ -755,7 +755,7 @@ In this example, we implemented an aggregate function, and added some logging.
...
@@ -755,7 +755,7 @@ In this example, we implemented an aggregate function, and added some logging.
2. log() is the function for logging, it converts the input object to string and output with an end of line
2. log() is the function for logging, it converts the input object to string and output with an end of line
3. destroy() closes the log file \
3. destroy() closes the log file \
4. start() returns the initial buffer for storing the intermediate result
4. start() returns the initial buffer for storing the intermediate result
5. reduce() processes each daa block and aggregates the result
5. reduce() processes each data block and aggregates the result
6. finish() converts the final buffer() to final result\
6. finish() converts the final buffer() to final result\
@@ -672,7 +672,7 @@ If you input a specific column, the number of non-null values in the column is r
...
@@ -672,7 +672,7 @@ If you input a specific column, the number of non-null values in the column is r
ELAPSED(ts_primary_key[,time_unit])
ELAPSED(ts_primary_key[,time_unit])
```
```
**Description**: `elapsed` function can be used to calculate the continuous time length in which there is valid data. If it's used with `INTERVAL` clause, the returned result is the calculated time length within each time window. If it's used without `INTERVAL` caluse, the returned result is the calculated time length within the specified time range. Please be noted that the return value of `elapsed` is the number of `time_unit` in the calculated time length.
**Description**: `elapsed` function can be used to calculate the continuous time length in which there is valid data. If it's used with `INTERVAL` clause, the returned result is the calculated time length within each time window. If it's used without `INTERVAL` clause, the returned result is the calculated time length within the specified time range. Please be noted that the return value of `elapsed` is the number of `time_unit` in the calculated time length.
**Return value type**: Double if the input value is not NULL;
**Return value type**: Double if the input value is not NULL;
@@ -21,7 +21,7 @@ part_list can be any scalar expression, such as a column, constant, scalar funct
...
@@ -21,7 +21,7 @@ part_list can be any scalar expression, such as a column, constant, scalar funct
A PARTITION BY clause is processed as follows:
A PARTITION BY clause is processed as follows:
- The PARTITION BY clause must occur after the WHERE clause
- The PARTITION BY clause must occur after the WHERE clause
- The PARTITION BY caluse partitions the data according to the specified dimensions, then perform computation on each partition. The performed computation is determined by the rest of the statement - a window clause, GROUP BY clause, or SELECT clause.
- The PARTITION BY clause partitions the data according to the specified dimensions, then perform computation on each partition. The performed computation is determined by the rest of the statement - a window clause, GROUP BY clause, or SELECT clause.
- The PARTITION BY clause can be used together with a window clause or GROUP BY clause. In this case, the window or GROUP BY clause takes effect on every partition. For example, the following statement partitions the table by the location tag, performs downsampling over a 10 minute window, and returns the maximum value:
- The PARTITION BY clause can be used together with a window clause or GROUP BY clause. In this case, the window or GROUP BY clause takes effect on every partition. For example, the following statement partitions the table by the location tag, performs downsampling over a 10 minute window, and returns the maximum value: