提交 721593ec 编写于 作者: X Xiaoyu Wang

docs: query data doc adjust

上级 6248d57d
......@@ -9,7 +9,7 @@ The data model employed by TDengine is similar to that of a relational database.
The [characteristics of time-series data](https://www.taosdata.com/blog/2019/07/09/86.html) from different data collection points may be different. Characteristics include collection frequency, retention policy and others which determine how you create and configure the database. For e.g. days to keep, number of replicas, data block size, whether data updates are allowed and other configurable parameters would be determined by the characteristics of your data and your business requirements. For TDengine to operate with the best performance, we strongly recommend that you create and configure different databases for data with different characteristics. This allows you, for example, to set up different storage and retention policies. When creating a database, there are a lot of parameters that can be configured such as, the days to keep data, the number of replicas, the number of memory blocks, time precision, the minimum and maximum number of rows in each data block, whether compression is enabled, the time range of the data in single data file and so on. Below is an example of the SQL statement to create a database.
```sql
CREATE DATABASE power KEEP 365 DURATION 10 BUFFER 16 VGROUPS 100 WAL 1;
CREATE DATABASE power KEEP 365 DURATION 10 BUFFER 16 WAL_LEVEL 1;
```
In the above SQL statement:
......
......@@ -61,20 +61,20 @@ In summary, records across subtables can be aggregated by a simple query on thei
In TDengine CLI `taos`, use the SQL below to get the average voltage of all the meters in California grouped by location.
```
taos> SELECT AVG(voltage) FROM meters GROUP BY location;
avg(voltage) | location |
=============================================================
222.000000000 | California.LosAngeles |
219.200000000 | California.SanFrancisco |
Query OK, 2 row(s) in set (0.002136s)
taos> SELECT AVG(voltage), location FROM meters GROUP BY location;
avg(voltage) | location |
===============================================================================================
219.200000000 | California.SanFrancisco |
221.666666667 | California.LosAngeles |
Query OK, 2 rows in database (0.005995s)
```
### Example 2
In TDengine CLI `taos`, use the SQL below to get the number of rows and the maximum current in the past 24 hours from meters whose groupId is 2.
In TDengine CLI `taos`, use the SQL below to get the number of rows and the maximum current from meters whose groupId is 2.
```
taos> SELECT count(*), max(current) FROM meters where groupId = 2 and ts > now - 24h;
taos> SELECT count(*), max(current) FROM meters where groupId = 2;
count(*) | max(current) |
==================================
5 | 13.4 |
......@@ -88,40 +88,41 @@ Join queries are only allowed between subtables of the same STable. In [Select](
In IoT use cases, down sampling is widely used to aggregate data by time range. The `INTERVAL` keyword in TDengine can be used to simplify the query by time window. For example, the SQL statement below can be used to get the sum of current every 10 seconds from meters table d1001.
```
taos> SELECT sum(current) FROM d1001 INTERVAL(10s);
ts | sum(current) |
taos> SELECT _wstart, sum(current) FROM d1001 INTERVAL(10s);
_wstart | sum(current) |
======================================================
2018-10-03 14:38:00.000 | 10.300000191 |
2018-10-03 14:38:10.000 | 24.900000572 |
Query OK, 2 row(s) in set (0.000883s)
Query OK, 2 rows in database (0.003139s)
```
Down sampling can also be used for STable. For example, the below SQL statement can be used to get the sum of current from all meters in California.
```
taos> SELECT SUM(current) FROM meters where location like "California%" INTERVAL(1s);
ts | sum(current) |
taos> SELECT _wstart, SUM(current) FROM meters where location like "California%" INTERVAL(1s);
_wstart | sum(current) |
======================================================
2018-10-03 14:38:04.000 | 10.199999809 |
2018-10-03 14:38:05.000 | 32.900000572 |
2018-10-03 14:38:05.000 | 23.699999809 |
2018-10-03 14:38:06.000 | 11.500000000 |
2018-10-03 14:38:15.000 | 12.600000381 |
2018-10-03 14:38:16.000 | 36.000000000 |
Query OK, 5 row(s) in set (0.001538s)
2018-10-03 14:38:16.000 | 34.400000572 |
Query OK, 5 rows in database (0.007413s)
```
Down sampling also supports time offset. For example, the below SQL statement can be used to get the sum of current from all meters but each time window must start at the boundary of 500 milliseconds.
```
taos> SELECT SUM(current) FROM meters INTERVAL(1s, 500a);
ts | sum(current) |
taos> SELECT _wstart, SUM(current) FROM meters INTERVAL(1s, 500a);
_wstart | sum(current) |
======================================================
2018-10-03 14:38:04.500 | 11.189999809 |
2018-10-03 14:38:05.500 | 31.900000572 |
2018-10-03 14:38:06.500 | 11.600000000 |
2018-10-03 14:38:15.500 | 12.300000381 |
2018-10-03 14:38:16.500 | 35.000000000 |
Query OK, 5 row(s) in set (0.001521s)
2018-10-03 14:38:03.500 | 10.199999809 |
2018-10-03 14:38:04.500 | 10.300000191 |
2018-10-03 14:38:05.500 | 13.399999619 |
2018-10-03 14:38:06.500 | 11.500000000 |
2018-10-03 14:38:14.500 | 12.600000381 |
2018-10-03 14:38:16.500 | 34.400000572 |
Query OK, 6 rows in database (0.005515s)
```
In many use cases, it's hard to align the timestamp of the data collected by each collection point. However, a lot of algorithms like FFT require the data to be aligned with same time interval and application programs have to handle this by themselves. In TDengine, it's easy to achieve the alignment using down sampling.
......
......@@ -54,20 +54,20 @@ Query OK, 2 row(s) in set (0.001100s)
在 TAOS Shell,查找加利福尼亚州所有智能电表采集的电压平均值,并按照 location 分组。
```
taos> SELECT AVG(voltage) FROM meters GROUP BY location;
avg(voltage) | location |
=============================================================
222.000000000 | California.LosAngeles |
219.200000000 | California.SanFrancisco |
Query OK, 2 row(s) in set (0.002136s)
taos> SELECT AVG(voltage), location FROM meters GROUP BY location;
avg(voltage) | location |
===============================================================================================
219.200000000 | California.SanFrancisco |
221.666666667 | California.LosAngeles |
Query OK, 2 rows in database (0.005995s)
```
### 示例二
在 TAOS shell, 查找 groupId 为 2 的所有智能电表过去 24 小时的记录条数,电流的最大值。
在 TAOS shell, 查找 groupId 为 2 的所有智能电表的记录条数,电流的最大值。
```
taos> SELECT count(*), max(current) FROM meters where groupId = 2 and ts > now - 24h;
taos> SELECT count(*), max(current) FROM meters where groupId = 2;
cunt(*) | max(current) |
==================================
5 | 13.4 |
......@@ -81,40 +81,41 @@ Query OK, 1 row(s) in set (0.002136s)
物联网场景里,经常需要通过降采样(down sampling)将采集的数据按时间段进行聚合。TDengine 提供了一个简便的关键词 interval 让按照时间窗口的查询操作变得极为简单。比如,将智能电表 d1001 采集的电流值每 10 秒钟求和
```
taos> SELECT sum(current) FROM d1001 INTERVAL(10s);
ts | sum(current) |
taos> SELECT _wstart, sum(current) FROM d1001 INTERVAL(10s);
_wstart | sum(current) |
======================================================
2018-10-03 14:38:00.000 | 10.300000191 |
2018-10-03 14:38:10.000 | 24.900000572 |
Query OK, 2 row(s) in set (0.000883s)
Query OK, 2 rows in database (0.003139s)
```
降采样操作也适用于超级表,比如:将加利福尼亚州所有智能电表采集的电流值每秒钟求和
```
taos> SELECT SUM(current) FROM meters where location like "California%" INTERVAL(1s);
ts | sum(current) |
taos> SELECT _wstart, SUM(current) FROM meters where location like "California%" INTERVAL(1s);
_wstart | sum(current) |
======================================================
2018-10-03 14:38:04.000 | 10.199999809 |
2018-10-03 14:38:05.000 | 32.900000572 |
2018-10-03 14:38:05.000 | 23.699999809 |
2018-10-03 14:38:06.000 | 11.500000000 |
2018-10-03 14:38:15.000 | 12.600000381 |
2018-10-03 14:38:16.000 | 36.000000000 |
Query OK, 5 row(s) in set (0.001538s)
2018-10-03 14:38:16.000 | 34.400000572 |
Query OK, 5 rows in database (0.007413s)
```
降采样操作也支持时间偏移,比如:将所有智能电表采集的电流值每秒钟求和,但要求每个时间窗口从 500 毫秒开始
```
taos> SELECT SUM(current) FROM meters INTERVAL(1s, 500a);
ts | sum(current) |
taos> SELECT _wstart, SUM(current) FROM meters INTERVAL(1s, 500a);
_wstart | sum(current) |
======================================================
2018-10-03 14:38:04.500 | 11.189999809 |
2018-10-03 14:38:05.500 | 31.900000572 |
2018-10-03 14:38:06.500 | 11.600000000 |
2018-10-03 14:38:15.500 | 12.300000381 |
2018-10-03 14:38:16.500 | 35.000000000 |
Query OK, 5 row(s) in set (0.001521s)
2018-10-03 14:38:03.500 | 10.199999809 |
2018-10-03 14:38:04.500 | 10.300000191 |
2018-10-03 14:38:05.500 | 13.399999619 |
2018-10-03 14:38:06.500 | 11.500000000 |
2018-10-03 14:38:14.500 | 12.600000381 |
2018-10-03 14:38:16.500 | 34.400000572 |
Query OK, 6 rows in database (0.005515s)
```
物联网场景里,每个数据采集点采集数据的时间是难同步的,但很多分析算法(比如 FFT)需要把采集的数据严格按照时间等间隔的对齐,在很多系统里,需要应用自己写程序来处理,但使用 TDengine 的降采样操作就轻松解决。
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册