index.mdx 8.7 KB
Newer Older
1
---
2
title: Query Data
D
danielclow 已提交
3
description: This document describes how to query data in TDengine and how to perform synchronous and asynchronous queries using connectors.
4 5 6 7 8 9 10 11 12 13 14
---

import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
import JavaQuery from "./_java.mdx";
import PyQuery from "./_py.mdx";
import GoQuery from "./_go.mdx";
import RustQuery from "./_rust.mdx";
import NodeQuery from "./_js.mdx";
import CsQuery from "./_cs.mdx";
import CQuery from "./_c.mdx";
15
import PhpQuery from "./_php.mdx";
16 17 18 19 20 21 22
import PyAsync from "./_py_async.mdx";
import NodeAsync from "./_js_async.mdx";
import CsAsync from "./_cs_async.mdx";
import CAsync from "./_c_async.mdx";

## Introduction

23
SQL is used by TDengine as its query language. Application programs can send SQL statements to TDengine through REST API or connectors. TDengine's CLI `taos` can also be used to execute ad hoc SQL queries. Here is the list of major query functionalities supported by TDengine:
24 25

- Query on single column or multiple columns
26
- Filter on tags or data columns: >, <, =, <\>, like
27 28
- Grouping of results: `Group By` - Sorting of results: `Order By` - Limit the number of results: `Limit/Offset`
- Windowed aggregate queries for time windows (interval), session windows (session), and state windows (state_window) 
29 30 31 32
- Arithmetic on columns of numeric types or aggregate results
- Join query with timestamp alignment
- Aggregate functions: count, max, min, avg, sum, twa, stddev, leastsquares, top, bottom, first, last, percentile, apercentile, last_row, spread, diff

33
For example, the SQL statement below can be executed in TDengine CLI `taos` to select records with voltage greater than 215 and limit the output to only 2 rows.
34 35 36 37 38 39 40 41 42 43

```sql
taos> select * from d1001 where voltage > 215 order by ts desc limit 2;
           ts            |       current        |   voltage   |        phase         |
======================================================================================
 2018-10-03 14:38:16.800 |             12.30000 |         221 |              0.31000 |
 2018-10-03 14:38:15.000 |             12.60000 |         218 |              0.33000 |
Query OK, 2 row(s) in set (0.001100s)
```

44
To meet the requirements of varied use cases, some special functions have been added in TDengine. Some examples are `twa` (Time Weighted Average), `spread` (The difference between the maximum and the minimum), and `last_row` (the last row).
45

G
gccgdb1234 已提交
46
For detailed query syntax, see [Select](../../taos-sql/select).
47 48 49

## Aggregation among Tables

50
In most use cases, there are always multiple kinds of data collection points. A new concept, called STable (abbreviation for super table), is used in TDengine to represent one type of data collection point, and a subtable is used to represent a specific data collection point of that type. Tags are used by TDengine to represent the static properties of data collection points. A specific data collection point has its own values for static properties. By specifying filter conditions on tags, aggregation can be performed efficiently among all the subtables created via the same STable, i.e. same type of data collection points. Aggregate functions applicable for tables can be used directly on STables; the syntax is exactly the same.
51 52 53

### Example 1

54
In TDengine CLI `taos`, use the SQL below to get the average voltage of all the meters in California grouped by location.
55 56

```
X
Xiaoyu Wang 已提交
57 58 59 60 61 62
taos> SELECT AVG(voltage), location FROM meters GROUP BY location;
       avg(voltage)        |                             location                             |
===============================================================================================
             219.200000000 | California.SanFrancisco                                          |
             221.666666667 | California.LosAngeles                                            |
Query OK, 2 rows in database (0.005995s)
63 64 65 66
```

### Example 2

X
Xiaoyu Wang 已提交
67
In TDengine CLI `taos`, use the SQL below to get the number of rows and the maximum current from meters whose groupId is 2.
68 69

```
X
Xiaoyu Wang 已提交
70
taos> SELECT count(*), max(current) FROM meters where groupId = 2;
71
     count(*)  |    max(current)  |
72 73 74 75 76
==================================
            5 |             13.4 |
Query OK, 1 row(s) in set (0.002136s)
```

G
gccgdb1234 已提交
77
In [Select](../../taos-sql/select), all query operations are marked as to whether they support STables or not.
78 79 80

## Down Sampling and Interpolation

81
In IoT use cases, down sampling is widely used to aggregate data by time range. The `INTERVAL` keyword in TDengine can be used to simplify the query by time window. For example, the SQL statement below can be used to get the sum of current every 10 seconds from meters table d1001.
82 83

```
X
Xiaoyu Wang 已提交
84 85
taos> SELECT _wstart, sum(current) FROM d1001 INTERVAL(10s);
         _wstart         |       sum(current)        |
86 87 88
======================================================
 2018-10-03 14:38:00.000 |              10.300000191 |
 2018-10-03 14:38:10.000 |              24.900000572 |
X
Xiaoyu Wang 已提交
89
Query OK, 2 rows in database (0.003139s)
90 91
```

G
gccgdb1234 已提交
92
Down sampling can also be used for STable. For example, the below SQL statement can be used to get the sum of current from all meters in California.
93 94

```
X
Xiaoyu Wang 已提交
95 96
taos> SELECT _wstart, SUM(current) FROM meters where location like "California%" INTERVAL(1s);
         _wstart         |       sum(current)        |
97 98
======================================================
 2018-10-03 14:38:04.000 |              10.199999809 |
X
Xiaoyu Wang 已提交
99
 2018-10-03 14:38:05.000 |              23.699999809 |
100 101
 2018-10-03 14:38:06.000 |              11.500000000 |
 2018-10-03 14:38:15.000 |              12.600000381 |
X
Xiaoyu Wang 已提交
102 103
 2018-10-03 14:38:16.000 |              34.400000572 |
Query OK, 5 rows in database (0.007413s)
104 105
```

S
Sean Ely 已提交
106
Down sampling also supports time offset. For example, the below SQL statement can be used to get the sum of current from all meters but each time window must start at the boundary of 500 milliseconds.
107 108

```
X
Xiaoyu Wang 已提交
109 110
taos> SELECT _wstart, SUM(current) FROM meters INTERVAL(1s, 500a);
         _wstart         |       sum(current)        |
111
======================================================
X
Xiaoyu Wang 已提交
112 113 114 115 116 117 118
 2018-10-03 14:38:03.500 |              10.199999809 |
 2018-10-03 14:38:04.500 |              10.300000191 |
 2018-10-03 14:38:05.500 |              13.399999619 |
 2018-10-03 14:38:06.500 |              11.500000000 |
 2018-10-03 14:38:14.500 |              12.600000381 |
 2018-10-03 14:38:16.500 |              34.400000572 |
Query OK, 6 rows in database (0.005515s)
119 120
```

S
Sean Ely 已提交
121
In many use cases, it's hard to align the timestamp of the data collected by each collection point. However, a lot of algorithms like FFT require the data to be aligned with same time interval and application programs have to handle this by themselves. In TDengine, it's easy to achieve the alignment using down sampling.
122 123 124

Interpolation can be performed in TDengine if there is no data in a time range.

G
gccgdb1234 已提交
125
For more information, see [Aggregate by Window](../../taos-sql/distinguished).
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154

## Examples

### Query

In the section describing [Insert](/develop/insert-data/sql-writing), a database named `power` is created and some data are inserted into STable `meters`. Below sample code demonstrates how to query the data in this STable.

<Tabs defaultValue="java" groupId="lang">
  <TabItem label="Java" value="java">
    <JavaQuery />
  </TabItem>
  <TabItem label="Python" value="python">
    <PyQuery />
  </TabItem>
  <TabItem label="Go" value="go">
    <GoQuery />
  </TabItem>
  <TabItem label="Rust" value="rust">
    <RustQuery />
  </TabItem>
  <TabItem label="Node.js" value="nodejs">
    <NodeQuery />
  </TabItem>
  <TabItem label="C#" value="csharp">
    <CsQuery />
  </TabItem>
  <TabItem label="C" value="c">
    <CQuery />
  </TabItem>
155 156 157
  <TabItem label="PHP" value="php">
    <PhpQuery />
  </TabItem>
158 159 160 161
</Tabs>

:::note

S
Sean Ely 已提交
162
1. With either REST connection or native connection, the above sample code works well.
163
2. Please note that `use db` can't be used in case of REST connection because it's stateless. You can specify the database name by either the REST endpoint's parameter or <db_name>.<table_name> in the SQL command.
164 165 166 167 168

:::

### Asynchronous Query

169
Besides synchronous queries, an asynchronous query API is also provided by TDengine to insert or query data more efficiently. With a similar hardware and software environment, the async API is 2~4 times faster than sync APIs. Async API works in non-blocking mode, which means an operation can be returned without finishing so that the calling thread can switch to other work to improve the performance of the whole application system. Async APIs perform especially better in the case of poor networks.
170

S
Sean Ely 已提交
171
Please note that async query can only be used with a native connection.
172 173 174 175 176 177 178 179 180 181 182 183

<Tabs defaultValue="python" groupId="lang">
  <TabItem label="Python" value="python">
    <PyAsync />
  </TabItem>
  <TabItem label="C#" value="csharp">
    <CsAsync />
  </TabItem>
  <TabItem label="C" value="c">
    <CAsync />
  </TabItem>
</Tabs>