06-select.md 18.1 KB
Newer Older
1
---
2
sidebar_label: Select
3 4 5 6 7
title: Select
---

## Syntax

8 9 10 11 12 13
```sql
SELECT {DATABASE() | CLIENT_VERSION() | SERVER_VERSION() | SERVER_STATUS() | NOW() | TODAY() | TIMEZONE()}

SELECT [DISTINCT] select_list
    from_clause
    [WHERE condition]
D
dapan1121 已提交
14
    [partition_by_clause]
15 16 17
    [window_clause]
    [group_by_clause]
    [order_by_clasue]
18 19
    [SLIMIT limit_val [SOFFSET offset_val]]
    [LIMIT limit_val [OFFSET offset_val]]
20
    [>> export_file]
21

22 23
select_list:
    select_expr [, select_expr] ...
24

25 26 27 28 29 30 31
select_expr: {
    *
  | query_name.*
  | [schema_name.] {table_name | view_name} .*
  | t_alias.*
  | expr [[AS] c_alias]
}
32

33 34 35 36 37 38 39 40 41 42 43 44 45
from_clause: {
    table_reference [, table_reference] ...
  | join_clause [, join_clause] ...
}

table_reference:
    table_expr t_alias

table_expr: {
    table_name
  | view_name
  | ( subquery )
}
46

47 48
join_clause:
    table_reference [INNER] JOIN table_reference ON condition
49

50 51 52 53 54
window_clause: {
    SESSION(ts_col, tol_val)
  | STATE_WINDOW(col)
  | INTERVAL(interval_val [, interval_offset]) [SLIDING (sliding_val)] [WATERMARK(watermark_val)] [FILL(fill_mod_and_val)]

D
dapan1121 已提交
55 56 57
partition_by_clause:
    PARTITION BY expr [, expr] ... 

58 59 60 61 62 63 64 65
group_by_clause:
    GROUP BY expr [, expr] ... HAVING condition

order_by_clasue:
    ORDER BY order_expr [, order_expr] ...

order_expr:
    {expr | position | c_alias} [DESC | ASC] [NULLS FIRST | NULLS LAST]
66
```
67 68 69 70 71

## Lists

A query can be performed on some or all columns. Data and tag columns can all be included in the SELECT list.

72
### Wildcards
73

W
wade zhang 已提交
74
You can use an asterisk (\*) as a wildcard character to indicate all columns. For normal tables or sub-tables, the asterisk indicates only data columns. For supertables, tag columns are also included when using asterisk (\*).
75 76 77

```sql
SELECT * FROM d1001;
78 79
```

80
You can use a table name as a prefix before an asterisk. For example, the following SQL statements both return all columns from the d1001 table:
81

82
```sql
83 84 85 86
SELECT * FROM d1001;
SELECT d1001.* FROM d1001;
```

87
However, in a JOIN query, using a table name prefix with an asterisk returns different results. In this case, querying * returns all data in all columns in all tables (not including tags), whereas using a table name prefix returns all data in all columns in the specified table only.
88

89 90 91
```sql
SELECT * FROM d1001, d1003 WHERE d1001.ts=d1003.ts;
SELECT d1001.* FROM d1001,d1003 WHERE d1001.ts = d1003.ts;
92 93
```

94
The first of the preceding SQL statements returns all columns from the d1001 and d1003 tables, but the second of the preceding SQL statements returns all columns from the d1001 table only.
95

96 97
With regard to the other SQL functions that support wildcards, the differences are as follows:
`count(*)` only returns one column. `first`, `last`, and `last_row` return all columns.
98

99
### Tag Columns
100

101 102 103 104
You can query tag columns in supertables and subtables and receive results in the same way as querying data columns.

```sql
SELECT location, groupid, current FROM d1001 LIMIT 2;
105 106
```

107
### Distinct Values
108

109
The DISTINCT keyword returns only values that are different over one or more columns. You can use the DISTINCT keyword with tag columns and data columns.
110

111
The following SQL statement returns distinct values from a tag column:
112

113 114 115
```sql
SELECT DISTINCT tag_name [, tag_name ...] FROM stb_name;
```
116

117
The following SQL statement returns distinct values from a data column:
118 119 120 121 122 123 124

```sql
SELECT DISTINCT col_name [, col_name ...] FROM tb_name;
```

:::info

125 126
1. Configuration parameter `maxNumOfDistinctRes` in `taos.cfg` is used to control the number of rows to output. The minimum configurable value is 100,000, the maximum configurable value is 100,000,000, the default value is 1,000,000. If the actual number of rows exceeds the value of this parameter, only the number of rows specified by this parameter will be output.
2. It can't be guaranteed that the results selected by using `DISTINCT` on columns of `FLOAT` or `DOUBLE` are exactly unique because of the precision errors in floating point numbers.
127 128 129

:::

130
### Column Names
131

132
When using `SELECT`, the column names in the result set will be the same as that in the select clause if `AS` is not used. `AS` can be used to rename the column names in the result set. For example:
133

134
```sql
135 136 137 138 139
taos> SELECT ts, ts AS primary_key_ts FROM d1001;
```

`AS` can't be used together with `first(*)`, `last(*)`, or `last_row(*)`.

140 141
### Pseudocolumns

142 143
**Pseudocolumn:** A pseudo-column behaves like a table column but is not actually stored in the table. You can select from pseudo-columns, but you cannot insert, update, or delete their values. A pseudo-column is also similar to a function without arguments. This section describes these pseudo-columns:

144 145
**TBNAME**
The TBNAME pseudocolumn in a supertable contains the names of subtables within the supertable.
146

147
The following SQL statement returns all unique subtable names and locations within the meters supertable:
148

149 150 151
```mysql
SELECT DISTINCT TBNAME, location FROM meters;
```
152

153
Use the `INS_TAGS` system table in `INFORMATION_SCHEMA` to query the information for subtables in a supertable. For example, the following statement returns the name and tag values for each subtable in the `meters` supertable.
154

155 156
```mysql
SELECT table_name, tag_name, tag_type, tag_value FROM information_schema.ins_tags WHERE stable_name='meters';
157 158
```

159
The following SQL statement returns the number of subtables within the meters supertable.
160

161 162
```mysql
SELECT COUNT(*) FROM (SELECT DISTINCT TBNAME FROM meters);
163 164
```

165
In the preceding two statements, only tags can be used as filtering conditions in the WHERE clause. For example:
166

167
**\_QSTART and \_QEND**
168

169
The \_QSTART and \_QEND pseudocolumns contain the beginning and end of the time range of a query. If the WHERE clause in a statement does not contain valid timestamps, the time range is equal to [-2^63, 2^63 - 1].
170

171
The \_QSTART and \_QEND pseudocolumns cannot be used in a WHERE clause.
172

S
Sean Ely 已提交
173 174
**\_WSTART, \_WEND, and \_WDURATION**

175
The \_WSTART, \_WEND, and \_WDURATION pseudocolumns indicate the beginning, end, and duration of a window.
176

177
These pseudocolumns can be used only in time window-based aggregations and must occur after the aggregation clause.
178

179
**\_c0 and \_ROWTS**
180

181 182 183 184
In TDengine, the first column of all tables must be a timestamp. This column is the primary key of the table. The \_c0 and \_ROWTS pseudocolumns both represent the values of this column. These pseudocolumns enable greater flexibility and standardization. For example, you can use functions such as MAX and MIN with these pseudocolumns.

```sql
select _rowts, max(current) from meters;
185 186
```

G
Ganlin Zhao 已提交
187 188
**\_IROWTS**

G
Ganlin Zhao 已提交
189
The \_IROWTS pseudocolumn can only be used with INTERP function. This pseudocolumn can be used to retrieve the corresponding timestamp column associated with the interpolation results.
G
Ganlin Zhao 已提交
190 191 192 193 194

```sql
select _irowts, interp(current) from meters range('2020-01-01 10:00:00', '2020-01-01 10:30:00') every(1s) fill(linear);
```

195
## Query Objects
196

197 198
`FROM` can be followed by a number of tables or super tables, or can be followed by a sub-query.
If no database is specified as current database in use, table names must be preceded with database name, for example, `power.d1001`.
199

200
You can perform INNER JOIN statements based on the primary key. The following conditions apply:
201

202 203 204 205 206 207 208
1. You can use FROM table list or an explicit JOIN clause.
2. For standard tables and subtables, you must specify an ON condition and the condition must be equivalent to the primary key.
3. For supertables, the ON condition must be equivalent to the primary key. In addition, the tag columns of the tables on which the INNER JOIN is performed must have a one-to-one relationship. You cannot specify an OR condition.
4. The tables that are included in a JOIN clause must be of the same type (supertable, standard table, or subtable).
5. You can include subqueries before and after the JOIN keyword.
6. You cannot include more than ten tables in a JOIN clause.
7. You cannot include a FILL clause and a JOIN clause in the same statement.
209

210
## GROUP BY
211

212
If you use a GROUP BY clause, the SELECT list can only include the following items:
213

214 215 216 217
1. Constants
2. Aggregate functions
3. Expressions that are consistent with the expression following the GROUP BY clause
4. Expressions that include the preceding expression
218

219
The GROUP BY clause groups each row of data by the value of the expression following the clause and returns a combined result for each group.
220

221
The expressions in a GROUP BY clause can include any column in any table or view. It is not necessary that the expressions appear in the SELECT list.
222

223
The GROUP BY clause does not guarantee that the results are ordered. If you want to ensure that grouped data is ordered, use the ORDER BY clause.
224 225


226
## PARTITION BY
227

228
The PARTITION BY clause is a TDengine-specific extension to standard SQL. This clause partitions data based on the part_list and performs computations per partition.
229

230
For more information, see TDengine Extensions.
231

232
## ORDER BY
233

234
The ORDER BY keyword orders query results. If you do not include an ORDER BY clause in a query, the order of the results can be inconsistent.
235

236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254
You can specify integers after ORDER BY to indicate the order in which you want the items in the SELECT list to be displayed. For example, 1 indicates the first item in the select list.

You can specify ASC for ascending order or DESC for descending order.

You can also use the NULLS keyword to specify the position of null values. Ascending order uses NULLS LAST by default. Descending order uses NULLS FIRST by default.

## LIMIT

The LIMIT keyword controls the number of results that are displayed. You can also use the OFFSET keyword to specify the result to display first. `LIMIT` and `OFFSET` are executed after `ORDER BY` in the query execution. You can include an offset in a LIMIT clause. For example, LIMIT 5 OFFSET 2 can also be written LIMIT 2, 5. Both of these clauses display the third through the seventh results.

In a statement that includes a PARTITON BY clause, the LIMIT keyword is performed on each partition, not on the entire set of results.

## SLIMIT

The SLIMIT keyword is used with a PARTITION BY clause to control the number of partitions that are displayed. You can include an offset in a SLIMIT clause. For example, SLIMIT 5 OFFSET 2 can also be written LIMIT 2, 5. Both of these clauses display the third through the seventh partitions.

Note: If you include an ORDER BY clause, only one partition can be displayed.

## Special Query
255

256
Some special query functions can be invoked without `FROM` sub-clause.
257

陶建辉(Jeff)'s avatar
陶建辉(Jeff) 已提交
258
### Obtain Current Database
259 260 261 262 263

The following SQL statement returns the current database. If a database has not been specified on login or with the `USE` command, a null value is returned.

```sql
SELECT DATABASE();
264 265
```

266
### Obtain Current Version
267

268 269 270
```sql
SELECT CLIENT_VERSION();
SELECT SERVER_VERSION();
271 272
```

陶建辉(Jeff)'s avatar
陶建辉(Jeff) 已提交
273
### Obtain Server Status
274 275 276 277 278

The following SQL statement returns the status of the TDengine server. An integer indicates that the server is running normally. An error code indicates that an error has occurred. This statement can also detect whether a connection pool or third-party tool is connected to TDengine properly. By using this statement, you can ensure that connections in a pool are not lost due to an incorrect heartbeat detection statement.

```sql
SELECT SERVER_STATUS();
279 280
```

281
### Obtain Current Time
282

283 284 285
```sql
SELECT NOW();
```
286

287
### Obtain Current Date
288

289 290 291
```sql
SELECT TODAY();
```
292

293
### Obtain Current Time Zone
294

295 296 297
```sql
SELECT TIMEZONE();
```
298 299 300 301 302

## Regular Expression

### Syntax

303
```txt
304 305 306 307 308
WHERE (column|tbname) **match/MATCH/nmatch/NMATCH** _regex_
```

### Specification

309
TDengine supports POSIX regular expression syntax. For more information, see [Regular Expressions](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html).
310 311 312

### Restrictions

313
Regular expression filtering is supported only on table names (TBNAME), BINARY tags, and NCHAR tags. Regular expression filtering cannot be performed on data columns.
314

315
A regular expression string cannot exceed 128 bytes. You can configure this value by modifying the maxRegexStringLen parameter on the TDengine Client. The modified value takes effect when the client is restarted.
316

X
Xiaoyu Wang 已提交
317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336
## CASE Expressions

### Syntax

```txt
CASE value WHEN compare_value THEN result [WHEN compare_value THEN result ...] [ELSE result] END
CASE WHEN condition THEN result [WHEN condition THEN result ...] [ELSE result] END
```

### Description
CASE expressions let you use IF ... THEN ... ELSE logic in SQL statements without having to invoke procedures. 

The first CASE syntax returns the `result` for the first `value`=`compare_value` comparison that is true. 

The second syntax returns the `result` for the first `condition` that is true. 

If no comparison or condition is true, the result after ELSE is returned, or NULL if there is no ELSE part.

The return type of the CASE expression is the result type of the first WHEN WHEN part, and the result type of the other WHEN WHEN parts and ELSE parts can be converted to them, otherwise TDengine will report an error.

337 338 339 340 341 342 343 344 345 346 347 348 349 350
### Examples

A device has three status codes to display its status. The statements are as follows:

```sql
SELECT CASE dev_status WHEN 1 THEN 'Running' WHEN 2 THEN 'Warning' WHEN 3 THEN 'Downtime' ELSE 'Unknown' END FROM dev_table;
```

The average voltage value of the smart meter is counted. When the voltage is less than 200 or more than 250, it is considered that the statistics is wrong, and the value is corrected to 220. The statement is as follows:

```sql
SELECT AVG(CASE WHEN voltage < 200 or voltage > 250 THEN 220 ELSE voltage END) FROM meters;
```

351 352
## JOIN

353
TDengine supports natural joins between supertables, between standard tables, and between subqueries. The difference between natural joins and inner joins is that natural joins require that the fields being joined in the supertables or standard tables must have the same name. Data or tag columns must be joined with the equivalent column in another table.
354

355
For standard tables, only the timestamp (primary key) can be used in join operations. For example:
356 357 358 359 360 361 362

```sql
SELECT *
FROM temp_tb_1 t1, pressure_tb_1 t2
WHERE t1.ts = t2.ts
```

363
For supertables, tags as well as timestamps can be used in join operations. For example:
364 365 366

```sql
SELECT *
367
FROM temp_stable t1, temp_stable t2
368 369 370
WHERE t1.ts = t2.ts AND t1.deviceid = t2.deviceid AND t1.status=0;
```

371
Similarly, join operations can be performed on the result sets of multiple subqueries.
372 373 374

:::note

375 376 377 378 379 380 381 382
The following restriction apply to JOIN statements:

- The number of tables or supertables in a single join operation cannot exceed 10.
- `FILL` cannot be used in a JOIN statement.
- Arithmetic operations cannot be performed on the result sets of join operation.
- `GROUP BY` is not allowed on a segment of the tables that participate in a join operation.
- `OR` cannot be used in the conditions for join operation
- Join operation can be performed only on tags or timestamps. You cannot perform a join operation on data columns.
383 384 385 386 387

:::

## Nested Query

388
Nested query is also called sub query. This means that in a single SQL statement the result of inner query can be used as the data source of the outer query.
389

S
Sean Ely 已提交
390
From 2.2.0.0, unassociated sub query can be used in the `FROM` clause. Unassociated means the sub query doesn't use the parameters in the parent query. More specifically, in the `tb_name_list` of `SELECT` statement, an independent SELECT statement can be used. So a complete nested query looks like:
391

392
```
393 394 395 396 397
SELECT ... FROM (SELECT ... FROM ...) ...;
```

:::info

G
gccgdb1234 已提交
398
- The result of a nested query is returned as a virtual table used by the outer query. It's recommended to give an alias to this table for the convenience of using it in the outer query.
399
- JOIN operation is allowed between tables/STables inside both inner and outer queries. Join operation can be performed on the result set of the inner query.
G
gccgdb1234 已提交
400
- The features that can be used in the inner query are the same as those that can be used in a non-nested query.
401 402
  - `ORDER BY` inside the inner query is unnecessary and will slow down the query performance significantly. It is best to avoid the use of `ORDER BY` inside the inner query.
- Compared to the non-nested query, the functionality that can be used in the outer query has the following restrictions:
403
  - Functions
G
gccgdb1234 已提交
404
    - If the result set returned by the inner query doesn't contain timestamp column, then functions relying on timestamp can't be used in the outer query, like INTERP,DERIVATIVE, IRATE, LAST_ROW, FIRST, LAST, TWA, STATEDURATION, TAIL, UNIQUE.
W
wade zhang 已提交
405
    - If the result set returned by the inner query are not sorted in order by timestamp, then functions relying on data ordered by timestamp can't be used in the outer query, like LEASTSQUARES, ELAPSED, INTERP, DERIVATIVE, IRATE, TWA, DIFF, STATECOUNT, STATEDURATION, CSUM, MAVG, TAIL, UNIQUE. 
G
gccgdb1234 已提交
406
    - Functions that need to scan the data twice can't be used in the outer query, like PERCENTILE.
407 408 409 410 411

:::

## UNION ALL

412
```txt title=Syntax
413 414 415 416 417
SELECT ...
UNION ALL SELECT ...
[UNION ALL SELECT ...]
```

418
TDengine supports the `UNION ALL` operation. `UNION ALL` operator can be used to combine the result set from multiple select statements as long as the result set of these select statements have exactly the same columns. `UNION ALL` doesn't remove redundant rows from multiple result sets. In a single SQL statement, at most 100 `UNION ALL` can be supported.
419 420 421 422 423

### Examples

table `tb1` is created using below SQL statement:

424
```
425 426 427 428 429
CREATE TABLE tb1 (ts TIMESTAMP, col1 INT, col2 FLOAT, col3 BINARY(50));
```

The rows in the past one hour in `tb1` can be selected using below SQL statement:

430
```
431 432 433 434 435
SELECT * FROM tb1 WHERE ts >= NOW - 1h;
```

The rows between 2018-06-01 08:00:00.000 and 2018-06-02 08:00:00.000 and col3 ends with 'nny' can be selected in the descending order of timestamp using below SQL statement:

436
```
437 438 439 440 441
SELECT * FROM tb1 WHERE ts > '2018-06-01 08:00:00.000' AND ts <= '2018-06-02 08:00:00.000' AND col3 LIKE '%nny' ORDER BY ts DESC;
```

The sum of col1 and col2 for rows later than 2018-06-01 08:00:00.000 and whose col2 is bigger than 1.2 can be selected and renamed as "complex", while only 10 rows are output from the 5th row, by below SQL statement:

442
```
443 444 445
SELECT (col1 + col2) AS 'complex' FROM tb1 WHERE ts > '2018-06-01 08:00:00.000' AND col2 > 1.2 LIMIT 10 OFFSET 5;
```

446
The rows in the past 10 minutes and whose col2 is bigger than 3.14 are selected and output to the result file `/home/testoutput.csv` with below SQL statement:
447

448
```
449
SELECT COUNT(*) FROM tb1 WHERE ts >= NOW - 10m AND col2 > 3.14 >> /home/testoutput.csv;
450
```