index.mdx 7.5 KB
Newer Older
D
dingbo 已提交
1
---
G
gccgdb1234 已提交
2
title: Data Model
D
dingbo 已提交
3 4
---

S
Sean Ely 已提交
5
The data model employed by TDengine is similar to a relational database, you need to create databases and tables. Design the data model based on your own application scenarios and you should design the STable (abbreviation for super table) schema to fit your data. This chapter will explain the big picture without getting into syntax details.
D
dingbo 已提交
6

G
gccgdb1234 已提交
7
## Create Database
D
dingbo 已提交
8

S
Sean Ely 已提交
9
The characteristics of data from different data collection points may be different, such as collection frequency, days to keep, number of replicas, data block size, whether it's allowed to update data, etc. For TDengine to operate with the best performance, it's strongly suggested to put the data with different characteristics into different databases because different storage policies can be set for each database. When creating a database, there are a lot of parameters that can be configured, such as the days to keep data, the number of replicas, the number of memory blocks, time precision, the minimum and maximum number of rows in each data block, compress or not, the time range of the data in single data file, etc. Below is an example of the SQL statement for creating a database.
D
dingbo 已提交
10 11 12 13 14

```sql
CREATE DATABASE power KEEP 365 DAYS 10 BLOCKS 6 UPDATE 1;
```

G
gccgdb1234 已提交
15
In the above SQL statement, a database named "power" will be created, the data in it will be kept for 365 days, which means the data older than 365 days will be deleted automatically, a new data file will be created every 10 days, the number of memory blocks is 6, data is allowed to be updated. For more details please refer to [Database](/taos-sql/database).
D
dingbo 已提交
16

S
Sean Ely 已提交
17
After creating a database, the current database in use can be switched using SQL command `USE`, for example below SQL statement switches the current database to `power`. Without the current database specified, table name must be preceded with the corresponding database name.
D
dingbo 已提交
18 19 20 21 22 23 24

```sql
USE power;
```

:::note

G
gccgdb1234 已提交
25
- Any table or STable must belong to a database. To create a table or STable, the database it belongs to must be ready.
S
Sean Ely 已提交
26
- JOIN operations can't be performed on tables from two different databases.
G
gccgdb1234 已提交
27
- Timestamp needs to be specified when inserting rows or querying historical rows.
D
dingbo 已提交
28 29 30

:::

G
gccgdb1234 已提交
31
## Create STable
D
dingbo 已提交
32

S
Sean Ely 已提交
33
In a time-series application, there may be multiple kinds of data collection points. For example, in the electrical power system there are meters, transformers, bus bars, switches, etc. For easy and efficient aggregation of multiple tables, one STable needs to be created for each kind of data collection point. For example, for the meters in [table 1](/tdinternal/arch#model_table1), the below SQL statement can be used to create the super table.
D
dingbo 已提交
34 35

```sql
G
gccgdb1234 已提交
36
CREATE STable meters (ts timestamp, current float, voltage int, phase float) TAGS (location binary(64), groupId int);
D
dingbo 已提交
37 38 39
```

:::note
G
gccgdb1234 已提交
40
If you are using versions prior to 2.0.15, the `STable` keyword needs to be replaced with `TABLE`.
G
gccgdb1234 已提交
41

D
dingbo 已提交
42 43
:::

S
Sean Ely 已提交
44
Similar to creating a regular table, when creating a STable, the name and schema need to be provided. In the STable schema, the first column must be timestamp (like ts in the example), and the other columns (like current, voltage and phase in the example) are the data collected. The column type can be integer, float, double, string ,etc. Besides, the schema for tags need to be provided, like location and groupId in the example. The tag type can be integer, float, string, etc. The static properties of a data collection point can be defined as tags, like the location, device type, device group ID, manager ID, etc. Tags in the schema can be added, removed or updated. Please refer to [STable](/taos-sql/stable) for more details.
D
dingbo 已提交
45

S
Sean Ely 已提交
46
For each kind of data collection point, a corresponding STable must be created. There may be many STables in an application. For electrical power system, we need to create a STable respectively for meters, transformers, busbars, switches. There may be multiple kinds of data collection points on a single device, for example there may be one data collection point for electrical data like current and voltage and another point for environmental data like temperature, humidity and wind direction, multiple STables are required for such kind of device.
D
dingbo 已提交
47

S
Sean Ely 已提交
48
At most 4096 (or 1024 prior to version 2.1.7.0) columns are allowed in a STable. If there are more than 4096 of metrics to be collected for a data collection point, multiple STables are required. There can be multiple databases in a system, while one or more STables can exist in a database.
D
dingbo 已提交
49

G
gccgdb1234 已提交
50
## Create Table
D
dingbo 已提交
51

G
gccgdb1234 已提交
52
A specific table needs to be created for each data collection point. Similar to RDBMS, table name and schema are required to create a table. Beside, one or more tags can be created for each table. To create a table, a STable needs to be used as template and the values need to be specified for the tags. For example, for the meters in [Table 1](/tdinternal/arch#model_table1), the table can be created using below SQL statement.
D
dingbo 已提交
53 54 55 56 57

```sql
CREATE TABLE d1001 USING meters TAGS ("Beijing.Chaoyang", 2);
```

陶建辉(Jeff)'s avatar
陶建辉(Jeff) 已提交
58
In the above SQL statement, "d1001" is the table name, "meters" is the STable name, followed by the value of tag "Location" and the value of tag "groupId", which are "Beijing.Chaoyang" and "2" respectively in the example. The tag values can be updated after the table is created. Please refer to [Tables](/taos-sql/table) for details.
D
dingbo 已提交
59

S
Sean Ely 已提交
60
In TDengine system, it's recommended to create a table for a data collection point via STable. A table created via STable is called subtable in some parts of the TDengine documentation. All SQL commands applied on regular tables can be applied on subtables.
D
dingbo 已提交
61 62

:::warning
G
gccgdb1234 已提交
63
It's not recommended to create a table in a database while using a STable from another database as template.
D
dingbo 已提交
64

G
gccgdb1234 已提交
65
:::tip
G
gccgdb1234 已提交
66
It's suggested to use the global unique ID of a data collection point as the table name, for example the device serial number. If there isn't such a unique ID, multiple IDs that are not global unique can be combined to form a global unique ID. It's not recommended to use a global unique ID as tag value.
D
dingbo 已提交
67

陶建辉(Jeff)'s avatar
陶建辉(Jeff) 已提交
68
## Create Table Automatically
D
dingbo 已提交
69

S
Sean Ely 已提交
70
In some circumstances, it's unknown whether the table already exists when inserting rows. The table can be created automatically using the SQL statement below, and nothing will happen if the table already exist.
D
dingbo 已提交
71 72 73 74 75

```sql
INSERT INTO d1001 USING meters TAGS ("Beijng.Chaoyang", 2) VALUES (now, 10.2, 219, 0.32);
```

G
gccgdb1234 已提交
76
In the above SQL statement, a row with value `(now, 10.2, 219, 0.32)` will be inserted into table "d1001". If table "d1001" doesn't exist, it will be created automatically using STable "meters" as template with tag value `"Beijing.Chaoyang", 2`.
D
dingbo 已提交
77

G
gccgdb1234 已提交
78
For more details please refer to [Create Table Automatically](/taos-sql/insert#automatically-create-table-when-inserting).
D
dingbo 已提交
79

G
gccgdb1234 已提交
80
## Single Column vs Multiple Column
D
dingbo 已提交
81

S
Sean Ely 已提交
82
A multiple columns data model is supported in TDengine. As long as multiple metrics are collected by the same data collection point at the same time, i.e. the timestamp are identical, these metrics can be put in a single STable as columns. However, there is another kind of design, i.e. single column data model, a table is created for each metric, which means a STable is required for each kind of metric. For example, 3 STables are required for current, voltage and phase.
D
dingbo 已提交
83

S
Sean Ely 已提交
84
It's recommended to use a multiple column data model as much as possible because it's better in the performance of inserting or querying rows. In some cases, however, the metrics to be collected vary frequently and correspondingly the STable schema needs to be changed frequently too. In such case, it's more convenient to use single column data model.