02-database.md 11.3 KB
Newer Older
1 2
---
title: Database
D
danielclow 已提交
3 4
sidebar_label: Database
description: This document describes how to create and perform operations on databases.
5 6
---

7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
## Create a Database

```sql
CREATE DATABASE [IF NOT EXISTS] db_name [database_options]
 
database_options:
    database_option ...
 
database_option: {
    BUFFER value
  | CACHEMODEL {'none' | 'last_row' | 'last_value' | 'both'}
  | CACHESIZE value
  | COMP {0 | 1 | 2}
  | DURATION value
  | WAL_FSYNC_PERIOD value
  | MAXROWS value
  | MINROWS value
  | KEEP value
  | PAGES value
  | PAGESIZE  value
  | PRECISION {'ms' | 'us' | 'ns'}
  | REPLICA value
  | RETENTIONS ingestion_duration:keep_duration ...
  | WAL_LEVEL {1 | 2}
  | VGROUPS value
  | SINGLE_STABLE {0 | 1}
33
  | STT_TRIGGER value
34 35
  | TABLE_PREFIX value
  | TABLE_SUFFIX value
36
  | TSDB_PAGESIZE value
37 38
  | WAL_RETENTION_PERIOD value
  | WAL_RETENTION_SIZE value
39
  | WAL_ROLL_PERIOD value
40 41 42 43 44 45
  | WAL_SEGMENT_SIZE value
}
```

## Parameters

46
- BUFFER: specifies the size (in MB) of the write buffer for each vnode. Enter a value between 3 and 16384. The default value is 256.
47 48 49 50 51 52 53 54 55 56 57 58 59 60
- CACHEMODEL: specifies how the latest data in subtables is stored in the cache. The default value is none.
  - none: The latest data is not cached.
  - last_row: The last row of each subtable is cached. This option significantly improves the performance of the LAST_ROW function.
  - last_value: The last non-null value of each column in each subtable is cached. This option significantly improves the performance of the LAST function under normal circumstances, such as statements including the WHERE, ORDER BY, GROUP BY, and INTERVAL keywords.
  - both: The last row of each subtable and the last non-null value of each column in each subtable are cached.
- CACHESIZE: specifies the amount (in MB) of memory used for subtable caching on each vnode. Enter a value between 1 and 65536. The default value is 1.
- COMP: specifies how databases are compressed. The default value is 2.
  - 0: Compression is disabled.
  - 1: One-pass compression is enabled.
  - 2: Two-pass compression is enabled.
- DURATION: specifies the time period contained in each data file. After the time specified by this parameter has elapsed, TDengine creates a new data file to store incoming data. You can use m (minutes), h (hours), and d (days) as the unit, for example DURATION 100h or DURATION 10d. If you do not include a unit, d is used by default.
- WAL_FSYNC_PERIOD: specifies the interval (in milliseconds) at which data is written from the WAL to disk. This parameter takes effect only when the WAL parameter is set to 2. The default value is 3000. Enter a value between 0 and 180000. The value 0 indicates that incoming data is immediately written to disk.
- MAXROWS: specifies the maximum number of rows recorded in a block. The default value is 4096.
- MINROWS: specifies the minimum number of rows recorded in a block. The default value is 100.
C
Cary Xu 已提交
61
- KEEP: specifies the time for which data is retained. Enter a value between 1 and 365000. The default value is 3650. The value of the KEEP parameter must be greater than or equal to the value of the DURATION parameter. TDengine automatically deletes data that is older than the value of the KEEP parameter. You can use m (minutes), h (hours), and d (days) as the unit, for example KEEP 100h or KEEP 10d. If you do not include a unit, d is used by default. The Enterprise Edition supports [Tiered Storage](https://docs.tdengine.com/tdinternal/arch/#tiered-storage) function, thus multiple KEEP values (comma separated and up to 3 values supported, and meet keep 0 <= keep 1 <= keep 2, e.g. KEEP 100h,100d,3650d) are supported; the Community Edition does not support Tiered Storage function (although multiple keep values are configured, they do not take effect, only the maximum keep value is used as KEEP).
62 63 64 65 66 67 68 69 70 71 72 73
- PAGES: specifies the number of pages in the metadata storage engine cache on each vnode. Enter a value greater than or equal to 64. The default value is 256. The space occupied by metadata storage on each vnode is equal to the product of the values of the PAGESIZE and PAGES parameters. The space occupied by default is 1 MB.
- PAGESIZE: specifies the size (in KB) of each page in the metadata storage engine cache on each vnode. The default value is 4. Enter a value between 1 and 16384.
- PRECISION: specifies the precision at which a database records timestamps. Enter ms for milliseconds, us for microseconds, or ns for nanoseconds. The default value is ms.
- REPLICA: specifies the number of replicas that are made of the database. Enter 1 or 3. The default value is 1. The value of the REPLICA parameter cannot exceed the number of dnodes in the cluster.
- RETENTIONS: specifies the retention period for data aggregated at various intervals. For example, RETENTIONS 15s:7d,1m:21d,15m:50d indicates that data aggregated every 15 seconds is retained for 7 days, data aggregated every 1 minute is retained for 21 days, and data aggregated every 15 minutes is retained for 50 days. You must enter three aggregation intervals and corresponding retention periods.
- WAL_LEVEL: specifies whether fsync is enabled. The default value is 1.
  - 1: WAL is enabled but fsync is disabled.
  - 2: WAL and fsync are both enabled.
- VGROUPS: specifies the initial number of vgroups when a database is created.
- SINGLE_STABLE: specifies whether the database can contain more than one supertable.
  - 0: The database can contain multiple supertables.
  - 1: The database can contain only one supertable.
74
- STT_TRIGGER: specifies the number of file merges triggered by flushed files. The default is 8, ranging from 1 to 16. For high-frequency scenarios with few tables, it is recommended to use the default configuration or a smaller value for this parameter; For multi-table low-frequency scenarios, it is recommended to configure this parameter with a larger value.
75 76
- TABLE_PREFIX: The prefix in the table name that is ignored when distributing a table to a vgroup when it's a positive number, or only the prefix is used when distributing a table to a vgroup, the default value is 0; For example, if the table name v30001, then "0001" is used if TSDB_PREFIX is set to 2 but "v3" is used if TSDB_PREFIX is set to -2; It can help you to control the distribution of tables.
- TABLE_SUFFIX: The suffix in the table name that is ignored when distributing a table to a vgroup when it's a positive number, or only the suffix is used when distributing a table to a vgroup, the default value is 0; For example, if the table name v30001, then "v300" is used if TSDB_SUFFIX is set to 2 but "01" is used if TSDB_SUFFIX is set to -2; It can help you to control the distribution of tables. 
77
- TSDB_PAGESIZE: The page size of the data storage engine in a vnode. The unit is KB. The default is 4 KB. The range is 1 to 16384, that is, 1 KB to 16 MB.
78 79
- WAL_RETENTION_PERIOD: specifies the maximum time of which WAL files are to be kept for consumption. This parameter is used for data subscription. Enter a time in seconds. The default value 0. A value of 0 indicates that WAL files are not required to keep for consumption. Alter it with a proper value at first to create topics.
- WAL_RETENTION_SIZE: specifies the maximum total size of which WAL files are to be kept for consumption. This parameter is used for data subscription. Enter a size in KB. The default value is 0. A value of 0 indicates that the total size of WAL files to keep for consumption has no upper limit.
80 81
- WAL_ROLL_PERIOD: specifies the time after which WAL files are rotated. After this period elapses, a new WAL file is created. The default value is 0. A value of 0 indicates that a new WAL file is created only after TSDB data in memory are flushed to disk.
- WAL_SEGMENT_SIZE: specifies the maximum size of a WAL file. After the current WAL file reaches this size, a new WAL file is created. The default value is 0. A value of 0 indicates that a new WAL file is created only after TSDB data in memory are flushed to disk.
82 83 84 85 86 87 88 89 90 91
### Example Statement

```sql
create database if not exists db vgroups 10 buffer 10

```

The preceding SQL statement creates a database named db that has 10 vgroups and whose vnodes have a 10 MB cache.

### Specify the Database in Use
92 93

```
94
USE db_name;
95 96
```

97
The preceding SQL statement switches to the specified database. (If you connect to TDengine over the REST API, this statement does not take effect.)
98

99
## Drop a Database
100 101

```
102
DROP DATABASE [IF EXISTS] db_name
103 104
```

105
The preceding SQL statement deletes the specified database. This statement will delete all tables in the database and destroy all vgroups associated with it. Exercise caution when using this statement.
106

107
## Change Database Configuration
108

109 110
```sql
ALTER DATABASE db_name [alter_database_options]
111

112 113
alter_database_options:
    alter_database_option ...
114

115 116 117
alter_database_option: {
    CACHEMODEL {'none' | 'last_row' | 'last_value' | 'both'}
  | CACHESIZE value
118 119 120 121
  | BUFFER value
  | PAGES value
  | REPLICA value
  | STT_TRIGGER value
122 123 124
  | WAL_LEVEL value
  | WAL_FSYNC_PERIOD value
  | KEEP value
125 126
  | WAL_RETENTION_PERIOD value
  | WAL_RETENTION_SIZE value
127
}
128 129
```

G
gccgdb1234 已提交
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
###  ALTER CACHESIZE

The command of changing database configuration parameters is easy to use, but it's hard to determine whether a parameter is proper or not. In this section we will describe how to determine whether cachesize is big enough.

1. How to check cachesize?

You can use  `select * from information_schema.ins_databases;` to get the value of cachesize.

2. How to check cacheload?

You can use `show <db_name>.vgroups;` to check the value of cacheload.

3. Determine whether cachesize is big engough

If the value of `cacheload` is very close to the value of `cachesize`, then it's very probably that `cachesize` is too small. If the value of `cacheload` is much smaller than the value of `cachesize`, then `cachesize` is big enough. You can use this simple principle to determine. Depending on how much memory is available in your system, you can choose to double `cachesize` or incrase it by even 5 or more times.

146
:::note
147
Other parameters cannot be modified after the database has been created.
148 149 150

:::

151
## View Databases
152

153
### View All Databases
154 155

```
156
SHOW DATABASES;
157 158
```

159
### View the CREATE Statement for a Database
160 161

```
162
SHOW CREATE DATABASE db_name;
163 164
```

165
The preceding SQL statement can be used in migration scenarios. This command can be used to get the CREATE statement, which can be used in another TDengine instance to create the exact same database.
166

167
### View Database Configuration
168

169
```sql
W
wade zhang 已提交
170
SELECT * FROM INFORMATION_SCHEMA.INS_DATABASES WHERE NAME='DBNAME' \G;
171 172
```

173
The preceding SQL statement shows the value of each parameter for the specified database. One value is displayed per line.
174

175
## Delete Expired Data
176

177 178
```sql
TRIM DATABASE db_name;
179 180
```

181
The preceding SQL statement deletes data that has expired and orders the remaining data in accordance with the storage configuration.
182

X
Xiaoyu Wang 已提交
183 184 185 186 187 188 189 190
## Flush Data

```sql
FLUSH DATABASE db_name;
```

Flush data from memory onto disk. Before shutting down a node, executing this command can avoid data restore after restarting and speed up the startup process.

191 192 193 194 195 196 197 198 199 200 201 202 203 204 205
## Redistribute Vgroup

```sql
REDISTRIBUTE VGROUP vgroup_no DNODE dnode_id1 [DNODE dnode_id2] [DNODE dnode_id3]
```

Adjust the distribution of vnodes in the vgroup according to the given list of dnodes. 

## Balance Vgroup

```sql
BALANCE VGROUP
```

Automatically adjusts the distribution of vnodes in all vgroups of the cluster, which is equivalent to load balancing the data of the cluster at the vnode level.