replacingmergetree.md 3.1 KB
Newer Older
I
Ivan Blinkov 已提交
1 2 3 4 5
---
toc_priority: 33
toc_title: ReplacingMergeTree
---

6
# ReplacingMergeTree {#replacingmergetree}
7

D
Denny Crane 已提交
8
The engine differs from [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md#table_engines-mergetree) in that it removes duplicate entries with the same [sorting key](../../../engines/table-engines/mergetree-family/mergetree.md) value (`ORDER BY` table section, not `PRIMARY KEY`).
9

10
Data deduplication occurs only during a merge. Merging occurs in the background at an unknown time, so you can’t plan for it. Some of the data may remain unprocessed. Although you can run an unscheduled merge using the `OPTIMIZE` query, don’t count on using it, because the `OPTIMIZE` query will read and write a large amount of data.
11

12
Thus, `ReplacingMergeTree` is suitable for clearing out duplicate data in the background in order to save space, but it doesn’t guarantee the absence of duplicates.
13

14
## Creating a Table {#creating-a-table}
15

16
``` sql
17 18 19 20 21 22 23 24
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
    name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
    name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
    ...
) ENGINE = ReplacingMergeTree([ver])
[PARTITION BY expr]
[ORDER BY expr]
25
[PRIMARY KEY expr]
26 27
[SAMPLE BY expr]
[SETTINGS name=value, ...]
28 29
```

30
For a description of request parameters, see [statement description](../../../sql-reference/statements/create/table.md).
31

D
Denny Crane 已提交
32 33 34
!!! note "Attention"
    Uniqueness of rows is determined by the `ORDER BY` table section, not `PRIMARY KEY`.

35 36
**ReplacingMergeTree Parameters**

37
-   `ver` — column with version. Type `UInt*`, `Date` or `DateTime`. Optional parameter.
38

39
    When merging, `ReplacingMergeTree` from all the rows with the same sorting key leaves only one:
40

D
Denny Crane 已提交
41
    -   The last in the selection, if `ver` not set. A selection is a set of rows in a set of parts participating in the merge. The most recently created part (the last insert) will be the last one in the selection. Thus, after deduplication, the very last row from the most recent insert will remain for each unique sorting key.
42
    -   With the maximum version, if `ver` specified.
43 44 45

**Query clauses**

46
When creating a `ReplacingMergeTree` table the same [clauses](../../../engines/table-engines/mergetree-family/mergetree.md) are required, as when creating a `MergeTree` table.
47

48 49 50
<details markdown="1">

<summary>Deprecated Method for Creating a Table</summary>
51

52
!!! attention "Attention"
53 54
    Do not use this method in new projects and, if possible, switch the old projects to the method described above.

55
``` sql
56 57 58 59 60 61 62 63 64
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
    name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
    name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
    ...
) ENGINE [=] ReplacingMergeTree(date-column [, sampling_expression], (primary, key), index_granularity, [ver])
```

All of the parameters excepting `ver` have the same meaning as in `MergeTree`.
65

66
-   `ver` - column with the version. Optional parameter. For a description, see the text above.
67

68
</details>
I
Ivan Blinkov 已提交
69

I
Ivan Blinkov 已提交
70
[Original article](https://clickhouse.tech/docs/en/operations/table_engines/replacingmergetree/) <!--hide-->