DOCAPI-7430: EN review, RU translation. MergeTree INDEX bloom filter docs. (#7025)

* Update mergetree.md (#38) * DOCAPI-7430: RU translation.

DOCAPI-7430: EN review, RU translation. MergeTree INDEX bloom filter docs. (#7025)
* Update mergetree.md (#38) * DOCAPI-7430: RU translation.
6db4cb81 · BayoNet · alexey-milovidov · 5b38a7f4 · 6db4cb81 · 6db4cb81
Showing with 23 addition and 13 deletion

docs/en/operations/table_engines/mergetree.md docs/en/operations/table_engines/mergetree.md +7 -7

docs/ru/operations/table_engines/mergetree.md docs/ru/operations/table_engines/mergetree.md +16 -6

未找到文件。
--- a/docs/en/operations/table_engines/mergetree.md
+++ b/docs/en/operations/table_engines/mergetree.md
@@ -47,7 +47,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]

 For a description of parameters, see the [CREATE query description](../../query_language/create.md).

-!!! note "Note"
+!!!note "Note"
    `INDEX` is an experimental feature, see [Data Skipping Indexes](#table_engine-mergetree-data_skipping-indexes).

 ### Query Clauses
@@ -288,24 +288,24 @@ SELECT count() FROM table WHERE u64 * i32 == 10 AND u64 * length(s) >= 1234

 - `ngrambf_v1(n, size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)`

-    Stores a [bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) that contains all ngrams from a block of data. Works only with strings. Can be used for optimization of `equals`, `like` and `in` expressions.
+    Stores a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) that contains all ngrams from a block of data. Works only with strings. Can be used for optimization of `equals`, `like` and `in` expressions.

    - `n` — ngram size,
    - `size_of_bloom_filter_in_bytes` — Bloom filter size in bytes (you can use large values here, for example, 256 or 512, because it can be compressed well).
-    - `number_of_hash_functions` — The number of hash functions used in the bloom filter.
-    - `random_seed` — The seed for bloom filter hash functions.
+    - `number_of_hash_functions` — The number of hash functions used in the Bloom filter.
+    - `random_seed` — The seed for Bloom filter hash functions.

 - `tokenbf_v1(size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)`

    The same as `ngrambf_v1`, but stores tokens instead of ngrams. Tokens are sequences separated by non-alphanumeric characters.

- `bloom_filter([false_positive])` — Stores [bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) for the specified columns.
+- `bloom_filter([false_positive])` — Stores a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) for the specified columns.

-    The `false_positive` optional parameter is the probability of false positive response from the filter. Possible values: (0, 1). Default value: 0.025.
+    The optional `false_positive` parameter is the probability of receiving a false positive response from the filter. Possible values: (0, 1). Default value: 0.025.

    Supported data types: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`.

-    Supported for the following functions: [equals](../../query_language/functions/comparison_functions.md), [notEquals](../../query_language/functions/comparison_functions.md), [in](../../query_language/functions/in_functions.md), [notIn](../../query_language/functions/in_functions.md).
+    The following functions can use it: [equals](../../query_language/functions/comparison_functions.md), [notEquals](../../query_language/functions/comparison_functions.md), [in](../../query_language/functions/in_functions.md), [notIn](../../query_language/functions/in_functions.md).

 ```sql
 INDEX sample_index (u64 * length(s)) TYPE minmax GRANULARITY 4

--- a/docs/ru/operations/table_engines/mergetree.md
+++ b/docs/ru/operations/table_engines/mergetree.md
@@ -44,7 +44,10 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
 [SETTINGS name=value, ...]
 ```

-Описание параметров запроса смотрите в [описании запроса](../../query_language/create.md).
+Описание параметров смотрите в [описании запроса CREATE](../../query_language/create.md).
+
+!!!note "Note"
+    `INDEX` — экспериментальная возможность, смотрите [Индексы пропуска данных](#table_engine-mergetree-data_skipping-indexes).

 ### Секции запроса

@@ -244,7 +247,7 @@ ClickHouse не может использовать индекс, если зн

 ClickHouse использует эту логику не только для последовательностей дней месяца, но и для любого частично-монотонного первичного ключа.

-### Дополнительные индексы (Экспериментальная функциональность)
+### Индексы пропуска данных (экспериментальная функциональность) {#table_engine-mergetree-data_skipping-indexes}

 Для использования требуется установить настройку `allow_experimental_data_skipping_indices` в 1. (запустить `SET allow_experimental_data_skipping_indices = 1`).

@@ -282,11 +285,18 @@ SELECT count() FROM table WHERE u64 * i32 == 10 AND u64 * length(s) >= 1234

 #### Доступные индексы

-* `minmax`
-Хранит минимум и максимум выражения (если выражение - `tuple`, то для каждого элемента `tuple`), используя их для пропуска блоков аналогично первичному ключу.
+- `minmax` — Хранит минимум и максимум выражения (если выражение - `tuple`, то для каждого элемента `tuple`), используя их для пропуска блоков аналогично первичному ключу.
+
+- `set(max_rows)` — Хранит уникальные значения выражения на блоке в количестве не более `max_rows` (если `max_rows = 0`, то ограничений нет), используя их для пропуска блоков, оценивая выполнимость `WHERE` выражения на хранимых данных.
+
+- `bloom_filter([false_positive])` — [фильтр Блума](https://en.wikipedia.org/wiki/Bloom_filter) для указанных стоблцов.
+
+    Необязательный параметр `false_positive` — это вероятность получения ложноположительного срабатывания. Возможные значения: (0, 1). Значение по умолчанию: 0.025.
+
+    Поддержанные типы данных: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`.
+
+    Фильтром могут пользоваться функции: [equals](../../query_language/functions/comparison_functions.md), [notEquals](../../query_language/functions/comparison_functions.md), [in](../../query_language/functions/in_functions.md), [notIn](../../query_language/functions/in_functions.md).

-* `set(max_rows)`
-Хранит уникальные значения выражения на блоке в количестве не более `max_rows` (если `max_rows = 0`, то ограничений нет), используя их для пропуска блоков, оценивая выполнимость `WHERE` выражения на хранимых данных.

 **Примеры**