diff --git a/docs/en/operations/settings/query_complexity.md b/docs/en/operations/settings/query_complexity.md index a52644616d3efda4f3d5a030e791eefdfe40ea94..2deddc9a80c6d571b9de3f4e6b722349b60d288c 100644 --- a/docs/en/operations/settings/query_complexity.md +++ b/docs/en/operations/settings/query_complexity.md @@ -194,4 +194,55 @@ Maximum number of bytes (uncompressed data) that can be passed to a remote serve What to do when the amount of data exceeds one of the limits: 'throw' or 'break'. By default, throw. +## max_rows_in_join {#settings-max_rows_in_join} + +Limits number of rows in the hash table that is used when joining tables. + +This settings applies to [SELECT ... JOIN](../../query_language/select.md#select-join) operations and [Join engine](../table_engines/join.md) functioning. + +ClickHouse can proceed with different actions when the limit is reached. Use the [join_overflow_mode](#settings-join_overflow_mode) settings to choose the action. + +Possible values: + +- Positive integers. +- 0. Memory control is disabled. + +Default value: 0. + +## max_bytes_in_join {#settings-max_bytes_in_join} + +Limits number of bytes in the hash table that is used when joining tables. + +This settings applies to [SELECT ... JOIN](../../query_language/select.md#select-join) operations and the [Join table engine](../table_engines/join.md) functioning. + +ClickHouse can proceed with different actions when the limit is reached. Use the [join_overflow_mode](#settings-join_overflow_mode) settings to choose the action. + +Possible values: + +- Positive integers. +- 0. Memory control is disabled. + +Default value: 0. + +## join_overflow_mode {#settings-join_overflow_mode} + +Defines an action that ClickHouse performs, when the corresponding limit is reached. + +Controlled limits: + +- [max_bytes_in_join](#settings-max_bytes_in_join) +- [max_rows_in_join](#settings-max_rows_in_join) + +Possible values: + +- `THROW` — ClickHouse throws an exception and breaks operation. +- `BREAK` — ClickHouse breaks operation and doesn't throw an exception. + +Default value: `THROW`. + +**See Also** + +- [SELECT ... JOIN](../../query_language/select.md#select-join) +- [Join engine](../table_engines/join.md) + [Original article](https://clickhouse.yandex/docs/en/operations/settings/query_complexity/) diff --git a/docs/en/query_language/select.md b/docs/en/query_language/select.md index 418b726d8c17958d31cc6cf9c7426580b0a3d422..ec100bbdc20134f931064aed64f4102d30e35c65 100644 --- a/docs/en/query_language/select.md +++ b/docs/en/query_language/select.md @@ -428,7 +428,6 @@ Joins the data in the normal [SQL JOIN](https://en.wikipedia.org/wiki/Join_(SQL) !!! info "Note" Not related to [ARRAY JOIN](#select-array-join-clause). - ``` sql SELECT FROM @@ -465,8 +464,6 @@ Be careful when using `GLOBAL`. For more information, see the section [Distribut **Usage Recommendations** -All columns that are not needed for the `JOIN` are deleted from the subquery. - When running a `JOIN`, there is no optimization of the order of execution in relation to other stages of the query. The join (a search in the right table) is run before filtering in `WHERE` and before aggregation. In order to explicitly set the processing order, we recommend running a `JOIN` subquery with a subquery. Example: @@ -524,6 +521,15 @@ Among the various types of `JOIN`, the most efficient is `ANY LEFT JOIN`, then ` If you need a `JOIN` for joining with dimension tables (these are relatively small tables that contain dimension properties, such as names for advertising campaigns), a `JOIN` might not be very convenient due to the bulky syntax and the fact that the right table is re-accessed for every query. For such cases, there is an "external dictionaries" feature that you should use instead of `JOIN`. For more information, see the section [External dictionaries](dicts/external_dicts.md). +**Memory limitations** + +ClickHouse uses the [hash join](https://en.wikipedia.org/wiki/Hash_join) algorithm. ClickHouse takes the `` and create a hash table for it in RAM. If you need to restrict join operation memory consumption use the following settings: + +- [max_rows_in_join](../operations/settings/query_complexity.md#settings-max_rows_in_join) — Limits number of rows in the hash table. +- [max_bytes_in_join](../operations/settings/query_complexity.md#settings-max_bytes_in_join) — Limits size of the hash table. + +When one of these limits is reached, ClickHouse acts as the [join_overflow_mode](../operations/settings/query_complexity.md#settings-join_overflow_mode) setting instructs. + #### Processing of Empty or NULL Cells While joining tables, the empty cells may appear. The setting [join_use_nulls](../operations/settings/settings.md#settings-join_use_nulls) define how ClickHouse fills these cells.