diff --git a/docs/en/engines/table-engines/integrations/embedded-rocksdb.md b/docs/en/engines/table-engines/integrations/embedded-rocksdb.md index 6e864751cc3389e458e9787eabc5a4b741239383..e9e069933e527adc98494b768d9ccf08a41365d8 100644 --- a/docs/en/engines/table-engines/integrations/embedded-rocksdb.md +++ b/docs/en/engines/table-engines/integrations/embedded-rocksdb.md @@ -39,4 +39,4 @@ ENGINE = EmbeddedRocksDB PRIMARY KEY key ``` -[Original article](https://clickhouse.tech/docs/en/operations/table_engines/embedded-rocksdb/) +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/embedded-rocksdb/) diff --git a/docs/en/engines/table-engines/integrations/hdfs.md b/docs/en/engines/table-engines/integrations/hdfs.md index 5c36e3f1c21f4d50d6b8db1c203fda45b0985a46..0782efe8e72be7173a76328fc85512f2be15d9a0 100644 --- a/docs/en/engines/table-engines/integrations/hdfs.md +++ b/docs/en/engines/table-engines/integrations/hdfs.md @@ -5,7 +5,7 @@ toc_title: HDFS # HDFS {#table_engines-hdfs} -This engine provides integration with [Apache Hadoop](https://en.wikipedia.org/wiki/Apache_Hadoop) ecosystem by allowing to manage data on [HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html)via ClickHouse. This engine is similar +This engine provides integration with [Apache Hadoop](https://en.wikipedia.org/wiki/Apache_Hadoop) ecosystem by allowing to manage data on [HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html) via ClickHouse. This engine is similar to the [File](../../../engines/table-engines/special/file.md#table_engines-file) and [URL](../../../engines/table-engines/special/url.md#table_engines-url) engines, but provides Hadoop-specific features. ## Usage {#usage} @@ -174,7 +174,7 @@ Similar to GraphiteMergeTree, the HDFS engine supports extended configuration us | dfs\_domain\_socket\_path | "" | -[HDFS Configuration Reference ](https://hawq.apache.org/docs/userguide/2.3.0.0-incubating/reference/HDFSConfigurationParameterReference.html) might explain some parameters. +[HDFS Configuration Reference](https://hawq.apache.org/docs/userguide/2.3.0.0-incubating/reference/HDFSConfigurationParameterReference.html) might explain some parameters. #### ClickHouse extras {#clickhouse-extras} @@ -185,7 +185,6 @@ Similar to GraphiteMergeTree, the HDFS engine supports extended configuration us |hadoop\_kerberos\_kinit\_command | kinit | #### Limitations {#limitations} - * hadoop\_security\_kerberos\_ticket\_cache\_path can be global only, not user specific ## Kerberos support {#kerberos-support} @@ -207,4 +206,4 @@ If hadoop\_kerberos\_keytab, hadoop\_kerberos\_principal or hadoop\_kerberos\_ki - [Virtual columns](../../../engines/table-engines/index.md#table_engines-virtual_columns) -[Original article](https://clickhouse.tech/docs/en/operations/table_engines/hdfs/) +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/hdfs/) diff --git a/docs/en/engines/table-engines/integrations/index.md b/docs/en/engines/table-engines/integrations/index.md index 288c9c3cd56a2e2b673e7256ab3ab5355a71254f..28f38375448b36f43658c9aff30a39b712ad1a8a 100644 --- a/docs/en/engines/table-engines/integrations/index.md +++ b/docs/en/engines/table-engines/integrations/index.md @@ -18,3 +18,6 @@ List of supported integrations: - [Kafka](../../../engines/table-engines/integrations/kafka.md) - [EmbeddedRocksDB](../../../engines/table-engines/integrations/embedded-rocksdb.md) - [RabbitMQ](../../../engines/table-engines/integrations/rabbitmq.md) +- [PostgreSQL](../../../engines/table-engines/integrations/postgresql.md) + +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/) diff --git a/docs/en/engines/table-engines/integrations/jdbc.md b/docs/en/engines/table-engines/integrations/jdbc.md index 2144be9f1e325bd372dcaeadc10f3572180f64f2..edbc5d3ed3eefc8eb0774233f847273fd527715e 100644 --- a/docs/en/engines/table-engines/integrations/jdbc.md +++ b/docs/en/engines/table-engines/integrations/jdbc.md @@ -85,4 +85,4 @@ FROM jdbc_table - [JDBC table function](../../../sql-reference/table-functions/jdbc.md). -[Original article](https://clickhouse.tech/docs/en/operations/table_engines/jdbc/) +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/jdbc/) diff --git a/docs/en/engines/table-engines/integrations/kafka.md b/docs/en/engines/table-engines/integrations/kafka.md index fb1df62bb1511b4ef67cac486dd1649e0f743a71..1b3aaa4b5695e900d4cb4247681c93fc2be1e943 100644 --- a/docs/en/engines/table-engines/integrations/kafka.md +++ b/docs/en/engines/table-engines/integrations/kafka.md @@ -194,4 +194,4 @@ Example: - [Virtual columns](../../../engines/table-engines/index.md#table_engines-virtual_columns) - [background_schedule_pool_size](../../../operations/settings/settings.md#background_schedule_pool_size) -[Original article](https://clickhouse.tech/docs/en/operations/table_engines/kafka/) +[Original article](https://clickhouse.tech/docs/enen/engines/table-engines/integrations/kafka/) diff --git a/docs/en/engines/table-engines/integrations/mongodb.md b/docs/en/engines/table-engines/integrations/mongodb.md index e648a13b5e0ba386c0b8f17b4f3d9abf3e2047e9..2fee27ce80db816f576a837178b92930af036042 100644 --- a/docs/en/engines/table-engines/integrations/mongodb.md +++ b/docs/en/engines/table-engines/integrations/mongodb.md @@ -54,4 +54,4 @@ SELECT COUNT() FROM mongo_table; └─────────┘ ``` -[Original article](https://clickhouse.tech/docs/en/operations/table_engines/integrations/mongodb/) +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/mongodb/) diff --git a/docs/en/engines/table-engines/integrations/mysql.md b/docs/en/engines/table-engines/integrations/mysql.md index 2ea8ea959582048d7b78ccfc8362ef8dcf17d0ba..8b7caa12c914eb93a4a045193d133026c699c843 100644 --- a/docs/en/engines/table-engines/integrations/mysql.md +++ b/docs/en/engines/table-engines/integrations/mysql.md @@ -101,4 +101,4 @@ SELECT * FROM mysql_table - [The ‘mysql’ table function](../../../sql-reference/table-functions/mysql.md) - [Using MySQL as a source of external dictionary](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-mysql) -[Original article](https://clickhouse.tech/docs/en/operations/table_engines/mysql/) +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/mysql/) diff --git a/docs/en/engines/table-engines/integrations/odbc.md b/docs/en/engines/table-engines/integrations/odbc.md index 8083d644deb4b79a530426989c9f02171b5e622a..99efd8700886b6314623c7821c1940992c958a7f 100644 --- a/docs/en/engines/table-engines/integrations/odbc.md +++ b/docs/en/engines/table-engines/integrations/odbc.md @@ -128,4 +128,4 @@ SELECT * FROM odbc_t - [ODBC external dictionaries](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-odbc) - [ODBC table function](../../../sql-reference/table-functions/odbc.md) -[Original article](https://clickhouse.tech/docs/en/operations/table_engines/odbc/) +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/odbc/) diff --git a/docs/en/engines/table-engines/integrations/postgresql.md b/docs/en/engines/table-engines/integrations/postgresql.md index 7272f2e5edf8a5241324ffc4301a230da86a77d4..1a2ccf3e0dc9c7d562067b29498650ff528d37c2 100644 --- a/docs/en/engines/table-engines/integrations/postgresql.md +++ b/docs/en/engines/table-engines/integrations/postgresql.md @@ -102,3 +102,5 @@ SELECT * FROM postgresql_table WHERE str IN ('test') - [The ‘postgresql’ table function](../../../sql-reference/table-functions/postgresql.md) - [Using PostgreSQL as a source of external dictionary](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-postgresql) + +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/postgresql/) diff --git a/docs/en/engines/table-engines/integrations/rabbitmq.md b/docs/en/engines/table-engines/integrations/rabbitmq.md index 4a0550275ca8e4cbac4598edf132c996012638d9..476192d3969433590b17ce330ba9f1eb0e1009c4 100644 --- a/docs/en/engines/table-engines/integrations/rabbitmq.md +++ b/docs/en/engines/table-engines/integrations/rabbitmq.md @@ -163,3 +163,5 @@ Example: - `_redelivered` - `redelivered` flag of the message. - `_message_id` - messageID of the received message; non-empty if was set, when message was published. - `_timestamp` - timestamp of the received message; non-empty if was set, when message was published. + +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/rabbitmq/) diff --git a/docs/en/engines/table-engines/integrations/s3.md b/docs/en/engines/table-engines/integrations/s3.md index 5858a0803e633748158030ad3e851f804fe5f78d..93dcbdbc0f1c57e2e205c804a5c85c6bb0ae3639 100644 --- a/docs/en/engines/table-engines/integrations/s3.md +++ b/docs/en/engines/table-engines/integrations/s3.md @@ -6,11 +6,11 @@ toc_title: S3 # S3 {#table_engines-s3} This engine provides integration with [Amazon S3](https://aws.amazon.com/s3/) ecosystem. This engine is similar -to the [HDFS](../../../engines/table-engines/special/file.md#table_engines-hdfs) engine, but provides S3-specific features. +to the [HDFS](../../../engines/table-engines/integrations/hdfs.md#table_engines-hdfs) engine, but provides S3-specific features. ## Usage {#usage} -``` sql +```sql ENGINE = S3(path, [aws_access_key_id, aws_secret_access_key,] format, structure, [compression]) ``` @@ -25,23 +25,23 @@ ENGINE = S3(path, [aws_access_key_id, aws_secret_access_key,] format, structure, **1.** Set up the `s3_engine_table` table: -``` sql +```sql CREATE TABLE s3_engine_table (name String, value UInt32) ENGINE=S3('https://storage.yandexcloud.net/my-test-bucket-768/test-data.csv.gz', 'CSV', 'name String, value UInt32', 'gzip') ``` **2.** Fill file: -``` sql +```sql INSERT INTO s3_engine_table VALUES ('one', 1), ('two', 2), ('three', 3) ``` **3.** Query the data: -``` sql +```sql SELECT * FROM s3_engine_table LIMIT 2 ``` -``` text +```text ┌─name─┬─value─┐ │ one │ 1 │ │ two │ 2 │ @@ -69,7 +69,7 @@ Constructions with `{}` are similar to the [remote](../../../sql-reference/table **Example** -1. Suppose we have several files in TSV format with the following URIs on HDFS: +1. Suppose we have several files in CSV format with the following URIs on S3: - ‘https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_1.csv’ - ‘https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_2.csv’ @@ -82,19 +82,19 @@ Constructions with `{}` are similar to the [remote](../../../sql-reference/table -``` sql +```sql CREATE TABLE table_with_range (name String, value UInt32) ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/{some,another}_prefix/some_file_{1..3}', 'CSV') ``` 3. Another way: -``` sql +```sql CREATE TABLE table_with_question_mark (name String, value UInt32) ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/{some,another}_prefix/some_file_?', 'CSV') ``` 4. Table consists of all the files in both directories (all files should satisfy format and schema described in query): -``` sql +```sql CREATE TABLE table_with_asterisk (name String, value UInt32) ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/{some,another}_prefix/*', 'CSV') ``` @@ -105,7 +105,7 @@ CREATE TABLE table_with_asterisk (name String, value UInt32) ENGINE = S3('https: Create table with files named `file-000.csv`, `file-001.csv`, … , `file-999.csv`: -``` sql +```sql CREATE TABLE big_table (name String, value UInt32) ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/big_prefix/file-{000..999}.csv', 'CSV') ``` @@ -124,7 +124,7 @@ The following settings can be set before query execution or placed into configur - `s3_max_single_part_upload_size` — Default value is `64Mb`. The maximum size of object to upload using singlepart upload to S3. - `s3_min_upload_part_size` — Default value is `512Mb`. The minimum size of part to upload during multipart upload to [S3 Multipart upload](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html). -- `s3_max_redirects` — Default value is `10`. Max number of S3 redirects hops allowed. +- `s3_max_redirects` — Default value is `10`. Max number of HTTP redirects S3 hops allowed. Security consideration: if malicious user can specify arbitrary S3 URLs, `s3_max_redirects` must be set to zero to avoid [SSRF](https://en.wikipedia.org/wiki/Server-side_request_forgery) attacks; or alternatively, `remote_host_filter` must be specified in server configuration. @@ -153,4 +153,4 @@ Example: ``` -[Original article](https://clickhouse.tech/docs/en/operations/table_engines/s3/) +[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/s3/) diff --git a/docs/en/sql-reference/table-functions/file.md b/docs/en/sql-reference/table-functions/file.md index da0999e66eb820ba7e0564aad7d4de8b11303a4d..e1459b5e2547b12d193280a0de3280f400ebe01f 100644 --- a/docs/en/sql-reference/table-functions/file.md +++ b/docs/en/sql-reference/table-functions/file.md @@ -124,6 +124,6 @@ SELECT count(*) FROM file('big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name String, **See Also** -- [Virtual columns](index.md#table_engines-virtual_columns) +- [Virtual columns](../../engines/table-engines/index.md#table_engines-virtual_columns) [Original article](https://clickhouse.tech/docs/en/sql-reference/table-functions/file/) diff --git a/docs/en/sql-reference/table-functions/hdfs.md b/docs/en/sql-reference/table-functions/hdfs.md index 512f47a2b461325460a72871dede1ac981935bce..31e2000b22d157c7b3bf01bcc06c5176b90e30c9 100644 --- a/docs/en/sql-reference/table-functions/hdfs.md +++ b/docs/en/sql-reference/table-functions/hdfs.md @@ -97,6 +97,6 @@ FROM hdfs('hdfs://hdfs1:9000/big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name Strin **See Also** -- [Virtual columns](https://clickhouse.tech/docs/en/operations/table_engines/#table_engines-virtual_columns) +- [Virtual columns](../../engines/table-engines/index.md#table_engines-virtual_columns) [Original article](https://clickhouse.tech/docs/en/query_language/table_functions/hdfs/) diff --git a/docs/en/sql-reference/table-functions/index.md b/docs/en/sql-reference/table-functions/index.md index 691687dea252aef85c4a234f619916beb58d13fe..d1368c6a674588f137effc62a21cf6aebd66e035 100644 --- a/docs/en/sql-reference/table-functions/index.md +++ b/docs/en/sql-reference/table-functions/index.md @@ -21,17 +21,18 @@ You can use table functions in: !!! warning "Warning" You can’t use table functions if the [allow_ddl](../../operations/settings/permissions-for-queries.md#settings_allow_ddl) setting is disabled. -| Function | Description | -|-----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------| -| [file](../../sql-reference/table-functions/file.md) | Creates a [File](../../engines/table-engines/special/file.md)-engine table. | -| [merge](../../sql-reference/table-functions/merge.md) | Creates a [Merge](../../engines/table-engines/special/merge.md)-engine table. | -| [numbers](../../sql-reference/table-functions/numbers.md) | Creates a table with a single column filled with integer numbers. | -| [remote](../../sql-reference/table-functions/remote.md) | Allows you to access remote servers without creating a [Distributed](../../engines/table-engines/special/distributed.md)-engine table. | -| [url](../../sql-reference/table-functions/url.md) | Creates a [Url](../../engines/table-engines/special/url.md)-engine table. | -| [mysql](../../sql-reference/table-functions/mysql.md) | Creates a [MySQL](../../engines/table-engines/integrations/mysql.md)-engine table. | -| [jdbc](../../sql-reference/table-functions/jdbc.md) | Creates a [JDBC](../../engines/table-engines/integrations/jdbc.md)-engine table. | -| [odbc](../../sql-reference/table-functions/odbc.md) | Creates a [ODBC](../../engines/table-engines/integrations/odbc.md)-engine table. | -| [hdfs](../../sql-reference/table-functions/hdfs.md) | Creates a [HDFS](../../engines/table-engines/integrations/hdfs.md)-engine table. | -| [s3](../../sql-reference/table-functions/s3.md) | Creates a [S3](../../engines/table-engines/integrations/s3.md)-engine table. | +| Function | Description | +|-----------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------| +| [file](../../sql-reference/table-functions/file.md) | Creates a [File](../../engines/table-engines/special/file.md)-engine table. | +| [merge](../../sql-reference/table-functions/merge.md) | Creates a [Merge](../../engines/table-engines/special/merge.md)-engine table. | +| [numbers](../../sql-reference/table-functions/numbers.md) | Creates a table with a single column filled with integer numbers. | +| [remote](../../sql-reference/table-functions/remote.md) | Allows you to access remote servers without creating a [Distributed](../../engines/table-engines/special/distributed.md)-engine table. | +| [url](../../sql-reference/table-functions/url.md) | Creates a [Url](../../engines/table-engines/special/url.md)-engine table. | +| [mysql](../../sql-reference/table-functions/mysql.md) | Creates a [MySQL](../../engines/table-engines/integrations/mysql.md)-engine table. | +| [postgresql](../../sql-reference/table-functions/postgresql.md) | Creates a [PostgreSQL](../../engines/table-engines/integrations/posgresql.md)-engine table. | +| [jdbc](../../sql-reference/table-functions/jdbc.md) | Creates a [JDBC](../../engines/table-engines/integrations/jdbc.md)-engine table. | +| [odbc](../../sql-reference/table-functions/odbc.md) | Creates a [ODBC](../../engines/table-engines/integrations/odbc.md)-engine table. | +| [hdfs](../../sql-reference/table-functions/hdfs.md) | Creates a [HDFS](../../engines/table-engines/integrations/hdfs.md)-engine table. | +| [s3](../../sql-reference/table-functions/s3.md) | Creates a [S3](../../engines/table-engines/integrations/s3.md)-engine table. | [Original article](https://clickhouse.tech/docs/en/query_language/table_functions/) diff --git a/docs/en/sql-reference/table-functions/odbc.md b/docs/en/sql-reference/table-functions/odbc.md index ea79cd44a9372819efccd6f8cde2f1a2279d99a5..38ca4d40d17ad57b19bd3f2ad5e5c22d841e6082 100644 --- a/docs/en/sql-reference/table-functions/odbc.md +++ b/docs/en/sql-reference/table-functions/odbc.md @@ -103,4 +103,4 @@ SELECT * FROM odbc('DSN=mysqlconn', 'test', 'test') - [ODBC external dictionaries](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-odbc) - [ODBC table engine](../../engines/table-engines/integrations/odbc.md). -[Original article](https://clickhouse.tech/docs/en/query_language/table_functions/jdbc/) +[Original article](https://clickhouse.tech/docs/en/sql-reference/table-functions/jdbc/) diff --git a/docs/en/sql-reference/table-functions/postgresql.md b/docs/en/sql-reference/table-functions/postgresql.md index 082931343bf3f01f00d4d0ac8b80aac44352b77d..ad5d8a299045374fb642f7fb84132e6a6fac03a4 100644 --- a/docs/en/sql-reference/table-functions/postgresql.md +++ b/docs/en/sql-reference/table-functions/postgresql.md @@ -100,3 +100,5 @@ SELECT * FROM postgresql('localhost:5432', 'test', 'test', 'postgresql_user', 'p - [The ‘PostgreSQL’ table engine](../../engines/table-engines/integrations/postgresql.md) - [Using PostgreSQL as a source of external dictionary](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-postgresql) + +[Original article](https://clickhouse.tech/docs/en/sql-reference/table-functions/postgresql/) diff --git a/docs/en/sql-reference/table-functions/s3.md b/docs/en/sql-reference/table-functions/s3.md index 76a0e042ea4fe89fd52f51ab8aa8e901c19ebe1d..ea5dde707b81d17d983188afbd51789b2419a923 100644 --- a/docs/en/sql-reference/table-functions/s3.md +++ b/docs/en/sql-reference/table-functions/s3.md @@ -164,6 +164,6 @@ Security consideration: if malicious user can specify arbitrary S3 URLs, `s3_max **See Also** -- [Virtual columns](https://clickhouse.tech/docs/en/operations/table_engines/#table_engines-virtual_columns) +- [Virtual columns](../../engines/table-engines/index.md#table_engines-virtual_columns) [Original article](https://clickhouse.tech/docs/en/query_language/table_functions/s3/) diff --git a/docs/en/sql-reference/table-functions/view.md b/docs/en/sql-reference/table-functions/view.md index 08096c2b01951e758b1e1b3276d641f0689b7289..b627feee4c248aa75ac17f2b9700a63f75a10c4e 100644 --- a/docs/en/sql-reference/table-functions/view.md +++ b/docs/en/sql-reference/table-functions/view.md @@ -64,4 +64,5 @@ SELECT * FROM cluster(`cluster_name`, view(SELECT a, b, c FROM table_name)) **See Also** - [View Table Engine](https://clickhouse.tech/docs/en/engines/table-engines/special/view/) -[Original article](https://clickhouse.tech/docs/en/query_language/table_functions/view/) \ No newline at end of file + +[Original article](https://clickhouse.tech/docs/en/sql-reference/table-functions/view/) \ No newline at end of file diff --git a/docs/ru/engines/table-engines/integrations/embedded-rocksdb.md b/docs/ru/engines/table-engines/integrations/embedded-rocksdb.md index 9b68bcfc77026a489af0f03c477b9e75b0eb4355..7bd1420dfab2d3a97dbf0fdf60d57e2a18f3b9c6 100644 --- a/docs/ru/engines/table-engines/integrations/embedded-rocksdb.md +++ b/docs/ru/engines/table-engines/integrations/embedded-rocksdb.md @@ -41,4 +41,4 @@ ENGINE = EmbeddedRocksDB PRIMARY KEY key; ``` -[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/table_engines/embedded-rocksdb/) \ No newline at end of file +[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/embedded-rocksdb/) \ No newline at end of file diff --git a/docs/ru/engines/table-engines/integrations/hdfs.md b/docs/ru/engines/table-engines/integrations/hdfs.md index bd8e760fce4766ffe604b42dfd754f7ffdfeb0f7..449d7c9a20c8985a236bea640db4824c65fffa94 100644 --- a/docs/ru/engines/table-engines/integrations/hdfs.md +++ b/docs/ru/engines/table-engines/integrations/hdfs.md @@ -102,16 +102,104 @@ CREATE TABLE table_with_asterisk (name String, value UInt32) ENGINE = HDFS('hdfs Создадим таблицу с именами `file000`, `file001`, … , `file999`: ``` sql -CREARE TABLE big_table (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/big_dir/file{0..9}{0..9}{0..9}', 'CSV') +CREATE TABLE big_table (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/big_dir/file{0..9}{0..9}{0..9}', 'CSV') ``` +## Конфигурация {#configuration} + +Похоже на GraphiteMergeTree, движок HDFS поддерживает расширенную конфигурацию с использованием файла конфигурации ClickHouse. Есть два раздела конфигурации которые вы можете использовать: глобальный (`hdfs`) и на уровне пользователя (`hdfs_*`). Глобальные настройки применяются первыми, и затем применяется конфигурация уровня пользователя (если она указана). + +``` xml + + + /tmp/keytab/clickhouse.keytab + clickuser@TEST.CLICKHOUSE.TECH + kerberos + + + + + root@TEST.CLICKHOUSE.TECH + +``` + +### Список возможных опций конфигурации со значениями по умолчанию +#### Поддерживаемые из libhdfs3 + + +| **параметр** | **по умолчанию** | +| rpc\_client\_connect\_tcpnodelay | true | +| dfs\_client\_read\_shortcircuit | true | +| output\_replace-datanode-on-failure | true | +| input\_notretry-another-node | false | +| input\_localread\_mappedfile | true | +| dfs\_client\_use\_legacy\_blockreader\_local | false | +| rpc\_client\_ping\_interval | 10 * 1000 | +| rpc\_client\_connect\_timeout | 600 * 1000 | +| rpc\_client\_read\_timeout | 3600 * 1000 | +| rpc\_client\_write\_timeout | 3600 * 1000 | +| rpc\_client\_socekt\_linger\_timeout | -1 | +| rpc\_client\_connect\_retry | 10 | +| rpc\_client\_timeout | 3600 * 1000 | +| dfs\_default\_replica | 3 | +| input\_connect\_timeout | 600 * 1000 | +| input\_read\_timeout | 3600 * 1000 | +| input\_write\_timeout | 3600 * 1000 | +| input\_localread\_default\_buffersize | 1 * 1024 * 1024 | +| dfs\_prefetchsize | 10 | +| input\_read\_getblockinfo\_retry | 3 | +| input\_localread\_blockinfo\_cachesize | 1000 | +| input\_read\_max\_retry | 60 | +| output\_default\_chunksize | 512 | +| output\_default\_packetsize | 64 * 1024 | +| output\_default\_write\_retry | 10 | +| output\_connect\_timeout | 600 * 1000 | +| output\_read\_timeout | 3600 * 1000 | +| output\_write\_timeout | 3600 * 1000 | +| output\_close\_timeout | 3600 * 1000 | +| output\_packetpool\_size | 1024 | +| output\_heeartbeat\_interval | 10 * 1000 | +| dfs\_client\_failover\_max\_attempts | 15 | +| dfs\_client\_read\_shortcircuit\_streams\_cache\_size | 256 | +| dfs\_client\_socketcache\_expiryMsec | 3000 | +| dfs\_client\_socketcache\_capacity | 16 | +| dfs\_default\_blocksize | 64 * 1024 * 1024 | +| dfs\_default\_uri | "hdfs://localhost:9000" | +| hadoop\_security\_authentication | "simple" | +| hadoop\_security\_kerberos\_ticket\_cache\_path | "" | +| dfs\_client\_log\_severity | "INFO" | +| dfs\_domain\_socket\_path | "" | + + +[Руководство по конфигурации HDFS](https://hawq.apache.org/docs/userguide/2.3.0.0-incubating/reference/HDFSConfigurationParameterReference.html) поможет обьяснить назначения некоторых параметров. + + +#### Расширенные параметры для ClickHouse {#clickhouse-extras} + +| **параметр** | **по умолчанию** | +|hadoop\_kerberos\_keytab | "" | +|hadoop\_kerberos\_principal | "" | +|hadoop\_kerberos\_kinit\_command | kinit | + +#### Ограничения {#limitations} + * hadoop\_security\_kerberos\_ticket\_cache\_path могут быть определены только на глобальном уровне + +## Поддержика Kerberos {#kerberos-support} + +Если hadoop\_security\_authentication параметр имеет значение 'kerberos', ClickHouse аутентифицируется с помощью Kerberos. +[Расширенные параметры](#clickhouse-extras) и hadoop\_security\_kerberos\_ticket\_cache\_path помогают сделать это. +Обратите внимание что из-за ограничений libhdfs3 поддерживается только устаревший метод аутентификации, +коммуникация с узлами данных не защищена SASL (HADOOP\_SECURE\_DN\_USER надежный показатель такого +подхода к безопасности). Используйте tests/integration/test\_storage\_kerberized\_hdfs/hdfs_configs/bootstrap.sh для примера настроек. + +Если hadoop\_kerberos\_keytab, hadoop\_kerberos\_principal или hadoop\_kerberos\_kinit\_command указаны в настройках, kinit будет вызван. hadoop\_kerberos\_keytab и hadoop\_kerberos\_principal обязательны в этом случае. Необходимо также будет установить kinit и файлы конфигурации krb5. ## Виртуальные столбцы {#virtualnye-stolbtsy} - `_path` — Путь к файлу. - `_file` — Имя файла. -**Смотрите также** +**См. также** -- [Виртуальные столбцы](index.md#table_engines-virtual_columns) +- [Виртуальные колонки](../../../engines/table-engines/index.md#table_engines-virtual_columns) -[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/table_engines/hdfs/) +[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/hdfs/) diff --git a/docs/ru/engines/table-engines/integrations/index.md b/docs/ru/engines/table-engines/integrations/index.md index db7e527442e08753ce255c6fafda6543d1fa8954..7a11a5176cd5f048cfb1e91909b534763c75a8fa 100644 --- a/docs/ru/engines/table-engines/integrations/index.md +++ b/docs/ru/engines/table-engines/integrations/index.md @@ -14,8 +14,10 @@ toc_priority: 30 - [MySQL](../../../engines/table-engines/integrations/mysql.md) - [MongoDB](../../../engines/table-engines/integrations/mongodb.md) - [HDFS](../../../engines/table-engines/integrations/hdfs.md) +- [S3](../../../engines/table-engines/integrations/s3.md) - [Kafka](../../../engines/table-engines/integrations/kafka.md) - [EmbeddedRocksDB](../../../engines/table-engines/integrations/embedded-rocksdb.md) - [RabbitMQ](../../../engines/table-engines/integrations/rabbitmq.md) +- [PostgreSQL](../../../engines/table-engines/integrations/postgresql.md) [Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/) diff --git a/docs/ru/engines/table-engines/integrations/jdbc.md b/docs/ru/engines/table-engines/integrations/jdbc.md index d7d438e0633bc6a0b3bc239c9b0e3c20d3aef707..8ead5abb27786e5916c858de17b9efea870ab098 100644 --- a/docs/ru/engines/table-engines/integrations/jdbc.md +++ b/docs/ru/engines/table-engines/integrations/jdbc.md @@ -89,4 +89,4 @@ FROM jdbc_table - [Табличная функция JDBC](../../../engines/table-engines/integrations/jdbc.md). -[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/table_engines/jdbc/) +[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/jdbc/) diff --git a/docs/ru/engines/table-engines/integrations/kafka.md b/docs/ru/engines/table-engines/integrations/kafka.md index 5a6971b1ae6e7b975d49d8e8db8f6075cc912a61..06a0d4df1805959dfc89343d43192908b1d3c9e6 100644 --- a/docs/ru/engines/table-engines/integrations/kafka.md +++ b/docs/ru/engines/table-engines/integrations/kafka.md @@ -193,4 +193,4 @@ ClickHouse может поддерживать учетные данные Kerbe - [Виртуальные столбцы](index.md#table_engines-virtual_columns) - [background_schedule_pool_size](../../../operations/settings/settings.md#background_schedule_pool_size) -[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/table_engines/kafka/) +[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/kafka/) diff --git a/docs/ru/engines/table-engines/integrations/mongodb.md b/docs/ru/engines/table-engines/integrations/mongodb.md index 0765b3909de5b44dc35af5477ae603d94fa47cee..5ab634946482c889c2d3dd7a0e516e41d8653742 100644 --- a/docs/ru/engines/table-engines/integrations/mongodb.md +++ b/docs/ru/engines/table-engines/integrations/mongodb.md @@ -54,4 +54,4 @@ SELECT COUNT() FROM mongo_table; └─────────┘ ``` -[Original article](https://clickhouse.tech/docs/ru/operations/table_engines/integrations/mongodb/) +[Original article](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/mongodb/) diff --git a/docs/ru/engines/table-engines/integrations/mysql.md b/docs/ru/engines/table-engines/integrations/mysql.md index 459f8844ce81fce6b3f3b4fa1e24824be3e756d1..bc53e0f1fbbb05ada6e1a984696b9a2fb4c18b3a 100644 --- a/docs/ru/engines/table-engines/integrations/mysql.md +++ b/docs/ru/engines/table-engines/integrations/mysql.md @@ -101,4 +101,4 @@ SELECT * FROM mysql_table - [Табличная функция ‘mysql’](../../../engines/table-engines/integrations/mysql.md) - [Использование MySQL в качестве источника для внешнего словаря](../../../engines/table-engines/integrations/mysql.md#dicts-external_dicts_dict_sources-mysql) -[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/table_engines/mysql/) +[Оригинальная статья](https://clickhouse.tech/docs/engines/table-engines/integrations/mysql/) diff --git a/docs/ru/engines/table-engines/integrations/odbc.md b/docs/ru/engines/table-engines/integrations/odbc.md index 898d569d504eb7025ab7a64737b7ab7963767868..ee34be302bcc2d90dc1ddd286cb2d2a8b56ac951 100644 --- a/docs/ru/engines/table-engines/integrations/odbc.md +++ b/docs/ru/engines/table-engines/integrations/odbc.md @@ -128,4 +128,4 @@ SELECT * FROM odbc_t - [Внешние словари ODBC](../../../engines/table-engines/integrations/odbc.md#dicts-external_dicts_dict_sources-odbc) - [Табличная функция odbc](../../../engines/table-engines/integrations/odbc.md) -[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/table_engines/odbc/) +[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/odbc/) diff --git a/docs/ru/engines/table-engines/integrations/postgresql.md b/docs/ru/engines/table-engines/integrations/postgresql.md index 3ab986822032785d0b4b78fc6f178ce5f172fc39..bc26899f55bce0b3853fe913c87ed10befea640a 100644 --- a/docs/ru/engines/table-engines/integrations/postgresql.md +++ b/docs/ru/engines/table-engines/integrations/postgresql.md @@ -102,3 +102,5 @@ SELECT * FROM postgresql_table WHERE str IN ('test') - [Табличная функция ‘postgresql’](../../../sql-reference/table-functions/postgresql.md) - [Использование PostgreSQL в качестве истояника для внешнего словаря](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-postgresql) + +[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/postgresql/) diff --git a/docs/ru/engines/table-engines/integrations/rabbitmq.md b/docs/ru/engines/table-engines/integrations/rabbitmq.md index f55163c1988eb2cd9e6c88dd9bd99ff8b999605d..1865cb16fccbcba6fc45efd85e7ef35e9e1d909f 100644 --- a/docs/ru/engines/table-engines/integrations/rabbitmq.md +++ b/docs/ru/engines/table-engines/integrations/rabbitmq.md @@ -155,3 +155,5 @@ Example: - `_redelivered` - флаг `redelivered`. (Не равно нулю, если есть возможность, что сообщение было получено более, чем одним каналом.) - `_message_id` - значение поля `messageID` полученного сообщения. Данное поле непусто, если указано в параметрах при отправке сообщения. - `_timestamp` - значение поля `timestamp` полученного сообщения. Данное поле непусто, если указано в параметрах при отправке сообщения. + +[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/rabbitmq/) diff --git a/docs/ru/engines/table-engines/integrations/s3.md b/docs/ru/engines/table-engines/integrations/s3.md new file mode 100644 index 0000000000000000000000000000000000000000..f1b2e78b0bace490eb6a4f5edda0e77205d44bae --- /dev/null +++ b/docs/ru/engines/table-engines/integrations/s3.md @@ -0,0 +1,156 @@ +--- +toc_priority: 4 +toc_title: S3 +--- + +# S3 {#table_engines-s3} + +Этот движок обеспечивает интеграцию с экосистемой [Amazon S3](https://aws.amazon.com/s3/). Этот движок похож на +движок [HDFS](../../../engines/table-engines/integrations/hdfs.md#table_engines-hdfs), но предоставляет S3-специфичные функции. + +## Использование {#usage} + +```sql +ENGINE = S3(path, [aws_access_key_id, aws_secret_access_key,] format, structure, [compression]) +``` + +**Параметры** + +- `path` — URL ссылающийся на файл расположенный в S3. В режиме для чтения можно читать несколько файлов как один, поддерживаются следующие шаблоны для указания маски пути к файлам: *, ?, {abc,def} и {N..M} где N, M — числа, `’abc’, ‘def’ — строки. +- `format` — [Формат](../../../interfaces/formats.md#formats) файла. +- `structure` — Структура таблицы. Формат `'column1_name column1_type, column2_name column2_type, ...'`. +- `compression` — Алгоритм сжатия, не обязятельный параметр. Поддерживаемые значения: none, gzip/gz, brotli/br, xz/LZMA, zstd/zst. По умолчанию, алгоритм сжатия будет автоматически применен в зависимости от расширения в имени файла. + +**Пример:** + +**1.** Создание таблицы `s3_engine_table` : + +```sql +CREATE TABLE s3_engine_table (name String, value UInt32) ENGINE=S3('https://storage.yandexcloud.net/my-test-bucket-768/test-data.csv.gz', 'CSV', 'name String, value UInt32', 'gzip') +``` + +**2.** Заполнение файла: + +```sql +INSERT INTO s3_engine_table VALUES ('one', 1), ('two', 2), ('three', 3) +``` + +**3.** Запрос данных: + +```sql +SELECT * FROM s3_engine_table LIMIT 2 +``` + +```text +┌─name─┬─value─┐ +│ one │ 1 │ +│ two │ 2 │ +└──────┴───────┘ +``` + +## Детали реализации {#implementation-details} + +- Чтение и запись могут быть одновременными и паралельными +- Не поддерживается: + - `ALTER` и `SELECT...SAMPLE` операции. + - Индексы. + - Репликация. + +**Поддержка шаблонов в параметре path** + +Множество частей параметра `path` поддерживает шаблоны. Для того чтобы быть обработанным файл должен присутствовать в S3 и соответсвовать шаблону. Списки файлов определяются в момент `SELECT` (но не в момент `CREATE`). + +- `*` — Заменяет любой количество любых символов кроме `/` включая пустые строки. +- `?` — Заменяет один символ. +- `{some_string,another_string,yet_another_one}` — Заменяет любую из строк `'some_string', 'another_string', 'yet_another_one'`. +- `{N..M}` — Заменяет любое числов в диапозоне от N до M включительно. N и M могут иметь лидирующие нули например `000..078`. + +Конструкции с`{}` работают также как в табличной функции [remote](../../../sql-reference/table-functions/remote.md). + +**Пример** + +1. Предположим у нас есть некоторые файлы в CSV формате со следующими URIs в S3: + +- ‘https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_1.csv’ +- ‘https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_2.csv’ +- ‘https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_3.csv’ +- ‘https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_1.csv’ +- ‘https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_2.csv’ +- ‘https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_3.csv’ + +2. Есть несколько способов сделать таблицу состяющую из всех шести файлов: + + + +```sql +CREATE TABLE table_with_range (name String, value UInt32) ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/{some,another}_prefix/some_file_{1..3}', 'CSV') +``` + +3. Другой способ: + +```sql +CREATE TABLE table_with_question_mark (name String, value UInt32) ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/{some,another}_prefix/some_file_?', 'CSV') +``` + +4. Таблица состоящая из всех файлах в обоих каталогах (все файлы должны удовлетворять формату и схеме описанными в запросе): + +```sql +CREATE TABLE table_with_asterisk (name String, value UInt32) ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/{some,another}_prefix/*', 'CSV') +``` + +!!! warning "Предупреждение" + Если список файлов содержит диапозоны номеров с ведующими нулями, используйте конструкции со скобками для каждой цифры или используйте `?`. + +**Пример** + +Создание таблицы с именами файлов `file-000.csv`, `file-001.csv`, … , `file-999.csv`: + +```sql +CREATE TABLE big_table (name String, value UInt32) ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/big_prefix/file-{000..999}.csv', 'CSV') +``` + +## Виртуальные колонки {#virtual-columns} + +- `_path` — Path to the file. +- `_file` — Name of the file. + +**Смотри также** + +- [Virtual columns](../../../engines/table-engines/index.md#table_engines-virtual_columns) + +## S3-специфичные настройки {#settings} + +Следующие настройки могут быть заданы при запуске запроса или установлены в конфигурационном файле для пользовательского профиля. + +- `s3_max_single_part_upload_size` — По умолчанию `64Mb`. Максикальный размер куска данных для загрузки в S3 как singlepart. +- `s3_min_upload_part_size` — По умолчанию `512Mb`. Минимальный размер куска данных для загрузки в S3 с помощью [S3 Multipart загрузки](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html). +- `s3_max_redirects` — Значение по умолчанию `10`. Максимально допустимое количество HTTP перенаправлений от серверов S3. + +Примечания для безопасности: если злоумышленник может указать произвольные ссылки на S3, то лучше выставить `s3_max_redirects` как ноль для избежания атак типа [SSRF](https://en.wikipedia.org/wiki/Server-side_request_forgery) ; или ограничить с помощью `remote_host_filter` список адресов по которым возможно взаимодействие с S3. + +### Настройки специфичные для заданной конечной точки {#endpointsettings} + +Следующие настройки могут быть указаны в конфигурационном файле для заданной конечной точки (которой будет сопоставлен точный конечный префик URL): + +- `endpoint` — Обязательный параметр. Указывает префикс URL для конечной точки. +- `access_key_id` и `secret_access_key` — Не обязательно. Задает параметры авторизации для заданной конечной точки. +- `use_environment_credentials` — Не обязательный параметр, значение по умолчанию `false`. Если установлено как `true`, S3 клиент будет пытаться получить параметры авторизации из переменных окружения и Amazon EC2 метаданных для заданной конечной точки. +- `header` — Не обязательный параметр, может быть указан несколько раз. Добавляет указанный HTTP заголовок к запросу для заданной в `endpoint` URL префикса. +- `server_side_encryption_customer_key_base64` — Не обязательный параметр. Если указан, к запросам будут указаны заголовки необходимые для доступа к S3 объектам с SSE-C шифрованием. + +Пример: + +``` + + + https://storage.yandexcloud.net/my-test-bucket-768/ + + + + + + + +``` + +[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/table-engines/integrations/s3/)