Creates [external dictionary](dicts/external_dicts.md) with given [structure](dicts/external_dicts_dict_structure.md), [source](dicts/external_dicts_dict_sources.md), [layout](dicts/external_dicts_dict_layout.md) and [lifetime](dicts/external_dicts_dict_lifetime.md).
External dictionary structure consists of attributes. Dictionary attributes are specified similarly to table columns. The only required attribute property is its type, all other properties may have default values.
Depending on dictionary [layout](dicts/external_dicts_dict_layout.md) one or more attributes can be specified as dictionary keys.
For more information, see [External Dictionaries](dicts/external_dicts.md) section.
@@ -4,10 +4,11 @@ You can add your own dictionaries from various data sources. The data source for
ClickHouse:
> - Fully or partially stores dictionaries in RAM.
- Fully or partially stores dictionaries in RAM.
- Periodically updates dictionaries and dynamically loads missing values. In other words, dictionaries can be loaded dynamically.
- Allows to create external dictionaries with xml-files or [DDL queries](../create.md#create-dictionary-query).
The configuration of external dictionaries is located in one or more files. The path to the configuration is specified in the [dictionaries_config](../../operations/server_settings/settings.md#server_settings-dictionaries_config) parameter.
The configuration of external dictionaries can be located in one or more xml-files. The path to the configuration is specified in the [dictionaries_config](../../operations/server_settings/settings.md#server_settings-dictionaries_config) parameter.
Dictionaries can be loaded at server startup or at first use, depending on the [dictionaries_lazy_load](../../operations/server_settings/settings.md#server_settings-dictionaries_lazy_load) setting.
...
...
@@ -31,6 +32,8 @@ The dictionary configuration file has the following format:
You can [configure](external_dicts_dict.md) any number of dictionaries in the same file.
[DDL queries for dictionaries](../create.md#create-dictionary-query) doesn't require any additional records in server configuration. They allow to work with dictionaries as first-class entities, like tables or views.
!!! attention
You can convert values for a small dictionary by describing it in a `SELECT` query (see the [transform](../functions/other_functions.md) function). This functionality is not related to external dictionaries.
The dictionary is completely stored in memory in the form of a hash table. The dictionary can contain any number of elements with any identifiers In practice, the number of keys can reach tens of millions of items.
To use a sample for date ranges, define the `range_min` and `range_max` elements in the [structure](external_dicts_dict_structure.md). These elements must contain elements `name` and` type` (if `type` is not specified, the default type will be used - Date). `type` can be any numeric type (Date / DateTime / UInt64 / Int32 / others).
...
...
@@ -144,6 +171,19 @@ Example:
...
```
or
```sql
CREATEDICTIONARYsomedict(
idUInt64,
firstDate,
lastDate
)
PRIMARYKEYid
LAYOUT(RANGE_HASHED())
RANGE(MINfirstMAXlast)
```
To work with these dictionaries, you need to pass an additional argument to the `dictGetT` function, for which a range is selected:
```sql
...
...
@@ -193,6 +233,18 @@ Configuration example:
</yandex>
```
or
```sql
CREATEDICTIONARYsomedict(
AbcdefUInt64,
StartTimeStampUInt64,
EndTimeStampUInt64,
XXXTypeStringDEFAULT''
)
PRIMARYKEYAbcdef
RANGE(MINStartTimeStampMAXEndTimeStamp)
```
### cache
...
...
@@ -218,6 +270,12 @@ Example of settings:
</layout>
```
or
```sql
LAYOUT(CACHE(SIZE_IN_CELLS1000000000))
```
Set a large enough cache size. You need to experiment to select the number of cells:
1. Set some value.
...
...
@@ -241,17 +299,17 @@ This type of storage is for mapping network prefixes (IP addresses) to metadata
Example: The table contains network prefixes and their corresponding AS number and country code:
```text
+-----------------+-------+--------+
+-----------------|-------|--------+
| prefix | asn | cca2 |
+=================+=======+========+
| 202.79.32.0/20 | 17501 | NP |
+-----------------+-------+--------+
+-----------------|-------|--------+
| 2620:0:870::/48 | 3856 | US |
+-----------------+-------+--------+
+-----------------|-------|--------+
| 2a02:6b8:1::/48 | 13238 | RU |
+-----------------+-------+--------+
+-----------------|-------|--------+
| 2001:db8::/32 | 65536 | ZZ |
+-----------------+-------+--------+
+-----------------|-------|--------+
```
When using this type of layout, the structure must have a composite key.
...
...
@@ -279,6 +337,17 @@ Example:
...
```
or
```sql
CREATEDICTIONARYsomedict(
prefixString,
asnUInt32,
cca2StringDEFAULT'??'
)
PRIMARYKEYprefix
```
The key must have only one String type attribute that contains an allowed IP prefix. Other types are not supported yet.
For queries, you must use the same functions (`dictGetT` with a tuple) as for dictionaries with composite keys:
Setting `<lifetime>0</lifetime>` (`LIFETIME(0)`) prevents dictionaries from updating.
You can set a time interval for upgrades, and ClickHouse will choose a uniformly random time within this range. This is necessary in order to distribute the load on the dictionary source when upgrading on a large number of servers.
...
...
@@ -32,6 +39,12 @@ Example of settings:
</dictionary>
```
or
```sql
LIFETIME(MIN300MAX360)
```
When upgrading the dictionaries, the ClickHouse server applies different logic depending on the type of [ source](external_dicts_dict_sources.md):
- For a text file, it checks the time of modification. If the time differs from the previously recorded time, the dictionary is updated.
...
...
@@ -56,5 +69,13 @@ Example of settings:
</dictionary>
```
or
```sql
...
SOURCE(ODBC(...invalidate_query'SELECT update_time FROM dictionary_source where id = 1'))
-`command` – The absolute path to the executable file, or the file name (if the program directory is written to `PATH`).
...
...
@@ -99,6 +120,17 @@ Example of settings:
</source>
```
or
```sql
SOURCE(HTTP(
url'http://[::1]/os.tsv'
format'TabSeparated'
credentials(user'user'password'password')
headers(header(name'API-KEY'value'key'))
))
```
In order for ClickHouse to access an HTTPS resource, you must [configure openSSL](../../operations/server_settings/settings.md#server_settings-openssl) in the server configuration.
Setting fields:
...
...
@@ -121,12 +153,25 @@ You can use this method to connect any database that has an ODBC driver.
-`host` – The ClickHouse host. If it is a local host, the query is processed without any network activity. To improve fault tolerance, you can create a [Distributed](../../operations/table_engines/distributed.md) table and enter it in subsequent configurations.
- Numeric key. UInt64. Defined in the tag `<id>` .
- Composite key. Set of values of different types. Defined in the tag `<key>` .
A structure can contain either `<id>` or `<key>` .
- Numeric key. UInt64. Defined in the `<id>` tag or using `PRIMARY KEY` keyword.
- Composite key. Set of values of different types. Defined in the tag `<key>` or `PRIMARY KEY` keyword.
!!! warning
The key doesn't need to be defined separately in attributes.
A xml-structure can contain either `<id>` or `<key>`. DDL-query must contain single `PRIMARY KEY`.
### Numeric Key
...
...
@@ -56,6 +68,20 @@ Configuration fields:
-`name` – The name of the column with keys.
For DDL-query:
```sql
CREATEDICTIONARY(
IdUInt64,
...
)
PRIMARYKEYId
...
```
-`PRIMARY KEY` – The name of the column with keys.
### Composite Key
The key can be a `tuple` from any types of fields. The [layout](external_dicts_dict_layout.md) in this case must be `complex_key_hashed` or `complex_key_cache`.
...
...
@@ -81,6 +107,18 @@ The key structure is set in the element `<key>`. Key fields are specified in the
...
```
or
```sql
CREATEDICTIONARY(
field1String,
field2String
...
)
PRIMARYKEYfield1,field2
...
```
For a query to the `dictGet*` function, a tuple is passed as the key. Example: `dictGetString('dict_name', 'attr_name', tuple('string for field1', num_for_field2))`.
Returns a single `UInt8`-type column, which contains the single value `0` if the table or database doesn't exist, or `1` if the table exists in the specified database.