提交 19a9672c 编写于 作者: I Ivan Blinkov

More warnings fixed in english docs

上级 e67250ff
......@@ -2,4 +2,5 @@ Array(T)
--------
Array of T-type items. The T type can be any type, including an array.
We don't recommend using multidimensional arrays, because they are not well supported (for example, you can't store multidimensional arrays in tables with engines from MergeTree family).
......@@ -251,7 +251,8 @@ ip_trie
The table stores IP prefixes for each key (IP address), which makes it possible to map IP addresses to metadata such as ASN or threat score.
Example: in the table there are prefixes matches to AS number and country:
::
..
prefix asn cca2
202.79.32.0/20 17501 NP
2620:0:870::/48 3856 US
......
JSON
-----
----
Outputs data in JSON format. Besides data tables, it also outputs column names and types, along with some additional information - the total number of output rows, and the number of rows that could have been output if there weren't a LIMIT. Example:
......
......@@ -35,10 +35,12 @@ The TabSeparated format is convenient for processing data using custom programs
The TabSeparated format supports outputting total values (when using WITH TOTALS) and extreme values (when 'extremes' is set to 1). In these cases, the total values and extremes are output after the main data. The main result, total values, and extremes are separated from each other by an empty line. Example:
``SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated``
.. code-block:: sql
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated
..
2014-03-17 1406958
2014-03-18 1383658
2014-03-19 1405797
......
......@@ -2,7 +2,8 @@ TSKV
----
Similar to TabSeparated, but displays data in name=value format. Names are displayed just as in TabSeparated. Additionally, a ``=`` symbol is displayed.
::
..
SearchPhrase= count()=8267016
SearchPhrase=bathroom interior count()=2166
SearchPhrase=yandex count()=1655
......
......@@ -5,5 +5,5 @@ Prints every row in parentheses. Rows are separated by commas. There is no comma
Minimum set of symbols that you must escape in Values format is single quote and backslash.
This is the format that is used in ``INSERT INTO t VALUES`` ...
This is the format that is used in ``INSERT INTO t VALUES ...``
But you can also use it for query result.
......@@ -3,12 +3,14 @@ Arithmetic functions
For all arithmetic functions, the result type is calculated as the smallest number type that the result fits in, if there is such a type. The minimum is taken simultaneously based on the number of bits, whether it is signed, and whether it floats. If there are not enough bits, the highest bit type is taken.
Example
Example:
.. code-block:: sql
:) SELECT toTypeName(0), toTypeName(0 + 0), toTypeName(0 + 0 + 0), toTypeName(0 + 0 + 0 + 0)
SELECT toTypeName(0), toTypeName(0 + 0), toTypeName(0 + 0 + 0), toTypeName(0 + 0 + 0 + 0)
..
┌─toTypeName(0)─┬─toTypeName(plus(0, 0))─┬─toTypeName(plus(plus(0, 0), 0))─┬─toTypeName(plus(plus(plus(0, 0), 0), 0))─┐
│ UInt8 │ UInt16 │ UInt32 │ UInt64 │
└───────────────┴────────────────────────┴─────────────────────────────────┴──────────────────────────────────────────┘
......
......@@ -92,7 +92,9 @@ This function is normally used together with ARRAY JOIN. It allows counting some
arrayEnumerate(GoalsReached) AS num
WHERE CounterID = 160656
LIMIT 10
..
┌─Reaches─┬──Hits─┐
│ 95606 │ 31406 │
└─────────┴───────┘
......@@ -106,7 +108,9 @@ In this example, Reaches is the number of conversions (the strings received afte
count() AS Hits
FROM test.hits
WHERE (CounterID = 160656) AND notEmpty(GoalsReached)
..
┌─Reaches─┬──Hits─┐
│ 95606 │ 31406 │
└─────────┴───────┘
......@@ -134,7 +138,9 @@ This function is useful when using ARRAY JOIN and aggregation of array elements.
GROUP BY GoalID
ORDER BY Reaches DESC
LIMIT 10
..
┌──GoalID─┬─Reaches─┬─Visits─┐
│ 53225 │ 3214 │ 1097 │
│ 2825062 │ 3188 │ 1097 │
......@@ -157,7 +163,9 @@ The arrayEnumerateUniq function can take multiple arrays of the same size as arg
.. code-block:: sql
SELECT arrayEnumerateUniq([1, 1, 1, 2, 2, 2], [1, 1, 2, 1, 1, 2]) AS res
..
┌─res───────────┐
│ [1,2,1,1,2,1] │
└───────────────┘
......
......@@ -22,7 +22,8 @@ Example:
arrayJoin([1, 2, 3] AS src) AS dst,
'Hello',
src
..
┌─dst─┬─\'Hello\'─┬─src─────┐
│ 1 │ Hello │ [1,2,3] │
│ 2 │ Hello │ [1,2,3] │
......
Conditional functions
---------------------
if(cond, then, else), оператор cond ? then : else
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
if(cond, then, else), ternary operator cond ? then : else
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns 'then' if 'cond != 0', or 'else' if 'cond = 0'.
'cond' must be UInt 8, and 'then' and 'else' must be a type that has the smallest common type.
......@@ -4,16 +4,20 @@ Functions for working with dates and times
Time Zone Support
All functions for working with the date and time for which this makes sense, can take a second, optional argument - the time zone name. Example: Asia / Yekaterinburg. In this case, they do not use the local time zone (the default), but the specified one.
.. code-block:: sql
SELECT
toDateTime('2016-06-15 23:00:00') AS time,
toDate(time) AS date_local,
toDate(time, 'Asia/Yekaterinburg') AS date_yekat,
toString(time, 'US/Samoa') AS time_samoa
..
┌────────────────time─┬─date_local─┬─date_yekat─┬─time_samoa──────────┐
│ 2016-06-15 23:00:00 │ 2016-06-15 │ 2016-06-16 │ 2016-06-15 09:00:00 │
└─────────────────────┴────────────┴────────────┴─────────────────────┘
Only time zones are supported, different from UTC for an integer number of hours.
toYear
......@@ -137,5 +141,5 @@ This function is specific to Yandex.Metrica, since half an hour is the minimum a
timeSlots(StartTime, Duration)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For a time interval starting at 'StartTime' and continuing for 'Duration' seconds, it returns an array of moments in time, consisting of points from this interval rounded down to the half hour.
For example, timeSlots(toDateTime('2012-01-01 12:20:00'), toUInt32(600)) = [toDateTime('2012-01-01 12:00:00'), toDateTime('2012-01-01 12:30:00')].
This is necessary for searching for pageviews in the corresponding session.
For example, ``timeSlots(toDateTime('2012-01-01 12:20:00'), toUInt32(600)) = [toDateTime('2012-01-01 12:00:00'), toDateTime('2012-01-01 12:30:00')]``.
This is necessary for searching for page views in the corresponding session.
......@@ -17,7 +17,7 @@ dictGetDate, dictGetDateTime
dictGetString
~~~~~~~~~~~~~
``dictGetT('dict_name', 'attr_name', id)``
- Gets the value of the 'attr_name' attribute from the 'dict_name' dictionary by the 'id' key.
Gets the value of the 'attr_name' attribute from the 'dict_name' dictionary by the 'id' key.
'dict_name' and 'attr_name' are constant strings.
'id' must be UInt64.
If the 'id' key is not in the dictionary, it returns the default value set in the dictionary definition.
......@@ -30,14 +30,14 @@ Similar to the functions dictGetT, but the default value is taken from the last
dictIsIn
~~~~~~~~
``dictIsIn('dict_name', child_id, ancestor_id)``
- For the 'dict_name' hierarchical dictionary, finds out whether the 'child_id' key is located inside 'ancestor_id' (or matches 'ancestor_id'). Returns UInt8.
For the 'dict_name' hierarchical dictionary, finds out whether the 'child_id' key is located inside 'ancestor_id' (or matches 'ancestor_id'). Returns UInt8.
dictGetHierarchy
~~~~~~~~~~~~~~~~
``dictGetHierarchy('dict_name', id)``
- For the 'dict_name' hierarchical dictionary, returns an array of dictionary keys starting from 'id' and continuing along the chain of parent elements. Returns Array(UInt64).
For the 'dict_name' hierarchical dictionary, returns an array of dictionary keys starting from 'id' and continuing along the chain of parent elements. Returns Array(UInt64).
dictHas
~~~~~~~
``dictHas('dict_name', id)``
- check the presence of a key in the dictionary. Returns a value of type UInt8, equal to 0, if there is no key and 1 if there is a key.
check the presence of a key in the dictionary. Returns a value of type UInt8, equal to 0, if there is no key and 1 if there is a key.
......@@ -26,18 +26,24 @@ Examples:
.. code-block:: sql
SELECT arrayFilter(x -> x LIKE '%World%', ['Hello', 'abc World']) AS res
..
┌─res───────────┐
│ ['abc World'] │
└───────────────┘
.. code-block:: sql
SELECT
arrayFilter(
(i, x) -> x LIKE '%World%',
arrayEnumerate(arr),
['Hello', 'abc World'] AS arr)
AS res
..
┌─res─┐
│ [2] │
└─────┘
......
......@@ -7,7 +7,6 @@ In this section we discuss regular functions. For aggregate functions, see the s
* - There is a third type of function that the 'arrayJoin' function belongs to; table functions can also be mentioned separately.
.. toctree::
:glob:
......
......@@ -33,7 +33,8 @@ visitParamExtractRaw(params, name)
Returns the value of a field, including separators.
Examples:
::
..
visitParamExtractRaw('{"abc":"\\n\\u0000"}', 'abc') = '"\\n\\u0000"'
visitParamExtractRaw('{"abc":{"def":[1,2,3]}}', 'abc') = '{"def":[1,2,3]}'
......@@ -42,7 +43,8 @@ visitParamExtractString(params, name)
Parses the string in double quotes. The value is unescaped. If unescaping failed, it returns an empty string.
Examples:
::
..
visitParamExtractString('{"abc":"\\n\\u0000"}', 'abc') = '\n\0'
visitParamExtractString('{"abc":"\\u263a"}', 'abc') = '☺'
visitParamExtractString('{"abc":"\\u263"}', 'abc') = ''
......
......@@ -45,14 +45,16 @@ Accepts a numeric argument and returns a Float64 number close to the cubic root
erf(x)
~~~~~~
If 'x' is non-negative, then erf(x / σ√2) - is the probability that a random variable having a normal distribution with standard deviation 'σ' takes the value that is separated from the expected value by more than 'x'.
If 'x' is non-negative, then ``erf(x / σ√2)`` - is the probability that a random variable having a normal distribution with standard deviation 'σ' takes the value that is separated from the expected value by more than 'x'.
Example (three sigma rule):
.. code-block:: sql
SELECT erf(3 / sqrt(2))
..
┌─erf(divide(3, sqrt(2)))─┐
│ 0.9973002039367398 │
└─────────────────────────┘
......
......@@ -76,7 +76,9 @@ The band is drawn with accuracy to one eighth of a symbol. Example:
FROM test.hits
GROUP BY h
ORDER BY h ASC
..
┌──h─┬──────c─┬─bar────────────────┐
│ 0 │ 292907 │ █████████▋ │
│ 1 │ 180563 │ ██████ │
......@@ -142,7 +144,9 @@ Example:
WHERE SearchEngineID != 0
GROUP BY title
ORDER BY c DESC
..
┌─title─────┬──────c─┐
│ Яндекс │ 498635 │
│ Google │ 229872 │
......@@ -170,7 +174,9 @@ Example:
GROUP BY domain(Referer)
ORDER BY count() DESC
LIMIT 10
..
┌─s──────────────┬───────c─┐
│ │ 2906259 │
│ www.yandex │ 867767 │
......@@ -195,7 +201,9 @@ Example:
SELECT
arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
formatReadableSize(filesize_bytes) AS filesize
..
┌─filesize_bytes─┬─filesize───┐
│ 1 │ 1.00 B │
│ 1024 │ 1.00 KiB │
......@@ -249,7 +257,9 @@ Example:
ORDER BY EventTime ASC
LIMIT 5
)
..
┌─EventID─┬───────────EventTime─┬─delta─┐
│ 1106 │ 2016-11-24 00:00:04 │ 0 │
│ 1107 │ 2016-11-24 00:00:05 │ 1 │
......
......@@ -29,7 +29,9 @@ Example 1. Converting the date to American format:
FROM test.hits
LIMIT 7
FORMAT TabSeparated
..
2014-03-17 03/17/2014
2014-03-18 03/18/2014
2014-03-19 03/19/2014
......@@ -43,7 +45,9 @@ Example 2. Copy the string ten times:
.. code-block:: sql
SELECT replaceRegexpOne('Hello, World!', '.*', '\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0') AS res
..
┌─res────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World! │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
......@@ -54,7 +58,9 @@ This does the same thing, but replaces all the occurrences. Example:
.. code-block:: sql
SELECT replaceRegexpAll('Hello, World!', '.', '\\0\\0') AS res
..
┌─res────────────────────────┐
│ HHeelllloo,, WWoorrlldd!! │
└────────────────────────────┘
......@@ -64,7 +70,9 @@ Example:
.. code-block:: sql
SELECT replaceRegexpAll('Hello, World!', '^', 'here: ') AS res
..
┌─res─────────────────┐
│ here: Hello, World! │
└─────────────────────┘
......@@ -43,17 +43,21 @@ To do transformations on DateTime in given time zone, pass second argument with
SELECT
now() AS now_local,
toString(now(), 'Asia/Yekaterinburg') AS now_yekat
..
┌───────────now_local─┬─now_yekat───────────┐
│ 2016-06-15 00:11:21 │ 2016-06-15 02:11:21 │
└─────────────────────┴─────────────────────┘
To format DateTime in given time zone:
::
..
toString(now(), 'Asia/Yekaterinburg')
To get unix timestamp for string with datetime in specified time zone:
::
..
toUnixTimestamp('2000-01-01 00:00:00', 'Asia/Yekaterinburg')
toFixedString(s, N)
......@@ -68,13 +72,13 @@ Example:
.. code-block:: sql
:) SELECT toFixedString('foo', 8) AS s, toStringCutToZero(s) AS s_cut
SELECT toFixedString('foo', 8) AS s, toStringCutToZero(s) AS s_cut
┌─s─────────────┬─s_cut─┐
│ foo\0\0\0\0\0 │ foo │
└───────────────┴───────┘
:) SELECT toFixedString('foo\0bar', 8) AS s, toStringCutToZero(s) AS s_cut
SELECT toFixedString('foo\0bar', 8) AS s, toStringCutToZero(s) AS s_cut
┌─s──────────┬─s_cut─┐
│ foo\0bar\0 │ foo │
......@@ -112,7 +116,9 @@ Example:
CAST(timestamp AS Date) AS date,
CAST(timestamp, 'String') AS string,
CAST(timestamp, 'FixedString(22)') AS fixed_string
..
┌─timestamp───────────┬────────────datetime─┬───────date─┬─string──────────────┬─fixed_string──────────────┐
│ 2016-06-15 23:00:00 │ 2016-06-15 23:00:00 │ 2016-06-15 │ 2016-06-15 23:00:00 │ 2016-06-15 23:00:00\0\0\0 │
└─────────────────────┴─────────────────────┴────────────┴─────────────────────┴───────────────────────────┘
......
......@@ -3,77 +3,78 @@ Functions for working with URLs
All these functions don't follow the RFC. They are maximally simplified for improved performance.
Функции, извлекающие часть URL-а.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions tat extract part of the URL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there isn't anything similar in a URL, an empty string is returned.
protocol
""""""""
- Selects the protocol. Examples: http, ftp, mailto, magnet...
Selects the protocol. Examples: http, ftp, mailto, magnet...
domain
""""""
- Selects the domain.
Selects the domain.
domainWithoutWWW
""""""""""""""""
- Selects the domain and removes no more than one 'www.' from the beginning of it, if present.
Selects the domain and removes no more than one 'www.' from the beginning of it, if present.
topLevelDomain
""""""""""""""
- Selects the top-level domain. Example: .ru.
Selects the top-level domain. Example: .ru.
firstSignificantSubdomain
"""""""""""""""""""""""""
- Selects the "first significant subdomain". This is a non-standard concept specific to Yandex.Metrica. The first significant subdomain is a second-level domain if it is 'com', 'net', 'org', or 'co'. Otherwise, it is a third-level domain. For example, firstSignificantSubdomain('https://news.yandex.ru/') = 'yandex', firstSignificantSubdomain('https://news.yandex.com.tr/') = 'yandex'. The list of "insignificant" second-level domains and other implementation details may change in the future.
Selects the "first significant subdomain". This is a non-standard concept specific to Yandex.Metrica. The first significant subdomain is a second-level domain if it is 'com', 'net', 'org', or 'co'. Otherwise, it is a third-level domain. For example, firstSignificantSubdomain('https://news.yandex.ru/') = 'yandex', firstSignificantSubdomain('https://news.yandex.com.tr/') = 'yandex'. The list of "insignificant" second-level domains and other implementation details may change in the future.
cutToFirstSignificantSubdomain
""""""""""""""""""""""""""""""
- Selects the part of the domain that includes top-level subdomains up to the "first significant subdomain" (see the explanation above).
Selects the part of the domain that includes top-level subdomains up to the "first significant subdomain" (see the explanation above).
For example, ``cutToFirstSignificantSubdomain('https://news.yandex.com.tr/') = 'yandex.com.tr'``.
path
""""
- Selects the path. Example: /top/news.html The path does not include the query-string.
Selects the path. Example: /top/news.html The path does not include the query-string.
pathFull
""""""""
- The same as above, but including query-string and fragment. Example: /top/news.html?page=2#comments
The same as above, but including query-string and fragment. Example: /top/news.html?page=2#comments
queryString
"""""""""""
- Selects the query-string. Example: page=1&lr=213. query-string does not include the first question mark, or # and everything that comes after #.
Selects the query-string. Example: page=1&lr=213. query-string does not include the first question mark, or # and everything that comes after #.
fragment
""""""""
- Selects the fragment identifier. fragment does not include the first number sign (#).
Selects the fragment identifier. fragment does not include the first number sign (#).
queryStringAndFragment
""""""""""""""""""""""
- Selects the query-string and fragment identifier. Example: page=1#29390.
Selects the query-string and fragment identifier. Example: page=1#29390.
extractURLParameter(URL, name)
""""""""""""""""""""""""""""""
- Selects the value of the 'name' parameter in the URL, if present. Otherwise, selects an empty string. If there are many parameters with this name, it returns the first occurrence. This function works under the assumption that the parameter name is encoded in the URL in exactly the same way as in the argument passed.
Selects the value of the 'name' parameter in the URL, if present. Otherwise, selects an empty string. If there are many parameters with this name, it returns the first occurrence. This function works under the assumption that the parameter name is encoded in the URL in exactly the same way as in the argument passed.
extractURLParameters(URL)
"""""""""""""""""""""""""
- Gets an array of name=value strings corresponding to the URL parameters. The values are not decoded in any way.
Gets an array of name=value strings corresponding to the URL parameters. The values are not decoded in any way.
extractURLParameterNames(URL)
"""""""""""""""""""""""""""""
- Gets an array of name=value strings corresponding to the names of URL parameters. The values are not decoded in any way.
Gets an array of name=value strings corresponding to the names of URL parameters. The values are not decoded in any way.
URLHierarchy(URL)
"""""""""""""""""
- Gets an array containing the URL trimmed to the ``/``, ``?`` characters in the path and query-string. Consecutive separator characters are counted as one. The cut is made in the position after all the consecutive separator characters. Example:
Gets an array containing the URL trimmed to the ``/``, ``?`` characters in the path and query-string. Consecutive separator characters are counted as one. The cut is made in the position after all the consecutive separator characters. Example:
URLPathHierarchy(URL)
"""""""""""""""""""""
- The same thing, but without the protocol and host in the result. The / element (root) is not included. Example:
The same thing, but without the protocol and host in the result. The / element (root) is not included. Example:
This function is used for implementing tree-view reports by URL in Yandex.Metrica.
::
..
URLPathHierarchy('https://example.com/browse/CONV-6788') =
[
'/browse/',
......@@ -83,11 +84,12 @@ This function is used for implementing tree-view reports by URL in Yandex.Metric
decodeURLComponent(URL)
"""""""""""""""""""""""
Returns a URL-decoded URL.
Example:
.. code-block:: sql
:) SELECT decodeURLComponent('http://127.0.0.1:8123/?query=SELECT%201%3B') AS DecodedURL;
SELECT decodeURLComponent('http://127.0.0.1:8123/?query=SELECT%201%3B') AS DecodedURL;
┌─DecodedURL─────────────────────────────┐
│ http://127.0.0.1:8123/?query=SELECT 1; │
......
......@@ -38,7 +38,9 @@ Converts a region to an area (type 5 in the geobase). In every other way, this f
SELECT DISTINCT regionToName(regionToArea(toUInt32(number), 'ua'), 'en')
FROM system.numbers
LIMIT 15
..
┌─regionToName(regionToArea(toUInt32(number), \'ua\'), \'en\')─┐
│ │
│ Moscow and Moscow region │
......@@ -66,7 +68,9 @@ Converts a region to a federal district (type 4 in the geobase). In every other
SELECT DISTINCT regionToName(regionToDistrict(toUInt32(number), 'ua'), 'en')
FROM system.numbers
LIMIT 15
..
┌─regionToName(regionToDistrict(toUInt32(number), \'ua\'), \'en\')─┐
│ │
│ Central │
......
......@@ -21,13 +21,15 @@ Installing from packages
~~~~~~~~~~~~~~~~~~~~~~~~
In `/etc/apt/sources.list` (or in a separate `/etc/apt/sources.list.d/clickhouse.list` file), add the repository:
::
..
deb http://repo.yandex.ru/clickhouse/trusty stable main
For other Ubuntu versions, replace `trusty` to `xenial` or `precise`.
Then run:
::
.. code-block:: bash
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4 # optional
sudo apt-get update
sudo apt-get install clickhouse-client clickhouse-server-common
......@@ -45,12 +47,14 @@ Installing from source
To build, follow the instructions in build.md (for Linux) or in build_osx.md (for Mac OS X).
You can compile packages and install them. You can also use programs without installing packages.
::
..
Client: dbms/src/Client/
Server: dbms/src/Server/
For the server, create a catalog with data, such as:
::
..
/opt/clickhouse/data/default/
/opt/clickhouse/metadata/default/
......@@ -70,7 +74,8 @@ Launch
------
To start the server (as a daemon), run:
::
.. code-block:: bash
sudo service clickhouse-server start
View the logs in the catalog `/var/log/clickhouse-server/`
......@@ -78,25 +83,29 @@ View the logs in the catalog `/var/log/clickhouse-server/`
If the server doesn't start, check the configurations in the file `/etc/clickhouse-server/config.xml`
You can also launch the server from the console:
::
.. code-block:: bash
clickhouse-server --config-file=/etc/clickhouse-server/config.xml
In this case, the log will be printed to the console, which is convenient during development. If the configuration file is in the current directory, you don't need to specify the '--config-file' parameter. By default, it uses './config.xml'.
You can use the command-line client to connect to the server:
::
.. code-block:: bash
clickhouse-client
The default parameters indicate connecting with localhost:9000 on behalf of the user 'default' without a password.
The client can be used for connecting to a remote server. For example:
::
.. code-block:: bash
clickhouse-client --host=example.com
For more information, see the section "Command-line client".
Checking the system:
::
milovidov@milovidov-Latitude-E6320:~/work/metrica/src/dbms/src/Client$ ./clickhouse-client
.. code-block:: bash
milovidov@hostname:~/work/metrica/src/dbms/src/Client$ ./clickhouse-client
ClickHouse client version 0.0.18749.
Connecting to localhost:9000.
Connected to ClickHouse server version 0.0.18749.
......
Command-line client
-------------------
To work for command line you can use ``clickhouse-client``:
::
.. code-block:: bash
$ clickhouse-client
ClickHouse client version 0.0.26176.
Connecting to localhost:9000.
......@@ -37,9 +38,12 @@ Only works in non-interactive mode.
``--stacktrace`` - If specified, also prints the stack trace if an exception occurs.
``--config-file`` - Name of the configuration file that has additional settings or changed defaults for the settings listed above.
By default, files are searched for in this order:
./clickhouse-client.xml
~/./clickhouse-client/config.xml
/etc/clickhouse-client/config.xml
..
./clickhouse-client.xml
~/./clickhouse-client/config.xml
/etc/clickhouse-client/config.xml
Settings are only taken from the first file found.
You can also specify any settings that will be used for processing queries. For example, ``clickhouse-client --max_threads=1``. For more information, see the section "Settings".
......@@ -49,7 +53,8 @@ To use batch mode, specify the 'query' parameter, or send data to 'stdin' (it ve
Similar to the HTTP interface, when using the 'query' parameter and sending data to 'stdin', the request is a concatenation of the 'query' parameter, a line break, and the data in 'stdin'. This is convenient for large INSERT queries.
Examples for insert data via clickhouse-client:
::
.. code-block:: bash
echo -ne "1, 'some text', '2016-08-14 00:00:00'\n2, 'some more text', '2016-08-14 00:00:01'" | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
cat <<_EOF | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
......
......@@ -160,15 +160,18 @@ By default, the database that is registered in the server settings is used as th
The username and password can be indicated in one of two ways:
1. Using HTTP Basic Authentication. Example: ::
1. Using HTTP Basic Authentication. Example:
.. code-block:: bash
echo 'SELECT 1' | curl 'http://user:password@localhost:8123/' -d @-
2. In the 'user' and 'password' URL parameters. Example: ::
2. In the 'user' and 'password' URL parameters. Example:
.. code-block:: bash
echo 'SELECT 1' | curl 'http://localhost:8123/?user=user&password=password' -d @-
3. Using 'X-ClickHouse-User' and 'X-ClickHouse-Key' headers. Example: ::
3. Using 'X-ClickHouse-User' and 'X-ClickHouse-Key' headers. Example:
.. code-block:: bash
echo 'SELECT 1' | curl -H "X-ClickHouse-User: user" -H "X-ClickHouse-Key: password" 'http://localhost:8123/' -d @-
......
......@@ -58,5 +58,5 @@ This lets us use the system as the back-end for a web interface. Low latency mea
14. Data replication and support for data integrity on replicas.
----------------------------------------------------------------
Uses asynchronous multimaster replication. After being written to any available replica, data is distributed to all the remaining replicas. The system maintains identical data on different replicas. Data is restored automatically after a failure, or using a "button" for complex cases.
Uses asynchronous multi-master replication. After being written to any available replica, data is distributed to all the remaining replicas. The system maintains identical data on different replicas. Data is restored automatically after a failure, or using a "button" for complex cases.
For more information, see the section "Data replication".
......@@ -4,15 +4,18 @@ What is ClickHouse?
ClickHouse is a columnar DBMS for OLAP.
In a "normal" row-oriented DBMS, data is stored in this order:
::
..
5123456789123456789 1 Eurobasket - Greece - Bosnia and Herzegovina - example.com 1 2011-09-01 01:03:02 6274717 1294101174 11409 612345678912345678 0 33 6 http://www.example.com/basketball/team/123/match/456789.html http://www.example.com/basketball/team/123/match/987654.html 0 1366 768 32 10 3183 0 0 13 0\0 1 1 0 0 2011142 -1 0 0 01321 613 660 2011-09-01 08:01:17 0 0 0 0 utf-8 1466 0 0 0 5678901234567890123 277789954 0 0 0 0 0
5234985259563631958 0 Consulting, Tax assessment, Accounting, Law 1 2011-09-01 01:03:02 6320881 2111222333 213 6458937489576391093 0 3 2 http://www.example.ru/ 0 800 600 16 10 2 153.1 0 0 10 63 1 1 0 0 2111678 000 0 588 368 240 2011-09-01 01:03:17 4 0 60310 0 windows-1251 1466 0 000 778899001 0 0 0 0 0
...
...
In other words, all the values related to a row are stored next to each other. Examples of a row-oriented DBMS are MySQL, Postgres, MS SQL Server, and others.
In a column-oriented DBMS, data is stored like this:
::
..
WatchID: 5385521489354350662 5385521490329509958 5385521489953706054 5385521490476781638 5385521490583269446 5385521490218868806 5385521491437850694 5385521491090174022 5385521490792669254 5385521490420695110 5385521491532181574 5385521491559694406 5385521491459625030 5385521492275175494 5385521492781318214 5385521492710027334 5385521492955615302 5385521493708759110 5385521494506434630 5385521493104611398
JavaEnable: 1 0 1 0 0 0 1 0 1 1 1 1 1 1 0 1 0 0 1 1
Title: Yandex Announcements - Investor Relations - Yandex Yandex — Contact us — Moscow Yandex — Mission Ru Yandex — History — History of Yandex Yandex Financial Releases - Investor Relations - Yandex Yandex — Locations Yandex Board of Directors - Corporate Governance - Yandex Yandex — Technologies
......@@ -56,7 +59,8 @@ Columnar-oriented databases are better suited to OLAP scenarios (at least 100 ti
For example, the query "count the number of records for each advertising platform" requires reading one "advertising platform ID" column, which takes up 1 byte uncompressed. If most of the traffic was not from advertising platforms, you can expect at least 10-fold compression of this column. When using a quick compression algorithm, data decompression is possible at a speed of at least several gigabytes of uncompressed data per second. In other words, this query can be processed at a speed of approximately several billion rows per second on a single server. This speed is actually achieved in practice.
Example:
::
..
milovidov@████████.yandex.ru:~$ clickhouse-client
ClickHouse client version 0.0.52053.
Connecting to localhost:9000.
......
......@@ -4,7 +4,8 @@ Queries
CREATE DATABASE
~~~~~~~~~~~~~~~
Creates the 'db_name' database.
::
.. code-block:: sql
CREATE DATABASE [IF NOT EXISTS] db_name
A database is just a directory for tables.
......@@ -13,7 +14,8 @@ If "IF NOT EXISTS" is included, the query won't return an error if the database
CREATE TABLE
~~~~~~~~~~~~
The ``CREATE TABLE`` query can have several forms.
::
.. code-block:: sql
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
......@@ -25,11 +27,13 @@ Creates a table named 'name' in the 'db' database or the current database if 'db
A column description is ``name type`` in the simplest case. For example: ``RegionID UInt32``.
Expressions can also be defined for default values (see below).
::
.. code-block:: sql
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name AS [db2.]name2 [ENGINE = engine]
Creates a table with the same structure as another table. You can specify a different engine for the table. If the engine is not specified, the same engine will be used as for the 'db2.name2' table.
::
.. code-block:: sql
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name ENGINE = engine AS SELECT ...
Creates a table with a structure like the result of the ``SELECT`` query, with the 'engine' engine, and fills it with data from SELECT.
......@@ -92,14 +96,17 @@ Creates a view. There are two types of views: normal and MATERIALIZED.
Normal views don't store any data, but just perform a read from another table. In other words, a normal view is nothing more than a saved query. When reading from a view, this saved query is used as a subquery in the FROM clause.
As an example, assume you've created a view:
::
.. code-block:: sql
CREATE VIEW view AS SELECT ...
and written a query:
::
.. code-block:: sql
SELECT a, b, c FROM view
This query is fully equivalent to using the subquery:
::
.. code-block:: sql
SELECT a, b, c FROM (SELECT ...)
Materialized views store data transformed by the corresponding SELECT query.
......@@ -130,21 +137,24 @@ This query is used when starting the server. The server stores table metadata as
DROP
~~~~
This query has two types: ``DROP DATABASE`` and ``DROP TABLE``.
::
.. code-block:: sql
DROP DATABASE [IF EXISTS] db
Deletes all tables inside the 'db' database, then deletes the 'db' database itself.
If IF EXISTS is specified, it doesn't return an error if the database doesn't exist.
::
.. code-block:: sql
DROP TABLE [IF EXISTS] [db.]name
Deletes the table.
If IF EXISTS is specified, it doesn't return an error if the table doesn't exist or the database doesn't exist.
If ``IF EXISTS`` is specified, it doesn't return an error if the table doesn't exist or the database doesn't exist.
DETACH
~~~~~~
Deletes information about the table from the server. The server stops knowing about the table's existence.
::
.. code-block:: sql
DETACH TABLE [IF EXISTS] [db.]name
This does not delete the table's data or metadata. On the next server launch, the server will read the metadata and find out about the table again. Similarly, a "detached" table can be re-attached using the ATTACH query (with the exception of system tables, which do not have metadata stored for them).
......@@ -154,7 +164,8 @@ There is no DETACH DATABASE query.
RENAME
~~~~~~
Renames one or more tables.
::
.. code-block:: sql
RENAME TABLE [db11.]name11 TO [db12.]name12, [db21.]name21 TO [db22.]name22, ...
All tables are renamed under global locking. Renaming tables is a light operation. If you indicated another database after TO, the table will be moved to this database. However, the directories with databases must reside in the same file system (otherwise, an error is returned).
......@@ -166,13 +177,15 @@ The ALTER query is only supported for *MergeTree type tables, as well as for Mer
Column manipulations
""""""""""""""""""""
Lets you change the table structure.
::
.. code-block:: sql
ALTER TABLE [db].name ADD|DROP|MODIFY COLUMN ...
In the query, specify a list of one or more comma-separated actions. Each action is an operation on a column.
The following actions are supported:
::
.. code-block:: sql
ADD COLUMN name [type] [default_expr] [AFTER name_after]
Adds a new column to the table with the specified name, type, and default expression (see the section "Default expressions"). If you specify 'AFTER name_after' (the name of another column), the column is added after the specified one in the list of table columns. Otherwise, the column is added to the end of the table. Note that there is no way to add a column to the beginning of a table. For a chain of actions, 'name_after' can be the name of a column that is added in one of the previous actions.
......@@ -248,7 +261,8 @@ Another way to view a set of parts and partitions is to go into the directory wi
The directory with data is
/var/lib/clickhouse/data/database/table/,
where /var/lib/clickhouse/ is the path to ClickHouse data, 'database' is the database name, and 'table' is the table name. Example:
::
.. code-block:: bash
$ ls -l /var/lib/clickhouse/data/test/visits/
total 48
drwxrwxrwx 2 clickhouse clickhouse 20480 мая 13 02:58 20140317_20140323_2_2_0
......@@ -271,8 +285,9 @@ Each part corresponds to a single partition and contains data for a single month
On an operating server, you can't manually change the set of parts or their data on the file system, since the server won't know about it. For non-replicated tables, you can do this when the server is stopped, but we don't recommended it. For replicated tables, the set of parts can't be changed in any case.
The 'detached' directory contains parts that are not used by the server - detached from the table using the ALTER ... DETACH query. Parts that are damaged are also moved to this directory, instead of deleting them. You can add, delete, or modify the data in the 'detached' directory at any time - the server won't know about this until you make the ALTER TABLE ... ATTACH query.
::
ALTER TABLE [db.]table DETACH PARTITION 'name'
.. code-block:: sql
ALTER TABLE [db.]table DETACH PARTITION 'name'
Move all data for partitions named 'name' to the 'detached' directory and forget about them.
The partition name is specified in YYYYMM format. It can be indicated in single quotes or without them.
......@@ -280,11 +295,13 @@ The partition name is specified in YYYYMM format. It can be indicated in single
After the query is executed, you can do whatever you want with the data in the 'detached' directory — delete it from the file system, or just leave it.
The query is replicated - data will be moved to the 'detached' directory and forgotten on all replicas. The query can only be sent to a leader replica. To find out if a replica is a leader, perform SELECT to the 'system.replicas' system table. Alternatively, it is easier to make a query on all replicas, and all except one will throw an exception.
::
.. code-block:: sql
ALTER TABLE [db.]table DROP PARTITION 'name'
Similar to the DETACH operation. Deletes data from the table. Data parts will be tagged as inactive and will be completely deleted in approximately 10 minutes. The query is replicated - data will be deleted on all replicas.
::
.. code-block:: sql
ALTER TABLE [db.]table ATTACH PARTITION|PART 'name'
Adds data to the table from the 'detached' directory.
......@@ -294,7 +311,8 @@ It is possible to add data for an entire partition or a separate part. For a par
The query is replicated. Each replica checks whether there is data in the 'detached' directory. If there is data, it checks the integrity, verifies that it matches the data on the server that initiated the query, and then adds it if everything is correct. If not, it downloads data from the query requestor replica, or from another replica where the data has already been added.
So you can put data in the 'detached' directory on one replica, and use the ALTER ... ATTACH query to add it to the table on all replicas.
::
.. code-block:: sql
ALTER TABLE [db.]table FREEZE PARTITION 'name'
Creates a local backup of one or multiple partitions. The name can be the full name of the partition (for example, 201403), or its prefix (for example, 2014) - then the backup will be created for all the corresponding partitions.
......@@ -334,7 +352,8 @@ Replication provides protection from device failures. If all data disappeared on
For protection from device failures, you must use replication. For more information about replication, see the section "Data replication".
Backups protect against human error (accidentally deleting data, deleting the wrong data or in the wrong cluster, or corrupting data). For high-volume databases, it can be difficult to copy backups to remote servers. In such cases, to protect from human error, you can keep a backup on the same server (it will reside in /var/lib/clickhouse/shadow/).
::
.. code-block:: sql
ALTER TABLE [db.]table FETCH PARTITION 'name' FROM 'path-in-zookeeper'
This query only works for replicatable tables.
......@@ -623,7 +642,7 @@ Allows executing JOIN with an array or nested data structure. The intent is simi
ARRAY JOIN is essentially INNER JOIN with an array. Example:
.. code-block:: sql
..
:) CREATE TABLE arrays_test (s String, arr Array(UInt8)) ENGINE = Memory
......@@ -684,6 +703,8 @@ An alias can be specified for an array in the ARRAY JOIN clause. In this case, a
FROM arrays_test
ARRAY JOIN arr AS a
..
┌─s─────┬─arr─────┬─a─┐
│ Hello │ [1,2] │ 1 │
│ Hello │ [1,2] │ 2 │
......@@ -697,7 +718,7 @@ An alias can be specified for an array in the ARRAY JOIN clause. In this case, a
Multiple arrays of the same size can be comma-separated in the ARRAY JOIN clause. In this case, JOIN is performed with them simultaneously (the direct sum, not the direct product).
Example:
.. code-block:: sql
..
:) SELECT s, arr, a, num, mapped FROM arrays_test ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num, arrayMap(x -> x + 1, arr) AS mapped
......@@ -733,7 +754,7 @@ Example:
ARRAY JOIN also works with nested data structures. Example:
.. code-block:: sql
..
:) CREATE TABLE nested_test (s String, nest Nested(x UInt8, y UInt32)) ENGINE = Memory
......@@ -788,7 +809,7 @@ ARRAY JOIN also works with nested data structures. Example:
When specifying names of nested data structures in ARRAY JOIN, the meaning is the same as ARRAY JOIN with all the array elements that it consists of. Example:
.. code-block:: sql
..
:) SELECT s, nest.x, nest.y FROM nested_test ARRAY JOIN nest.x, nest.y
......@@ -816,6 +837,8 @@ This variation also makes sense:
FROM nested_test
ARRAY JOIN `nest.x`
..
┌─s─────┬─nest.x─┬─nest.y─────┐
│ Hello │ 1 │ [10,20] │
│ Hello │ 2 │ [10,20] │
......@@ -836,6 +859,8 @@ An alias may be used for a nested data structure, in order to select either the
FROM nested_test
ARRAY JOIN nest AS n
..
┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┐
│ Hello │ 1 │ 10 │ [1,2] │ [10,20] │
│ Hello │ 2 │ 20 │ [1,2] │ [10,20] │
......@@ -856,6 +881,8 @@ Example of using the arrayEnumerate function:
FROM nested_test
ARRAY JOIN nest AS n, arrayEnumerate(`nest.x`) AS num
..
┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┬─num─┐
│ Hello │ 1 │ 10 │ [1,2] │ [10,20] │ 1 │
│ Hello │ 2 │ 20 │ [1,2] │ [10,20] │ 2 │
......@@ -935,6 +962,8 @@ Example:
ORDER BY hits DESC
LIMIT 10
..
┌─CounterID─┬───hits─┬─visits─┐
│ 1143050 │ 523264 │ 13665 │
│ 731962 │ 475698 │ 102716 │
......@@ -1241,6 +1270,8 @@ Example:
GROUP BY EventDate
ORDER BY EventDate ASC
..
┌──EventDate─┬────ratio─┐
│ 2014-03-17 │ 1 │
│ 2014-03-18 │ 0.807696 │
......
......@@ -3,7 +3,8 @@ Syntax
There are two types of parsers in the system: a full SQL parser (a recursive descent parser), and a data format parser (a fast stream parser). In all cases except the INSERT query, only the full SQL parser is used.
The INSERT query uses both parsers:
::
.. code-block:: sql
INSERT INTO t VALUES (1, 'Hello, world'), (2, 'abc'), (3, 'def')
The ``INSERT INTO t VALUES`` fragment is parsed by the full parser, and the data ``(1, 'Hello, world'), (2, 'abc'), (3, 'def')`` is parsed by the fast stream parser.
......@@ -82,7 +83,8 @@ Data types and table engines in the ``CREATE`` query are written the same way as
Synonyms
~~~~~~~~
In the SELECT query, expressions can specify synonyms using the AS keyword. Any expression is placed to the left of AS. The identifier name for the synonym is placed to the right of AS. As opposed to standard SQL, synonyms are not only declared on the top level of expressions:
::
.. code-block:: sql
SELECT (1 AS n) + 2, n
In contrast to standard SQL, synonyms can be used in all parts of a query, not just ``SELECT``.
......
......@@ -26,7 +26,7 @@ The maximum number of query processing threads
This parameter applies to threads that perform the same stages of the query execution pipeline in parallel.
For example, if reading from a table, evaluating expressions with functions, filtering with WHERE and pre-aggregating for GROUP BY can all be done in parallel using at least ``max_threads`` number of threads, then 'max_threads' are used.
By default, ``8``.
By default, 8.
If less than one SELECT query is normally run on a server at a time, set this parameter to a value slightly less than the actual number of processor cores.
......
......@@ -6,9 +6,10 @@ To apply all the settings in a profile, set 'profile'. Example:
SET profile = 'web'
- Load the 'web' profile. That is, set all the options belonging to the 'web' profile.
Load the 'web' profile. That is, set all the options belonging to the 'web' profile.
Settings profiles are declared in the user config file. This is normally 'users.xml'.
Example:
.. code-block:: xml
......
......@@ -3,7 +3,8 @@ system.clusters
Contains information about clusters available in the config file and the servers in them.
Columns:
::
..
cluster String - Cluster name.
shard_num UInt32 - Number of a shard in the cluster, starting from 1.
shard_weight UInt32 - Relative weight of a shard when writing data.
......
......@@ -3,7 +3,8 @@ system.columns
Contains information about the columns in all tables.
You can use this table to get information similar to ``DESCRIBE TABLE``, but for multiple tables at once.
::
..
database String - Name of the database the table is located in.
table String - Table name.
name String - Column name.
......
......@@ -4,7 +4,8 @@ system.dictionaries
Contains information about external dictionaries.
Columns:
::
..
name String - Dictionary name.
type String - Dictionary type: Flat, Hashed, Cache.
origin String - Path to the config file where the dictionary is described.
......
......@@ -3,7 +3,7 @@ system.functions
Contains information about normal and aggregate functions.
Columns:
..
::
name String - Function name.
is_aggregate UInt8 - Whether it is an aggregate function.
......@@ -3,7 +3,8 @@ system.merges
Contains information about merges currently in process for tables in the MergeTree family.
Columns:
::
..
database String - Name of the database the table is located in.
table String - Name of the table.
elapsed Float64 - Time in seconds since the merge started.
......
......@@ -3,7 +3,8 @@ system.parts
Contains information about parts of a table in the MergeTree family.
Columns:
::
..
database String - Name of the database where the table that this part belongs to is located.
table String - Name of the table that this part belongs to.
engine String - Name of the table engine, without parameters.
......
......@@ -3,7 +3,8 @@ system.processes
This system table is used for implementing the ``SHOW PROCESSLIST`` query.
Columns:
::
..
user String - Name of the user who made the request. For distributed query processing, this is the user who helped the requestor server send the query to this server, not the user who made the distributed request on the requestor server.
address String - The IP address the request was made from. The same for distributed processing.
......
......@@ -5,7 +5,7 @@ Contains information and status for replicated tables residing on the local serv
Example:
.. code-block:: sql
..
SELECT *
FROM system.replicas
......@@ -34,8 +34,8 @@ Example:
total_replicas: 2
active_replicas: 2
Столбцы:
::
Columns:
..
database: Database name.
table: Table name.
engine: Table engine name.
......
......@@ -4,7 +4,8 @@ system.settings
Contains information about settings that are currently in use (i.e. used for executing the query you are using to read from the system.settings table).
Columns:
::
..
name String - Setting name.
value String - Setting value.
changed UInt8 - Whether the setting was explicitly defined in the config or explicitly changed.
......
......@@ -9,7 +9,8 @@ To output data for all root nodes, write path = '/'.
If the path specified in 'path' doesn't exist, an exception will be thrown.
Columns:
::
..
name String - Name of the node.
path String - Path to the node.
value String - Value of the node.
......@@ -27,7 +28,7 @@ Columns:
Example:
.. code-block:: sql
..
SELECT *
FROM system.zookeeper
......
......@@ -2,7 +2,8 @@ Buffer
------
Buffers the data to write in RAM, periodically flushing it to another table. During the read operation, data is read from the buffer and the other table simultaneously.
::
..
Buffer(database, table, num_layers, min_time, max_time, min_rows, max_rows, min_bytes, max_bytes)
Engine parameters:
......@@ -20,7 +21,8 @@ During the write operation, data is inserted to a 'num_layers' number of random
The conditions for flushing the data are calculated separately for each of the 'num_layers' buffers. For example, if num_layers = 16 and max_bytes = 100000000, the maximum RAM consumption is 1.6 GB.
Example:
::
.. code-block:: sql
CREATE TABLE merge.hits_buffer AS merge.hits ENGINE = Buffer(merge, hits, 16, 10, 100, 10000, 1000000, 10000000, 100000000)
Creating a 'merge.hits_buffer' table with the same structure as 'merge.hits' and using the Buffer engine. When writing to this table, data is buffered in RAM and later written to the 'merge.hits' table. 16 buffers are created. The data in each of them is flushed if either 100 seconds have passed, or one million rows have been written, or 100 MB of data have been written; or if simultaneously 10 seconds have passed and 10,000 rows and 10 MB of data have been written. For example, if just one row has been written, after 100 seconds it will be flushed, no matter what. But if many rows have been written, the data will be flushed sooner.
......
Distributed
-----------
**The Distributed engine does not store data itself**, but allows distributed query processing on multiple servers.
**The Distributed engine by itself does not store data**, but allows distributed query processing on multiple servers.
Reading is automatically parallelized. During a read, the table indexes on remote servers are used, if there are any.
The Distributed engine accepts parameters: the cluster name in the server's config file, the name of a remote database, the name of a remote table, and (optionally) a sharding key.
Example:
::
..
Distributed(logs, default, hits[, sharding_key])
- Data will be read from all servers in the 'logs' cluster, from the 'default.hits' table located on every server in the cluster.
......
File(InputFormat)
-----------------
The data source is a file that stores data in one of the supported input formats (TabSeparated, Native, и т. д.) ...
The data source is a file that stores data in one of the supported input formats (TabSeparated, Native, etc.) ...
......@@ -2,7 +2,8 @@ Join
----
A prepared data structure for JOIN that is always located in RAM.
::
..
Join(ANY|ALL, LEFT|INNER, k1[, k2, ...])
Engine parameters: ``ANY``|``ALL`` - strictness, and ``LEFT``|``INNER`` - the type. These parameters are set without quotes and must match the JOIN that the table will be used for. k1, k2, ... are the key columns from the USING clause that the join will be made on.
......
......@@ -4,7 +4,8 @@ Merge
The Merge engine (not to be confused with MergeTree) does not store data itself, but allows reading from any number of other tables simultaneously.
Reading is automatically parallelized. Writing to a table is not supported. When reading, the indexes of tables that are actually being read are used, if they exist.
The Merge engine accepts parameters: the database name and a regular expression for tables. Example:
::
..
Merge(hits, '^WatchLog')
- Data will be read from the tables in the 'hits' database with names that match the regex ``'^WatchLog'``.
......@@ -19,7 +20,7 @@ It is possible to create two Merge tables that will endlessly try to read each o
The typical way to use the Merge engine is for working with a large number of TinyLog tables as if with a single table.
Virtual columns
~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~
Virtual columns are columns that are provided by the table engine, regardless of the table definition. In other words, these columns are not specified in CREATE TABLE, but they are accessible for SELECT.
......
......@@ -8,11 +8,13 @@ The engine accepts parameters: the name of a Date type column containing the dat
Example:
Example without sampling support:
::
..
MergeTree(EventDate, (CounterID, EventDate), 8192)
Example with sampling support:
::
..
MergeTree(EventDate, intHash32(UserID), (CounterID, EventDate, intHash32(UserID)), 8192)
A MergeTree type table must have a separate column containing the date. In this example, it is the 'EventDate' column. The type of the date column must be 'Date' (not 'DateTime').
......
......@@ -73,7 +73,8 @@ The ``'Replicated'`` prefix is added to the table engine name. For example, ``Re
Two parameters are also added in the beginning of the parameters list - the path to the table in ZooKeeper, and the replica name in ZooKeeper.
Example:
::
..
ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/hits', '{replica}', EventDate, intHash32(UserID), (CounterID, EventDate, intHash32(UserID), EventTime), 8192)
As the example shows, these parameters can contain substitutions in curly brackets. The substituted values are taken from the 'macros' section of the config file. Example:
......@@ -145,7 +146,7 @@ An alternative recovery option is to delete information about the lost replica f
There is no restriction on network bandwidth during recovery. Keep this in mind if you are restoring many replicas at once.
Converting from MergeTree to ReplicatedMergeTree
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From here on, we use ``MergeTree`` to refer to all the table engines in the ``MergeTree`` family, including ``ReplicatedMergeTree``.
......@@ -160,7 +161,7 @@ Then run ALTER TABLE ATTACH PART on one of the replicas to add these data parts
If exactly the same parts exist on the other replicas, they are added to the working set on them. If not, the parts are downloaded from the replica that has them.
Converting from ReplicatedMergeTree to MergeTree
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a MergeTree table with a different name. Move all the data from the directory with the ReplicatedMergeTree table data to the new table's data directory. Then delete the ReplicatedMergeTree table and restart the server.
......@@ -170,6 +171,6 @@ If you want to get rid of a ReplicatedMergeTree table without launching the serv
After this, you can launch the server, create a MergeTree table, move the data to its directory, and then restart the server.
Recovery when metadata in the ZooKeeper cluster is lost or damaged
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you lost ZooKeeper, you can save data by moving it to an unreplicated table as described above.
......@@ -23,13 +23,13 @@ Summation is not performed for a read operation. If it is necessary, write the a
In addition, a table can have nested data structures that are processed in a special way.
If the name of a nested table ends in 'Map' and it contains at least two columns that meet the following criteria:
* for the first table, numeric ((U)IntN, Date, DateTime), we'll refer to it as 'key'
* for other tables, arithmetic ((U)IntN, Float32/64), we'll refer to it as '(values...)'
then this nested table is interpreted as a mapping of key => (values...), and when merging its rows, the elements of two data sets are merged by 'key' with a summation of the corresponding (values...).
Then this nested table is interpreted as a mapping of key => (values...), and when merging its rows, the elements of two data sets are merged by 'key' with a summation of the corresponding (values...).
Examples:
::
..
[(1, 100)] + [(2, 150)] -> [(1, 100), (2, 150)]
[(1, 100)] + [(1, 150)] -> [(1, 250)]
[(1, 100)] + [(1, 150), (2, 150)] -> [(1, 250), (2, 150)]
......
......@@ -115,7 +115,7 @@ Gentoo overlay: https://github.com/kmeaw/clickhouse-overlay
.. code-block:: bash
milovidov@milovidov-Latitude-E6320:~/work/metrica/src/dbms/src/Client$ ./clickhouse-client
milovidov@hostname:~/work/metrica/src/dbms/src/Client$ ./clickhouse-client
ClickHouse client version 0.0.18749.
Connecting to localhost:9000.
Connected to ClickHouse server version 0.0.18749.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册