未验证 提交 0357c118 编写于 作者: B BayoNet 提交者: GitHub

DOCAPI-7129 Nested JSON in JSONEachRow description + date_time_input_format (#5889)

* The input_format_import_nested_json and date_time_input_format settings description.
* Usage of Nested Structures with JSONEachRow.
上级 70b0c315
......@@ -323,7 +323,7 @@ When using this format, ClickHouse outputs rows as separated, newline-delimited
```json
{"SearchPhrase":"curtain designs","count()":"1064"}
{"SearchPhrase":"baku","count()":"1000"}
{"SearchPhrase":"","count":"8267016"}
{"SearchPhrase":"","count()":"8267016"}
```
When inserting the data, you should provide a separate JSON object for each row.
......@@ -386,6 +386,60 @@ Unlike the [JSON](#json) format, there is no substitution of invalid UTF-8 seque
!!! note "Note"
Any set of bytes can be output in the strings. Use the `JSONEachRow` format if you are sure that the data in the table can be formatted as JSON without losing any information.
### Usage of Nested Structures {#jsoneachrow-nested}
If you have a table with the [Nested](../data_types/nested_data_structures/nested.md) data type columns, you can insert JSON data having the same structure. Enable this functionality with the [input_format_import_nested_json](../operations/settings/settings.md#settings-input_format_import_nested_json) setting.
For example, consider the following table:
```sql
CREATE TABLE json_each_row_nested (n Nested (s String, i Int32) ) ENGINE = Memory
```
As you can find in the `Nested` data type description, ClickHouse treats each component of the nested structure as a separate column, `n.s` and `n.i` for our table. So you can insert the data the following way:
```sql
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n.s": ["abc", "def"], "n.i": [1, 23]}
```
To insert data as hierarchical JSON object set [input_format_import_nested_json=1](../operations/settings/settings.md#settings-input_format_import_nested_json).
```json
{
"n": {
"s": ["abc", "def"],
"i": [1, 23]
}
}
```
Without this setting ClickHouse throws the exception.
```sql
SELECT name, value FROM system.settings WHERE name = 'input_format_import_nested_json'
```
```text
┌─name────────────────────────────┬─value─┐
│ input_format_import_nested_json │ 0 │
└─────────────────────────────────┴───────┘
```
```sql
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
```
```text
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: n: (at row 1)
```
```sql
SET input_format_import_nested_json=1
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
SELECT * FROM json_each_row_nested
```
```text
┌─n.s───────────┬─n.i────┐
│ ['abc','def'] │ [1,23] │
└───────────────┴────────┘
```
## Native {#native}
The most efficient format. Data is written and read by blocks in binary format. For each block, the number of rows, number of columns, column names and types, and parts of columns in this block are recorded one after another. In other words, this format is "columnar" – it doesn't convert columns to rows. This is the format used in the native interface for interaction between servers, for using the command-line client, and for C++ clients.
......
......@@ -231,6 +231,25 @@ Possible values:
Default value: 0.
## input_format_import_nested_json {#settings-input_format_import_nested_json}
Enables or disables inserting of JSON data with nested objects.
Supported formats:
- [JSONEachRow](../../interfaces/formats.md#jsoneachrow)
Possible values:
- 0 — Disabled.
- 1 — Enabled.
Default value: 0.
**See Also**
- [Usage of Nested Structures](../../interfaces/formats.md#jsoneachrow-nested) with the `JSONEachRow` format.
## input_format_with_names_use_header {#settings-input_format_with_names_use_header}
Enables or disables checking the column order when inserting data.
......@@ -249,6 +268,27 @@ Possible values:
Default value: 1.
## date_time_input_format {#settings-date_time_input_format}
Enables or disables extended parsing of date and time formatted strings.
The setting doesn't apply to [date and time functions](../../query_language/functions/date_time_functions.md).
Possible values:
- `'best_effort'` — Enables extended parsing.
ClickHouse can parse the basic format `YYYY-MM-DD HH:MM:SS` and all the [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) date and time formats. For example, `'2018-06-08T01:02:03.000Z'`.
- `'basic'` — Use basic parser.
ClickHouse can parse only the basic format.
**See Also**
- [DateTime data type.](../../data_types/datetime.md)
- [Functions for working with dates and times.](../../query_language/functions/date_time_functions.md)
## join_default_strictness {#settings-join_default_strictness}
Sets default strictness for [JOIN clauses](../../query_language/select.md#select-join).
......
......@@ -58,11 +58,10 @@ arrayConcat(arrays)
- `arrays` – Arbitrary number of arguments of [Array](../../data_types/array.md) type.
**Example**
``` sql
```sql
SELECT arrayConcat([1, 2], [3, 4], [5, 6]) AS res
```
```
```text
┌─res───────────┐
│ [1,2,3,4,5,6] │
└───────────────┘
......@@ -204,7 +203,7 @@ Returns the array \[1, 2, 3, ..., length (arr) \]
This function is normally used with ARRAY JOIN. It allows counting something just once for each array after applying ARRAY JOIN. Example:
``` sql
```sql
SELECT
count() AS Reaches,
countIf(num = 1) AS Hits
......@@ -215,8 +214,7 @@ ARRAY JOIN
WHERE CounterID = 160656
LIMIT 10
```
```
```text
┌─Reaches─┬──Hits─┐
│ 95606 │ 31406 │
└─────────┴───────┘
......@@ -224,15 +222,14 @@ LIMIT 10
In this example, Reaches is the number of conversions (the strings received after applying ARRAY JOIN), and Hits is the number of pageviews (strings before ARRAY JOIN). In this particular case, you can get the same result in an easier way:
``` sql
```sql
SELECT
sum(length(GoalsReached)) AS Reaches,
count() AS Hits
FROM test.hits
WHERE (CounterID = 160656) AND notEmpty(GoalsReached)
```
```
```text
┌─Reaches─┬──Hits─┐
│ 95606 │ 31406 │
└─────────┴───────┘
......@@ -248,7 +245,7 @@ For example: arrayEnumerateUniq(\[10, 20, 10, 30\]) = \[1, 1, 2, 1\].
This function is useful when using ARRAY JOIN and aggregation of array elements.
Example:
``` sql
```sql
SELECT
Goals.ID AS GoalID,
sum(Sign) AS Reaches,
......@@ -262,8 +259,7 @@ GROUP BY GoalID
ORDER BY Reaches DESC
LIMIT 10
```
```
```text
┌──GoalID─┬─Reaches─┬─Visits─┐
│ 53225 │ 3214 │ 1097 │
│ 2825062 │ 3188 │ 1097 │
......@@ -282,11 +278,10 @@ In this example, each goal ID has a calculation of the number of conversions (ea
The arrayEnumerateUniq function can take multiple arrays of the same size as arguments. In this case, uniqueness is considered for tuples of elements in the same positions in all the arrays.
``` sql
```sql
SELECT arrayEnumerateUniq([1, 1, 1, 2, 2, 2], [1, 1, 2, 1, 1, 2]) AS res
```
```
```text
┌─res───────────┐
│ [1,2,1,1,2,1] │
└───────────────┘
......@@ -308,11 +303,10 @@ arrayPopBack(array)
**Example**
``` sql
```sql
SELECT arrayPopBack([1, 2, 3]) AS res
```
```
```text
┌─res───┐
│ [1,2] │
└───────┘
......@@ -332,11 +326,10 @@ arrayPopFront(array)
**Example**
``` sql
```sql
SELECT arrayPopFront([1, 2, 3]) AS res
```
```
```text
┌─res───┐
│ [2,3] │
└───────┘
......@@ -357,11 +350,10 @@ arrayPushBack(array, single_value)
**Example**
``` sql
```sql
SELECT arrayPushBack(['a'], 'b') AS res
```
```
```text
┌─res───────┐
│ ['a','b'] │
└───────────┘
......@@ -382,11 +374,10 @@ arrayPushFront(array, single_value)
**Example**
``` sql
```sql
SELECT arrayPushBack(['b'], 'a') AS res
```
```
```text
┌─res───────┐
│ ['a','b'] │
└───────────┘
......@@ -446,11 +437,10 @@ arraySlice(array, offset[, length])
**Example**
``` sql
```sql
SELECT arraySlice([1, 2, NULL, 4, 5], 2, 3) AS res
```
```
```text
┌─res────────┐
│ [2,NULL,4] │
└────────────┘
......@@ -464,10 +454,10 @@ Sorts the elements of the `arr` array in ascending order. If the `func` function
Example of integer values sorting:
``` sql
```sql
SELECT arraySort([1, 3, 3, 0]);
```
```
```text
┌─arraySort([1, 3, 3, 0])─┐
│ [0,1,3,3] │
└─────────────────────────┘
......@@ -475,10 +465,10 @@ SELECT arraySort([1, 3, 3, 0]);
Example of string values sorting:
``` sql
```sql
SELECT arraySort(['hello', 'world', '!']);
```
```
```text
┌─arraySort(['hello', 'world', '!'])─┐
│ ['!','hello','world'] │
└────────────────────────────────────┘
......@@ -486,10 +476,10 @@ SELECT arraySort(['hello', 'world', '!']);
Consider the following sorting order for the `NULL`, `NaN` and `Inf` values:
``` sql
```sql
SELECT arraySort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]);
```
```
```text
┌─arraySort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf])─┐
│ [-inf,-4,1,2,3,inf,nan,nan,NULL,NULL] │
└───────────────────────────────────────────────────────────┘
......@@ -504,10 +494,10 @@ Note that `arraySort` is a [higher-order function](higher_order_functions.md). Y
Let's consider the following example:
``` sql
```sql
SELECT arraySort((x) -> -x, [1, 2, 3]) as res;
```
```
```text
┌─res─────┐
│ [3,2,1] │
└─────────┘
......@@ -517,11 +507,10 @@ For each element of the source array, the lambda function returns the sorting ke
The lambda function can accept multiple arguments. In this case, you need to pass the `arraySort` function several arrays of identical length that the arguments of lambda function will correspond to. The resulting array will consist of elements from the first input array; elements from the next input array(s) specify the sorting keys. For example:
``` sql
```sql
SELECT arraySort((x, y) -> y, ['hello', 'world'], [2, 1]) as res;
```
```
```text
┌─res────────────────┐
│ ['world', 'hello'] │
└────────────────────┘
......@@ -531,19 +520,19 @@ Here, the elements that are passed in the second array ([2, 1]) define a sorting
Other examples are shown below.
``` sql
```sql
SELECT arraySort((x, y) -> y, [0, 1, 2], ['c', 'b', 'a']) as res;
```
``` sql
```text
┌─res─────┐
│ [2,1,0] │
└─────────┘
```
``` sql
```sql
SELECT arraySort((x, y) -> -y, [0, 1, 2], [1, 2, 3]) as res;
```
``` sql
```text
┌─res─────┐
│ [2,1,0] │
└─────────┘
......@@ -558,10 +547,10 @@ Sorts the elements of the `arr` array in descending order. If the `func` functio
Example of integer values sorting:
``` sql
```sql
SELECT arrayReverseSort([1, 3, 3, 0]);
```
```
```text
┌─arrayReverseSort([1, 3, 3, 0])─┐
│ [3,3,1,0] │
└────────────────────────────────┘
......@@ -569,10 +558,10 @@ SELECT arrayReverseSort([1, 3, 3, 0]);
Example of string values sorting:
``` sql
```sql
SELECT arrayReverseSort(['hello', 'world', '!']);
```
```
```text
┌─arrayReverseSort(['hello', 'world', '!'])─┐
│ ['world','hello','!'] │
└───────────────────────────────────────────┘
......@@ -580,10 +569,10 @@ SELECT arrayReverseSort(['hello', 'world', '!']);
Consider the following sorting order for the `NULL`, `NaN` and `Inf` values:
``` sql
```sql
SELECT arrayReverseSort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]) as res;
```
``` sql
```text
┌─res───────────────────────────────────┐
│ [inf,3,2,1,-4,-inf,nan,nan,NULL,NULL] │
└───────────────────────────────────────┘
......@@ -596,10 +585,10 @@ SELECT arrayReverseSort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]) as res;
Note that the `arrayReverseSort` is a [higher-order function](higher_order_functions.md). You can pass a lambda function to it as the first argument. Example is shown below.
``` sql
```sql
SELECT arrayReverseSort((x) -> -x, [1, 2, 3]) as res;
```
```
```text
┌─res─────┐
│ [1,2,3] │
└─────────┘
......@@ -612,10 +601,10 @@ The array is sorted in the following way:
The lambda function can accept multiple arguments. In this case, you need to pass the `arrayReverseSort` function several arrays of identical length that the arguments of lambda function will correspond to. The resulting array will consist of elements from the first input array; elements from the next input array(s) specify the sorting keys. For example:
``` sql
```sql
SELECT arrayReverseSort((x, y) -> y, ['hello', 'world'], [2, 1]) as res;
```
``` sql
```text
┌─res───────────────┐
│ ['hello','world'] │
└───────────────────┘
......@@ -628,18 +617,18 @@ In this example, the array is sorted in the following way:
Other examples are shown below.
``` sql
```sql
SELECT arrayReverseSort((x, y) -> y, [4, 3, 5], ['a', 'b', 'c']) AS res;
```
``` sql
```text
┌─res─────┐
│ [5,3,4] │
└─────────┘
```
``` sql
```sql
SELECT arrayReverseSort((x, y) -> -y, [4, 3, 5], [1, 2, 3]) AS res;
```
``` sql
```text
┌─res─────┐
│ [4,3,5] │
└─────────┘
......
......@@ -328,6 +328,60 @@ JSON با جاوااسکریپت سازگار است. برای اطمینان ا
برای پارس کردن، هر ترتیبی برای مقادیر ستون های مختلف پشتیبانی می شود. حذف شدن بعضی مقادیر قابل قبول است، آنها با مقادیر پیش فرض خود برابر هستند. در این مورد، صفر و سطر های خالی به عنوان مقادیر پیش فرض قرار می گیرند. مقادیر پیچیده که می توانند در جدول مشخص شوند، به عنوان مقادیر پیش فرض پشتیبانی نمی شوند. Whitespace بین element ها نادیده گرفته می شوند. اگر کاما بعد از object ها قرار گیرند، نادیده گرفته می شوند. object ها نیازی به جداسازی با استفاده از new line را ندارند.
### Usage of Nested Structures {#jsoneachrow-nested}
If you have a table with the [Nested](../data_types/nested_data_structures/nested.md) data type columns, you can insert JSON data having the same structure. Enable this functionality with the [input_format_import_nested_json](../operations/settings/settings.md#settings-input_format_import_nested_json) setting.
For example, consider the following table:
```sql
CREATE TABLE json_each_row_nested (n Nested (s String, i Int32) ) ENGINE = Memory
```
As you can find in the `Nested` data type description, ClickHouse treats each component of the nested structure as a separate column, `n.s` and `n.i` for our table. So you can insert the data the following way:
```sql
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n.s": ["abc", "def"], "n.i": [1, 23]}
```
To insert data as hierarchical JSON object set [input_format_import_nested_json=1](../operations/settings/settings.md#settings-input_format_import_nested_json).
```json
{
"n": {
"s": ["abc", "def"],
"i": [1, 23]
}
}
```
Without this setting ClickHouse throws the exception.
```sql
SELECT name, value FROM system.settings WHERE name = 'input_format_import_nested_json'
```
```text
┌─name────────────────────────────┬─value─┐
│ input_format_import_nested_json │ 0 │
└─────────────────────────────────┴───────┘
```
```sql
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
```
```text
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: n: (at row 1)
```
```sql
SET input_format_import_nested_json=1
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
SELECT * FROM json_each_row_nested
```
```text
┌─n.s───────────┬─n.i────┐
│ ['abc','def'] │ [1,23] │
└───────────────┴────────┘
```
## Native
کارآمدترین فرمت. داده ها توسط بلاک ها و در فرمت باینری نوشته و خوانده می شوند. برای هر بلاک، تعداد سطرها، تعداد ستون ها، نام ستون ها و type آنها، و بخش هایی از ستون ها در این بلاک یکی پس از دیگری ثبت می شوند. به عبارت دیگر، این فرمت "columnar" است - این فرمت ستون ها را به سطر تبدیل نمی کند. این فرمت در حالت native interface و بین سرور و محیط ترمینال و همچنین کلاینت C++ استفاده می شود.
......
......@@ -327,6 +327,60 @@ ClickHouse 支持 [NULL](../query_language/syntax.md), 在 JSON 格式中以 `nu
对于解析,任何顺序都支持不同列的值。可以省略某些值 - 它们被视为等于它们的默认值。在这种情况下,零和空行被用作默认值。 作为默认值,不支持表中指定的复杂值。元素之间的空白字符被忽略。如果在对象之后放置逗号,它将被忽略。对象不一定必须用新行分隔。
### Usage of Nested Structures {#jsoneachrow-nested}
If you have a table with the [Nested](../data_types/nested_data_structures/nested.md) data type columns, you can insert JSON data having the same structure. Enable this functionality with the [input_format_import_nested_json](../operations/settings/settings.md#settings-input_format_import_nested_json) setting.
For example, consider the following table:
```sql
CREATE TABLE json_each_row_nested (n Nested (s String, i Int32) ) ENGINE = Memory
```
As you can find in the `Nested` data type description, ClickHouse treats each component of the nested structure as a separate column, `n.s` and `n.i` for our table. So you can insert the data the following way:
```sql
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n.s": ["abc", "def"], "n.i": [1, 23]}
```
To insert data as hierarchical JSON object set [input_format_import_nested_json=1](../operations/settings/settings.md#settings-input_format_import_nested_json).
```json
{
"n": {
"s": ["abc", "def"],
"i": [1, 23]
}
}
```
Without this setting ClickHouse throws the exception.
```sql
SELECT name, value FROM system.settings WHERE name = 'input_format_import_nested_json'
```
```text
┌─name────────────────────────────┬─value─┐
│ input_format_import_nested_json │ 0 │
└─────────────────────────────────┴───────┘
```
```sql
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
```
```text
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: n: (at row 1)
```
```sql
SET input_format_import_nested_json=1
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
SELECT * FROM json_each_row_nested
```
```text
┌─n.s───────────┬─n.i────┐
│ ['abc','def'] │ [1,23] │
└───────────────┴────────┘
```
## Native {#native}
最高性能的格式。 据通过二进制格式的块进行写入和读取。对于每个块,该块中的行数,列数,列名称和类型以及列的部分将被相继记录。 换句话说,这种格式是 “列式”的 - 它不会将列转换为行。 这是用于在服务器之间进行交互的本地界面中使用的格式,用于使用命令行客户端和 C++ 客户端。
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册