diff --git a/docs/en/14-reference/13-schemaless/13-schemaless.md b/docs/en/14-reference/13-schemaless/13-schemaless.md index 3f75364081d7ec242d96a30f3adf0861637a06eb..aad0e63a4228ca303302d4a3970182355f750d53 100644 --- a/docs/en/14-reference/13-schemaless/13-schemaless.md +++ b/docs/en/14-reference/13-schemaless/13-schemaless.md @@ -3,13 +3,11 @@ title: Schemaless Writing description: This document describes how to use the schemaless write component of TDengine. --- -In IoT applications, data is collected for many purposes such as intelligent control, business analysis, device monitoring and so on. Due to changes in business or functional requirements or changes in device hardware, the application logic and even the data collected may change. Schemaless writing automatically creates storage structures for your data as it is being written to TDengine, so that you do not need to create supertables in advance. When necessary, schemaless writing -will automatically add the required columns to ensure that the data written by the user is stored correctly. +In IoT applications, data is collected for many purposes such as intelligent control, business analysis, device monitoring and so on. Due to changes in business or functional requirements or changes in device hardware, the application logic and even the data collected may change. Schemaless writing automatically creates storage structures for your data as it is being written to TDengine, so that you do not need to create supertables in advance. When necessary, schemaless writing will automatically add the required columns to ensure that the data written by the user is stored correctly. The schemaless writing method creates super tables and their corresponding subtables. These are completely indistinguishable from the super tables and subtables created directly via SQL. You can write data directly to them via SQL statements. Note that the names of tables created by schemaless writing are based on fixed mapping rules for tag values, so they are not explicitly ideographic and they lack readability. -Tips: -The schemaless write will automatically create a table. You do not need to create a table manually, or an unknown error may occur. +Note: Schemaless writing creates tables automatically. Creating tables manually is not supported with schemaless writing. ## Schemaless Writing Line Protocol @@ -50,8 +48,7 @@ In the schemaless writing data line protocol, each data item in the field_set ne - `t`, `T`, `true`, `True`, `TRUE`, `f`, `F`, `false`, and `False` will be handled directly as BOOL types. -For example, the following data rows write c1 column as 3 (BIGINT), c2 column as false (BOOL), c3 column -as "passit" (BINARY), c4 column as 4 (DOUBLE), and the primary key timestamp as 1626006833639000000 to child table with the t1 label as "3" (NCHAR), the t2 label as "4" (NCHAR), and the t3 label as "t3" (NCHAR) and the super table named `st`. +For example, the following string indicates that the one row of data is written to the st supertable with the t1 tag as "3" (NCHAR), the t2 tag as "4" (NCHAR), and the t3 tag as "t3" (NCHAR); the c1 column is 3 (BIGINT), the c2 column is false (BOOL), the c3 column is "passit" (BINARY), the c4 column is 4 (DOUBLE), and the primary key timestamp is 1626006833639000000. ```json st,t1=3,t2=4,t3=t3 c1=3i64,c3="passit",c2=false,c4=4f64 1626006833639000000 @@ -69,23 +66,31 @@ Schemaless writes process row data according to the following principles. "measurement,tag_key1=tag_value1,tag_key2=tag_value2" ``` +:::tip Note that tag_key1, tag_key2 are not the original order of the tags entered by the user but the result of using the tag names in ascending order of the strings. Therefore, tag_key1 is not the first tag entered in the line protocol. -The string's MD5 hash value "md5_val" is calculated after the ranking is completed. The calculation result is then combined with the string to generate the table name: "t_md5_val". "t_" is a fixed prefix that every table generated by this mapping relationship has. +The string's MD5 hash value "md5_val" is calculated after the ranking is completed. The calculation result is then combined with the string to generate the table name: "t_md5_val". "t\_" is a fixed prefix that every table generated by this mapping relationship has. +::: + You can configure smlChildTableName in taos.cfg to specify table names, for example, `smlChildTableName=tname`. You can insert `st,tname=cpul,t1=4 c1=3 1626006833639000000` and the cpu1 table will be automatically created. Note that if multiple rows have the same tname but different tag_set values, the tag_set of the first row is used to create the table and the others are ignored. 2. If the super table obtained by parsing the line protocol does not exist, this super table is created. + **Important:** Manually creating supertables for schemaless writing is not supported. Schemaless writing creates appropriate supertables automatically. + 3. If the subtable obtained by the parse line protocol does not exist, Schemaless creates the sub-table according to the subtable name determined in steps 1 or 2. + 4. If the specified tag or regular column in the data row does not exist, the corresponding tag or regular column is added to the super table (only incremental). -5. If there are some tag columns or regular columns in the super table that are not specified to take values in a data row, then the values of these columns are set to - NULL. + +5. If there are some tag columns or regular columns in the super table that are not specified to take values in a data row, then the values of these columns are set to NULL. + 6. For BINARY or NCHAR columns, if the length of the value provided in a data row exceeds the column type limit, the maximum length of characters allowed to be stored in the column is automatically increased (only incremented and not decremented) to ensure complete preservation of the data. + 7. Errors encountered throughout the processing will interrupt the writing process and return an error code. -8. It is assumed that the order of field_set in a supertable is consistent, meaning that the first record contains all fields and subsequent records store fields in the same order. If the order is not consistent, set smlDataFormat in taos.cfg to false. Otherwise, data will be written out of order and a database error will occur.(smlDataFormat in taos.cfg default to false after version of 3.0.1.3, discarded since 3.0.3.0) -:::tip -All processing logic of schemaless will still follow TDengine's underlying restrictions on data structures, such as the total length of each row of data cannot exceed -48KB, and the total length of tag value cannot exceed 16KB. See [TDengine SQL Boundary Limits](/taos-sql/limit) for specific constraints in this area. +8. It is assumed that the order of field_set in a supertable is consistent, meaning that the first record contains all fields and subsequent records store fields in the same order. If the order is not consistent, set smlDataFormat in taos.cfg to false. Otherwise, data will be written out of order and a database error will occur. + Note: TDengine 3.0.3.0 and later automatically detect whether order is consistent. This parameter is no longer used. +:::tip +All processing logic of schemaless will still follow TDengine's underlying restrictions on data structures, such as the total length of each row of data cannot exceed 48 KB and the total length of a tag value cannot exceed 16 KB. See [TDengine SQL Boundary Limits](/taos-sql/limit) for specific constraints in this area. ::: ## Time resolution recognition @@ -114,8 +119,7 @@ In OpenTSDB file and JSON protocol modes, the precision of the timestamp is dete ## Data Model Mapping -This section describes how data in line protocol is mapped to a schema. The data measurement in each line is mapped to a -supertable name. The tag name in tag_set is the tag name in the schema, and the name in field_set is the column name in the schema. The following example shows how data is mapped: +This section describes how data in InfluxDB line protocol is mapped to a schema. The data measurement in each line is mapped to a supertable name. The tag name in tag_set is the tag name in the schema, and the name in field_set is the column name in the schema. The following example shows how data is mapped: ```json st,t1=3,t2=4,t3=t3 c1=3i64,c3="passit",c2=false,c4=4f64 1626006833639000000 @@ -160,7 +164,7 @@ The preceding data includes a new entry, c6, with type binary(6). When this occu TDengine guarantees the idempotency of data writes. This means that you can repeatedly call the API to perform write operations with bad data. However, TDengine does not guarantee the atomicity of multi-row writes. In a multi-row write, some data may be written successfully and other data unsuccessfully. -##: Error Codes +## Error Codes The TSDB_CODE_TSC_LINE_SYNTAX_ERROR indicates an error in the schemaless writing component. This error occurs when writing text. For other errors, schemaless writing uses the standard TDengine error codes