From 3cfdbf2f0cef54a06c41c418985ca9db935f5c86 Mon Sep 17 00:00:00 2001 From: gccgdb1234 Date: Fri, 19 May 2023 08:18:29 +0800 Subject: [PATCH] doc: refine Kafka source connector configuration parameters --- docs/en/20-third-party/11-kafka.md | 13 ++++++++----- docs/zh/20-third-party/11-kafka.md | 8 +++++--- 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/docs/en/20-third-party/11-kafka.md b/docs/en/20-third-party/11-kafka.md index 3b0de6c349..71d8c41173 100644 --- a/docs/en/20-third-party/11-kafka.md +++ b/docs/en/20-third-party/11-kafka.md @@ -424,11 +424,14 @@ The following configuration items apply to TDengine Sink Connector and TDengine ### TDengine Source Connector specific configuration 1. `connection.database`: source database name, no default value. -2. `topic.prefix`: topic name prefix after data is imported into kafka. Use `topic.prefix` + `connection.database` name as the full topic name. Defaults to the empty string "". -3. `timestamp.initial`: Data synchronization start time. The format is 'yyyy-MM-dd HH:mm:ss'. Default "1970-01-01 00:00:00". -4. `poll.interval.ms`: Pull data interval, the unit is ms. Default is 1000. -5. `fetch.max.rows`: The maximum number of rows retrieved when retrieving the database. Default is 100. -6. `out.format`: The data format. The value could be line or json. The line represents the InfluxDB Line protocol format, and json represents the OpenTSDB JSON format. Default is `line`. +2. `topic.prefix`: topic name prefix used when importing data into kafka. Its defaults value is empty string "". +3. `timestamp.initial`: Data synchronization start time. The format is 'yyyy-MM-dd HH:mm:ss'. If it is not set, the data importing to Kafka will be started from the first/oldest row in the database. +4. `poll.interval.ms`: The time interval for checking newly created tables or removed tables, default value is 1000. +5. `fetch.max.rows`: The maximum number of rows retrieved when retrieving the database, default is 100. +6. `out.format`: The data format. The value could be `line`, which represents the InfluxDB Line protocol format. +7. 7. `query.interval.ms`: The time range of reading data from TDengine each time, its unit is millisecond. It should be adjusted according to the data flow in rate, the default value is 1000. +8. `topic.per.stable`: If it's set to true, it means one super table in TDengine corresponds to a topic in Kafka, the topic naming rule is `--`; if it's set to false, it means the whole DB corresponds to a topic in Kafka, the topic naming rule is `-`. + ## Other notes diff --git a/docs/zh/20-third-party/11-kafka.md b/docs/zh/20-third-party/11-kafka.md index 75d8deebb1..2470bf7c9a 100644 --- a/docs/zh/20-third-party/11-kafka.md +++ b/docs/zh/20-third-party/11-kafka.md @@ -435,10 +435,12 @@ confluent local services connect connector unload TDengineSourceConnector 1. `connection.database`: 源数据库名称,无缺省值。 2. `topic.prefix`: 数据导入 kafka 后 topic 名称前缀。 使用 `topic.prefix` + `connection.database` 名称作为完整 topic 名。默认为空字符串 ""。 -3. `timestamp.initial`: 数据同步起始时间。格式为'yyyy-MM-dd HH:mm:ss'。默认为 "1970-01-01 00:00:00"。 -4. `poll.interval.ms`: 拉取数据间隔,单位为 ms。默认为 1000。 +3. `timestamp.initial`: 数据同步起始时间。格式为'yyyy-MM-dd HH:mm:ss',若未指定则从指定 DB 中最早的一条记录开始。 +4. `poll.interval.ms`: 检查是否有新建或删除的表的时间间隔,单位为 ms。默认为 1000。 5. `fetch.max.rows` : 检索数据库时最大检索条数。 默认为 100。 -6. `out.format`: 数据格式。取值 line 或 json。line 表示 InfluxDB Line 协议格式, json 表示 OpenTSDB JSON 格式。默认为 line。 +6. `out.format`: 数据格式。取值为 `line`, 表示 InfluxDB Line 协议格式 +7. `query.interval.ms`: 从 TDengine 一次读取数据的时间跨度,需要根据表中的数据特征合理配置,避免一次查询的数据量过大或过小;在具体的环境中建议通过测试设置一个较优值,默认值为 1000. +8. `topic.per.stable`: 如果设置为true,表示一个超级表对应一个 Kafka topic,topic的命名规则 `--`;如果设置为 false,则指定的 DB 中的所有数据进入一个 Kafka topic,topic 的命名规则为 `-` ## 其他说明 -- GitLab