From 3cfdbf2f0cef54a06c41c418985ca9db935f5c86 Mon Sep 17 00:00:00 2001
From: gccgdb1234 <wxzhang@taosdata.com>
Date: Fri, 19 May 2023 08:18:29 +0800
Subject: [PATCH] doc: refine Kafka source connector configuration parameters

---
 docs/en/20-third-party/11-kafka.md | 13 ++++++++-----
 docs/zh/20-third-party/11-kafka.md |  8 +++++---
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/docs/en/20-third-party/11-kafka.md b/docs/en/20-third-party/11-kafka.md
index 3b0de6c349..71d8c41173 100644
--- a/docs/en/20-third-party/11-kafka.md
+++ b/docs/en/20-third-party/11-kafka.md
@@ -424,11 +424,14 @@ The following configuration items apply to TDengine Sink Connector and TDengine
 ### TDengine Source Connector specific configuration
 
 1. `connection.database`: source database name, no default value.
-2. `topic.prefix`: topic name prefix after data is imported into kafka. Use `topic.prefix` + `connection.database` name as the full topic name. Defaults to the empty string "".
-3. `timestamp.initial`: Data synchronization start time. The format is 'yyyy-MM-dd HH:mm:ss'. Default "1970-01-01 00:00:00".
-4. `poll.interval.ms`: Pull data interval, the unit is ms. Default is 1000.
-5. `fetch.max.rows`: The maximum number of rows retrieved when retrieving the database. Default is 100.
-6. `out.format`: The data format. The value could be line or json. The line represents the InfluxDB Line protocol format, and json represents the OpenTSDB JSON format. Default is `line`.
+2. `topic.prefix`: topic name prefix used when importing data into kafka. Its defaults value is empty string "".
+3. `timestamp.initial`: Data synchronization start time. The format is 'yyyy-MM-dd HH:mm:ss'. If it is not set, the data importing to Kafka will be started from the first/oldest row in the database.
+4. `poll.interval.ms`: The time interval for checking newly created tables or removed tables, default value is 1000.
+5. `fetch.max.rows`: The maximum number of rows retrieved when retrieving the database, default is 100.
+6. `out.format`: The data format. The value could be `line`, which represents the InfluxDB Line protocol format.
+7. 7. `query.interval.ms`: The time range of reading data from TDengine each time, its unit is millisecond. It should be adjusted according to the data flow in rate, the default value is 1000.
+8. `topic.per.stable`: If it's set to true, it means one super table in TDengine corresponds to a topic in Kafka, the topic naming rule is `<topic.prefix>-<connection.database>-<stable.name>`; if it's set to false, it means the whole DB corresponds to a topic in Kafka, the topic naming rule is `<topic.prefix>-<connection.database>`.
+
 
 
 ## Other notes
diff --git a/docs/zh/20-third-party/11-kafka.md b/docs/zh/20-third-party/11-kafka.md
index 75d8deebb1..2470bf7c9a 100644
--- a/docs/zh/20-third-party/11-kafka.md
+++ b/docs/zh/20-third-party/11-kafka.md
@@ -435,10 +435,12 @@ confluent local services connect connector unload TDengineSourceConnector
 
 1. `connection.database`: 源数据库名称，无缺省值。
 2. `topic.prefix`： 数据导入 kafka 后 topic 名称前缀。 使用 `topic.prefix` + `connection.database` 名称作为完整 topic 名。默认为空字符串 ""。
-3. `timestamp.initial`: 数据同步起始时间。格式为'yyyy-MM-dd HH:mm:ss'。默认为 "1970-01-01 00:00:00"。
-4. `poll.interval.ms`: 拉取数据间隔，单位为 ms。默认为 1000。
+3. `timestamp.initial`: 数据同步起始时间。格式为'yyyy-MM-dd HH:mm:ss'，若未指定则从指定 DB 中最早的一条记录开始。
+4. `poll.interval.ms`: 检查是否有新建或删除的表的时间间隔，单位为 ms。默认为 1000。
 5. `fetch.max.rows` : 检索数据库时最大检索条数。 默认为 100。
-6. `out.format`: 数据格式。取值 line 或 json。line 表示 InfluxDB Line 协议格式， json 表示 OpenTSDB JSON 格式。默认为 line。
+6. `out.format`: 数据格式。取值为 `line`， 表示 InfluxDB Line 协议格式
+7. `query.interval.ms`: 从 TDengine 一次读取数据的时间跨度，需要根据表中的数据特征合理配置，避免一次查询的数据量过大或过小；在具体的环境中建议通过测试设置一个较优值，默认值为 1000.
+8. `topic.per.stable`: 如果设置为true，表示一个超级表对应一个 Kafka topic，topic的命名规则 `<topic.prefix>-<connection.database>-<stable.name>`；如果设置为 false，则指定的 DB 中的所有数据进入一个 Kafka topic，topic 的命名规则为 `<topic.prefix>-<connection.database>`
 
 ## 其他说明
 
-- 
GitLab