[skip ci] Update flush doc (#8261)

Signed-off-by: N yudong.cai <yudong.cai@zilliz.com>

[skip ci] Update flush doc (#8261)
Signed-off-by: N yudong.cai <yudong.cai@zilliz.com>
6e664429 · Cai Yudong · GitHub · b2f6b2df · 6e664429
隐藏空白更改
内联并排

Showing with 46 addition and 46 deletion

docs/design_docs/milvus_flush_collections_en.md docs/design_docs/milvus_flush_collections_en.md +46 -46

未找到文件。
--- a/docs/design_docs/milvus_flush_collections_en.md
+++ b/docs/design_docs/milvus_flush_collections_en.md
 # Flush Collection 
-`Flush` operation is used to make sure that the data has been writen into the persistent storage, this document introduce how `Flush` operation works in `Milvus 2.0`. The following figure shows the execution flow of `Flush`
+`Flush` operation is used to make sure that inserted data will be written into persistent storage. This document will introduce how `Flush` operation works in `Milvus 2.0`. The following figure shows the execution flow of `Flush`.

 ![flush_collections](./graphs/flush_data_coord.png)

@@ -7,9 +7,7 @@
 ```proto
 service MilvusService {
  ...
-
  rpc Flush(FlushRequest) returns (FlushResponse) {}
-
  ...
 }

@@ -27,7 +25,7 @@ message FlushResponse{
 ```


-2. When received the `Flush` request, the `Proxy` would wraps this request into `FlushTask`, and pushs this task into `DdTaskQueue` queue. After that, `Proxy` would call method of `WatiToFinish` to wait until the task finished.
+2. When `Proxy` receives `Flush` request, it would wrap this request into `FlushTask`, and push this task into `DdTaskQueue` queue. After that, `Proxy` would call method `WatiToFinish` to wait until the task finished.
 ```go
 type task interface {
 	TraceCtx() context.Context
@@ -55,46 +53,50 @@ type FlushTask struct {
 }
 ```

-3. There is a backgroud service in `Proxy`, this service would get the `FlushTask` from `DdTaskQueue`, and executes it in three phases.
-    - `PreExecute`,`FlushTask` does nothing at this phase, and return directly
-    - `Execute`, at this phase, `Proxy` would send `Flush` request to `DataCoord` via `Grpc`,and wait for the reponse, the `proto` is defined as follow:
-    ```proto
-        service DataCoord {
-          ...
-
-          rpc Flush(FlushRequest) returns (FlushResponse) {}
+3. There is a backgroud service in `Proxy`. This service gets `FlushTask` from `DdTaskQueue`, and executes in three phases:
+    - `PreExecute`
+    
+      `FlushTask` does nothing at this phase, and returns directly
+    
+    - `Execute`

-          ...
-        }
-
-        message FlushRequest {
-          common.MsgBase base = 1;
-          int64 dbID = 2;
-          int64 collectionID = 4;
-        }
-
-        message FlushResponse {
-          common.Status status = 1;
-          int64 dbID = 2;
-          int64 collectionID = 3;
-          repeated int64 segmentIDs = 4;
+      `Proxy` sends `Flush` request to `DataCoord` via `Grpc`, and waits for the response, the `proto` is defined as follow:
+    ```proto
+    service DataCoord {
+      ...
+      rpc Flush(FlushRequest) returns (FlushResponse) {}
+      ...
+    }
+    
+    message FlushRequest {
+      common.MsgBase base = 1;
+      int64 dbID = 2;
+      int64 collectionID = 4;
+    }
+    
+    message FlushResponse {
+      common.Status status = 1;
+      int64 dbID = 2;
+      int64 collectionID = 3;
+      repeated int64 segmentIDs = 4;
+    }
    ```
-    - `PostExecute`, `FlushTask` does nothing at this phase, and return directly
-
-4. After receiving `Flush` request from `Proxy`, `DataCoord` would call `SealAllSegments` to seal all the growing segments that belong to this `Collection`, and no longer allocate new `ID`s for these segments. After that, `DataCoord` would send response to `Proxy`, and the response should contain all the sealed segment ID.
-
-5. In `Milvus 2.0`, the `Flush` is an asynchronous operation. So when `SDK` receives the response of `Flush`, it only means that the `DataCoord` has sealed these segments, and there are 2 problem that we have to soluved.
-    - The sealed segments might still in the memory, and not have been writen into persistent storage yet.
+    - `PostExecute`
+    
+      `FlushTask` does nothing at this phase, and returns directly
+    
+4. After receiving `Flush` request from `Proxy`, `DataCoord` would call `SealAllSegments` to seal all the growing segments belonging to this `Collection`, and do not allocate new `ID`s for these segments any more. After that, `DataCoord` would send response to `Proxy`, which contain all the sealed segment `ID`s.
+
+5. In `Milvus 2.0`,  `Flush` is an asynchronous operation. So when `SDK` receives the response of `Flush`, it only means that the `DataCoord` has sealed these segments. There are 2 problems that we have to solve.
+    - The sealed segments might still in memory, and have not been written into persistent storage yet.
    - `DataCoord` would no longer allocate new `ID`s for these sealed segments, but how to make sure all the allocated `ID`s have been consumed by `DataNode`.


-6. For the first problem, `SDK` should send `GetSegmentInfo` request to `DataCoord` periodically, until all the sealed segment are in state of `Flushed`. the `proto` is defined as following.
+6. For the first problem, `SDK` should send `GetSegmentInfo` request to `DataCoord` periodically, until all sealed segments are in state of `Flushed`. The `proto` is defined as following.
 ```proto
 service DataCoord {
  ...
-
  rpc GetSegmentInfo(GetSegmentInfoRequest) returns (GetSegmentInfoResponse) {}
-
  ...
 }

@@ -132,25 +134,23 @@ enum SegmentState {

 ```

-7. For second problem, `DataNode` would report a timestamp to `DataCoord` every time it consumes a package from `MsgStream`,the Proto is define as follow.
+7. For the second problem, `DataNode` would report a timestamp to `DataCoord` every time it consumes a package from `MsgStream`, the `proto` is define as follow.

 ```proto
- message DataNodeTtMsg {
-    common.MsgBase base =1;
-    string channel_name = 2;
-    uint64 timestamp = 3;
+message DataNodeTtMsg {
+  common.MsgBase base = 1;
+  string channel_name = 2;
+  uint64 timestamp = 3;
 }
-```
+ ```

 8. There is a backgroud service, `startDataNodeTsLoop`, in `DataCoord` to process the message of `DataNodeTtMsg`.
-    - Firstly, `DataCoord` would extract `channel_name` from `DataNodeTtMsg`, and filter out all the sealed segments that attached on this `channel_name`
-    - Compare the timestamp when the segment enters into state of `Sealed` with the `DataNodeTtMsg.timestamp`, if `DataNodeTtMsg.timestamp` is greater, it means that all the `ID`s belong to that segment have been consumed by `DataNode`,so it's safe to notify `DataNode` to write that segment into persistent storage. The `proto` is defined as follow.
+    - Firstly, `DataCoord` would extract `channel_name` from `DataNodeTtMsg`, and filter out all sealed segments that attached on this `channel_name`
+    - Compare the timestamp when the segment enters into state of `Sealed` with the `DataNodeTtMsg.timestamp`, if `DataNodeTtMsg.timestamp` is greater, which means that all `ID`s belonging to that segment have been consumed by `DataNode`, it's safe to notify `DataNode` to write that segment into persistent storage. The `proto` is defined as follow:
 ```proto
 service DataNode {
  ...
-
  rpc FlushSegments(FlushSegmentsRequest) returns(common.Status) {}
-
  ...
 }

@@ -160,5 +160,5 @@ message FlushSegmentsRequest {
  int64 collectionID = 3;
  repeated int64 segmentIDs = 4;
 }
+```

-```
\ No newline at end of file