diff --git a/doc/doc_en/detection_en.md b/doc/doc_en/detection_en.md index 1478cb37db09b63d143ef3dcda902bb533e9f5a0..390b0ee83a25f97656ed41528ad28356fa8b0242 100644 --- a/doc/doc_en/detection_en.md +++ b/doc/doc_en/detection_en.md @@ -27,7 +27,7 @@ The provided annotation file format is as follow, seperated by "\t": " Image file name Image annotation information encoded by json.dumps" ch4_test_images/img_61.jpg [{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]], ...}] ``` -The image annotation before json.dumps() encoding is a list containing multiple dictionaries. The `points` in the dictionary represent the coordinates (x, y) of the four points of the text box, arranged clockwise from the point at the upper left corner. +The image annotation after json.dumps() encoding is a list containing multiple dictionaries. The `points` in the dictionary represent the coordinates (x, y) of the four points of the text box, arranged clockwise from the point at the upper left corner. `transcription` represents the text of the current text box, and this information is not needed in the text detection task. If you want to train PaddleOCR on other datasets, you can build the annotation file according to the above format.