未验证 提交 f9e4d49b 编写于 作者: G guru4elephant 提交者: GitHub

Merge pull request #14702 from barrierye/update_checkfile_in_datafeed

add the comment for CheckFile function
...@@ -259,6 +259,14 @@ bool MultiSlotDataFeed::CheckFile(const char* filename) { ...@@ -259,6 +259,14 @@ bool MultiSlotDataFeed::CheckFile(const char* filename) {
return false; return false;
} }
} }
// It may be added '\t' character to the end of the output of reduce
// task when processes data by Hadoop(when the output of the reduce
// task of Hadoop has only one field, it will add a '\t' at the end
// of the line by default, and you can use this option to avoid it:
// `-D mapred.textoutputformat.ignoreseparator=true`), which does
// not affect the correctness of the data. Therefore, it should be
// judged that the data is not normal when the end of each line of
// data contains characters which are not spaces.
while (endptr - str != len) { while (endptr - str != len) {
if (!isspace(*(endptr++))) { if (!isspace(*(endptr++))) {
VLOG(0) VLOG(0)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册