提交 81033451 编写于 作者: A Abhijit Subramanya

Fix the rejected row count reported by single row error handling in COPY.

When we use error log, if the master finds a badly formatted data row, it
increments the rejected row count and then sends the row to the segment so that
it can be stored in the error log file. On the segments, the row gets parsed
again and the segment increments the reject count again.  For reporting the
total number of rejected rows to the user we sum up the rejected row count from
the master and the segments. We need to ignore the count from the master
because it will be included in the reject count from the segments.

Also includes additional cleanup and typo fix of variable name by Heikki
Linnakangas.
上级 bc27ea2d
......@@ -2564,7 +2564,7 @@ CopyFromDispatch(CopyState cstate)
Datum *values;
bool *nulls;
int *attr_offsets;
int total_rejeted_from_qes = 0;
int total_rejected_from_qes = 0;
bool isnull;
bool *isvarlena;
ResultRelInfo *resultRelInfo;
......@@ -3493,7 +3493,7 @@ CopyFromDispatch(CopyState cstate)
* databases Now we would like to end the copy command on
* all segment databases across the cluster.
*/
total_rejeted_from_qes = cdbCopyEnd(cdbCopy);
total_rejected_from_qes = cdbCopyEnd(cdbCopy);
/*
* If we quit the processing loop earlier due to a
......@@ -3545,12 +3545,17 @@ CopyFromDispatch(CopyState cstate)
{
int total_rejected = 0;
int total_rejected_from_qd = cstate->cdbsreh->rejectcount;
/* if used errtable, QD bad rows were sent to QEs and counted there. ignore QD count */
if (cstate->cdbsreh)
/*
* If error log has been requested, then we send the row to the segment
* so that it can be written in the error log file. The segment process
* counts it again as a rejected row. So we ignore the reject count
* from the master and only consider the reject count from segments.
*/
if (cstate->cdbsreh->log_to_file)
total_rejected_from_qd = 0;
total_rejected = total_rejected_from_qd + total_rejeted_from_qes;
total_rejected = total_rejected_from_qd + total_rejected_from_qes;
cstate->processed -= total_rejected;
/* emit a NOTICE with number of rejected rows */
......
......@@ -1621,18 +1621,13 @@ PortalRunMulti(Portal portal, bool isTopLevel,
* process utility functions (create, destroy, etc..)
*
* These are assumed canSetTag if they're the only stmt in the
* portal, with the following exception:
*
* A COPY FROM that specifies a non-existent error table, will
* be transformed (parse_analyze) into a (CreateStmt, CopyStmt).
* XXX Maybe this should be treated like DECLARE CURSOR?
* portal.
*/
if (list_length(portal->stmts) == 1 || portal->sourceTag == T_CopyStmt)
if (list_length(portal->stmts) == 1)
PortalRunUtility(portal, stmt, isTopLevel, dest, completionTag);
else
PortalRunUtility(portal, stmt, isTopLevel, altdest, NULL);
}
/*
* Increment command counter between queries, but not after the last
......
......@@ -86,8 +86,8 @@ typedef enum ErrLocType
typedef enum CopyErrMode
{
ALL_OR_NOTHING, /* Either all rows or no rows get loaded (the default) */
SREH_IGNORE, /* Sreh - ignore errors (REJECT but no error table) */
SREH_LOG /* Sreh - log errors in an error table */
SREH_IGNORE, /* Sreh - ignore errors (REJECT, but don't log errors) */
SREH_LOG /* Sreh - log errors */
} CopyErrMode;
......
......@@ -685,6 +685,7 @@ CONTEXT: COPY test_first_segment_reject_limit, line 2: "error1"
-- should go through fine
SET gp_initial_bad_row_limit = 6;
COPY test_first_segment_reject_limit FROM STDIN WITH DELIMITER '|' segment reject limit 20;
NOTICE: Found 4 data formatting errors (4 or more input rows). Rejected related input data.
SELECT COUNT(*) FROM test_first_segment_reject_limit;
count
-------
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册