未验证 提交 5028054e 编写于 作者: X xiong-gang 提交者: GitHub

Correctly seek to the end of buffile that contains multiple physical files

When stash a 'VisimapDelete' to the buffile, we must seek to end of the last
physical file if the buffile contains multiple files. This commit cherry-pick
part of the commit from upstream:

commit 808e13b282efa7e7ac7b78e886aca5684f4bccd3
Author: Amit Kapila <akapila@postgresql.org>
Date:   Wed Aug 26 07:36:43 2020 +0530

    Extend the BufFile interface.

    Allow BufFile to support temporary files that can be used by the single
    backend when the corresponding files need to be survived across the
    transaction and need to be opened and closed multiple times. Such files
    need to be created as a member of a SharedFileSet.

    Additionally, this commit implements the interface for BufFileTruncate to
    allow files to be truncated up to a particular offset and extends the
    BufFileSeek API to support the SEEK_END case. This also adds an option to
    provide a mode while opening the shared BufFiles instead of always opening
    in read-only mode.

    These enhancements in BufFile interface are required for the upcoming
    patch to allow the replication apply worker, to handle streamed
    in-progress transactions.

    Author: Dilip Kumar, Amit Kapila
    Reviewed-by: Amit Kapila
    Tested-by: Neha Sharma
    Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
上级 9363718d
......@@ -619,7 +619,6 @@ AppendOnlyVisimapDelete_Stash(
bool found;
off_t offset;
int fileno;
int64 filesize;
Assert(visiMapDelete);
visiMap = visiMapDelete->visiMap;
......@@ -643,37 +642,14 @@ AppendOnlyVisimapDelete_Stash(
oldContext = MemoryContextSwitchTo(visiMap->memoryContext);
AppendOnlyVisimapEntry_WriteData(&visiMap->visimapEntry);
BufFileTell(visiMapDelete->workfile, &fileno, &offset);
filesize = BufFileSize(visiMapDelete->workfile);
/*
* If the BufFile was seeked to an internal position for reading a
* previously stashed visimap entry before we were called, we must seek
* till the end of it before writing new visimap entries.
*
* GPDB_12_MERGE_FIXME if the BufFile ends up with multiple files
* (numFiles > 1), the following (filesize > offset) comaprison is
* invalid. The offset is within a single file whereas filesize is total
* size of all files comprising this BufFile. BufFile interface may need
* some enhancements to address this problem. E.g. API to seek to the end
* so as to append to the BufFile, API to flush existing in-memory buffer
* to disk.
*/
if (filesize > offset)
{
if (BufFileSeek(visiMapDelete->workfile, 0, filesize, SEEK_SET) != 0)
elog(ERROR, "failed to seek to end of visimap buf file: offset " INT64_FORMAT, filesize);
BufFileTell(visiMapDelete->workfile, &fileno, &offset);
}
else
{
/*
* The previous write was shorter than the buffer size used by
* BufFile. That means it was not actually written to disk, leading
* to disk file size smaller than the in-memory size. The BufFile is
* already positioned to the offest past the previous write in that
* case, no need to seek.
*/
}
if (BufFileSeek(visiMapDelete->workfile, 0, 0, SEEK_END) != 0)
elog(ERROR, "failed to seek to end of visimap buf file");
BufFileTell(visiMapDelete->workfile, &fileno, &offset);
elogif(Debug_appendonly_print_visimap, LOG,
"Append-only visi map delete: Stash dirty visimap entry %d/" INT64_FORMAT,
......
......@@ -63,7 +63,7 @@
* The reason is that we'd like large BufFiles to be spread across multiple
* tablespaces when available.
*/
#define MAX_PHYSICAL_FILESIZE 0x40000000
#define MAX_PHYSICAL_FILESIZE 0x40000000
#define BUFFILE_SEG_SIZE (MAX_PHYSICAL_FILESIZE / BLCKSZ)
/* To align upstream's structure, minimize the code differences */
......@@ -900,11 +900,22 @@ BufFileSeek(BufFile *file, int fileno, off_t offset, int whence)
newFile = file->curFile;
newOffset = (file->curOffset + file->pos) + offset;
break;
#ifdef NOT_USED
case SEEK_END:
/* could be implemented, not needed currently */
break;
#endif
/*
* The file size of the last file gives us the end offset of that
* file.
*/
if (file->curFile == file->numFiles - 1 && file->dirty)
BufFileFlush(file);
newFile = file->numFiles - 1;
newOffset = FileSize(file->files[file->numFiles - 1]);
if (newOffset < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not determine size of temporary file \"%s\" from BufFile \"%s\": %m",
FilePathName(file->files[file->numFiles - 1]),
file->name)));
break;
default:
elog(ERROR, "invalid whence: %d", whence);
return EOF;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册