提交 · 64d5b1bd759581c16e37f64850737d308311a9bc · Greenplum / Gpdb

15 9月, 2020 1 次提交
- N
  Revert "ic-proxy: support hostname as proxy addresses" · 64d5b1bd
  由 Ning Yu 提交于 9月 15, 2020
```
This reverts commit 4c8332b5.
```
  64d5b1bd
14 9月, 2020 4 次提交

由 japinli 提交于 9月 14, 2020

In commit 5eaa5889, the --novacuum option is removed, however the help
page of gpexpand keep the -V option, which is a short option for
--novacuum.

(cherry picked from commit cb2afaf9)

63bab4c4

Refactor query string truncation on top of · abf6b330

由 Asim R P 提交于 8月 21, 2020

Commit 889ba39e fixed the query string truncation in dispatcher
to make it locale-aware.  This patch refactors that change so as to
avoid accessing a string beyond its length.

Reviewed by: Heikki, Ning Yu and Polina Bungina

(cherry picked from commit ef09a3a6)

abf6b330

Fix query string truncation while dispatching to QE · f31600e9

由 Polina Bungina 提交于 8月 13, 2020

Execution of a long enough query containing multi-byte characters can cause incorrect truncation of the query string. Incorrect truncation implies an occasional cut of a multi-byte character and (with log_min_duration_statement set to 0 ) subsequent write of an invalid symbol to segment logs. Such broken character present in logs produces problems when trying to fetch logs info from gp_toolkit.__gp_log_segment_ext  table - queries fail with the following error: «ERROR: invalid byte sequence for encoding…».
This is caused by buildGpQueryString function in `cdbdisp_query.c`, which prepares query text for dispatch to QE. It does not take into account character length when truncation is necessary (text is longer than QUERY_STRING_TRUNCATE_SIZE).

(cherry picked from commit 889ba39e)

f31600e9

Fix flaky test uao_crash_compaction_column (#10807) · 75bb046e

由 Paul Guo 提交于 9月 14, 2020

Here is part of the diff output.

@@ -14,11 +14,11 @@
  role | preferred_role | content | mode | status
  ------+----------------+---------+------+--------
  m    | m              | -1      | s    | u
- m    | m              | 0       | s    | u
+ m    | m              | 0       | n    | u

The root cause has nothing to do with this test case.  It's because test
prepared_xact_deadlock_pg_rewind finally calls gprecoverseg to recover the
cluster but does not wait until the cluster state restores.
Reviewed-by: NJunfeng(Jerome) Yang <jeyang@pivotal.io>
(cherry picked from commit 07c594ed)

75bb046e

12 9月, 2020 1 次提交
- D
  
  Docs - update ODBC driver version · c87d3237
  由 David Yozie 提交于 9月 11, 2020
  
  c87d3237
11 9月, 2020 3 次提交

gpperfmon: fix definition of terminate_timeout and recv_timeout params (#10725) · 27c91557

由 Maksim Milyutin 提交于 9月 11, 2020

The terminate_timeout and recv_timeout parameters are defined based on
incoming quantum value before its invalidation/normalization phase. As a
consequence those parameters can take on values that lead to unstable
interaction between gpmmon and gpsmon.

Current fix moves the assignment of terminate_timeout and recv_timeout
behind invalidation/normalization block. Furthermore, as
terminate_timeout is passed to gpsmon as startup option, the guard that
disallow zero timeout value for tcp_event is added.

27c91557

L

docs - update pxf xrefs to v5.15 (#10797) · b86e93f7
由 Lisa Owen 提交于 9月 10, 2020

b86e93f7

Use return instead of exit() in configure · 92bb6697

由 Peter Eisentraut 提交于 8月 30, 2016

Using exit() requires stdlib.h, which is not included.  Use return
instead.  Also add return type for main().
Reviewed-by: NHeikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: NThomas Munro <thomas.munro@enterprisedb.com>
(cherry picked from commit 1c0cf52b)
(cherry picked from commit 6d3c99bb)

92bb6697

10 9月, 2020 6 次提交

Add .git-blame-ignore-revs · 59aa1fec

由 Jesse Zhang 提交于 7月 13, 2020

This file will be used to record commits to be ignored by default by
git-blame (user still has to opt in). This is intended to include
large (generally automated) reformatting or renaming commits.

(cherry picked from commit b19e6abb)

59aa1fec

ic-proxy: support hostname as proxy addresses · 4c8332b5

由 Ning Yu 提交于 9月 10, 2020

The GUC gp_interconnect_proxy_addresses is used to set the listener
addresses and ports of all the proxy bgworkers, only IP addresses were
supported previously, which is inconvenient to use.

Now we add the support for hostnames too, the IP addresses are also
supported.

Note that if a hostname is bound to a different IP at runtime, we must
reload the setting with the "gpstop -u" command.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
(cherry picked from commit 2a1794bc)

4c8332b5

ic-proxy: type checking in ic_proxy_new() · 93e0c196

由 Ning Yu 提交于 8月 10, 2020

A typical mistake on allocating typed memory is as below:

    int64 *ptr = malloc(sizeof(int32));

To prevent this, now we make ic_proxy_new() a typed allocator, it always
return a pointer of the specified type, for example:

    int64 *p1 = ic_proxy_new(int64); /* good */
    int64 *p2 = ic_proxy_new(int32); /* bad, gcc will raise a warning */
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
(cherry picked from commit a3ef623d)

93e0c196

(

PANIC when register file to a non-active workfile set. (#10793) · 16db7fe3

由 (Jerome)Junfeng Yang 提交于 9月 10, 2020

We used to have `Assert` to check `RegisterFileWithSet` never register
file to a non-active workfile_set. But in production, there could be
some corner cases that caller register file to a non-active workfile_set.
It'll cause inconsistent `workfile_shared->num_active` with the real
active workfile_sets numbers under some situations.
For example,
1. `RegisterFileWithSet` a file to a created work_set. (current
`work_set->num_files` is 1)
2. `FileClose` closes the file and causes `WorkFileDeleted` to detele
the work_set since current `work_set->num_files` is 0 after detele file.
Which also decrease `workfile_shared->num_active`.
3.  `RegisterFileWithSet` another file to the created work_set(which
actually is not active now, but we dont't prevent that, only uses
`Assert` to check).
4. `FileClose` closes the file and causes `WorkFileDeleted` to detele
the work_set again. The `workfile_shared->num_active` gets decreased
again.

Raise PANIC to expose the coner cases.
Normally the caller of `RegisterFileWithSet` should ensure the correctness.
But we lack of the check in the `RegisterFileWithSet`.

(cherry picked from commit c23980cb)

16db7fe3

Fixes for optimizer_join_order exhaustive2 (#10792) · 06b6bdea

由 Hans Zeller 提交于 9月 09, 2020

Refactor caching of scalar expressions in MEMO groups
-----------------------------------------------------

We cache a full CExpression tree for scalar expressions in each scalar
CGroup - but only if no subqueries are involved. With subqueries,
things get a bit complicated and we store CExpressions with incorrect
arity. This caused a problem in handling nary LOJs, because for those
we have to descend into the scalar expression.

Refactored the code such that we can ask a CGroupExpression for an
exact scalar expression tree, with the possibility that it will return
NULL if there is a subquery in the expression. Changed the numerous
callers of this method into two subclasses:

1. Places where we accept an inexact version of the scalar expression,
with subqueries replaced by a "true" boolean constant or a NULL value.
This works well for statistics deriviation, costing, etc. where some
imprecision is acceptable.

2. Places where we need an exact expression, like constraint
derivation. Those places have to deal with the possibility of getting
a NULL pointer.

Handle NAry joins that contain LOJs in the decorrelator
-------------------------------------------------------

The decorrelator methods didn't handle the special scalar expression that
is used when we have LOJs in an NAry join.

The fix checks for outer references in the ON clause and the right child
of LOJs and prevents decorrelation when such outer refs are found.

The fix also passes the correct inner join predicates to recursive calls
and updates any changes in the inner join predicates.

Finally, this also fixes a bug unrelated to NAry joins, when we have a
regular 2-way LOJ or FOJ with outer refs in the ON clause or the right
child (for LOJ) or any child (FOJ). We shouldn't decorrelate such
outer refs.

Look for specific outer references when decorrelating
-----------------------------------------------------

Changing the decorrelator from trying to push up all outer references
to pushing up only those outer references that come from the outer side
of the apply that is driving the decorrelation process.

This is done by passing the ColRefs to remove as an additional argument.

There are two reasons for this change: First, it allows us to fix a
bug. The existing code didn't have a good way of checking whether
there are any outer refs in the ON clause of a left join that's part
of an NAry join. Second, the new code should allow us to decorrelate
hierarchies of subqueries better, by decorrelating those that satisfy
the conditions while leaving the rest in nested form.

Fix for fallback on correlated subquery with exhaustive2
--------------------------------------------------------

We currently don't expand NAry joins if they have outer references.
To avoid a regression when moving from "exhaustive" to "exhaustive2",
we need to allow expansion of NAry joins that can't be decorrelated,
because there are outer refs in the LOJ parts of them.

The added fix is a bit more general, it allows expansion of NAry
joins with LOJs in them, regardless of where the outer refs are.

Fix join stats calculation for NAry join with LOJs and outer refs

The join stats calculation had a bug when using
optimizer_join_order = exhaustive2.

We didn't handle outer refs correctly when encountering the new
flavor of NAry joins that contain left outer joins. As a result,
we would crash in retail builds and run into an assert in debug
builds.

Fix handling of NAry LOJs in CPredicateUtils::PexprRemoveImpliedConjuncts
-------------------------------------------------------------------------

This method also needs to preserve the CScalarNAryJoinPredList operator.

We are using a "representative" expression. In addition to fixing an assert,
this has two consequences:

- queries with a mix of predicates and subqueries, such as
  a = 5 and (sq) will get somewhat better estimates, as they will
  use the non-subquery parts.
- queries with scalar subqueries and negated subqueries may get lower
  estimates, which are probably more risky than overestimates. Examples:
  where a = (sq)   gets converted to where a = NULL
  where not (sq)   gets converted to where FALSE

Handle n-ary LOJs in subquery to apply xform
--------------------------------------------

The CXformSubqNAryJoin2Apply xform didn't handle NAry joins with
LOJs correctly. Added logic to preserve the LOJ-related data in the
NAry join and to push subqueries only to inner join children.

Remove failing assert from xform
--------------------------------

Looks like other code changes suddenly expose this xform.
The assert isn't correct. It calls the promise function on a CExpression,
but the promise function is written to work only with a CGroupExpression
attached to the expression handle. Since the assert doesn't seem very
useful, I just removed it.

Fixes for preprocessor-related methods
--------------------------------------

Fixed CLogicalNAryJoin::DeriveNotNullColumns and
CLogicalNAryJoin::DerivePropertyConstraint so that they handle NAry
LOJs correctly.

Columns from LOJ children are always nullable.

Equivalence classes and property constrains can only be passed to
the parent if they come from non-LOJ children.

Expand NAry joins with outer refs in "exhaustive2"
--------------------------------------------------

When optimizer_join_order is set to "exhaustive", that also enables
the "query" join order. When we have NAry joins with outer references,
only the "query" join order gets triggered. This is because in many
cases, we will be able to decorrelate the query tree. The "query"
join order provides a stop-gap for when decorrelation isn't possible.

We need a similar stop-gap for "exhaustive2", where DP, query, mincard
and greedy are all baked into a single transform. This fix enables
the DPv2 xform to fire on NAry joins with outer refs. Note that for
now we do a full expand, assuming that DPv2 can handle this, given
that very large joins in subqueries are not common. We could restrict
the logic to query only, but that would be a bit messy.

Handle NAry joins with outer references in DPv2
-----------------------------------------------

We need one change in the DPv2 xform to handle joins with outer references.
This is because we may see predicates of the form <col> = <outer ref>.
Such predicates are not true join predicates, involving multiple tables.
Therefore, they need to be applied as a separate select node on top of the
expanded join tree. To do this, we need to build an "expression to edge map",
used to find such unused edges.

This fix ensures that we build the expression to edge map if the NAry join
has outer references.

Add tests
---------

Added some tests, both explains and actual queries, to gporca.sql.
Some of these tests fall back to planner, as they would have before
this change. Others fail in the executor, see
https://github.com/greenplum-db/gpdb/issues/10791. The executor
failure happens without ORCA, so it is an independent issue.

(cherry picked from commit c1d143b1eb14247de021fb66423ec401330f7496)

06b6bdea

gpstart: when standby is unreachable don't start it · ec5f45a5

由 Bhuvnesh Chaudhary 提交于 9月 03, 2020

When the standby is unreachable and the user proceeds with startup,
the standby would attempt to be started resulting in a stack trace.
Detect when the standby is unreachable and set start_standby to False to
prevent starting it later in the startup process.
Co-authored-by: NKalen Krempely <kkrempely@vmware.com>

ec5f45a5

09 9月, 2020 5 次提交

Test gpfdists external table with hostname (#10781) · 55dfa785

由 Peifeng Qiu 提交于 9月 09, 2020

On centos7, the system libcurl uses NSS instead of OpenSSL as backend
for TLS/SSL connections. Previously it will fail if hostname is used
in external table location, probably due to initialization issues.
This is fixed by commit 89a1211c. Modify original test case to use
hostname.

curl is used for connection testing in gpfdist_ssl_start. But it will
always fail since we are now using SSL. Change the condition to not
equal to 7. This code can be found at curl documentation:
https://curl.haxx.se/libcurl/c/libcurl-errors.html

55dfa785

Fix url_curl on MacOS (#10261) · 14266c6e

由 Xiaoran Wang 提交于 8月 26, 2020

* Fix url_curl on MacOS

Fix libcurl can not read data from gpfdist
on MacOS

But gpfdist with a pipe can not work on macos as
flock(2) which is used in gfile.c is not supported
on MacOS.

14266c6e

Enable and enhance gpfdist SSL test cases (#9722) · 119ec9fc

由 Huiliang.liu 提交于 3月 16, 2020

* Enable and enhance gpfdist SSL test cases

1. Add multiple root CA test cases for gpfdist SSL
2. Fix output file due to foreign table modification

119ec9fc

Fix 6x gpload fail when capital letters in column in merge mode (#10783) · 968499f6

由 xiaoxiao 提交于 9月 09, 2020

* fix gpload multi-level partition table and special char in columns issue

fix match column condition to resolve primary key conflict when using the gpload
merge mode to import data to the Multi-level partition table
fix fail when special char and capital letters in column names

* add double quotations when creating staging table
omit distribution key

* fix gpload fail when column names have capital letters in merge mode
Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>

968499f6

D

Docs - update versioning for 6.11 release · 3219b792
由 David Yozie 提交于 9月 08, 2020

3219b792

08 9月, 2020 1 次提交
- X
  Revert "fix gpload fail when capital letters in column name in merge mode (#10751)" (#10778) · 7f1e3379
  由 xiaoxiao 提交于 9月 08, 2020
```
This reverts commit 4ba11904.
```
  7f1e3379
07 9月, 2020 1 次提交

fix gpload fail when capital letters in column name in merge mode (#10751) · 4ba11904

由 xiaoxiao 提交于 9月 07, 2020


* fix gpload fial when capital letters in column name in merge mode

add double quotations for column names when create staging table
omit distributio
Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>

4ba11904

04 9月, 2020 2 次提交

Fix server crash under rebuilding of RTEs and RelOptInfos · 78fb79d9

由 Maksim Milyutin 提交于 9月 04, 2020

The GROUPING SETS statement with multiple canonical rollups working on
randomly distributed partitioned table causes rebuilding of
root->simple_rte_array and root->simple_rel_array on planning stage by
PostgreSQL optimizer. The rebuilding process initializes both RTEs and
RelOptInfos item by item and doesn't take into account that the
build_simple_rel() routine is recursive for inherited relations and
requires a full list of already initialized child RTEs inside
root->simple_rte_array. As a result, the call of build_simple_rel() fails
on the access to child RTEs.

The current fix segregates loops of building root->simple_rte_array and
root->simple_rel_array trying to leave a whole semantics unchanged.

78fb79d9

Fix a 'VACUUM FULL' bug · 5f87e80a

由 xiong-gang 提交于 9月 04, 2020

When doing 'VACUUM FULL', 'swap_relation_files' updates the pg_class entry but
not increase the command counter, so the later 'vac_update_relstats' will
inplace update the 'relfrozenxid' and 'relhasindex' of the old tuple, when the
transaction is interrupted and aborted on the QE after this, the old entry is
corrupted.

5f87e80a

03 9月, 2020 4 次提交

A
Handle the PROCSIG_NOTIFY_INTERRUPT signal · 547de885
由 Adam Lee 提交于 9月 01, 2020
```
This will make LISTEN and NOTIFY work on the QD node.
```
547de885

fix gpload multi-level partition table and special char in columns issue (#10745) · 45823611

由 xiaoxiao 提交于 9月 03, 2020

fix match column condition to resovle primary key conflict when using the gpload
merge mode to import data to the Multi-level partition table
fix fail when special char and capital letters in column names
Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>

45823611

Using lwlock to protect resgroup slot in session state · 73b477a0

由 Hubert Zhang 提交于 9月 02, 2020

Resource group used to access resGroupSlot in SessionState without
lock. This is correct when session only access resGroupSlot by itself.
But as we introduced runaway feature, we need to traverse the current
session array to find the top consumer session when redzone is reached.
This requires:
1. runaway detector should hold shared resgroup lock to avoid resGroupSlot
is detached from a session concurrently when redzone is reached.
2. normal session should hold exclusive lock when modifying resGroupSlot
in SessionState.

Also fix a compile warning.
Reviewed-by: NNing Yu <nyu@pivotal.io>

(cherry picked from commit a4cb06b4)

73b477a0

ic-proxy: Quit proxy bgworker when postmaster is dead · 67c52b59

由 Hubert Zhang 提交于 9月 01, 2020

Proxy bgworker will become orphan process after postmaster is dead
due to the lack of checking pipe postmaster_alive_fds[POSTMASTER_FD_WATCH].
Epoll this pipe inside proxy bgworker main loop as well.
Reviewed-by: NNing Yu <nyu@pivotal.io>
(cherry picked from commit 9ce59d1a)

67c52b59

02 9月, 2020 3 次提交

Allow direct dispatch in Orca if predicate on column gp_segment_id (#10679) · f55b0fac

由 David Kimura 提交于 9月 01, 2020

This approach special cases gp_segment_id enough to include the column
as a distributed column constraint. It also updates direct dispatch info
to be aware of gp_segment_id which represents the raw value of the
segment where the data resides. This is different than other columns
which hash the datum value to decide where the data resides.

After this change the following DDL shows Gather Motion from 2 segments
on a 3 segment demo cluster.

```
CREATE TABLE t(a int, b int) DISTRIBUTED BY (a);
EXPLAIN SELECT gp_segment_id, * FROM t WHERE gp_segment_id=1 or gp_segment_id=2;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Gather Motion 2:1  (slice1; segments: 2)  (cost=0.00..431.00 rows=1 width=12)
   ->  Seq Scan on t  (cost=0.00..431.00 rows=1 width=12)
         Filter: ((gp_segment_id = 1) OR (gp_segment_id = 2))
 Optimizer: Pivotal Optimizer (GPORCA)
(4 rows)

```

(cherry picked from commit 10e2b2d9)

f55b0fac

Fix resource group runaway rounding issue · 0ade3e4e

由 Hubert Zhang 提交于 9月 01, 2020

When calculating safeChunksThreshold of runaway in resource group,
we used to divide by 100 to get the number of safe chunks. This may
lead to small chunk numbers to be rounded to zero. Fix it by storing
safeChunksThreshold100(100 times bigger than the real safe chunk) and
do the computation on the fly.
Reviewed-by: NNing Yu <nyu@pivotal.io>
(cherry picked from commit 757184f9)

0ade3e4e

Export MASTER_DATA_DIRECTORY when calling gpconfig · ccce626b

由 Bhuvnesh Chaudhary 提交于 8月 25, 2020

MASTER_DATA_DIECTORY must be exported to before calling
gpconfig else it will fail to set the GUC

Allow setting DCA_VERSION_FILE to enable testing

Allow overriding the value of DCA_VERSION_FILE so that during testing it
can be manipulated. Also add a test for the same to ensure that DCA
configuration GUCs are set properly on the environment.

ccce626b

01 9月, 2020 2 次提交
- D
  
  Docs - rebranding for proprietary docs condition · fa0b1637
  由 David Yozie 提交于 8月 31, 2020
  
  fa0b1637
- B
  
  pg_upgrade: Add setup to log timing for check and upgrade steps · 0cf1c0d8
  由 Bhuvnesh Chaudhary 提交于 8月 14, 2020
  
  0cf1c0d8
31 8月, 2020 3 次提交

Fix crashes when a Values Scan needs to create "fake ctids". · 3b7ca45c

由 Heikki Linnakangas 提交于 8月 31, 2020

When converting semi-join to inner-join, a distinct agg on ctid is added
above the hash-join node. But the fake ctids generated in Values Scan
were invalid, with offset number 0, which caused an assertion failure.

This patch is based on commit d8886cf9, which fixed the same issue for
Function Scans.
Co-authored-by: Ndh-cloud <60729713+dh-cloud@users.noreply.github.com>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>

3b7ca45c

Add ctid in function scan · ba8f2fe0

由 Jinbao Chen 提交于 12月 13, 2019

When we convert semi join to inner join, a distinct agg on ctid is added
above the hash join node. But tuples from function scan have no ctid. So
the assert check failed.

Set a synthetic ctid based on a fake ctid by call existed funciton
slot_set_ctid_from_fake.

(cherry picked from commit d8886cf9)

ba8f2fe0

Disable strxfrm for mk_sort at compile time · 4d248acb

由 Denis Smirnov 提交于 7月 20, 2020

Glibc implementations are known to return inconsistent results for
strcoll() and strxfrm() on many platforms that can cause
unpredictable bugs. Because of that PostgreSQL disabled strxfrm()
by default since 9.5 at compile time by TRUST_STRXFRM definition.
Greenplum has its own mk sort implementation that can also use
strxfrm(). Hence mk sort can also be affected by strcoll() and
strxfrm() inconsistency (breaks merge joins). That is why strxfrm()
should be disabled by default with TRUST_STRXFRM_MK_SORT definition
for mk sort as well. We don't use PostgreSQL's TRUST_STRXFRM
definition as many users used Greenplum with strxfrm() enabled for
mk sort and disabled in PostgreSQL core. Keeping TRUST_STRXFRM_MK_SORT
as a separate definition allows these users not to reindex after
version upgrade.
Reviewed-by: NAsim R P <pasim@vmware.com>
Reviewed-by: NHeikki Linnakangas <linnakangash@vmware.com>
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

4d248acb

28 8月, 2020 1 次提交
- X
  Fix postgres_fdw makefile (#10716) · 3ba32474
  由 Xiaoran Wang 提交于 8月 28, 2020
```
* Fix distclean of postgres_fdw and dblink
```
  3ba32474
27 8月, 2020 2 次提交

Fix missing xlog files with orphaned prepared transactions after crash recovery · dafe0c8c

由 Paul Guo 提交于 8月 14, 2020

After crash recovery finishes, the startup process will create an
end-of-recovery checkpoint. The checkpoint will recycle/remove xlog files
according to orphaned prepared transaction LSNs, replication slot data, etc.
The orphaned prepared transaction LSN data (TwoPhaseState->prepXacts, etc) for
checkpoint are populated in the startup process RecoverPreparedTransactions(),
but the function is called after the end-of-recovery checkpoint creation so
xlog files with orphaned prepared transactions might be recycled/removed. This
can cause "requested WAL segment pg_xlog/000000010000000000000009 has already
been removed" kind of error when bringing up the crashed primary. For example
if you run 'gprecoverseg -a -v', you might be able to see failure with stack as
below. In more details, this happens when running the single mode postgres in
pg_rewind.
2 0x5673a1 postgres <symbol not found> (xlogutils.c:572)
3 0x567b2f postgres read_local_xlog_page (xlogutils.c:870)
4 0x5658f3 postgres <symbol not found> (xlogreader.c:503)
5 0x56518a postgres XLogReadRecord (xlogreader.c:226)
6 0x54d725 postgres RecoverPreparedTransactions (twophase.c:1955)
7 0x55b0b7 postgres StartupXLOG (xlog.c:7760)
8 0xb0087f postgres InitPostgres (postinit.c:704)
9 0x97a97c postgres PostgresMain (postgres.c:4829)

Fixing this by prescanning the prepared transaction data from xlogs directly
for checkpoint creation if it's still InRecovery (i.e. when
TwoPhaseState->prepXacts is not populated).

Please note that "orphaned" might be temporary (usually happens on an
environment with heavy write-operation load) or permanent unless master dtx
recovery (implies a product bug).

This is a gpdb 6 only issue. On gpdb 7, state files are used to store prepared
transactions during checkpoint so we do not keep the wal files with orphaned
prepared transactions and thus we won't encounter this issue.
Reviewed-by: NAsim R P <pasim@vmware.com>

dafe0c8c

Fix assertion failures in BackoffSweeper · 11daae4e

由 ggbq 提交于 12月 23, 2019

Previous commit ab74e1c6, c7befb1d did not completely solve its race
condition, it did not test for last iteration of the while/for loop.
This could result in failed assertion in the following loop. The patch
moves the judgement to the ending of the for loop, it is safe, because
the first iteration will never trigger: Assert(activeWeight > 0.0).

Also, the other one race condition can trigger this assertion
Assert(gl->numFollowersActive > 0). Consider this situation:

    Backend A, B belong to the same statement.

    Timestamp1: backend A's leader is A, backend B's leader is B.

    Timestamp2: backend A's numFollowersActive remains zero due to timeout.

    Timestamp3: Sweeper calculates leader B's numFollowersActive to 1.

    Timestamp4: backend B changes it's leader to A even if A is inactive.

We stop sweeping for this race condition just like commit ab74e1c6 did.

Both Assert(activeWeight > 0.0) and Assert(gl->numFollowersActive > 0)
are removed.

(cherry picked from commit b1c19196)

11daae4e

26 8月, 2020 1 次提交

Fix dblink's libpq issue on gpdb 6X (#10656) · 1dc23be2

由 Xiaoran Wang 提交于 8月 26, 2020

* Fix dblink's libpq issue

When using dblink to connect to a postgres database, it reports the
following error:
unsupported frontend protocol 28675.0: server supports 2.0 to 3.0

Even if dblink.so is dynamically linked to libpq.so which is compiled
with the option -DFRONTEND, but when it's loaded in gpdb and run,
it will use the backend libpq which is compiled together with
Postgres program and reports the error.

This pr is almost the same as the pr:
https://github.com/greenplum-db/gpdb/pull/10617
which is used to fix postgres_fdw issue

* Disable dblink on MaxOS

--exclude-libs and -Bstatic are not supported on MaxOS

1dc23be2