提交 · 058ac4f9ff7403b5834b9a3ab2631cb2083123fb · Greenplum / Gpdb

29 7月, 2016 2 次提交
- K
  Introduce alien node elimination for code generation. · 058ac4f9
  由 Karthikeyan Jambu Rajaraman 提交于 7月 20, 2016
```
Note: We don't alien node elimination when there is an explain codegen
or explain analyze codegen query in master.
```
  058ac4f9
- F
  
  Squashing compiler warning by casting void pointer to char pointer during Assert in aset.c. · f63b5d4b
  由 Foyzur Rahman 提交于 7月 28, 2016
  
  f63b5d4b
28 7月, 2016 7 次提交

K
Add dev template pipeline to run GPDB icg with orca · 554f9e87
由 Karthikeyan Jambu Rajaraman 提交于 7月 13, 2016
```
Signed-off-by: NShreedhar Hardikar <shardikar@gmail.com>
```
554f9e87

Add missing 'consttypmod' field to outfast/readfast functions. · 63d5743a

由 Heikki Linnakangas 提交于 7月 28, 2016

This field was added to Const struct and the text out/read functions in
PostgreSQL 8.3 (upstream commit 0f4ff460). In the merge, we missed that
it also needs to be handled in the "fast" out/read functions.

I wasn't able to come up with a test case that would misbehave because of
this. When a Const is dispatched from the master to segments, the typmod
has already been "applied" to the value, e.g. a numeric has been rounded
to the desired precision. So it doesn't matter that the typmod is not
available in the segments. However, gpcheckcat complained about the
mismatch, in index expressions stored between the master and segments, as
well as in atttypmod stored for a view. And if we leave the typmod
uninitialized, it's left to 0, rather than -1 which would mean "no typmod",
so clearly this must be fixed, even if it doesn't have any user-visible
effects at the moment.

Reported by Jimmy.

63d5743a

Fixing the issue with low speed of passing arrays containing nulls into PL/Python functions · e8cf3855

由 Alexey Grishchenko 提交于 7月 28, 2016

Original implementation used array_ref for each element of the array. In case if array
contains nulls, each call to the function array_ref causes a cycle over nulls bitmap to
calculate the data offset. For array with 100'000 elements this causes 350x lower performance
This fix uses the code from array_out for effective array data traversal

e8cf3855

Fixing the issue with PL/Python "subtransaction" regression tests - specifying... · 00a61fcb

由 Alexey Grishchenko 提交于 7月 28, 2016

Fixing the issue with PL/Python "subtransaction" regression tests - specifying order for table selects, and adding more verbose output of SPI errors

00a61fcb

Add gpcrondump behave test for -t and --table-file to ignore pg_temp_% · 28bff271

由 Marbin Tan 提交于 7月 27, 2016

* We should ignore pg_temp_% during backup even if the user specifies
  the table. There is a potential error if we try to backup pg_temp_%,
  so we will prevent it.

28bff271

gpcrondump should not dump pg_temp schemas · 8d266683

由 Marbin Tan 提交于 7月 15, 2016

* We noticed that when gpcrondump is given with a -S option (exclude schema)
  gpcrondump will also try to dump pg_temp schemas.
  This can be detected by the following issue:
  user creates a new schema
  user creates a temp table
  -- Keep the session alive

  On another session:
  Run gpcrondump -S <exclude schema> -x <dbname>

* We've filtered out pg_temp in our queries to prevent this issue.

Dumping pg_temp schemas doesn't make sense as gpdbrestore can't restore
those schemas and tables anyways, since pg_ and gp_ schemas are
restricted for the system.

Make sure to ignore pg_temp from backup.

* We want to make sure that we ignore pg_temp and not error out if we
  can't find the pg_temp tables during a backup

Authors: Marbin Tan, Karen Huddleston, & Chumki Roy

8d266683

Add ORCA optimizer GUC optimizer_parallel_union for parallel append (#977) · 86e49e51

由 Haisheng Yuan 提交于 7月 27, 2016

This GUC is disabled by default.

Currently, when users run an `UNION ALL` query, such as
```sql
SELECT a FROM foo UNION ALL SELECT b FROM bar
```,
GPDB will parallelize the execution across all the segments, but the
table scans on `foo` and `bar` are not executed parallel. With this GUC
enabled, we add redistribution motion node under APPEND(UNION) operator,
which makes all the children of APPEND(UNION) operator execute in
parallel in every segment.

86e49e51

27 7月, 2016 4 次提交

Fix deparsing of group by columns in EXPLAIN of Agg node when Append is the outer child. · f82644f7

由 Foyzur Rahman 提交于 7月 21, 2016

If Agg has an Append as the outer child, group by cannot resolve the variable names of
the group by expression. This is because of incorrect assumption that it can take outerPlan
of any child to access the outer child of its child, which is not true for Append. For
Append we keep a list of children and lefttee and righttree are set to NULL.

This breaks bfv_partition test suite when optimizer is enabled. For repro:

```
drop table if exists t;
drop table if exists p1;
drop table if exists p2;
-- end_ignore

create table p1 (a int, b int) partition by range(b) (start (1) end(100) every (20));
create table p2 (a int, b int) partition by range(b) (start (1) end(100) every (20));
create table t(a int, b int);

insert into t select g, g*10 from generate_series(1,100) g;
insert into p1 select g, g%99 +1 from generate_series(1,10000) g;
insert into p2 select g, g%99 +1 from generate_series(1,10000) g;

analyze t;
analyze p1;
analyze p2;

explain select * from (select * from p1 union select * from p2) as p_all, t where p_all.b=t.b;
```

This commit fixes this by considering Append as special child and properly extracting the first
subplan under Append as its outer child.

f82644f7

Fixing wrong assumption that PartitionSelector always comes under Sequence. · 48fb3d13

由 Foyzur Rahman 提交于 7月 26, 2016

Looks like it can also come under NestedLoop.

This came as part of the PR #980 comment to remove compiler warning to cast the parentPlan
of PartitionSelector to Sequence, and it failed an Assert check that parentPlan is not
always a Sequence.

48fb3d13

Fix of PartitionSelector INNER VarNo reference by passing parent plan pointer. · 6a7f02ed

由 Foyzur Rahman 提交于 7月 20, 2016

In this PR we revert the earlier fix in #840, but keep some part of it. This
PR is similar in spirit to the earlier as it depends on parent of
PartitionSelector (which is expected to be Sequence) to deparse printablePredicate.
However, this PR resolves the parent from child, instead of resolving child from
parent and localizes the changes to just explain.c. We also keep the old behavior
of clearly isolating Sequence node from PartitionSelector node.

Besides this, we also pass the parent to resolve both outer and inner varno of a
PartitionSelector's printablePredicate, which is also necessary to make bfv_partition green.

6a7f02ed

Fixes syntax issues in backup/restore and gparray. · 5e453a8f

由 Chumki Roy 提交于 7月 13, 2016

These are minor fixes identified through static code analysis.

Add unit tests to gprestore filter and restore.

Authors: Chumki Roy and Chris Hajas

5e453a8f

25 7月, 2016 5 次提交

P

Add description about DF_CANCEL_ON_ERROR, DF_NEED_TWO_PHASE, DF_WITH_SNAPSHOT · 7e7a9aa5
由 Pengzhou Tang 提交于 7月 25, 2016

7e7a9aa5

Abandon cdbdisp_dispatchUtilityStatement · 75480b8c

由 Pengzhou Tang 提交于 7月 19, 2016

cdbdisp_dispatchUtilityStatement is designed to dispatch a utility statement synchronously used by
DefineIndex(), DefineExternalRelation(), DefineRelation() and createdb(). The formor three function
actually use it in asynchronous way, so replace it with CdbDispatchUtilityStatement.

createdb() use it synchronously, but the performance improvement is little. To make code clean, abandon
this function and replace it with asynchronous version.

75480b8c

Refactor command dispatch related function, · f7078db2

由 Pengzhou Tang 提交于 7月 12, 2016

Original cdbdisp_dispatchRMCommand() and CdbDoCommand() is easy confusing. This commit combine
them to one and meanwhile push down error handling to make coding easier.

f7078db2

Refactor utility statement dispatch interfaces · 01769ada

由 Pengzhou Tang 提交于 7月 08, 2016

refactor CdbDispatchUtilityStatement() to make it flexible for cdbCopyStart(),
dispatchVacuum() to call directly. Introduce flags like DF_NEED_TWO_SNAPSHOT,
DF_WITH_SNAPSHOT, DF_CANCEL_ON_ERROR to make function call much clearer

01769ada

Fix the self-deadlock caused by reentrance of malloc/free when QD is in idle · c3fdfd74

由 Kenan Yao 提交于 7月 13, 2016

state.

There are two cases leading to this self-deadlock:
(1) SIGALRM for IdleSessionGangTimeout comes when QD is in malloc function call
of SSL code for example, and the handler HandleClientWaitTimeout would call
function free to destroy Gangs, hence we are calling free inside malloc, which
would produce a deadlock;
(2) If a SIGUSR1 come when we are inside HandleClientWaitTimeout and calling
function free, then we would be interrupted to process the Catchup event first,
in which we would possibly call malloc, hence we are calling malloc inside free,
which would cause a deadlock also.

To fix this issue, for case 1, we only enable SIGALRM handling of IdleSessionGangTimeout
exactly before recv call, which can protect malloc/free from being interrupted
by this kind of SIGALRM; for case 2, we prevent reentrant signal handling.

This fix is mainly borrowed from a patch of Tao Ma in Apache HAWK project.

c3fdfd74

24 7月, 2016 1 次提交
- J
  
  We don't need "Development Tools" · 57a9e28a
  由 Jesse Zhang 提交于 7月 23, 2016
  
  57a9e28a
23 7月, 2016 1 次提交
- K
  
  Use PGFuncGeneratorInfo to codegen pg functions · dcd40712
  由 Karthikeyan Jambu Rajaraman 提交于 7月 13, 2016
  
  dcd40712
22 7月, 2016 7 次提交

Fix compiler warning. · 15bc00dc

由 Heikki Linnakangas 提交于 7月 21, 2016

The compiler rightly pointed out that this memset() call was not zeroing
the whole struct, as it was clearly intended to. 'dd_options' is a pointer,
so sizeof(dd_options) is only 4 or 8, depending on architecture. That was
harmless, because the caller just zeroed out the array, but let's silence
the compiler warning. Zeroing it twice is not necessary, but this is not
at all performance-critical, so I don't want to second-guess the intentions
of the original coder right now.

15bc00dc

H
Fix query used in regression tests. · 376dd9f1
由 Heikki Linnakangas 提交于 7月 21, 2016
```
It was broken by the removal of int to text cast, as part of the PostgreSQL
8.3 merge.
```
376dd9f1

Replace usage of RSA BSAFE lockbox file with custom implementation. · 496b48fc

由 Heikki Linnakangas 提交于 7月 21, 2016

The proprietary RSA BSAFE library was used for one thing only: for storing
the "lockbox" file containing the credentials for DDBoost. Replace with
a little hand-written key-value configuration file format, with password
obfuscation based on XOR, rand() and base64 encoding. This allows us to
finally remove the dependency to the BSAFE library.

The configuration file is now in a human-readable and -editable format
(except for the password), which is a bonus.

496b48fc

Clean up the compiler and linker options used to build ddboost stuff. · 92dd39ea

由 Heikki Linnakangas 提交于 7月 21, 2016

Passing CPPFLAGS when building libgpbsa75.so and libgpbsa71.so allows the
compilation to work, if ddboost header files are in a directory specified
in "./configure --with-includes=...". Using CFLAGS_SL rather than plain
CFLAGS ensures we use the correct flags determined by autoconf, for building
a shared library. That includes -fPIC on platforms that need it, so we don't
need to pass that explicitly anymore.

Move $(LIBS) after all the other libraries in all the command lines.
Doesn't make a difference right now, but with the upcoming patch to get rid
of the dependency to RSA BSAFE, it seems to be necessary that "-ldl" is
passed after -lDDBoost. Apparently, DDBoost.so uses dlopen() and friends,
and gets upset unless -ldl is passed after -lDDBoost (-ldl is part of LIBS,
at least on my system). This is probably highly platform-dependent, but I
think this order should work everywhere, even if it doesn't matter for
some platforms.

92dd39ea

Fix bugs in cdb_dump's base64 implementation. · 8a28d485

由 Heikki Linnakangas 提交于 7月 21, 2016

It was copied from src/backend/utils/adt/encode.c, but a few bugs were
introduced in the translation:

* Confusion on signed vs. unsigned datatypes, leading to the encoding
  routine to work incorrectly for bytes >= 128. They all got decoded as
  0xFF.

* Base64ToData() returned a conservative estimate of the length, rather
  than the exact length of the original data.

While we're at it, also mark the input args as 'const', and add a test
case.

8a28d485

Remove dead function. · 655570ac

由 Heikki Linnakangas 提交于 7月 21, 2016

This is remnants of an old configuration file format that pre-dates the
RSA BSAFE lockbox implementation. I'm actually about to replace the current
lockbox implementation with something that resembles this dead function, but
it's not going to be quite the same, so let's get this dead function out of
the way first.

655570ac

Add missing #includes. · 7fdb692a

由 Heikki Linnakangas 提交于 7月 21, 2016

PGConn and PQExpBuffer are used in these header file, so they should
include the headers that provide them. Things have worked without this,
because all the .c files that use these happen to include the needed
libpq header first, but let's be tidy.

7fdb692a

21 7月, 2016 7 次提交
- K
  
  Replace dynamic_cast with dyn_cast for llvm objects · 83869d1c
  由 Karthikeyan Jambu Rajaraman 提交于 7月 20, 2016
  
  83869d1c
- C
  Fix missed test case in gpcheckcat · 558c710c
  由 Christopher Hajas 提交于 7月 20, 2016
```
Fixing test in gpcheckcat from commit f9fcaafd, where missing extraneous
repair must be run with -E option.
```
  558c710c
- S
  Decide what to generate for ExecEvalExpr based on PlanState · 23007017
  由 Shreedhar Hardikar 提交于 7月 13, 2016
```
  * Skip generating ExecEvalExpr ScanNodes when ProjInfo is a Var list
  * Skip generating SlotGetAttr in AggNodes
```
  23007017
- S
  Add EXPLAIN CODEGEN to print generated IR to the client · 387d8ce8
  由 Shreedhar Hardikar 提交于 7月 14, 2016
```
This will aid tremendously in debugging codegen because the generated IR
will be easily available to be seen. Rest of the functionality of
EXPLAIN is maintained. Also we only perform the compilation step if it
not a EXPLAIN-only query. That is EXPLAIN CODEGEN with perform code
generation only and will not compile the resultant code. EXPLAIN ANALYZE
will do both, and also execute the query.
```
  387d8ce8
- K
  
  Introduce Datum to cpp cast, cpp type to Datum cast and normal cast.(#944) · c4a9bd27
  由 Karthikeyan Jambu Rajaraman 提交于 7月 07, 2016
  
  c4a9bd27
- C
  Add -E flag to gpcheckcat to perform missing/extraneous repair · f9fcaafd
  由 Christopher Hajas 提交于 7月 18, 2016
```
The missing and extraneous repair in gpcheckcat should not generate a repair file unless the -E flag
is provided, as this repair does not cover all cases.

Authors: Chris Hajas and Karen Huddleston
```
  f9fcaafd
- S
  
  Record external function names for useful debugging · 66158dfd
  由 Shreedhar Hardikar 提交于 7月 18, 2016
  
  66158dfd
20 7月, 2016 6 次提交

Fix copy-paste typo in append only code comments · 787b3ce2

由 Daniel Gustafsson 提交于 7月 20, 2016

The appendonlygettup() method fetches appendonly tuples, not heap
tuples, this is probably a copy-paste artefact from heapam.c which
has the same comment. While there also remove unused file header
tags.

787b3ce2

Fix annoyances in the regression test added for memtuple-forming bug. · 48f9ab4a

由 Heikki Linnakangas 提交于 7月 20, 2016

Commit 785c9bea, "Fix memtuple forming bug
when null-saved size is larger than 32763 bytes", added a regression test
that was annoying for multiple reasons:

* The 'aggregates' regression test is from the upstream. We're trying to keep
upstream test files unchanged, to make diffing and merging with the upstream
easier.

* The query is huge, about 250000 bytes, and it's all on one line. Not all
tools play well with such long lines. Emacs became really sluggish and the
laptop fan started screaming, when I moved the cursor over that line. When I
tried to copy-paste that into psql, my terminal ground to a halt. It took
minutes to process it, ie. just to paste the line to the terminal. (Note to
self: get a better terminal program :-) ).

* The query takes a very long time to run, about 30 seconds on my laptop. I
did some profiling: all the time is spent in the code in ExecInitAgg() that
checks for duplicate aggregates. That loop is O(n^2), and calls equal() on
the AggRef expressions on each iteration.

* When the query doesn't crash because of the bug that was fixed, it produces
an "invalid timestamp" error. Seems surprising that the expected result is
an error, unless you're specifically testing for error handling. It also
makes me worried that the test might fail to test the right thing in the
future, if the planner decides to rearrange things so that the cast to
timestamp is evaluated first, so that it errors out before even constructing
the array with the aggregate results.

To fix those issues:

* Move the test query to 'bfv_aggregate' test. BFV stands for Bug Fix
Verification, i.e. checking that old bugs haven't reappeared, and that's
exactly what this is. (There are other GPDB additions in 'aggregates' test
besides this, but let's not make it worse.)

* Split the test query into reasonably-sized lines.

* Replace the aggregated expression with a simpler one. I'm hesitant to try
optimizing the O(n^2) code, because code comes from the upstream. There
have been some changes in that area in recent versions of PostgreSQL, and
I'm not sure if the recent changes will make this better or worse, but not
much point spending time on it before we catch up. And then we should get
any improvements there committed to the upstream first.

Replacing the CASE-WHEN construct with something simpler doesn't fix the
O(n^2) behavior, but it makes the constant factor smaller, by making the
equal() calls cheaper. I replaced it with "c4 % [number]", and now the
query takes about 17 seconds on my laptop (was 30 s). That's still longer
than I'd wish, but every little helps.

* Remove the cast to timestamp that was causing the error on succes.

In the passing, also add a comment explaining what the purpose of the test
is, and some notes on the O(n^2) behavior.

48f9ab4a

H
Remove redundant 'functions' bugbuster test. · c8c8084b
由 Heikki Linnakangas 提交于 7月 19, 2016
```
Identical queries were in the 'qp_functions' test of the main suite
```
c8c8084b
H
Remove redundant bugbuster 'sirv_functions' test. · 5a96884f
由 Heikki Linnakangas 提交于 7月 19, 2016
```
There is a 'sirv_functions' test in the main greenplum schedule, which
contains all the same test queries.
```
5a96884f

Remove redundant tests. · c442971d

由 Heikki Linnakangas 提交于 7月 19, 2016

There are similar tests elsewhere. There's a test for an indexable
row comparison in the 'rowtypes' test, and for "ORDER BY col DESC" on
an indexed column, in the 'select' test.

c442971d

H
Remove redundant test queries. · 44a4cdd1
由 Heikki Linnakangas 提交于 7月 19, 2016
```
There's a test for "ALTER DOMAIN ... DROP NOT NULL" in the 'domain' test.
```
44a4cdd1