提交 · f7fb724cd878399123808c9c35325d6c0e1287fc · Greenplum / Gpdb

04 12月, 2018 2 次提交

Cleanup gpexpand behave tests and use dump of ICW database · f7fb724c

由 Jim Doty 提交于 10月 24, 2018

- Some tests were expanding into /tmp which ran out of space, so
now expand into /data/gpdata.
- Consolidate test to verify redistribution after expand.
- Actually use dump of ICW database in the relevant test.
Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
Co-authored-by: NJim Doty <jdoty@pivotal.io>
Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>

f7fb724c

gpexpand: zero out distributed_xlog in newly created segments · 99d48e1e

由 Jim Doty 提交于 10月 22, 2018

Because expand copies the entire master directory, we need to zero out the
distributed_xlog since the segment is not yet part of the cluster, to
allow it to only see local transactions.  Fixed the log level to WARN
when expanding a cluster with unique indexes.
Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
Co-authored-by: NJim Doty <jdoty@pivotal.io>
Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>

99d48e1e

30 11月, 2018 1 次提交

Remove unused imports · be50ded6

由 Daniel Gustafsson 提交于 11月 30, 2018

Reviewed-by: NJacob Champion <pchampion@pivotal.io>
Reviewed-by: NJimmy Yih <jyih@pivotal.io>

be50ded6

29 11月, 2018 1 次提交

Compare with None using the is operator · e39047b5

由 Daniel Gustafsson 提交于 11月 29, 2018

While == None works for comparison, it's a wasteful operation as it
performs type conversion and expansion. Instead move to using the
"is" operator which is the documented best practice for Python code.

Reviewed-by: Jacob Champion

e39047b5

25 10月, 2018 3 次提交

Unify the way to fetch/manage the number of segments (#6034) · 8eed4217

由 Tang Pengzhou 提交于 10月 25, 2018

* Don't use GpIdentity.numsegments directly for the number of segments

Use getgpsegmentCount() instead.

* Unify the way to fetch/manage the number of segments

Commit e0b06678 lets us expanding a GPDB cluster without a restart,
the number of segments may changes during a transaction now, so we
need to take care of the numsegments.

We now have two way to get segments number, 1) from GpIdentity.numsegments
2) from gp_segment_configuration (cdb_component_dbs) which dispatcher used
to decide the segments range of dispatching. We did some hard work to
update GpIdentity.numsegments correctly within e0b06678 which made the
management of segments more complicated, now we want to use an easier way
to do it:

1. We only allow getting segments info (include number of segments) through
gp_segment_configuration, gp_segment_configuration has newest segments info,
there is no need to update GpIdentity.numsegments, GpIdentity.numsegments is
left only for debugging and can be removed totally in the future.

2. Each global transaction fetches/updates the newest snapshot of
gp_segment_configuration and never change it until the end of transaction
unless a writer gang is lost, so a global transaction can see consistent
state of segments. We used to use gxidDispatched to do the same thing, now
it can be removed.

* Remove GpIdentity.numsegments

GpIdentity.numsegments take no effect now, remove it. This commit
does not remove gp_num_contents_in_cluster because it needs to
modify utilities like gpstart, gpstop, gprecoverseg etc, let's
do such cleanup work in another PR.

* Exchange the default UP/DOWN value in fts cache

Previously, Fts prober read gp_segment_configuration, checked the
status of segments and then set the status of segments in the shared
memory struct named ftsProbeInfo->fts_status[], so other components
(mainly used by dispatcher) can detect a segment was down.

All segments were initialized as down and then be updated to up in
most common cases, this brings two problems:

1. The fts_status is invalid until FTS does the first loop, so QD
need to check ftsProbeInfo->fts_statusVersion > 0
2. gpexpand add a new segment in gp_segment_configuration, the
new added segment may be marked as DOWN if FTS doesn't scan it
yet.

This commit changes the default value from DOWN to UP which can
resolve problems mentioned above.

* Fts should not be used to notify backends that a gpexpand occurs

As Ashwin mentioned in PR#5679, "I don't think giving FTS responsibility to
provide new segment count is right. FTS should only be responsible for HA
of the segments. The dispatcher should independently figure out the count
based on catalog.gp_segment_configuration should be the only way to get
the segment count", FTS should decouple from gpexpand.

* Access gp_segment_configuration inside a transaction

* upgrade log level from ERROR to FATAL if expand version changed

* Modify gpexpand test cases according to new design

8eed4217

J
Fix bad variable name · dcd6e301
由 Jamie McAtamney 提交于 10月 08, 2018
```
Authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
```
dcd6e301

Update gp_distribution_policy in gpexpand · 20d55451

由 Jamie McAtamney 提交于 10月 08, 2018

After commit 4eb65a53 added the numsegments column to gp_distribution_policy,
gpexpand would no longer correctly redistribute tables over the new segments
after an expansion. ALTER TABLE ... SET WITH(REORGANIZE=true) now relies upon
the numsegments value, and that value was not being updated by gpexpand.

This commit adds an UPDATE statement before every redistribution statement to
set numsegments to the proper value for the corresponding table so that the
data is correctly redistributed over the new segments.
Authored-by: NJamie McAtamney <jmcatamney@pivotal.io>

20d55451

17 10月, 2018 1 次提交

Add gpexpand tests for redistribute and prepare to use ICW dump · 76a81113

由 Kalen Krempely 提交于 10月 12, 2018

gpexpand does not currently work with a ICW dump yet, so this commit
prepares for that step as we work to fix all the outstanding errors.

We have also made the following changes:

Refactor and cleanup gpexpand behave tests

Add gpexpand test that can be run locally

Add new gpexpand test to verify redistribution

Add initial gpexpand test using snowflake-simple-database until we fix
gpexpand ICW dump errors
Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
Co-authored-by: NShoaib Lari <slari@pivotal.io>

76a81113

15 10月, 2018 1 次提交

Online expand without restarting gpdb cluster (#5679) · e0b06678

由 Ning Yu 提交于 10月 15, 2018

* Protect catalog changes on master.

To allow gpexpand to do the job without restarting the cluster we need
to prevent concurrent catalog changes on master.  A catalog lock is
provided for this purpose, all insert/update/delete to catalog tables
should hold this lock in shared mode before making the changes; gpexpand
can hold this lock in exclusive mode to 1) wait for in-progress catalog
updates to commit/rollback and 2) prevent concurrent catalog updates.

Add UDF to hold catalog lock in exclusive mode.

Add test cases for the catalog lock.
Co-authored-by: NJialun Du <jdu@pivotal.io>
Co-authored-by: NNing Yu <nyu@pivotal.io>

* Numsegments management.

GpIdentity.numsegments is a global variable saving the cluster size in
each process. It is important for cluster. Changing of the cluster size
means the expansion is finished and new segments have taken effect.

FTS will count the number of primary segments from gp_segment_configuration
and record it in shared memory. Later, other processes including master,
fts, gdd, QD will update GpIdentity.numsegmentswith this information.

So far it's not easy to make old transactions running with new segments,
so for QD GpIdentity.numsegments can only be updated at beginning of
transactions. Old transactions can only see old segments. QD will dispatch
GpIdentity.numsegments to QEs, so they will see the same cluster size.

Catalog changes in old transactions are disallowed.

Consider below workflow:

        A: begin;
        B: gpexpand from N segments to M (M > N);
        A: create table t1 (c1 int);
        A: commit;
        C: drop table t1;

Transaction A began when cluster size was still N, all its commands are
dispatched to N segments only even after cluster being expanded to M
segments, so t1 was only created on N tables, not only the data
distribution but also the catalog records.  This will cause the later
DROP TABLE to fail on the new segments.

To prevent this issue we currently disable all catalog updates in old
transactions once the expansion is done.  New transactions can update
catalog as they are already running on all the M segments.
Co-authored-by: NJialun Du <jdu@pivotal.io>
Co-authored-by: NNing Yu <nyu@pivotal.io>

* Online gpexpand implementation.

Do not restart the cluster during expansion.
- Lock the catalog
- Create the new segments from master segment
- Add new segments to cluster
- Reload the cluster so new transaction and background processes
  can see new segments
- Unlock the catalog

Add test cases.
Co-authored-by: NJialun Du <jdu@pivotal.io>
Co-authored-by: NNing Yu <nyu@pivotal.io>

* New job to run ICW after expansion.

It is better to run ICW after online expansion to see if the
cluster works well. So we add a new job to create a cluster with
two segments first, then expand to three segments and run all the
ICW cases. Cluster restarting is forbidden for we must be sure
all the cases will be passed after online expansion. So any case
which contains cluster restarting must be removed from this job.

The new job is icw_planner_centos7_online_expand, it is same as
icw_planner_centos7 but contains two new params EXCLUDE_TESTS and
ONLINE_EXPAND. If ONLINE_EXPAND is set, the ICW shell will go to
different branch, creating cluster with size two, expanding, etc.
EXCLUDE_TESTS lists the cases which contain cluster restarting
or the test cases will fail without restarting test cases.

After the whole run, the pid of master will be checked, if it
changes, the cluster must be restarted, so the job will fail.
As a result any new restarting test cases should be added to
EXCLUDE_TESTS.

* Add README.
Co-authored-by: NJialun Du <jdu@pivotal.io>
Co-authored-by: NNing Yu <nyu@pivotal.io>

* Small changes per review comments.

- Delete confusing test case
- Change function name updateBackendGpIdentityNumsegments to
  updateSystemProcessGpIdentityNumsegments
- Fix some typos

* Remove useless Assert

These two asserts will cause failure in isolation2/uao/
compaction_utility_insert. This case will check AO table in utility
mode. But numsegments are meaningless in sole segment, so this
assert must be removed.

e0b06678

18 9月, 2018 1 次提交

Fix gpexpand: start new mirror segments after expansion. (#5773) · 018525e4

由 Jialun 提交于 9月 18, 2018

We found gpexpand on master does not start new mirror segments after expansion, and the status of new mirrors are marked as down in gp_segment_configuration. Because the function sync_new_mirrors has been removed in commit 28eec592. And the test cases do not cover status check. So we re-add this function and add status check in test cases.

018525e4

17 8月, 2018 1 次提交
- N
  
  Remove RemoteCopy, use Scp instead · 568def47
  由 Nadeem Ghani 提交于 8月 11, 2018
  
  568def47
15 8月, 2018 1 次提交

Refactor allow_system_table_mods into a boolean GUC (#5407) · 4c24d744

由 David Kimura 提交于 8月 15, 2018

The purpose of this refactor is to more closely align the GUC with postgres. It
started as a suggestion in https://github.com/greenplum-db/gpdb/pull/4790.
There are still differences, particularly around when this GUC can be set. In
GPDB it can be set by anyone at any time (PGC_USERSET), however in postgres it
is limited to postmaster restart (PGC_POSTMASTER). This difference was kept on
purpose until we have more buy-in as it is a bigger change on the end-user.
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>

4c24d744

29 3月, 2018 1 次提交

Support replicated table in GPDB · 7efe3204

由 Pengzhou Tang 提交于 1月 29, 2018

* Support replicated table in GPDB

Currently, tables are distributed across all segments by hash or random in GPDB. There
are requirements to introduce a new table type that all segments have the duplicate
and full table data called replicated table.

To implement it, we added a new distribution policy named POLICYTYPE_REPLICATED to mark
a replicated table and added a new locus type named CdbLocusType_SegmentGeneral to specify
the distribution of tuples of a replicated table.  CdbLocusType_SegmentGeneral implies
data is generally available on all segments but not available on qDisp, so plan node with
this locus type can be flexibly planned to execute on either single QE or all QEs. it is
similar with CdbLocusType_General, the only difference is that CdbLocusType_SegmentGeneral
node can't be executed on qDisp. To guarantee this, we try our best to add a gather motion
on the top of a CdbLocusType_SegmentGeneral node when planing motion for join even other
rel has bottleneck locus type, a problem is such motion may be redundant if the single QE
is not promoted to executed on qDisp finally, so we need to detect such case and omit the
redundant motion at the end of apply_motion(). We don't reuse CdbLocusType_Replicated since
it's always implies a broadcast motion bellow, it's not easy to plan such node as direct
dispatch to avoid getting duplicate data.

We don't support replicated table with inherit/partition by clause now, the main problem is
that update/delete on multiple result relations can't work correctly now, we can fix this
later.

* Allow spi_* to access replicated table on QE

Previously, GPDB didn't allow QE to access non-catalog table because the
data is incomplete,
we can remove this limitation now if it only accesses replicated table.

One problem is QE need to know if a table is replicated table,
previously, QE didn't maintain
the gp_distribution_policy catalog, so we need to pass policy info to QE
for replicated table.

* Change schema of gp_distribution_policy to identify replicated table

Previously, we used a magic number -128 in gp_distribution_policy table
to identify replicated table which is quite a hack, so we add a new column
in gp_distribution_policy to identify replicated table and partitioned
table.

This commit also abandon the old way that used 1-length-NULL list and
2-length-NULL list to identify DISTRIBUTED RANDOMLY and DISTRIBUTED
FULLY clause.

Beside, this commit refactor the code to make the decision-making of
distribution policy more clear.

* support COPY for replicated table

* Disable row ctid unique path for replicated table.
  Previously, GPDB use a special Unique path on rowid to address queries
  like "x IN (subquery)", For example:
  select * from t1 where t1.c2 in (select c2 from t3), the plan looks
  like:
   ->  HashAggregate
         Group By: t1.ctid, t1.gp_segment_id
            ->  Hash Join
                  Hash Cond: t2.c2 = t1.c2
                ->  Seq Scan on t2
                ->  Hash
                    ->  Seq Scan on t1

  Obviously, the plan is wrong if t1 is a replicated table because ctid
  + gp_segment_id can't identify a tuple, in replicated table, a logical
  row may have different ctid and gp_segment_id. So we disable such plan
  for replicated table temporarily, it's not the best way because rowid
  unique way maybe the cheapest plan than normal hash semi join, so
  we left a FIXME for later optimization.

* ORCA related fix
  Reported and added by Bhuvnesh Chaudhary <bchaudhary@pivotal.io>
  Fallback to legacy query optimizer for queries over replicated table

* Adapt pg_dump/gpcheckcat to replicated table
  gp_distribution_policy is no longer a master-only catalog, do
  same check as other catalogs.

* Support gpexpand on replicated table && alter the dist policy of replicated table

7efe3204

07 2月, 2018 1 次提交

gpstart: fix OOM issue · a993ef03

由 Shoaib Lari 提交于 1月 05, 2018

gpstart did a cluster-wide check of heap_checksum settings and refused
to start the cluster if this setting was inconsistent. This meant a
round of ssh'ing across the cluster which was causing OOM errors with
large clusters.

This commit moves the heap_checksum validation to gpsegstart.py, and
changes the logic so that only those segments which have the same
heap_checksum setting as master are started.

Author: Jim Doty <jdoty@pivotal.io>
Author: Nadeem Ghani <nghani@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>

a993ef03

30 1月, 2018 2 次提交

Address PR comments: · 73763d6c

由 Shoaib Lari 提交于 1月 22, 2018

- Change gpdeletsystem to delete tablespaces before datadir
- Refactor SegmentStart.noWait to pg_ctrl_wait
- Create PgBaseBackup class
- Revert the change to default Mirror status
- Assorted typos and bugfixes

Author: Shoaib Lari <slari@pivotal.io>
Author: Jim Doty <jdoty@pivotal.io>
Author: Nadeem Ghani <nghani@pivotal.io>

73763d6c

gpexpand: Make it work with segwalrep. · 4534d171

由 Shoaib Lari 提交于 1月 11, 2018

The gpexpand utility and associated code are modified to work with the WALREP
code. Previously, gpexpand only worked with the primaries and relied on Filerep
to populate the mirrors. We are changing gpexpand such that it initializes the
mirrors using pg_basebackup and set them up for WAL replication.

Also, since the WALREP branch removed Filespaces, so we have also done the same
and replaced references to Filespaces by the Data Directory of the segments.

Author: Marbin Tan <mtan@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>
Author: Nadeem Ghani <nghani@pivotal.io>

4534d171

13 1月, 2018 9 次提交

gpexpand: remove sync_new_mirrors from gpexpand. · 28eec592

由 Shoaib Lari 提交于 1月 11, 2018

The syncing should be done automatically by the mirror, when the mirror
connects to the primary.

Author: Marbin Tan <mtan@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>

28eec592

H
Remove gpexpand --novacuum option, and never vacuum the catalogs. · 5eaa5889
由 Heikki Linnakangas 提交于 1月 11, 2018
```
The vacuum was causing trouble, see github issue #4298. And it seems pretty
pointless or unrelated to gpexpand, anyway.
```
5eaa5889

Revert "Temporarily add more debugging information." · 1bcd8809

由 Heikki Linnakangas 提交于 1月 11, 2018

This reverts commit 4efea6f73ef1b828e7ca435a9a62790f787ed861. I found
the culprit, and don't need that debugging information anymore.

1bcd8809

Temporarily add more debugging information. · 9ca6ebd0

由 Heikki Linnakangas 提交于 1月 11, 2018

To hunt down "not enough arguments for format string" failure we're seeing
MM_gpexpand_1 test in the pipeline

9ca6ebd0

Remove filespace components from gpexpand · f5d5f877

由 C.J. Jameson 提交于 1月 05, 2018

- Continuation of previous general filespace removal commits

Author: C.J. Jameson <cjameson@pivotal.io>

f5d5f877

H

Remove more references to filespaces in management utils. · 900ecae6
由 Heikki Linnakangas 提交于 1月 04, 2018

900ecae6
H

Remove some references to replication port in management utils. · d204ef97
由 Heikki Linnakangas 提交于 1月 04, 2018

d204ef97
H
Remove gprecoverseg --persistent-check option. · 1a42af45
由 Heikki Linnakangas 提交于 1月 04, 2018
```
Persistent table are no more.
```
1a42af45

Remove a lot of persistent table and mirroring stuff. · 5c158ff3

由 Heikki Linnakangas 提交于 12月 14, 2017

* Revert almost all the changes in smgr.c / md.c, to not go through
  the Mirrored* APIs.

* Remove mmxlog stuff. Use upstream "pending relation deletion" code
  instead.

* Get rid of multiple startup passes. Now it's just a single pass like
  in the upstream.

* Revert the way database drop/create are handled to the way it is in
  upstream. Doesn't use PT anymore, but accesses file system directly,
  and WAL-logs a single CREATE/DROP DATABASE WAL record.

* Get rid of MirroredLock

* Remove a few tests that were specific to persistent tables.

* Plus a lot of little removals and reverts to upstream code.

5c158ff3

09 1月, 2018 2 次提交

gppylib: refactor SegmentPair to not support multiple mirrors · a19f7327

由 Shoaib Lari 提交于 12月 13, 2017

Long ago, we thought we might need to support multiple mirrors. But we
don't, and don't forsee it coming soon. Simplify the code to only ever
have one mirror, but still allow for the possibility of no mirrors

Author: Shoaib Lari <slari@pivotal.io>
Author: C.J. Jameson <cjameson@pivotal.io>

a19f7327

gppylib: Rename gpArray variables and classes · bbc47080

由 Marbin Tan 提交于 12月 12, 2017

The gpArray use of GpDB and Segment classes was confusing. This change renames
GpDB to Segment and Segment to SegmentPair to clarify usage. Its a big diff, but
a simple, repeating change.

Author: Shoaib Lari <slari@pivotal.io>
Author: Marbin Tan <mtan@pivotal.io>
Author: C.J. Jameson <cjameson@pivotal.io>

bbc47080

06 12月, 2017 1 次提交
- Y
  
  Fix typo in gpexpand and gpmovemirrors · fd7674d0
  由 yanchaozhong 提交于 11月 30, 2017
  
  fd7674d0
14 11月, 2017 1 次提交
- D
  
  Fix common typos for "partition" in code comments · 8ccfd860
  由 Daniel Gustafsson 提交于 11月 14, 2017
  
  8ccfd860
01 9月, 2017 1 次提交

unix.RemoveDirectories: optimize and isolate cases · 872c02b3

由 Shoaib Lari 提交于 8月 28, 2017

Previously, RemoveDirectories and RemoveFiles used the unix command
"rm -rf", but this is inefficient for huge numbers of files.
Also, these functions accepted any globbed path.

Instead, use "rsync" to optimize deletion of files in a directory.
On a DCA using 1 million files, this increased speed by about 3x.

Also, this commit breaks up the different use-cases of deletion into
separate methods, adding methods RemoveDirectoryContents() and RemoveFile()
and RemoveGlob() to help isolate the assumptions of each case and
optimize for them.
Signed-off-by: NLarry Hamel <lhamel@pivotal.io>
Signed-off-by: NShoaib Lari <slari@pivotal.io>

872c02b3

30 8月, 2017 4 次提交

gpexpand: refactor table_expand_error global variable · 4c10409c

由 Nadeem Ghani 提交于 8月 28, 2017

Remove global variable table_expand_error by checking the pool of done
ExpandCommand(s).
Signed-off-by: NMarbin Tan <mtan@pivotal.io>

4c10409c

gpexpand: adding check for heap_checksums · fc6c1f7e

由 Shoaib Lari 提交于 8月 22, 2017

This commit adds a check for cluster state, heap_checksum setting on all
primary segments match heap_checksum setting on master, before doing the
expansion.

If all primaries match the master, gpexpand continues with setting up expansion
segments. Otherwise, it logs the inconsistent primaries and exits.
Signed-off-by: NMarbin Tan <mtan@pivotal.io>

fc6c1f7e

Refactor gpexpand to improve testability · 17f26ed2

由 Nadeem Ghani 提交于 8月 22, 2017

gpexpand had a lot of code in the __main__ module method, along with
global vars used by other methods and classes in the module.

This commit introduces a main() method, which can be called from unit tests,
and converts global vars to params and fields.
Signed-off-by: NShoaib Lari <slari@pivotal.io>

17f26ed2

S
Fix whitespaces. · ff98befa
由 Shoaib Lari 提交于 8月 22, 2017
```
Signed-off-by: NNadeem Ghani <nghani@pivotal.io>
```
ff98befa

02 6月, 2017 1 次提交

Fix gpexpand misreporting re-distributed table. · 22884d45

由 Marbin Tan 提交于 5月 24, 2017

When gpexpand is run with a specific duration or end-time using '-d' or
'-e' flag, there might be an off-chance that gpexpand reports that the
re-distribute was successful, however, that's not the case.
gpexpand status details reports as COMPLETED, but the data has not been
redistributed and is still distributed by RANDOM.
Signed-off-by: NNadeem Ghani <nghani@pivotal.io>
Signed-off-by: NTushar Dadlani <tdadlani@pivotal.io>

22884d45

04 5月, 2017 3 次提交
- M
  Remove gpcoverage · 536554fd
  由 Marbin Tan 提交于 5月 03, 2017
```
Signed-off-by: NLarry Hamel <lhamel@pivotal.io>
```
  536554fd
- L
  Refactor whitespace · e0c3a8da
  由 Larry Hamel 提交于 5月 03, 2017
```
Signed-off-by: NMarbin Tan <mtan@pivotal.io>
```
  e0c3a8da
- M
  Address pylint warnings and errors (gpconfig | gpexpand) (#2348) · e72d01d0
  由 Marbin Tan 提交于 5月 03, 2017
```
* Address pylint warnings and errors

- Fix whitespace and indentation
- Remove unused variables
- Fix syntax errors
Signed-off-by: NMarbin Tan <mtan@pivotal.io>
```
  e72d01d0
08 2月, 2017 2 次提交

Change arguments to postgres to use long flags · bd7150fe

由 Marbin Tan 提交于 1月 31, 2017

Make consistent with:
https://github.com/greenplum-db/gpdb/commit/d7e6e0ecac5c49d16c68fcb827a23b0116d07b5aSigned-off-by: NChumki Roy <croy@pivotal.io>
Signed-off-by: NC.J. Jameson <cjameson@pivotal.io>

bd7150fe

M
Remove white spaces · 290db755
由 Marbin Tan 提交于 1月 31, 2017
```
Signed-off-by: NC.J. Jameson <cjameson@pivotal.io>
```
290db755