- 14 3月, 2019 1 次提交
-
-
由 Daniel Gustafsson 提交于
As we merge with upstream and by that keep refining the Postgres planner, legacy planner is no longer a suitable name. This changes all variations of the spelling (legacy planner, legacy optimizer, legacy query optimizer etc) to say "Postgres" rather than "legacy". Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io> Reviewed-by: NDavid Yozie <dyozie@pivotal.io> Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
- 12 3月, 2019 1 次提交
-
-
由 David Krieger 提交于
This commit is part of Add Partitioned Indexes #7047. Tests were added to verify: - Index backed constraint names have matching index name - Constraints on partition tables including ADD PARTITION and EXCHANGE PARTITION. - Constraints and indexes can be upgraded. This includes testing directly in pg_regress, or creating tables to be used by pg_upgrade. Co-authored-by: NTaylor Vesely <tvesely@pivotal.io> Co-authored-by: NKalen Krempely <kkrempely@pivotal.io> Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
-
- 11 3月, 2019 1 次提交
-
-
由 Ning Yu 提交于
This method was introduced to improve the data redistribution performance during gpexpand phase2, however per benchmark results the effect does not reach our expectation. For example when expanding a table from 7 segments to 8 segments the reshuffle method is only 30% faster than the traditional CTAS method, when expanding from 4 to 8 segments reshuffle is even 10% slower than CTAS. When there are indexes on the table the reshuffle performance can be worse, and extra VACUUM is needed to actually free the disk space. According to our experiments the bottleneck of reshuffle method is on the tuple deletion operation, it is much slower than the insertion operation used by CTAS. The reshuffle method does have some benefits, it requires less extra disk space, it also requires less network bandwidth (similar to CTAS method with the new JCH reduce method, but less than CTAS + MOD). And it can be faster in some cases, however as we can not automatically determine when it is faster it is not easy to get benefit from it in practice. On the other side the reshuffle method is less tested, it is possible to have bugs in corner cases, so it is not production ready yet. In such a case we decided to retire it entirely for now, we might add it back in the future if we can get rid of the slow deletion or find out reliable ways to automatically choose between reshuffle and ctas methods. Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8xknWag-SkI/5OsIhZWdDgAJReviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
- 07 12月, 2018 1 次提交
-
-
由 Ning Yu 提交于
The following WARNING is generated by ANALYZE when some sample tuples are from segments outside the [0, numsegments-1] range, however this does not indicate the data distribution is wrong. Take inherited tables for example, when inherited tables has greater numsegments than parent this WARNING will be raised, and it is expected. This could happen normally on the random_numsegments pipeline job, so ignore this WARNING. WARNING: table "patest0" contains rows in segment 2, which is outside the # of segments for the table's policy (2 segments) Added this pattern to init_file to ignore it.
-
- 30 11月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
These rules cover messages that are no longer in the code, so they will never match. Remove. Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io> Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
-
- 29 11月, 2018 1 次提交
-
-
由 Ning Yu 提交于
By loading this hook CREATE TABLE will create tables with random numsegments by using the gp_debug_numsegments extension. It can be enabled via make command like this: make installcheck EXTRA_REGRESS_OPTS=--prehook=randomize_create_table_default_numsegments However as the plans can be different with random numsegments it is recommended to also ignore the plan diffs, so the make command becomes this: make installcheck EXTRA_REGRESS_OPTS="--prehook=randomize_create_table_default_numsegments --ignore-plans"
-
- 23 11月, 2018 1 次提交
-
-
由 Ning Yu 提交于
There are 3 reshuffle tests, the ao one, the co one, and the heap one. They share almost the same cases, but different on table names and create table options. There are also some differences caused when adding regression tests, they are only added in one file but not others. We want to keep minimal differences between these tests, so we ensure that a regression test for ao also covers similar case for heap. And once we understand one of the test file we have almost the same knowledge on the others. Here is a list of changes to these tests: - reduce differences on table names by using schema; - reduce differences on CREATE TABLE options by setting default storage options; - simplify the creation of partially distributed tables by using the gp_debug_numsegments extension; - copy some regression tests to all the tests; - retire the no longer used helper function; - move the tests into an existing parallel test group; pg_regress test framework provides some @@ tokens for ao/co tests, however we still can not merge the ao and co tests into one file as WITH (OIDS) is only supported by ao but not co.
-
- 22 9月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
We had changed this in GPDB, to print less parens. That's fine and dandy, but it hardly seems worth it to carry a diff vs upstream for this. Which format is better, is a matter of taste. The extra parens make some expressions more clear, but OTOH, it's unnecessarily verbose for simple expressions. Let's follow the upstream on this. These changes were made to GPDB back in 2006, as part of backporting to EXPLAIN-related patches from PostgreSQL 8.2. But I didn't see any explanation for this particular change in output in that commit message. It's nice to match upstream, to make merging easier. However, this won't make much difference to that: almost all EXPLAIN plans in regression tests are different from upstream anyway, because GPDB needs Motion nodes for most queries. But every little helps.
-
- 11 8月, 2018 1 次提交
-
-
由 Ashuka Xue 提交于
Prior to this commit, there was no support for GiST indexes in GPORCA. For queries involving GiST indexes, ORCA was selecting Table Scan paths as the optimal plan. These plans could take up to 300+ times longer than Planner, which generated a index scan plan using the GiST index. Example: ``` CREATE TABLE gist_tbl (a int, p polygon); CREATE TABLE gist_tbl2 (b int, p polygon); CREATE INDEX poly_index ON gist_tbl USING gist(p); INSERT INTO gist_tbl SELECT i, polygon(box(point(i, i+2),point(i+4, i+6))) FROM generate_series(1,50000)i; INSERT INTO gist_tbl2 SELECT i, polygon(box(point(i+1, i+3),point(i+5, i+7))) FROM generate_series(1,50000)i; ANALYZE; ``` With the query `SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;`, we see a performance increase with the support of GiST. Before: ``` EXPLAIN SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Aggregate (cost=0.00..171401912.12 rows=1 width=8) -> Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..171401912.12 rows=1 width=8) -> Aggregate (cost=0.00..171401912.12 rows=1 width=8) -> Nested Loop (cost=0.00..171401912.12 rows=335499869 width=1) Join Filter: gist_tbl.p <@ gist_tbl2.p -> Table Scan on gist_tbl2 (cost=0.00..432.25 rows=16776 width=101) -> Materialize (cost=0.00..530.81 rows=49997 width=101) -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..525.76 rows=49997 width=101) -> Table Scan on gist_tbl (cost=0.00..432.24 rows=16666 width=101) Optimizer status: PQO version 2.65.1 (10 rows) Time: 170.172 ms SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p; count ------- 49999 (1 row) Time: 546028.227 ms ``` After: ``` EXPLAIN SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Aggregate (cost=0.00..21749053.24 rows=1 width=8) -> Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..21749053.24 rows=1 width=8) -> Aggregate (cost=0.00..21749053.24 rows=1 width=8) -> Nested Loop (cost=0.00..21749053.24 rows=335499869 width=1) Join Filter: true -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..526.39 rows=50328 width=101) -> Table Scan on gist_tbl2 (cost=0.00..432.25 rows=16776 width=101) -> Bitmap Table Scan on gist_tbl (cost=0.00..21746725.48 rows=6667 width=1) Recheck Cond: gist_tbl.p <@ gist_tbl2.p -> Bitmap Index Scan on poly_index (cost=0.00..0.00 rows=0 width=0) Index Cond: gist_tbl.p <@ gist_tbl2.p Optimizer status: PQO version 2.65.1 (12 rows) Time: 617.489 ms SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p; count ------- 49999 (1 row) Time: 7779.198 ms ``` GiST support was implemented by sending over GiST index information to GPORCA in the metadata using a new index enum specifically for GiST. Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io> Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
- 03 8月, 2018 1 次提交
-
-
由 Karen Huddleston 提交于
This reverts commit 4750e1b6.
-
- 02 8月, 2018 1 次提交
-
-
由 Richard Guo 提交于
This is the final batch of commits from PostgreSQL 9.2 development, up to the point where the REL9_2_STABLE branch was created, and 9.3 development started on the PostgreSQL master branch. Notable upstream changes: * Index-only scan was included in the batch of upstream commits. It allows queries to retrieve data only from indexes, avoiding heap access. * Group commit was added to work effectively under heavy load. Previously, batching of commits became ineffective as the write workload increased, because of internal lock contention. * A new fast-path lock mechanism was added to reduce the overhead of taking and releasing certain types of locks which are taken and released very frequently but rarely conflict. * The new "parameterized path" mechanism was added. It allows inner index scans to use values from relations that are more than one join level up from the scan. This can greatly improve performance in situations where semantic restrictions (such as outer joins) limit the allowed join orderings. * SP-GiST (Space-Partitioned GiST) index access method was added to support unbalanced partitioned search structures. For suitable problems, SP-GiST can be faster than GiST in both index build time and search time. * Checkpoints now are performed by a dedicated background process. Formerly the background writer did both dirty-page writing and checkpointing. Separating this into two processes allows each goal to be accomplished more predictably. * Custom plan was supported for specific parameter values even when using prepared statements. * API for FDW was improved to provide multiple access "paths" for their tables, allowing more flexibility in join planning. * Security_barrier option was added for views to prevents optimizations that might allow view-protected data to be exposed to users. * Range data type was added to store a lower and upper bound belonging to its base data type. * CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The SELECT query is planned during the execution of the utility. To conform to this change, GPDB executes the utility statement only on QD and dispatches the plan of the SELECT query to QEs. Co-authored-by: NAdam Lee <ali@pivotal.io> Co-authored-by: NAlexandra Wang <lewang@pivotal.io> Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io> Co-authored-by: NAsim R P <apraveen@pivotal.io> Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io> Co-authored-by: NGang Xiong <gxiong@pivotal.io> Co-authored-by: NHaozhou Wang <hawang@pivotal.io> Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Co-authored-by: NJesse Zhang <sbjesse@gmail.com> Co-authored-by: NJinbao Chen <jinchen@pivotal.io> Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io> Co-authored-by: NMelanie Plageman <mplageman@pivotal.io> Co-authored-by: NPaul Guo <paulguo@gmail.com> Co-authored-by: NRichard Guo <guofenglinux@gmail.com> Co-authored-by: NShujie Zhang <shzhang@pivotal.io> Co-authored-by: NTaylor Vesely <tvesely@pivotal.io> Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
-
- 19 6月, 2018 1 次提交
-
-
由 Omer Arap 提交于
This commit introduces an end-to-end scalable solution to generate statistics of the root partitions. This is done by merging the statistics of leaf partition tables to generate the statistics of the root partition. Therefore, ability to merge leaf table statistics for the root table makes analyze very incremental and stable. **CHANGES IN LEAF TABLE STATS COLLECTION:** Incremental analyze will create sample for each partition as the previous version. While analyzing the sample and generating statistics for the partition, it will also create a `hyperloglog_counter` data structure and add values from the sample to the `hyperloglog_counter` such as number of multiples and sample size. Once the entire sample is processed, analyze will save the `hyperloglog_counter` as a byte array in `pg_statistic` catalog table. We reserve a slot for the `hyperlog_counter` in the table and signify this as a specific type of statistic kind which is `STATISTIC_KIND_HLL`. We only keep the `hyperloglog_counter` in the `pg_catalog` for the leaf partitions. If the user chooses to run FULL scan for HLL, we signify the kind as `STATISTIC_KIND_FULLHLL`. **MERGING LEAF STATISTICS** Once all the leaf partitions are analyzed, we analyze the root partition. Initially, we check if all the partitions have been analyzed properly and have all the statistics available to us in the `pg_statistic` catalog table. If there is a partition with no tuples, even though it has no entry in `pg_catalog`, we consider it as analyzed. If for some reason a single partition is not analyzed, we fall back to the original analyze algorithm that requires to acquire sample for the root partition and calculate statistic based on the sample. Merging null fraction and average width from leaf partition statistics is trivial and does not involve significant challenge. We do calculate them first. Then, the remaining statistics information are: - Number of distinct values (NDV) - Most common values (MCV), and their frequencies termed as most common frequency (MCF) - Histograms that represent the distribution of the data values in the table **Merging NDV:** Hyperloglog provides a functionality to merge multiple `hyperloglog_counter`s into one and calculate the number of distinct values using the aggregated `hyperlog_counter`. This aggregated `hyperlog_counter` is sufficient only if the user chooses to run full scan for hyperloglog. In the sample based approach, without the hyperloglog algorithm, derivation of number of distinct values is not possible. Hyperloglog enables us to merge the `hyperloglog_counter`s from each partition and calculate the NDV on the merged `hyperloglog_counter` with an acceptable error rate. However, it does not give us the ultimate NDV of the root partition, it provides us the NDV of the union of the samples from each partition. The rest of the NDV interpolation depends on four metrics in postgres and based on the formula used in postgres: NDV in the sample, number of multiple values in the sample, sample size and total rows in the table. Using these values the algorithm calculates the approximate NDV for the table. While merging the statistics from the leaf partitions, with the help of hyperloglog we can accurately generate NDV for the sample, sample size and total rows, however, number of multiples in the accumulated sample is unknown since we do not have an access to the accumulated sample at this point. _Number of Multiples_ Our approach to estimate the number of multiples in the aggregated sample (which itself is unavailable) for the root requires the availability of NDVs, number of multiples and size of each leaf sample. The NDVs in each sample is trivial to calculate using the partition's `hyperloglog_counter`. The number of multiples and sample size for each partition is saved in the `hyperloglog_counter` of the partition to be used in the merge during the leaf statistics gathering. Estimating the number of multiples in the aggregate sample for the root partition is a two step process. First, we accurately estimate the number of values that reside in more than one partition's sample. Then, we estimate the number of multiples that uniquely exists in a single partition. Finally, we add these values to estimate the overall number of multiples in the aggregate sample of the root partition. To count the number of values that uniquely exists in one single partition, we utilize hyperloglog functionality. We can easily estimate how many values appear only on a specific partition _i_. We call the NDV of overall aggregate of the entire partition as `NDV_all` and NDV of aggregate of all partitions but _i_ as `NDV_minus_i`. The difference of `NDV_all` and `NDV_minus_i` would result in the values that appear in only one partition. The rest of the values will contribute to the overall number of multiples in the root’s aggregated sample, and we call them as `nMultiple_inter` as the number of values that appear in more than one partition. However, that is not enough since even a single value only resides in one partition, the partition might have multiple of them. We need a way to express the possibility of existence of these values. Remember that we also account the number of multiples that uniquely in partition sample. We already know the number of multiples inside a partition sample, however we need to normalize this value with the proportion of the number of values unique to the partition sample to the number of distinct values of the partition sample. The normalized value would be partition sample i’s contribution to the overall calculation of the nMultiple. Finally, `nMultiple_root` would be the sum of the `nMultiple_inter` and `normalized_m_i` for each partition sample. **Merging MCVs:** We utilize the merge functionality we imported from the 4.3 version of the greenplum DB. The algorithm is trivial. We convert each MCV’s frequency into count and add them up if they appear in more than one partition. After every possible candidate’s count has been calculated, we sort the candidate values and pick the top ones which is defined by the `default_statistics_target`. 4.3 previously blindly picks the top values with the highest count. We however incorporated the same logic used in the current greenplum and postgres and test if a values is a real MCV by running some tests. Therefore, even after the merge, the logic totally aligns with the postgres. **Merging Histograms:** One of the main novel contribution of this commit comes in how we merge the histograms from the leaf partitions. In 4.3 we use priority queue to merge the histogram from the leaf partition. However, that approach is very naive and loses very important statistical information. In postgres, histogram is calculated over the values that did not qualify as an MCV. The merge logic for the histograms in 4.3, did not take this into consideration and significant statistical information is lost while we merge the MCV values. We introduce a novel approach to feed the MCV’s from the leaf partitions that did not qualify as a root MCV to the histogram merge logic. To fully utilize the previously implemented priority queue logic, we treated non-qualified MCV’s as the histograms of a so called `dummy` partitions. To be more previcate, if an MCV m1 is a non-qualified MCV we create a histogram [m1, m1] where it only has one bucket and the bucket size is the count of this non-qualified MCV. When we merge the histograms of the leaf partitions and these dummy partitions the merged histogram would not lose any statistical information. Signed-off-by: NJesse Zhang <sbjesse@gmail.com> Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
- 09 11月, 2017 1 次提交
-
-
由 Adam Lee 提交于
* Several small fixes of the tests 1, ignore two generated test files. 2, remove the string containing unpredictable segment numbers. 3, drop tables in external_table case, so we could run multiple times of it once. * Fix cases which are unpredictable > commit 3bbedbe9 > Author: Heikki Linnakangas <hlinnakangas@pivotal.io> > Date: Thu Nov 2 10:04:58 2017 +0200 > > Wake up faster, if a segment returns an error. > Previously, if a segment reported an error after starting up the > interconnect, it would take up to 250 ms for the main thread in the QD > process to wake up and poll the dispatcher connections, and to see that > there was an error. Shorten that time, by waking up immediately if the > QD->QE libpq socket becomes readable while we're waiting for data to > arrive in a Motion node. > This isn't a complete solution, because this will only wake up if one > arbitrarily chosen connection becomes readable, and we still rely on > polling for the others. But this greatly speeds up many common scenarios. > In particular, the "qp_functions_in_select" test now runs in under 5 s > on my laptop, when it took about 60 seconds before. > Before this commit, the master would only check every 250 ms if one of the > segments had reported an error. Now it wakes up and cancels the whole query as > soon as it receives an error from the first segment. That makes it more likely > that the other segments have not yet reached the same number of errors as what > is memorized in the expected output. These two cases check: 1, when selecting from a cte fails, one of the external table of the cte reached the error limit, how many errors happened in the other external table of the cte, which would not reached the limit. 2, when selecting from an external table with two locations mapped to two segments each, one segment reached the reject limit, the other also reached the same. We could not predict these two results without special test files, even without that commit actually. This commit removes the cte case and checks at least one segment failed in case readable_query26.
-
- 18 10月, 2017 1 次提交
-
-
由 Richard Guo 提交于
-
- 14 8月, 2017 1 次提交
-
-
由 Ning Yu 提交于
* resgroup: increase max slots for isolation tests. * ICW: ignore resgroup related warnings. * ICW: try to load resgroup variant of answers when resgroup enabled. * ICW: provide resgroup variant of answers. * ICW: check whether resqueue is enabled in UDF. * ICR: substitude usrname in gpconfig output. * ICR: explicitly set max_connections. * isolation2: increase resgroup concurrency for max_concurrency tests.
-
- 09 8月, 2017 2 次提交
-
-
由 Heikki Linnakangas 提交于
This replaces all places in regression tests, where the gpfaultinject binary was used, with the SQL-callable function in the new gp_inject_fault extension. The SQL function is more forgivin about the dev environemnt, and doesn't need gpfaultinject to be in $PATH, for starters. Also, it's just good to harmonize and have just one way of injecting faults. More uses of gpfaultinject remain in the TINC tests, so we cannot get rid of it any time soon, but this is a step in that direction, anyway.
-
由 Heikki Linnakangas 提交于
* Turn it into an extension, for easier installation. * Add a simpler variant of the gp_inject_fault function, with less options. This is applicable to almost all the calls in the regression suite, so it's nice to make them less verbose. * Change the dbid argument from smallint to int4. For convenience, so that you don't need a cast when calling the function.
-
- 04 8月, 2017 1 次提交
-
-
由 Daniel Gustafsson 提交于
Using errcode 0 will cause ereport() to treat it as an internal error and print the filename/line. Since this is a userfacing error it should have a proper errcode to avoid this. This also allows the gpdiff rule to be removed.
-
- 10 5月, 2017 1 次提交
-
-
由 xiong-gang 提交于
Previously, we remove the shared memory object when drop resource group, and restore it if the transaction aborts. The concurrently access to shared memory object would fail in this way. If the resource group is dropped, new transactions to this resource group will be queued up until the drop transaction is finished. Signed-off-by: NRichard Guo <riguo@pivotal.io>
-
- 03 5月, 2017 1 次提交
-
-
由 Adam Lee 提交于
Support COPY statement that exports the table directly from segment to local file parallelly. This commit adds a keyword "on segment" to save the copied file on "segment" instead of on "master". Two place holders are used, which are "<SEG_DATA_DIR>" and "<SEGID>" and will be replced to segment datadir and segment id. E.g. ``` COPY tbl TO '/tmp/<SEG_DATA_DIR>filename<SEGID>.txt' ON SEGMENT; ``` Signed-off-by: NYuan Zhao <yuzhao@pivotal.io> Signed-off-by: NHaozhou Wang <hawang@pivotal.io> Signed-off-by: NAdam Lee <ali@pivotal.io>
-
- 28 2月, 2017 1 次提交
-
-
由 Daniel Gustafsson 提交于
The error messages are developer or debug facing, no reason to believe this will break anyones regexing of logfiles in prod.
-
- 27 1月, 2017 1 次提交
-
-
由 Nikos Armenatzoglou 提交于
Closes #1606 Signed-off-by: NHaisheng Yuan <hyuan@pivotal.io>
-
- 24 1月, 2017 2 次提交
-
-
由 Heikki Linnakangas 提交于
This reduces the risk of accidentally masking out messages in a test that's not supposed to produce such messages in the first place, and is just nicer in general, IMHO. While we're at it, add a brief comment to init_file to explain what it's for. Also, remove a few more matchsubs from atmsort.pm that seem to be unused.
-
由 Heikki Linnakangas 提交于
When a table which had an attribute whose type has been dropped process the ALTER TABLE command queue, a "hidden" type will be created, and immediately dropped, during ALTER TABLE processing for table redistribution. This will emit several NOTICEs which can be confusing to the user as it's an autogenerated name and the DROP TYPE can have happened at a previous time. Below is an example of the output: create table <tablename> (a integer, b <typename>); drop type <typename>; ... alter table <tablename> set with(reorganize = true) distributed randomly; NOTICE: return type pg_atsdb_<oid>_2_3 is only a shell NOTICE: argument type pg_atsdb_<oid>_2_3 is only a shell NOTICE: drop cascades to function pg_atsdb_<oid>_2_3_out(pg_atsdb_<oid>_2_3) NOTICE: drop cascades to function pg_atsdb_<oid>_2_3_in(cstring) The reason for adding the hidden types is that the redistribution is performed with a CTAS doing SELECT *. To fix, change the way the CTAS is done, to not create hidden types. The temp table that we create still needs to include dropped columns at the same positions as the old one. Otherwise, when we swap the relation files, a tuple's representation on-disk won't match the catalogs. However, we cannot easily re-construct a dropped column with the same attlen, attalign, etc. as the original dropped column. Instead, create it as if it was an INT4 column, and just before swapping the relation files, update the attlen, attalign fields in pg_attribute entries of the dropped columns to match that of INT4. That way, the original table's catalog entries match that of the temp table. Alternatively, we could build the temp table without the dropped columns, and remove them from pg_attribute altogether. However, we'd need to update the attnum field of all following columns, and cascade that change to at least pg_attrdef and pg_depend. That seems more complicated. Also remove output from expected testfiles and perform minor cleanups. Original patch by Daniel Gustafsson, with the int4-placeholder mechanism added by me.
-
- 19 12月, 2016 2 次提交
-
-
由 Daniel Gustafsson 提交于
While it should be rare (and the original ticket referred indicates that it is), it's perfectly legal for a UDP buffer to fill up. Set the messagelevel to LOG rather than WARNING.
-
由 Daniel Gustafsson 提交于
The different kinds of NOTICE messages regarding table distribution were using a mix of upper and lower case for 'DISTRIBUTED BY'. Make them consistent by using upper case for all messages and update the test files, and atmsort regexes, to match.
-
- 18 11月, 2016 2 次提交
-
-
由 Heikki Linnakangas 提交于
Attach a suitable error code for many errors that were previously reported as "internal errors". GPDB's elog.c prints a source file name and line number for any internal errors, which is a bit ugly for errors that are in fact unexpected internal errors, but user-facing errors that happen as a result of e.g. an invalid query. To make sure we don't accumulate more of these, adjust the regression tests to not ignore the source file and line number in error messages. There are a few exceptions, which are listed explicitly.
-
由 Heikki Linnakangas 提交于
With commit 61972775, we use a proper SQLSTATE for the errors that needed this before.
-
- 19 9月, 2016 1 次提交
-
-
由 Pengzhou Tang 提交于
When process_startup_packets is triggered, gp_debug_linger was not set to 0 which cause anoying message "HINT: process xxxx" exist in output file and make tests unstable. This commit change the fault injection location to send_qe_details_init_backend where gp_debug_linger has been set to 0, so no hint message is generated in output file.
-
- 13 9月, 2016 1 次提交
-
-
由 Pengzhou Tang 提交于
To test corner cases, we use faultinjector utility to simulate segment recovery, segment FATAL&ERROR level errors when gangs are creating.
-
- 06 9月, 2016 1 次提交
-
- 05 9月, 2016 1 次提交
-
-
由 Kenan Yao 提交于
motion_listener back
-
- 20 8月, 2016 1 次提交
-
-
- 17 8月, 2016 1 次提交
-
-
由 Haisheng Yuan 提交于
Also updated gp_optimizer expected output and ignore line number difference for functions.c
-
- 16 7月, 2016 1 次提交
-
-
由 Heikki Linnakangas 提交于
* Anchor all the ERROR, WARNING etc. messages to beginning of line, with "/^..." * Remove obsolete substititions, for error messages that don't appear anywhere in the code anymore. * Remove redundant replacements of source line numbers in error messages, like "(xact.c:%d)". There is a special rule that replaces all of those with (SOMEFILE:SOMEFUNC). * Replace case-insensitive rules with case-sensitive ones. * Replace sloppy use of "\s+", with the actuala mount of whitespace in the error messages. * Remove unnecessary "s/.../" lines from the matchignore block in init_file. You don't need those with "matchignore", only with "matchsubs". Aside from being tidier, these changes make the diffing significantly faster. There are less regular expressions to parse, and the remaining ones are faster to evaluate.
-
- 22 6月, 2016 1 次提交
-
-
由 Daniel Gustafsson 提交于
When bfz_close() is called in the codepath during the abortion of a transaction we must avoid throwing even more errors unless the situation calls for it. For bfz_close() it's fine to lower the ereport level to WARNING in this case. Longer term we should move this, and other, codepaths away from calling unlink() directly and instead use the API provided but this closes a current issue in ICG so better to close this immediately and refactor all callsites when having a clean ICG.
-
- 12 3月, 2016 1 次提交
-
-
由 Jimmy Yih 提交于
Most of these test additions are inspired from Pivotal's internal testing and needed to be added to the open source installcheck to give the community more test coverage on AO/CO tables. This commit mostly adds extra coverage for indexes and partition tables.
-
- 11 3月, 2016 1 次提交
-
-
由 Kuien Liu 提交于
based on patch by Daniel Gustafsson
-
- 08 3月, 2016 1 次提交
-
-
由 Jimmy Yih 提交于
Most of these test additions are inspired from Pivotal's internal testing and needed to be added to the open source installcheck to give the community more test coverage.
-
- 05 12月, 2015 1 次提交
-
-
由 Abhijit Subramanya 提交于
auxiliary tables to not get shrunk and generate a notice to the user. The AppendOnlyCompaction_IsRelationEmpty() function incorrectly assumed that the column number for tupcount column was the same in pg_aoseg and pg_aocsseg tables. This cause it to incorrectly return true even when the CO relation was not empty. This method is used in vacuum to determine if the auxiliary relations need to be vacuumed. Due to the bug, vacuum would update the pg_aocsseg relation and vacuum it within the same transaction and hence generate the NOTICE that it can't shrink the relation because transaction is already in progress and would not shrink the relation. Also make sure that we do a vacuum on the auxiliary relations only in two cases :- 1. Vacuum cleanup phase 2. Relation is empty and we are in prepare phase Otherwise we will end up with the same issue above if some of the segments have zero rows
-