1. 16 6月, 2016 2 次提交
  2. 15 6月, 2016 10 次提交
    • D
      Fix documentation typos in cdbgroup · 67414f01
      Daniel Gustafsson 提交于
      67414f01
    • H
      Re-enable psql queries that depend on 8.3 catalogs. · a243b842
      Heikki Linnakangas 提交于
      These queries were backported from later PostgreSQL releases earlier,
      but had to be disabled when we started the 8.3 merge, because we didn't
      have all the catalog changes from 8.3 yet. Now we do, so re-enable the
      queries.
      a243b842
    • H
      Backport the "proper" fix for CVE-2013-0255. · 40d6ea40
      Heikki Linnakangas 提交于
      The PostgreSQL 8.3 merge included a band-aid fix for this, but since we
      haven't yet made a GPDB 5.0 release, and are therefore free to still modify
      the catalogs, let's apply the proper fix that went into PostgreSQL 9.3 at
      the time.
      
      commit 71627f3d
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Wed Feb 13 16:20:01 2013 -0500
      
          Fix CVE-2013-0255 properly.
      
          Revert commit ab0f7b60 (in HEAD only)
          in favor of the proper solution, which is to declare enum_recv() correctly
          in the system catalogs.  It should be declared to take type "internal"
          not "cstring".
      
          Also improve the type_sanity regression test, which should have caught
          this typo, so that it actually would.  Most of the relevant checks on
          the signature of type I/O functions should not have been restricted to
          basetypes/pseudotypes, as they should apply to any type's I/O functions.
      40d6ea40
    • H
      Merge with PostgreSQL 8.3.23. · a453004e
      Heikki Linnakangas 提交于
      Everything in PostgreSQL 8.3 is now in Greenplum. PostgreSQL 8.3 has been
      end-of-lifed in upstream, so there will be no new upstream minor releases
      of this series anymore either.
      
      This includes a lot of new features from PostgreSQL 8.3, and a ton of bug
      fixes. See upstream release notes for details. One big user-visible change
      included is the removal of implicit cast from other datatypes to text. When
      this was done in PostgreSQL, it caused a lot of sloppily written
      application queries to break. Which is a good thing in the long run, but it
      caused a lot of pain on upgrade.
      
      A few features work slightly differently in GPDB:
      
      * Lazy XIDs feature in upstream reduces XID consumption, by not assigning
        XIDs to read-only transactions. That optimization has not been
        implemented for GPDB distributed transactions, however, so practically
        all queries in GPDB still consume XIDs, whether they're read-only or not.
      
      * temp_tablespaces GUC was added in upstream, but it has been disabled in
        GPDB, in favor of GPDB-specific system to decide which filespace to use
        for temporary files.
      
      * B-tree indexes can now be used to answer "col IS NULL" queries. That
        support has not been implemented for bitmap indexes.
      
      * Support was added for "top N" sorting, speeding up queries with ORDER BY
        and LIMIT. That support was not implemented for tuplesort_mk.c, however,
        so you only get that benefit with enable_mk_sort=off.
      
      * Multi-worker autovacuum support was merged, but autovacuum as whole is
        still disabled in GPDB.
      
      * Planner hook was added. Nothing special there, but we should consider
        refactoring the ORCA glue code to use it.
      
      * Plan cache invalidation was added upstream, which made the
        gp_plpgsql_clear_cache_always option obsolete. We also backported the
        further invalidation improvements from 8.4. See below for more
        information.
      
      In addition to those, there were some big refactorings that were not that
      interesting from user point of view, but caused a lot of code churn:
      InitPlans are now initialized separately at executor startup, the executor
      range table was changed to be "flat", and code related to parse analysis of
      DDL commands was moved around. The way UPDATE WHERE CURRENT OF was
      implemented in GPDB was refactored to match the new upstream support for
      the same (see below for more information).
      
      Plan cache invalidation
      -----------------------
      
      One notable feature included in this merge is the support for plan cache
      invalidation. GPDB contained a hack to invalidate all plans in master
      between transactions, for the same purpose, but the proper plan cache
      invalidation is a much better solution. However, because the GPDB hack also
      handled dropped/recreated functions in some cases, if we just leave it out
      and replace with the 8.3 plan cache invalidation, there will be a
       regression in those cases. To fix, this merge commit includes a backport
      the rest of the plan cache invalidation support from PostgreSQL 8.4, which
       handles functions and operators too. The upstream commit for that was:
      
      commit ee33b95d
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Sep 9 18:58:09 2008 +0000
      
          Improve the plan cache invalidation mechanism to make it invalidate plans
          when user-defined functions used in a plan are modified.  Also invalidate
          plans when schemas, operators, or operator classes are modified; but for these
          cases we just invalidate everything rather than tracking exact dependencies,
          since these types of objects seldom change in a production database.
      
          Tom Lane; loosely based on a patch by Martin Pihlak.
      
      As a result of that, the gp_plpgsql_clear_cache_always GUC is removed, as
      it's no longer needed. This also includes new test cases in the plpgsql
      cache regression test, to demonstrate the improvements.
      
      CURRENT OF code changes
      -----------------------
      
      This merge includes the upstream support for UPDATE WHERE CURRENT OF. GPDB
      had support for that already, it's been refactored to use the upstream code
      as much as possible (execCurrentOf()). Now that we have the upstream code
      available, this also refactors the way current position of a cursor is
      dispatched. Instead of modifying every CurrentOfExpr node in the plan tree
      before dispatching it, store the current positions of each cursor mentioned
      in a CurrentOfExpr in a separate CursorPosInfo node, and dispatch those
      along with the query plan, in QueryDispatchDesc.
      
      This was a joint effort between me (Heikki) and Daniel Gustafsson.
      a453004e
    • H
      Change the way static Partition Selector nodes are printed in EXPLAIN · 5dc7dda9
      Heikki Linnakangas 提交于
      The printablePredicate of a static PartitionSelector node contains Var nodes
      with varno=INNER. That's bogus, because the PartitionSelector node doesn't
      actually have any child nodes, but works at execution time because the
      printablePredicate is not only used by EXPLAIN. In most cases, it still
      worked, because most Var nodes carry a varnoold field, which is used by
      EXPLAIN for the lookup, but there was one case of "bogus varno" error even
      memorized in the expected output of the regression suite. (PostgreSQL 8.3
      changed the way EXPLAIN resolves the printable name so that varnoold no
      longer saves the bacon, and you would get a lot more of those errors)
      
      To fix, teach the EXPLAIN of a Sequence node to also reach into the
      static PartitionSelector node, and print the printablePredicate as if that
      qual was part of the Sequence node directly.
      
      The user-visible effect of this is that the static Partition Selector
      expression now appears in EXPLAIN output as a direct attribute of the
      Sequence node, not as a separate child node. Also, if a static Partition
      Selector doesn't have a "printablePredicate", i.e. it doesn't actually
      do any selection, it's not printed at all.
      5dc7dda9
    • H
      Add support for CoerceViaIO nodes to ORCA translator library. · 84286b9c
      Heikki Linnakangas 提交于
      This is mostly copy-pasted from CoerceToDomain.
      
      Note: This requires an up-to-date version of ORCA to compile, older
      versions of the ORCA library itself don't know about CoerceViaIO nodes
      either.
      84286b9c
    • H
      Revert unintentional change to horology.sql. · 72d6b148
      Heikki Linnakangas 提交于
      This change in commit af5e4bcf was surely not intentional, and made the
      test to fail.
      72d6b148
    • H
      Porting optimizer functional tests to ICG [#120031461] · af5e4bcf
      Haisheng Yuan and Omer Arap 提交于
          Port `dml/triggers`
          Port `olap/groupingfunction`
          Port `functions/builtin`
          Port `dml/functional`
          Port `dml/functional/sql_partiton`
          Port `functions/functionProperty`
      
          Update optfunctional_schedule with ported tests
      
          Update the Makefile for optimizer functional tests
      af5e4bcf
    • N
      Allocate tuples retrieved by a SharedScan in a longer living memory context · c36ecd6f
      Nikos Armenatzoglou 提交于
      When SharedScan requests a tuple from the underlying Sort, and the Sort has spilled to disk, the mk_sort code (also the regular sort) makes a copy of the memtuple, but places it in the sortcontext.
      When hitting a QueryFinishPending (e.g., master received enough tuples, so stop the execution), EagerFree is called for the entire tree. When reaching the sort node, the sortcontext is deleted, even though the SharedScan above it still has a pointer to a memtuple allocated in this context.
      
      The solution is: pass a memory context for the mk_sort and regular sort to use (change the API and implementation of the following two functions: tuplesort_gettupleslot_pos_mk and tuplesort_gettupleslot_pos). Those functions should use the context provided by the caller to make a copy of the memtuple.
      Signed-off-by: NGeorge Caragea <gcaragea@pivotal.io>
      c36ecd6f
    • J
      Tiny bugfix for gpcrondump · abf26ba2
      Jamie McAtamney 提交于
      Removed extraneous argument to _prompt_continue that would cause
      gpcrondump to fail if run without -a.
      abf26ba2
  3. 14 6月, 2016 4 次提交
  4. 13 6月, 2016 3 次提交
    • A
      Small improvements to the vagrant build for centos · fb911c72
      Andreas Scherbaum 提交于
      Fix provided by @akon-dey
      Closes: #780
      fb911c72
    • K
      Dispatch exactly same text string for all slices. · 4b360942
      Kenan Yao 提交于
      Include a map from sliceIndex to gang_id in the dispatched string,
      and remove the localSlice field, hence QE should get the localSlice
      from the map now. By this way, we avoid duplicating and modifying
      the dispatch text string slice by slice, and each QE of a sliced
      dispatch would get same contents now.
      
      The extra space cost is sizeof(int) * SliceNumber bytes, and the extra
      computing cost is iterating the SliceNumber-size array. Compared with
      memcpy of text string for each slice in previous implementation, this
      way is much cheaper, because SliceNumber is much smaller than the size
      of dispatch text string. Also, since SliceNumber is so small, we just
      use an array for the map instead of a hash table.
      
      Also, clean up some dead code in dispatcher, including:
      (1) Remove primary_gang_id field of Slice struct and DispatchCommandDtxProtocolParms
      struct, since dispatch agent is deprecated now;
      (2) Remove redundant logic in cdbdisp_dispatchX;
      (3) Clean up buildGpDtxProtocolCommand;
      4b360942
    • P
      Fix bugs when a SET command is executed within init plan · 2cda812c
      Pengzhou Tang 提交于
      In commit d2725929, GPDB marked all allocatedReaderGangs with noReuse flag. When plan contains
      init plan and a SET command executed within it, GPDB will mark pre-assigned gangs to noReuse and
      destroy them which make query crash
      2cda812c
  5. 11 6月, 2016 4 次提交
    • F
      The GPDB Vmem is the lowest layer of memory allocator that supports higher... · 42ca3506
      Foyzur Rahman 提交于
          The GPDB Vmem is the lowest layer of memory allocator that supports higher allocators such as AllocSet. This layer (mostly defined in memprot.c) is in charge of actually calling malloc/realloc/free to allocate/reallocate/free memory. In this process this layer is also in charge of reserving "virtual" memory or Vmem, which is a GPDB specific shared memory counter to track per-segment combined allocations across all the GPDB processes under Vmem umbrella. The Vmem counter is managed by a separate module Vmem_Tracker, and the memprot functions (such as gp_malloc, gp_free2 and gp_realloc) call the APIs provided by VmemTracker.
      
          Previously the memprot allocators (gp_alloc/gp_realloc/gp_free) were only allocating/freeing memory but were not adding any additional metadata in the header (and there was no header) to track the size of allocations. Therefore, there was no gp_free as freeing memory requires the size of the free to adjust Vmem counter inside VmemTracker. This was patched by explicitly passing size info in gp_free2.
      
          In this PR we do the following:
      
          * We add allocation size in Vmem header (along with checksums which are only available in debug build to detect header and footer boundary, and buffer overruns).
      
          * We remove size information from the block header of AllocSet.
      
          * We rename gp_free2 to gp_free as the second parameter (size information) is now obtained from the header and therefore no longer necessary
      
          * We modify all the consumers of memprot.c APIs to use the new APIs
      
          * We add unit tests to test the metadata and the correctness of the new Vmem allocators
      
          This is the first step to integrate external modules and third party allocations with Vmem. A long running issue in GPDB is its inability to track allocations by external components including libraries such as ORCA. Therefore, the central Vmem counter is often way off from the underlying allocations, and this may run the system out of memory. By maintaining the size information in the Vmem header, we now have a self-contained allocator that can be exposed to external allocators such as GPOS allocators, without forcing them to manage size information separately.
      
          This fixes #117269929.
      Signed-off-by: NMarc Spehlmann <marc.spehlmann@gmail.com>
      42ca3506
    • J
      [#120984085] Adds GUC for array expansion. · 53230187
      Jesse Zhang and Marc Spehlmann 提交于
      This GUC will be used to control the MEMO size as well as optimization
      time for large IN list or large array comparison expressions.
      
      Only the Array with less number of elements than the GUC will be
      expanded and participate in constraint derivation.
      
      Trade-off of using this GUC is loss of potential benefits from the
      constraint derivation (e.g. conflict detection, partition elimination)
      with shorter optimization time and less memory utilization.
      53230187
    • N
    • N
      Revert "Reset g_dataSourceCtx variable, since the context is destroyed at EOX." · b918bb17
      Nikos Armenatzoglou 提交于
      This reverts commit d9edb869.
      g_dataSourceCtx should not be reset in AtAbort_ExtTables, since external tables is not the only component that uses it.
      b918bb17
  6. 10 6月, 2016 2 次提交
    • J
      Add check for backup directory existence to gprecoverseg · 5b853b3a
      Jamie McAtamney 提交于
      A previous commit added the capability to gprecoverseg to copy all
      backup files from the mirror to the primary during a full recovery,
      but assumed that the backup directory would always exist and so
      gprecoverseg -F would fail if it was not present.  This commit adds
      a check to ensure that gprecoverseg -F will finish successfully if
      there is no backup directory.
      5b853b3a
    • H
      Update minirepro and pg_dump utility to dump both relation and function ddl · 318b86af
      Haisheng Yuan 提交于
      This patch updates minirepro utility to support the following functions:
      1. Dump ddl and stats for multiple queries in a query file
      2. Dump ddl of relation that is used in CTE
      3. Dump ddl of function that is used in the query
      4. Add 2 options: relation-oids and function-oids into pg_dump command line tool
      318b86af
  7. 09 6月, 2016 4 次提交
    • H
      Simplify counting of tupletable slots, by getting rid of the counting. · 10ca88b4
      Heikki Linnakangas 提交于
      Backport this patch from PostgreSQL 9.0, which replaces the tuple table
      array with a linked list of individually palloc'd slots. With that, we
      don't need to know the size of the array beforehand, and don't need to
      count the slots. The counting was especially funky for subplans in GPDB,
      and it was about to change with the upcoming PostgreSQL 8.3 merge again.
      This makes it a lot simpler.
      
      I don't plan to backport the follow-up patch to remove the ExecCountSlots
      infrastructure. We'll get that later, when we merge with PostgreSQL 9.0.
      
      commit f92e8a4b
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Sun Sep 27 20:09:58 2009 +0000
      
          Replace the array-style TupleTable data structure with a simple List of
          TupleTableSlot nodes.  This eliminates the need to count in advance
          how many Slots will be needed, which seems more than worth the small
          increase in the amount of palloc traffic during executor startup.
      
          The ExecCountSlots infrastructure is now all dead code, but I'll remove it
          in a separate commit for clarity.
      
          Per a comment from Robert Haas.
      10ca88b4
    • H
      Change the way temp schemas are created. · 8a2cf323
      Heikki Linnakangas 提交于
      This reverts much of the changes vs. upstream, related to temp schema
      creation. Instead of using the normal CREATE SCHEMA processing to also
      create the temporary schema, let InitTempTableNameSpace() to do that
      like in the upstream. But in addition to creating the the temp schema
      locally, it dispatches a special CreateSchemaStmt command to the
      executor nodes, which instructs the executor nodes to also call
      InitTempTableNameSpace().
      8a2cf323
    • H
      Bump check on regression database size to 5 GB. · 2ba36ba6
      Heikki Linnakangas 提交于
      The regression database has grown over time, so that it's just above the
      1 GB size that the regression test used as a "sanity check". I think the
      new zlib regression test broke the camel's back. Bump it up to 5 GB, giving
      us about 4 GB of headroom to grow.
      2ba36ba6
    • S
      c19b3a05
  8. 08 6月, 2016 4 次提交
  9. 07 6月, 2016 7 次提交
    • H
      Don't call ExecAssignScanProjectionInfo while in partitionMemoryContext. · d7662fd7
      Heikki Linnakangas 提交于
      This allows removing the weird pfree() of the resultTupleSlot's tuple
      descriptor. What would've happened without the pfree() is that the old
      slot was allocated in the first ExecAssignScanProjectionInfo() call, in
      partitionMemoryContext, and then immediately destroyed when the memory
      context was reset. The second call to ExecAssignScanProjectionInfo() tries
      to free the slot, again, causing the segfault. But we can avoid that
      by this rearrangement of the calls in a cleaner way.
      
      In the passing, clean up the code a bit. I found having separate variables,
      indexState and scanState, which point to the same struct, to be confusing.
      d7662fd7
    • H
      Fix out-of-bounds writes to scanTupleSlot · a33699bc
      Heikki Linnakangas 提交于
      ss_ScanTupleSlot is not an array, it's a single slot. The slot is allocated
      from a bigger array, however, so this trampled over some other slot that was
      allocated right after the scan slot. This has apparently been harmless, as
      no-one's noticed, but it's surely wrong.
      
      I bumped into this in the PostgreSQL 8.3 merge branch, where I had changed
      the way the slots are allocated so that they're not stored in one big array
      anymore. This bug led to segfaults in that case.
      a33699bc
    • H
      Remove limitation that a default must be provided for ADD COLUMN on AO table · 455f0e19
      Heikki Linnakangas 提交于
      We had later added code that allows "DEFAULT NULL", by doing a table
      rewrite. DEFAULT NULL is really the same as no default, so we might as well
      do a table rewrite for that case too, and save the code needed to handle
      them differently.
      455f0e19
    • H
      Honour PLAIN storage also when constructing a memtuple. · 625ffa82
      Heikki Linnakangas 提交于
      A function might legitimately assume that an argument's varlen datum always
      has a 4-byte header, if the datatype is marked as 'plain', and crash if we
      then pass it a datum with 1-byte header, because it was packed in a
      memtuple. I bumped into this while working on the PostgreSQL 8.3 merge,
      because the merge brought us one such function: ts_rewrite().
      625ffa82
    • H
      Remove duplicated bitgetpage() function. · 77659037
      Heikki Linnakangas 提交于
      The one in nodeBitmapHeapScan.c is what the upstream has. The one in
      execBitmapHeapScan.c was copied from it in GPDB. No need for the
      duplication.
      
      It's not clear to me why we have the execBitmapHeapScan.c file at all,
      why not just use all the functions in nodeBitmapHeapScan.c. But I'll
      leave investigating that for another day.
      77659037
    • H
      Move the check for gp_mapreduce, to reduce diff vs. upstream. · ff881c1a
      Heikki Linnakangas 提交于
      This has the effect that gpmapreduce is exempt from logging also when it
      uses the extended query protocol (we only to only perform the checks in
      exec_simple_query()). That makes no difference in practice, though, because
      gpmapreduce doesn't actually use the extended query protocol.
      ff881c1a
    • D
      Remove instr_time definition in favor of portability/instr_time.h · e4b43c18
      Daniel Gustafsson 提交于
      The redefinitions broke the win32 build on Pulse. This was part of
      3bc25384 which was backported from
      upstream but left this in. This makes the win32 Pulse build green.
      e4b43c18