1. 12 3月, 2019 1 次提交
  2. 05 3月, 2019 1 次提交
    • P
      COPY a replicated table on a subset of segments should work · de99f760
      Pengzhou Tang 提交于
      Commit 4eb65a53 bring GPDB the ability to allow tables to be distributed on
      a subset of segments, that commit taken care of the replicated table well for
      SELECT/UPDATE/INSERT on a subset, however, COPY FROM was not.
      
      For COPY FROM replicated table, only one replica should be picked to provide
      data and the QE whose segid match gp_session_id % segment_size will be chosen,
      obviously, for table on a subset of segments, a valid QE might be chosen.
      
      To fix it, the real numsegments of replicated table should be used instead of
      current segment size, what's more, dispatcher can allocate gangs on a subset
      of segments now, QD can directly allocate picked gang directly, QE doesn't need
      to care about whether it should provide data anymore.
      
      For COPY TO replicated table, we also should allocated correct QEs matching the
      numsegments of replicated table.
      Co-authored-by: NGang Xiong <gxiong@pivotal.io>
      de99f760
  3. 13 2月, 2019 1 次提交
  4. 14 1月, 2019 1 次提交
    • H
      Set client_encoding to match DB encoding in QD->QE connections. · a6c9b436
      Heikki Linnakangas 提交于
      QE processes largely don't care about client_encoding, because query
      results are sent through the interconnect, except for a few internal
      commands, and the query text is presumed to already be in the database
      encoding, in QD->QE messages.
      
      But there were a couple of cases where it mattered. Error messages
      generated in QEs were being converted to client_encoding, but QD assumed
      that they were in server encoding.
      
      Now that the QEs don't know the user's client_encoding, COPY TO needs
      changes. In COPY TO, the QEs are responsible for forming the rows in the
      final cilent_encoding, so the QD now needs to explicitly use the COPY's
      ENCODING option, when it dispatches the COPY to QEs.
      
      The COPY TO handling wasn't quite right, even before this patch. It showed
      up as regression failure in the src/test/mb/mbregress.sh 'sjis' test. When
      client_encoding was set with the PGCLIENTENCODING, however, it wasn't set
      correctly in the QEs, which showed up as incorrectly encoded COPY output.
      Now that we always set it to match the database encoding in QEs, that's
      moot.
      
      While we're at it, change the mbregress test so that it's not sensitive to
      row orderings. And make atmsort.pm more lenient, to recognize "COPY
      <tablename> TO STDOUT", even when the tablename contains non-ascii
      characters. These changes were needed to make the src/test/mb/ tests pass
      cleanly.
      
      Fixes https://github.com/greenplum-db/gpdb/issues/5241.
      
      Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/WPmHXuU9T94/gvpNOE73FwAJReviewed-by: NPengzhou Tang <ptang@pivotal.io>
      a6c9b436
  5. 11 1月, 2019 1 次提交
  6. 25 12月, 2018 1 次提交
    • D
      Make SendTuple do de-toasting for heap as well as memtuples (#6527) · bf5e3d5d
      David Kimura 提交于
      * Make SendTuple do de-toasting for heap as well as memtuples
      
      Motion node didn't consistently handle where MemTuples and HeapTuples
      are de-toasted. This change delays de-toasting of MemTuple in motion
      sender until SerializeTupleDirect(), which is where HeapTuples are
      de-toasted.
      
      This change also corrects a bug in de-toasting of memtuples in motion
      sender.  De-toasting of memtuples marked the slot containing the
      de-toasted memtuple as virtual.  When a TupleTableSlot is marked
      virtual, its values array should point to addresses within the memtuple.
      The bug was that the values array was populated *before* de-toasting and
      the memory used by the toasted memtuple was freed.  Subsequent usages of
      this slot suffered from dereferencing invalid memory through the
      pointers in values array.
      
      This change corrects the issue by simplifying ExecFetchSlotMemTuple() to
      no longer manage memory required to de-toast which and also aligns the
      code structure more similar to ExecFetchSlotMinimalTuple() in Postgres.
      An alternative solution that we considered is to continue de-toasting in
      ExecFetchSlotMemTuple() and clear the virtual bit before freeing the
      tuple. A disadvantage of that approach is that it modifies an existing
      slot and complicates the contract of memtuple and virtual table slots.
      Also, it doesn't match with where HeapTuples are de-toasted so the
      contract becomes less clear.
      
      The bug is hard to discover because motion node in executor is the only place
      where a memtuple is de-toasted, before sending it over interconnect.  Only a
      few plan nodes make use of a tuple that's already returned to the parent nodes
      e.g. Unique (DISTINCT ON()) and SetOp (INTERSECT / EXCEPT).  Example plan that
      manifests the bug:
      
       Gather Motion 3:1
         Merge Key: a, b
         ->  Unique
               Group By: a, b
               ->  Sort
                     Sort Key (Distinct): a, b
                     ->  Seq Scan
      
      The Unique node uses the previous tuple de-toasted by the Gather Motion sender
      to compare with the new tuple obtained from the Sort node for eliminating
      duplicates.
      Co-authored-by: NAsim R P <apraveen@pivotal.io>
      bf5e3d5d
  7. 24 12月, 2018 1 次提交
  8. 15 12月, 2018 1 次提交
  9. 14 12月, 2018 1 次提交
  10. 13 12月, 2018 1 次提交
    • D
      Reporting cleanup for GPDB specific errors/messages · 56540f11
      Daniel Gustafsson 提交于
      The Greenplum specific error handling via ereport()/elog() calls was
      in need of a unification effort as some parts of the code was using a
      different messaging style to others (and to upstream). This aims at
      bringing many of the GPDB error calls in line with the upstream error
      message writing guidelines and thus make the user experience of
      Greenplum more consistent.
      
      The main contributions of this patch are:
      
      * errmsg() messages shall start with a lowercase letter, and not end
        with a period. errhint() and errdetail() shall be complete sentences
        starting with capital letter and ending with a period. This attempts
        to fix this on as many ereport() calls as possible, with too detailed
        errmsg() content broken up into details and hints where possible.
      
      * Reindent ereport() calls to be more consistent with the common style
        used in upstream and most parts of Greenplum:
      
      	ereport(ERROR,
      			(errcode(<CODE>),
      			 errmsg("short message describing error"),
      			 errhint("Longer message as a complete sentence.")));
      
      * Avoid breaking messages due to long lines since it makes grepping
        for error messages harder when debugging. This is also the de facto
        standard in upstream code.
      
      * Convert a few internal error ereport() calls to elog(). There are
        no doubt more that can be converted, but the low hanging fruit has
        been dealt with. Also convert a few elog() calls which are user
        facing to ereport().
      
      * Update the testfiles to match the new messages.
      
      Spelling and wording is mostly left for a follow-up commit, as this was
      getting big enough as it was. The most obvious cases have been handled
      but there is work left to be done here.
      
      Discussion: https://github.com/greenplum-db/gpdb/pull/6378Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      56540f11
  11. 12 12月, 2018 2 次提交
    • H
      Don't pass distribution policy from QD to QEs in in COPY. · 8acaa682
      Heikki Linnakangas 提交于
      We store the distribution policy in segments nowadays (since commit
      7efe3204). This is no longer needed.
      Reviewed-by: NAdam Lee <ali@pivotal.io>
      8acaa682
    • H
      Optimize the code to send data rows from QD to QEs in COPY FROM. · 64713257
      Heikki Linnakangas 提交于
      This code is performance-critical, as the QD will quickly become the
      bottleneck when loading data with COPY FROM. Optimizing it is therefore
      warranted.
      
      I did some quick testing on my laptop, loading a table with 50 boolean
      columns. This commit improved the performance of that by about 10%.
      
      In GPDB 5, we used to postpone calling the input functions for columns
      that were not needed to determine which segment to send the row to. We
      lost that optimization during the 9.1 merge, when the COPY code was
      heavily refactored. We now call the input functions for every column in
      the QD, which added a lot of overhead to the QD. On the other hand, some
      other things got faster. Before this commit, the test case with 50
      booleans was about 10% slower in GPDB 6, compared to GPDB 5, so this buys
      back the performance, so that it's about the same speed again. I'm sure
      you can find a test case where GPDB 6 is still slower - you can make the
      input functions arbitrarily slow, after all - but every little helps.
      Reviewed-by: NAdam Lee <ali@pivotal.io>
      64713257
  12. 07 12月, 2018 5 次提交
  13. 04 12月, 2018 1 次提交
  14. 03 12月, 2018 1 次提交
  15. 29 11月, 2018 7 次提交
  16. 14 11月, 2018 1 次提交
    • D
      Avoid freeing memory in error path · de09e863
      Daniel Gustafsson 提交于
      Erroring out via ereport(ERROR ..) will clean up resources allocated
      during execution, so explicitly freeing right before is not useful
      (unless the allocation is the in TopMemoryContext).  Remove pfree()
      calls for lower allocations, and reorder one to happen just after a
      conditional ereport instead to make for slightly easier debugging
      when breaking on the error.
      Reviewed-by: NJacob Champion <pchampion@pivotal.io>
      de09e863
  17. 13 11月, 2018 1 次提交
    • J
      Support 'copy (select statement) to file on segment' (#6077) · bad6cebc
      Jinbao Chen 提交于
      In ‘copy (select statement) to file’, we generate a query plan and set
      its dest receivor to copy_dest_receive. And run the dest receivor on QD.
      In 'copy (select statement) to file on segment', we modify the query plan,
      delete gather mothon, and let dest receivor run on QE.
      
      Change 'isCtas' in Query to 'parentStmtType' to be able to mark the upper
      utility statement type. Add a CopyIntoClause node to store copy
      informations. Add copyIntoClause to PlannedStmt.
      
      In postgres, we don't need to make a different query plan for the
      query in the utility stament. But in greenplum, we need to.
      So we use a field to indicate whether the query is contained in utitily
      statemnt, and the type of utitily statemnt.
      
      Actually the behavior of 'copy (select statement) to file on segment'
      is very similar to 'SELECT ... INTO ...' and 'CREATE TABLE ... AS SELECT ...'.
      We use distribution policy inherent in the query result as the final data
      distribution policy. If not, we use the first clomn in target list as the key,
      and redistribute. The only difference is that we used 'copy_dest_receiver'
      instead of 'intorel_dest_receiver'
      bad6cebc
  18. 06 11月, 2018 1 次提交
  19. 05 11月, 2018 1 次提交
  20. 29 10月, 2018 1 次提交
  21. 25 10月, 2018 1 次提交
    • T
      Unify the way to fetch/manage the number of segments (#6034) · 8eed4217
      Tang Pengzhou 提交于
      * Don't use GpIdentity.numsegments directly for the number of segments
      
      Use getgpsegmentCount() instead.
      
      * Unify the way to fetch/manage the number of segments
      
      Commit e0b06678 lets us expanding a GPDB cluster without a restart,
      the number of segments may changes during a transaction now, so we
      need to take care of the numsegments.
      
      We now have two way to get segments number, 1) from GpIdentity.numsegments
      2) from gp_segment_configuration (cdb_component_dbs) which dispatcher used
      to decide the segments range of dispatching. We did some hard work to
      update GpIdentity.numsegments correctly within e0b06678 which made the
      management of segments more complicated, now we want to use an easier way
      to do it:
      
      1. We only allow getting segments info (include number of segments) through
      gp_segment_configuration, gp_segment_configuration has newest segments info,
      there is no need to update GpIdentity.numsegments, GpIdentity.numsegments is
      left only for debugging and can be removed totally in the future.
      
      2. Each global transaction fetches/updates the newest snapshot of
      gp_segment_configuration and never change it until the end of transaction
      unless a writer gang is lost, so a global transaction can see consistent
      state of segments. We used to use gxidDispatched to do the same thing, now
      it can be removed.
      
      * Remove GpIdentity.numsegments
      
      GpIdentity.numsegments take no effect now, remove it. This commit
      does not remove gp_num_contents_in_cluster because it needs to
      modify utilities like gpstart, gpstop, gprecoverseg etc, let's
      do such cleanup work in another PR.
      
      * Exchange the default UP/DOWN value in fts cache
      
      Previously, Fts prober read gp_segment_configuration, checked the
      status of segments and then set the status of segments in the shared
      memory struct named ftsProbeInfo->fts_status[], so other components
      (mainly used by dispatcher) can detect a segment was down.
      
      All segments were initialized as down and then be updated to up in
      most common cases, this brings two problems:
      
      1. The fts_status is invalid until FTS does the first loop, so QD
      need to check ftsProbeInfo->fts_statusVersion > 0
      2. gpexpand add a new segment in gp_segment_configuration, the
      new added segment may be marked as DOWN if FTS doesn't scan it
      yet.
      
      This commit changes the default value from DOWN to UP which can
      resolve problems mentioned above.
      
      * Fts should not be used to notify backends that a gpexpand occurs
      
      As Ashwin mentioned in PR#5679, "I don't think giving FTS responsibility to
      provide new segment count is right. FTS should only be responsible for HA
      of the segments. The dispatcher should independently figure out the count
      based on catalog.gp_segment_configuration should be the only way to get
      the segment count", FTS should decouple from gpexpand.
      
      * Access gp_segment_configuration inside a transaction
      
      * upgrade log level from ERROR to FATAL if expand version changed
      
      * Modify gpexpand test cases according to new design
      8eed4217
  22. 19 10月, 2018 1 次提交
    • H
      Fix error handling in "COPY <table> TO <file>". · eab449f8
      Heikki Linnakangas 提交于
      If an error occurred in the segments, in a "COPY <table> TO <file>"
      command, the COPY was stopped, but the error was not reported to the user.
      That gave the false impression that it finished successfully, but what you
      actually got was an incomplete file.
      
      A test case is included. It uses a little helper output function that
      sometimes throws an error. Output functions are fairly unlikely to fail,
      but it could happen e.g. because of an out of memory error, or a disk
      failure. The "COPY (SELECT ...) TO <file>" variant did not suffer from
      this (otherwise, a query that throws an error would've been a much simpler
      way to test this.)
      
      The reason for this was that the code in cdbCopyGetData() that called
      PQgetResult(), and extracted the error message from the result, didn't
      indicate to the caller in any way that the error happened. To fix, delay
      the call to PQgetResult(), to a later call to cdbCopyEnd(). cdbCopyEnd()
      already had the logic to extract the error information from the PGresult,
      and throw it to the user. While we're at it, refactor cdbCopyEnd a
      little bit, to give the callers a nicer function signature.
      
      I also changed a few places that used 32-bit int to store rejected row
      counts, to use int64 instead. There was a FIXME comment about that. I
      didn't fix all the places that do that, though, so I moved the FIXME to
      one of the remaining places.
      
      Apply to master branch only. GPDB 5 didn't handle this too well, either;
      with the included test case, you got an error like this:
      
      postgres=# copy broken_type_test to '/tmp/x';
      ERROR:  missing error text
      
      That's not very nice, but at least you get an error, even if it's not a very
      good one. The code looks quite different in 5X_STABLE, so I'm not going to
      attempt improving that.
      Reviewed-by: NAdam Lee <ali@pivotal.io>
      eab449f8
  23. 28 9月, 2018 1 次提交
    • Z
      Allow tables to be distributed on a subset of segments · 4eb65a53
      ZhangJackey 提交于
      There was an assumption in gpdb that a table's data is always
      distributed on all segments, however this is not always true for example
      when a cluster is expanded from M segments to N (N > M) all the tables
      are still on M segments, to workaround the problem we used to have to
      alter all the hash distributed tables to randomly distributed to get
      correct query results, at the cost of bad performance.
      
      Now we support table data to be distributed on a subset of segments.
      
      A new columne `numsegments` is added to catalog table
      `gp_distribution_policy` to record how many segments a table's data is
      distributed on.  By doing so we could allow DMLs on M tables, joins
      between M and N tables are also supported.
      
      ```sql
      -- t1 and t2 are both distributed on (c1, c2),
      -- one on 1 segments, the other on 2 segments
      select localoid::regclass, attrnums, policytype, numsegments
          from gp_distribution_policy;
       localoid | attrnums | policytype | numsegments
      ----------+----------+------------+-------------
       t1       | {1,2}    | p          |           1
       t2       | {1,2}    | p          |           2
      (2 rows)
      
      -- t1 and t1 have exactly the same distribution policy,
      -- join locally
      explain select * from t1 a join t1 b using (c1, c2);
                         QUERY PLAN
      ------------------------------------------------
       Gather Motion 1:1  (slice1; segments: 1)
         ->  Hash Join
               Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
               ->  Seq Scan on t1 a
               ->  Hash
                     ->  Seq Scan on t1 b
       Optimizer: legacy query optimizer
      
      -- t1 and t2 are both distributed on (c1, c2),
      -- but as they have different numsegments,
      -- one has to be redistributed
      explain select * from t1 a join t2 b using (c1, c2);
                                QUERY PLAN
      ------------------------------------------------------------------
       Gather Motion 1:1  (slice2; segments: 1)
         ->  Hash Join
               Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
               ->  Seq Scan on t1 a
               ->  Hash
                     ->  Redistribute Motion 2:1  (slice1; segments: 2)
                           Hash Key: b.c1, b.c2
                           ->  Seq Scan on t2 b
       Optimizer: legacy query optimizer
      ```
      4eb65a53
  24. 06 9月, 2018 1 次提交
    • H
      Fix command tag in "COPY (SELECT ...) TO <file>". · f5130c20
      Heikki Linnakangas 提交于
      It used to always say "COPY 0", instead of the number of rows copied. This
      source line was added in PostgreSQL 9.0 (commit 8ddc05fb), but it was
      missed in the merge. Add a test case to check the command tags of
      different variants of COPY, including this one.
      f5130c20
  25. 05 9月, 2018 1 次提交
  26. 15 8月, 2018 1 次提交
  27. 14 8月, 2018 1 次提交
    • P
      Refine dispatching of COPY command · a1b6b2ae
      Pengzhou Tang 提交于
      Previously, COPY use CdbDispatchUtilityStatement directly to
      dispatch 'COPY' statements to all QEs and then send/receive
      data from primaryWriterGang, this way happens to work because
      primaryWriterGang is not recycled when a dispatcher state is
      destroyed. This seems nasty because the COPY command has finished
      logically.
      
      This commit splits the COPY dispatching logic to two parts to
      make it more reasonable.
      a1b6b2ae
  28. 03 8月, 2018 2 次提交