1. 09 11月, 2020 5 次提交
    • N
      ic-proxy: refresh peers on demand · 9265ea6a
      Ning Yu 提交于
      The user can adjust the ic-proxy peer addresses at runtime and reload by
      sending SIGHUP, if an address is modified or removed, the corresponding
      peer connection must be closed or reestablished.  The same to the peer
      listener, if the listener port is changed, then must re-setup the
      listener.
      9265ea6a
    • N
      ic-proxy: classify peer addresses · 854c4b84
      Ning Yu 提交于
      The peer addresses are specified with the GUC
      gp_interconnect_proxy_addresses, it can be reloaded on SIGHUP, we used
      to only care about the newly added ones, however it is also possible for
      the user to modify them, or even remove some of them.
      
      So now we add the logic to classify the addresses after parsing the GUC,
      we can tell whether an address is added, removed, or modified.
      
      The handling of the classified addresses will be done in the next
      commit.
      854c4b84
    • N
      ic-proxy: optimize looking up of my addr · 40facdb1
      Ning Yu 提交于
      We used to scan the whole addr list to find my addr, now we record it
      directly when parsing the addresses.
      40facdb1
    • N
      ic-proxy: rename ICProxyAddr.addr to sockaddr · 2c2ca626
      Ning Yu 提交于
      A ICProxyAddr variable is usually named as "addr", so the attribute is
      referred as "addr->addr", it's confusing and sometimes ambiguous.
      
      So renamed the attribute to "sockaddr", the function
      ic_proxy_extract_addr() is also renamed to ic_proxy_extract_sockaddr().
      2c2ca626
    • P
      Separate external table options out of copy option check (#11104) · e898755e
      Peifeng Qiu 提交于
      ProcessCopyOptions checks the option list of COPY command. It's also
      called by external table when text/csv format is used. It's better
      not to mix external table specific options here and check them
      separately.
      
      Checking custom protocol here is not necessary because it's checked
      when parsing location urls in GenerateExtTableEntryOptions anyway.
      e898755e
  2. 05 11月, 2020 3 次提交
    • X
      Correctly seek to the end of buffile that contains multiple physical files · 5028054e
      xiong-gang 提交于
      When stash a 'VisimapDelete' to the buffile, we must seek to end of the last
      physical file if the buffile contains multiple files. This commit cherry-pick
      part of the commit from upstream:
      
      commit 808e13b282efa7e7ac7b78e886aca5684f4bccd3
      Author: Amit Kapila <akapila@postgresql.org>
      Date:   Wed Aug 26 07:36:43 2020 +0530
      
          Extend the BufFile interface.
      
          Allow BufFile to support temporary files that can be used by the single
          backend when the corresponding files need to be survived across the
          transaction and need to be opened and closed multiple times. Such files
          need to be created as a member of a SharedFileSet.
      
          Additionally, this commit implements the interface for BufFileTruncate to
          allow files to be truncated up to a particular offset and extends the
          BufFileSeek API to support the SEEK_END case. This also adds an option to
          provide a mode while opening the shared BufFiles instead of always opening
          in read-only mode.
      
          These enhancements in BufFile interface are required for the upcoming
          patch to allow the replication apply worker, to handle streamed
          in-progress transactions.
      
          Author: Dilip Kumar, Amit Kapila
          Reviewed-by: Amit Kapila
          Tested-by: Neha Sharma
          Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
      5028054e
    • H
      Experimental cost model update (port from 6X) (#11115) · 9363718d
      Hans Zeller 提交于
      This is a cherry-pick of the change from PR https://github.com/greenplum-db/gporca/pull/607
      
      Avoid costing change for IN predicates on btree indexes
      
      Commit e5f1716 changed the way we handle IN predicates on indexes, it
      now uses a more efficient array comparison instead of treating it like
      an OR predicate. A side effect is that the cost function,
      CCostModelGPDB::CostBitmapTableScan, now goes through a different code
      path, using the "small NDV" or "large NDV" costing method. This produces
      very high cost estimates when the NDV increases beyond 2, so we
      basically never choose an index for these cases, although a btree
      index used in a bitmap scan isn't very sensitive to the NDV.
      
      To avoid this, we go back to the old formula we used before commit e5f1716.
      The fix is restricted to IN predicates on btree indexes, used in a bitmap
      scan.
      
      Add an MDP for a larger IN list, using a btree index on an AO table
      
      Misc. changes to the calibration test program
      
      - Added tests for btree indexes (btree_scan_tests).
      - Changed data distribution so that all column values range from 1...n.
      - Parameter values for test queries are now proportional to selectivity,
        a parameter value of 0 produces a selectivity of 0%.
      - Changed the logic to fake statistics somewhat, hopefully this will
        lead to more precise estimates. Incorporated the changes to the
        data distribution with no more 0 values. Added fake stats for
        unique columns.
      - Headers of tests now use semicolons to separate parts, to give
        a nicer output when pasting into Google Docs.
      - Some formatting changes.
      - Log fallbacks.
      - When using existing tables, the program now determines the table
        structure (heap or append-only) and the row count.
      - Split off two very slow tests into separate test units. These are
        not included when running "all" tests, they have to be run
        explicitly.
      - Add btree join tests, rename "bitmap_join_tests" to "index_join_tests"
        and run both bitmap and btree joins
      - Update min and max parameter values to cover a range that includes
        or at least is closer to the cross-over between index and table scan
      - Remove the "high NDV" tests, since the ranges in the general test
        now include both low and high NDV cases (<= and > 200)
      - Print out selectivity of each query, if available
      - Suppress standard deviation output when we execute queries only once
      - Set search path when connecting
      - Decrease the parameter range when running bitmap scan tests on
        heap tables
      - Run btree scan tests only on AO tables, they are not designed
        for testing index scans
      
      Updates to the experimental cost model, new calibration
      
      1. Simplify some of the formulas, the calibration process seemed to justify
         that. We might have to revisit if problems come up. Changes:
         - Rewrite some of the formulas so the costs per row and costs per byte
           are more easy to see
         - Make the cost for the width directly proportional
         - Unify the formula for scans and joins, use the same per-byte costs
           and make NDV-dependent costs proportional to num_rebinds * dNDV,
           except for the logic in item 3.
      
         That makes the cost for the new experimental cost model a simple linear formula:
      
         num_rebinds * ( rows * c1 + rows * width * c2 + ndv * c3 + bitmap_union_cost + c4 ) + c5
      
         We have 5 constants, c1 ... c5:
      
         c1: cost per row (rows on one segment)
         c2: cost per byte
         c3: cost per distinct value (total NDV on all segments)
         c4: cost per rebind
         c5: initialization cost
         bitmap_union_cost: see item 3 below
      
      2. Recalibrate some of the cost parameters, using the updated calibration
         program src/backend/gporca/scripts/cal_bitmap_test.py
      
      3. Add a cost penalty for bitmap index scans on heap tables. The added
         cost takes the form bitmap_union_cost = <base table rows> * (NDV-1) * c6.
      
         The reason for this is, as others have pointed out, that heap tables
         lead to much larger bit vectors, since their CTIDs are more spaced out
         than those of AO tables. The main factor seems to be the cost of unioning
         these bit vectors, and that cost is proportional to the number of bitmaps
         minus one and the size of the bitmaps, which is approximated here by the
         number of rows in the table.
      
         Note that because we use (NDV-1) in the formula, this penalty does not
         apply to usual index joins, which have an NDV of 1 per rebind. This is
         consistent with what we see in measurements and it also seems reasonable,
         since we don't have to union bitmaps in this case.
      
      4. Fix to select CostModelGPDB for the 'experimental' model, as we do in 5X.
      
      5. Calibrate the constants involved (c1 ... c6), using the calibration program
         and running experiments with heap and append-only tables on a laptop and
         also on a Linux cluster with 24 segments. Also run some other workloads
         for validation.
      
      6. Give a small initial advantage to bitmap scans, so they will be chosen over
         table scans for small tables. Otherwise, small queries will
         have more or less random plans, all of which cost around 431, the value
         of the initial cost. Added a 10% advantage of the bitmap scan.
      
      * Port calibration program to Python 3
      
      - Used 2to3 program to do the basics.
      - Version parameter in argparse no longer supported
      - Needs additional option in connection string to keep the search path
      - The dbconn.execSQL call can no longer be used to get a cursor,
        this was probably a non-observable defect in the Python 2 version
      - Needed to use // (floor division) in some cases
      Co-authored-by: NDavid Kimura <dkimura@vmware.com>
      9363718d
    • H
      Improve partition elimination when indexes are present (port from 6X) (#11107) · bfcc63e1
      Hans Zeller 提交于
      * Improve partition elimination when indexes are present (port from 6X)
      
      * Use original join pred for DPE with index nested loop joins
      
      Dynamic partition selection is based on a join predicate. For index
      nested loop joins, however, we push the join predicate to the inner
      side and replace the join predicate with "true". This meant that
      we couldn't do DPE for nested index loop joins.
      
      This commit remembers the original join predicate in the index nested
      loop join, to be used in the generated filter map for DPE. The original
      join predicate needs to be passed through multiple layers.
      
      * SPE for index preds
      
      Some of the xforms use method CXformUtils::PexprRedundantSelectForDynamicIndex
      to duplicate predicates that could be used both as index predicates and as
      partition elimination predicates. The call was missing in some other xforms.
      Added it.
      
      * Changes to equivalent distribution specs with redundant predicates
      
      Adding redundant predicates causes some issues with generating
      equivalent distribution specs, to be used for the outer table of
      a nested index loop join. We want the equivalent spec to be
      expressed in terms of outer references, which are the columns of
      the outer table.
      
      By passing in the outer refs, we can ensure that we won't replace
      an outer ref in a distribution spec with a local variable from
      the original distribution spec.
      
      Also removed the asserts in CPhysicalFilter::PdsDerive that ensure the
      distribution spec is complete (consisting of only columns from the
      outer table) after we see a select node. Even without my changes, the
      asserts do not always hold, as this test case shows:
      
        drop table if exists foo, bar;
        create table foo(a int, b int, c int, d int, e int) distributed by(a,b,c);
        create table bar(a int, b int, c int, d int, e int) distributed by(a,b,c);
      
        create index bar_ixb on bar(b);
      
        set optimizer_enable_hashjoin to off;
        set client_min_messages to log;
      
        -- runs into assert
        explain
        select *
        from foo join bar on foo.a=bar.a and foo.b=bar.b
        where bar.c > 10 and bar.d = 11;
      
      Instead of the asserts, we now use the new method of passing in the
      outer refs to ensure that we move towards completion. We also know
      now that we can't always achieve a complete distribution spec, even
      without redundant predicates.
      
      * MDP changes
      
      Various changes to MDPs:
      
      - New SPE filters used in plan
      - New redundant predicates (partitioning or on non-partitioning columns)
      - Plan space changes
      - Cost changes
      - Motion changes
      - Regenerated, because plan switched to a hash join, so used a guc
        to force an index plan
      - Fixed lookup failures
      - Add mdp where we try unsuccessfully to complete a distribution spec
      
      * ICG result changes
      
      - Test used the 'experimental' cost model to force an index scan, but we
        now get the index scan even with the default cost model (plans currently
        fall back).
      
      This is a cherry-pick of commit 4a7a6821 from the 6X_STABLE branch.
      bfcc63e1
  3. 04 11月, 2020 8 次提交
    • J
      Remove unused chevron operator. · 460780eb
      Jesse Zhang 提交于
      460780eb
    • J
      Remove no-op OsPrint methods. · ad86711b
      Jesse Zhang 提交于
      They have the same no-op implementation as the overridden base method,
      and they aren't called anywhere.
      ad86711b
    • J
      Remove no-op print from test. · a567bbbb
      Jesse Zhang 提交于
      The method called doesn't actually do any printing, and we don't assert
      on the output. Removing the call doesn't even change the program output.
      a567bbbb
    • J
      Make CJob::OsPrint const. · 420f3eb0
      Jesse Zhang 提交于
      420f3eb0
    • J
      Make CMemo::OsPrint const by fixing CSyncList. · cc7fa71b
      Jesse Zhang 提交于
      While working on extracting a common implementation of DbgPrint() into a
      mixin (commit forthcoming), I ran into the curious phenomenon that is
      the non-const CMemo::OsPrint. I almost dropped the requirement that
      DbgPrint requires "OsPrint() const", before realizing that the root
      cause is CSyncList has non-const Next() and friends. And that could be
      easily fixed. Make it so.
      
      While we're at it, also fixed a fairly obvious omission in
      CMemo::OsPrint where the output stream parameter was unused. We output
      to an unrelated "auto" stream instead. This was probably never noticed
      because we were relying on the assumption that streams are always
      connected to standard output.
      cc7fa71b
    • J
      Remove unnecessary "virtual" specifier. · 0158b5ce
      Jesse Zhang 提交于
      These are classes that are only implementing an OsPrint method just so
      that they can have a debug printing facility. They are not overriding
      anything from a base class, so the "virtual" was just a bad habit.
      Remove them.
      0158b5ce
    • J
      Remove unused class gpos::COstreamFile. · 0f2c1b75
      Jesse Zhang 提交于
      I looked through the history, this class was dead on arrival and *never*
      used. Ironically, we kept adding #include for its header over the years
      to places that didn't use the class.
      0f2c1b75
    • J
      Remove dead headers. · dfa1f0a6
      Jesse Zhang 提交于
      Mortality date, in chronological order:
      
      gpos/memory/ICache.h: Added in 2010, orphaned in 2011 (private commit)
      CL 90194
      
      gpopt/utils/COptClient.h and gpopt/utils/COptServer.h: Added in 2012,
      orphaned in 2015 (private commit) MPP-25631
      
      gpopt/base/CDrvdPropCtxtScalar.h: Dead on arrival when added in 2013
      
      gpos/error/CAutoLogger.h: Added in 2012, orphaned in 2014 (private
      commit) CL 189022
      
      unittest/gpos/task/CWorkerPoolManagerTest.h: Added in 2010, orphaned in
      2019 in commit 61c7405a "Remove multi-threading code"
      (greenplum-db/gporca#510)
      
      unittest/gpos/task/CAutoTaskProxyTest.h wasn't removed in commit
      61c7405a probably because there was an reference in
      CWorkerPoolManagerTest.h which is was also left behind (chained
      orphaning).
      dfa1f0a6
  4. 03 11月, 2020 1 次提交
  5. 30 10月, 2020 1 次提交
    • (
      Reset wrote_xlog in pg_conn to avoid keeping old value. (#11077) · 777b51cd
      (Jerome)Junfeng Yang 提交于
      On QD, it tracks whether QE wrote_xlog in the libpq connection.
      
      The logic is, if QE writes xlog, it'll send a libpq msg to QD. But the
      msg is sent in ReadyForQuery. So, before QE execute this function, the
      QE may already send back results to QD. Then when QD process this
      message, it does not read the new wrote_xlog value. This makes the
      connection still contains the previous dispatch wrote_xlog value,
      which will affect whether choosing one phase commit.
      
      The issue only happens when the QE flush the libpq msg before the
      ReadyForQuery function, hard to find a case to cover it.
      I found the issue when I playing the code to send some information from
      QE to QD. And it breaks the gangsize test which shows the commit info.
      777b51cd
  6. 29 10月, 2020 1 次提交
    • D
      Skip fts probe for fts process · 3cf72f6c
      dh-cloud 提交于
      If cdbcomponent_getCdbComponents() caught an error threw by
      function getCdbComponents, FtsNotifyProber would be called.
      But if it happened inside fts process, ftp process would hang.
      
      Skip fts probe for fts process, after that, under the same
      situation, fts process would exit and then be restarted by
      postmaster.
      3cf72f6c
  7. 28 10月, 2020 2 次提交
    • (
      Collect pgstat from QE to enable auto-ANALYZE on partition leaf table. (#10988) · 259cb9e7
      (Jerome)Junfeng Yang 提交于
      Collect tuple relead pgstat table info from segments. Then the
      auto-analyze could consider partition tables now. Since before, we don't
      have accurate pgstat for partition leaf table. This kind of info is counted
      through the access method on segments and we used to collect them by the
      estate es_processed count on QD. So if insert into the root partition
      table, we can not know how many tuples inserted into a leaf, autovacuum
      never trigger auto-ANALYZE for leaf table.
      
      The idea is, for writer QE, report current nest level xact tables pgstat
      to QD through libpq at the end of a query statement. For a single
      statement, it wouldn't operate too many tables, so the effort is really
      small.
      And on QD, retrieve and combine these tables' stat from the dispatch
      result and add to current nest level xact pgstats.
      Now we can remove the old pgstat collection code on QD.
      
      The pgstat for a table could be view by query `pg_stat_all_tables_inernal`.
      And now, except for the scan related counters, other counters should be accurate.
      On master, the table's pgstat of scan related counters are not gathered
      from segments yet, this requires extra work. The current implementation
      is already enough for supporting auto-ANALYZE on partition table.
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      259cb9e7
    • mask all signal in the udp pthreads · 54451fc0
      盏一 提交于
      In some cases, some signals (like SIGQUIT) that should only be
      processed by the main thread of the postmaster may be dispatched to rxThread.
      So we should and it is safe to block all signals in the udp pthreads.
      
      Fix #11006
      54451fc0
  8. 27 10月, 2020 5 次提交
    • H
      EXCLUDE in window functions works now, remove 'gp_ignore_window_exclude'. · 8299a524
      Heikki Linnakangas 提交于
      Previously, GPDB did not support the SQL "EXCLUDE [CURRENT ROW | GROUP |
      TIES]" syntax in window functions. We got support for that from upstream
      with the PostgreSQL v12 merge. That left the GUC obsolete and unused.
      
      Update the 'olap_window' test accordingly. NOTE: 'olap_window' test isn't
      currently run as part of the regression suite! I don't know why it's been
      neglected like that, but that's not this patch's fault. The upstream
      'window' test has queries with the EXCLUDE clause, so it's covered.
      
      Reviewed-by: Jimmy Yih
      8299a524
    • J
      Add query info hook for CTAS query type. (#11050) · c8d84436
      Jialun 提交于
      GPCC want to hook query like
      - create table ... as select ...
      - create materialized view ... as select ...
      c8d84436
    • C
      Remove Orca assertions when merging buckets · 34ae3d94
      Chris Hajas 提交于
      These assertions started getting tripped in the previous commit when
      adding tests, but aren't related to the Epsilon change. Rather, we're
      calculating the frequency of a singleton bucket using two different
      methods which causes this assertion to break down. The first method
      (calculating the upper_third) assumes the singleton has 1 NDV and that there is an even distribution
      across the NDVs. The second (in GetOverlapPercentage) calculates a
      "resolution" that is based on Epsilon and assumes the bucket contains
      some small Epsilon frequency. It results in the overlap percentage being
      too high, instead it too should likely be based on the NDV.
      
      In practice, this won't have much impact unless the NDV is very small.
      Additionally, the conditional logic is based on the bounds, not
      frequency. However, it would be good to align in the future so our
      statistics calculations are simpler to understand and predictable.
      
      For now, we'll remove the assertions and add a TODO. Once we align the
      methods, we should add these assertions back.
      34ae3d94
    • C
      Fix stats bucket logic for Double values in UNION queries in Orca · ba4deed0
      Chris Hajas 提交于
      When merging statistics buckets for UNION and UNION ALL queries
      involving a column that maps to Double (eg: floats, numeric, time
      related types), we could end up in an infinite loop. This occurred if
      the bucket boundaries that we compared were within a very small value,
      defined in Orca as Epsilon. While we considered that two values were
      equal if they were within Epsilon, we didn't when computing whether
      datum1 < datum2. Therefore we'd get into a situation where a datum
      could be both equal to and less than another datum, which the logic
      wasn't able to handle.
      
      The fix is to make sure we have a hard boundary of when we consider a
      datum less than another datum by including the epsilon logic in all
      datum comparisons. Now, 2 datums are equal if they are within epsilon,
      but datum1 is less than datum 2 only if datum1 < datum2 - epsilon.
      
      Also add some tests since we didn't have any tests for types that mapped
      to Double.
      ba4deed0
    • H
      Make 'rows' estimate more accurate for plans that fetch only a few rows. · f4d48358
      Heikki Linnakangas 提交于
      In commit c5f6dbbe, we changed the row and cost estimates on plan nodes
      to represent per-segment costs. That made some estimates worse, because
      the effects of the estimate "clamping" compounds. Per my comment on the
      PR back then:
      
      > One interesting effect of this change, that explains many of the
      > plan changes: If you have a table with very few rows, or e.g. a qual
      > like id = 123 that matches exactly one row, the Seq/Index Scan on it
      > will be marked with rows=1. It now means that we estimate that every
      > segment returns one row, although in reality, only one of them will
      > return a row, and the rest will return nothing. That's because the
      > row count estimates are "clamped" in the planner to at least
      > 1. That's not a big deal on its own, but if you then have e.g. a
      > Gather Motion on top of the Scan, the planner will estimate that the
      > Gather Motion returns as many rows as there are segments. If you
      > have e.g. 100 segments, that's relatively a big discrepancy, with
      > 100 rows vs 1. I don't think that's a big problem in practice, I
      > don't think most plans are very sensitive to that kind of a
      > misestimate. What do you think?
      >
      > If we wanted to fix that, perhaps we should stop "clamping" the
      > estimates to 1. I don't think there's any fundamental reason we need
      > to do it. Perhaps clamp down to 1 / numsegments instead.
      
      But I came up with a less intrusive idea, implemented in this commit:
      Most Motion nodes have a "parent" RelOptInfo, and the RelOptInfo
      contains an estimate of the total number of rows, before dividing it
      with the number of segments or clamping. So if the row estimate we get
      from the subpath seems clamped to 1.0, we look at the row estimate on
      the underlying RelOptInfo instead, and use that if it's smaller. That
      makes the row count estimates better for plans that fetch a single row
      or a few rows, same as they were before commit c5f6dbbe. Not all
      RelOptInfos have a row count estimate, and the subpaths estimate is
      more accurate if the number of rows produced by the path differs from
      the number of rows in the underlying relation, e.g.  because of a
      ProjectSet node, so we still prefer the subpath's estimate if it
      doesn't seem clamped.
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      f4d48358
  9. 26 10月, 2020 2 次提交
    • H
      Handle PartitionSelector in plan_tree_mutator(). · 76c99759
      Heikki Linnakangas 提交于
      It was handled in expression_tree_mutator(), which is why everything
      worked, but that's not the right place. expression_tree_mutator() is
      supposed to handle nodes that can appear in expressions, and
      plan_tree_mutator() is supposed to handle Plan nodes.
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      76c99759
    • D
      Fix: might recycle wrong gang size. · 269c3b73
      dh-cloud 提交于
      In buildGangDefinition, newGangDefinition->db_descriptors are
      initialized one by one, but newGangDefinition->size was already
      set to its final value. If an error was caught, its size should
      be reset to the right number.
      269c3b73
  10. 23 10月, 2020 12 次提交
    • A
      Relfrozenxid must be invalid for append-optimized tables · e68d5b8a
      Asim R P 提交于
      Append-optimized tables do not contain transaction information in
      their tuples.  Therefore, pg_class.relfrozenxid must remain invalid.
      This is being done correctly during table creation, however, when the
      table was rewritten, the relfrozenxid was accidentally set.  Fix it
      such that diff with upstream is minimised.  In particular, the function
      "should_have_valid_relfozenxid" is removed.
      
      The fixme comments that led me to this bug are also removed.
      
      Reviewed by: Ashwin Agrawal
      e68d5b8a
    • D
      Fix CLOSE_WAIT leaks when Gang recycling · 990454e8
      dh-cloud 提交于
      Postgresql libpq document:
      
      > Note that when PQconnectStart or PQconnectStartParams returns a
      > non-null pointer, you must call PQfinish when you are finished
      > with it, in order to dispose of the structure and any associated
      > memory blocks. **This must be done even if the connection attempt
      > fails or is abandoned**.
      
      However, cdbconn_disconnect() function did not call PQfinish when
      CONNECTION_BAD, it can cause socket leaks (CLOSE_WAIT state).
      990454e8
    • D
      9a4c1c0b
    • J
      Decorate virtual functions with "override" · aedfd1f5
      Jesse Zhang 提交于
      This conforms our code to the best practice that each polymorphic
      function should have exactly one of "virtual", "override", and "final"
      specifier. I made this change using the following invocation:
      
      clang-tidy -checks '-*,modernize-use-override'
      
      Once we've made this change, ideally this practice can be enforced in
      CI. We can just run clang-tidy with a single "modernize-use-override"
      check to start with. Or we can see if our compilers are helpful enough.
      Fortunately Clang already issues warnings (turned into errors by
      -Werror) when we have inconsistent use of "override", and GCC appears to
      have something similar (-Wsuggest-override) in versions 9.2 and later.
      aedfd1f5
    • J
      Publicize deleted member functions · 76b1b6eb
      Jesse Zhang 提交于
      Now that we are explicitly declaring copy-assignment operators and copy
      constructors as deleted, we should also make them public -- private and
      =delete don't make much sense in combination, and this results in the
      best diagnostics in practice [1].
      
      In the previous commit where private unimplemented special member
      functions are changed to be declared as deleted, we left the new
      declarations private. This follows up by moving those deleted functions
      to the public section of their classes.
      
      I made this change guided by tooling: even though it doesn't offer a
      fix, clang-tidy is useful enough to emit diagnostics that look like the
      following:
      
      /home/pivotal/workspace/gpdb/src/backend/gporca/libgpopt/include/gpopt/base/CColRef.h:76:2: warning: deleted member function should be public [modernize-use-equals-delete]
              CColRef(const CColRef &) = delete;
              ^
      
      The process involved of making this change is a tale of much sorrow,
      blood and tears. Suffice it to say 20 different regular expressions were
      involved, and I'm edging closer to a PTSD when looking at backslashes.
      
      References:
      [1] https://abseil.io/tips/143#summary
      76b1b6eb
    • J
      Modernize: use equals-default and equals-delete · 9f2e9344
      Jesse Zhang 提交于
      This commit replaces two pre-C++ 11 practices with their modern, more
      intent-expressive equivalents: "disallowed" copy and assignments, and
      pairs of empty braces for special member function (destructors, default
      constructors, and etc) bodies.
      
      "Disallowed" copy constructors / assignment
      -------------------------------------------
      Old: private, unimplemented copy constructors and copy-assignment
      operators in classes. These are usually paired with a semi-descriptive
      comment.
      
      New: publicly declare them as "= delete".
      
      Fun fact: we had 13 spellings in comments on disallowed copy
      constructors:
      
      01. disable copy ctor
      02. hidden copy ctor
      03. inaccessible copy ctor
      04. no copy ctor
      05. no default copy ctor
      06. private copy ctor
      07. private no copy ctor
      08. disabled copy constructor
      09. no copy constructor
      10. private copy constructor
      11. copy c'tor - not defined
      12. no copy c'tor
      13. private copy c'tor
      
      This commit removes all of them, because the new code ("~T() = delete")
      already clearly expresses the intent of disallowing copy and assignment.
      
      To keep the history clear, this commit leaves the declaration of
      "prohibited" functions private. A forthcoming commit will wholesale
      change them to public.
      
      Defaulted special member functions
      ----------------------------------
      Old:
      struct A {
        A() {}
        ~A();
      };
      A::~A() {}
      
      New:
      struct A {
        A() = default;
        ~A();
      };
      A::~A() = default;
      
      Replacing empty braces with defaulting not only makes them more clear,
      they also enable more opportunities in compiler optimizations, as e.g.
      some defaulted functions might be recognized as trivial.
      
      Most of this commit is produced by running clang-tidy with an invocation
      like the following (plus some CMake and shell tricks):
      
      clang-tidy-12 -header-filter 'gpdbcost|gpopt|gpos|naucrates' -checks '-*,modernize-use-equals-delete,modernize-use-equals-default'
      
      The tool uses a slightly conservative heuristic to detect a large
      portion of the two outdated patterns above and rewrite them into using
      "= delete" and "= default". Making the "= delete" functions public, is
      sadly a FIXME item, so we'll have to do it by hand (in a forthcoming
      commit).
      9f2e9344
    • J
      Avoid ".." in include paths to accommodate tooling. · d5bf6eae
      Jesse Zhang 提交于
      The current setup in CMake (and Makefile's too, but that's an even
      harder problem) leads to equivalent-but-not-identical header include
      paths (-I) for the same directory, e.g. -Ilibgpopt/include vs
      -Ilibgpdbcost/../libgpopt/include .
      
      This confuses Clang-based tooling into identifying multiple paths for
      the same header -- the difference being the extra sibling directory
      followed by dot-dot, or "niece" directory followed by two levels of
      dot-dot, and so forth -- as different headers. That, in turn, undermines
      the conflict resolution and edit deduplication features in Clang's
      refactoring engine, leading to duplicate edits when applying FixIt's.
      
      This commit applies some fairly simple fixes to spell the sibling
      directories in a way that generates consistent include paths, so that
      Clang tooling are more functional. The Makefile's are left unchanged as
      that is a lot more difficult to make "right". One can argue that we
      _might_ want to instead transform the intermediate representation of
      Clang's "replacement" YAML files, but that's left for another day.
      d5bf6eae
    • J
      Remove unused private fields · fcbbe77e
      Jesse Zhang 提交于
      During the development of a forthcoming commit that replaces private
      unimplemented special member functions with explicitly deleted ones, we
      got a surprising improvement of compiler diagnostics: Clang started to
      see a lot of unused private fields that are masked by the fact that
      there were unimplemented functions in the class. This makes sense
      because the ostensibly unused field _could be_ used by a method whose
      definition compiler hasn't seen in the current translation unit -- it
      also suggests that the compiler would be better equipped at detecting
      this if we had used whole-program analysis.
      
      A sample of the errors looks like the following:
      
      In file included from CMiniDumper.cpp:14:
      In file included from ../../../../../../src/backend/gporca/libgpos/include/gpos/error/CErrorContext.h:17:
      ../../../../../../src/backend/gporca/libgpos/include/gpos/error/CMiniDumper.h:32:15: error: private field 'm_mp' is not used [-Werror,-Wunused-private-field]
              CMemoryPool *m_mp;
                           ^
      1 error generated.
      
      There are 12 such warnings, and this commit fixes them all. Note that
      code mortality is a chain reaction: oftentimes, a variable (including
      parameter) is only live because it's passed to another variable. While I
      was at it, I also performed chained removal of variables and parameters
      that became dead after removing the dead fields.
      fcbbe77e
    • J
      Remove debug-only private fields in release builds · 28d15c6e
      Jesse Zhang 提交于
      We have a bunch of classes with private fields that are used only in
      debug builds, some of them are probably opportunities for complete
      removal. This is exposed by an upcoming commit that enforces the use of
      "=delete" and "=default" throughout the codebase.
      
      I've attempted to solve this by adding the GPOS_ASSERTS_ONLY attribute,
      but GCC doesn't like that, throwing errors like the following:
      
      In file included from ../src/backend/gporca/libgpos/include/gpos/common/clibwrapper.h:22,
                       from ../src/backend/gporca/libgpos/include/gpos/error/CMessage.h:21,
                       from ../src/backend/gporca/libgpos/include/gpos/error/CMessageTable.h:14,
                       from ../src/backend/gporca/libgpos/include/gpos/error/CMessageRepository.h:14,
                       from ../src/backend/gporca/libgpos/src/_api.cpp:15:
      ../src/backend/gporca/libgpos/include/gpos/attributes.h:15:43: error: ‘unused’ attribute ignored [-Werror=attributes]
         15 | #define GPOS_UNUSED __attribute__((unused))
            |                                           ^
      ../src/backend/gporca/libgpos/include/gpos/attributes.h:21:27: note: in expansion of macro ‘GPOS_UNUSED’
         21 | #define GPOS_ASSERTS_ONLY GPOS_UNUSED
            |                           ^~~~~~~~~~~
      ../src/backend/gporca/libgpos/include/gpos/memory/CAutoMemoryPool.h:56:31: note: in expansion of macro ‘GPOS_ASSERTS_ONLY’
         56 |  ELeakCheck m_leak_check_type GPOS_ASSERTS_ONLY;
            |                               ^~~~~~~~~~~~~~~~~
      cc1plus: all warnings being treated as errors
      
      So we're back to the good ol' #ifdef GPOS_DEBUG.
      
      This is the first of a pair of manual changes. In the immediately
      following commit I'll remove the remaining (unconditionally) unused
      private fields.
      28d15c6e
    • D
      Add workload3 to explain pipeline · 91ed33c9
      David Kimura 提交于
      91ed33c9
    • A
      31a35cd8
    • S
      Extract new error message fields in cdbdisp_get_PQerror() · 5cf09827
      Shaoqi Bai 提交于
      Postgres 991f3e5a introduce new error
      message fields, but cdbdisp_get_PQerror() did not extract these newly
      added message field. These fields client cannot see from libpq.
      
      This fixes https://github.com/greenplum-db/gpdb/issues/7934.
      Co-authored-by: NAshwin Agrawal <aashwin@vmware.com>
      Reviewed-by: NHeikki Linnakangas <hlinnaka@iki.fi>
      5cf09827