1. 24 10月, 2019 3 次提交
    • M
      gpinitsystem: Return 0 on warnings · d7b7a40a
      Mark Sliva 提交于
      Previously gpinitsystem returned 1 in case of warngings (2 in case of
      errors). We have changed gpinitsystem to return 0 for warnings so that other
      tools using gpinitsystem (such as gpupgrade) will not fail when gpinitsystem
      generates warnings.
      
      Also, gpinitsystem was exiting with non-zero return code if certain errors or
      warnings had happened on any previous run of gpinitsystem that day. This has
      also been fixed in this commit
      Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
      Co-authored-by: NShoaib Lari <slari@pivotal.io>
      d7b7a40a
    • M
    • Fix heap-use-after-free in gp_inject_fault() · 61042efb
      盏一 提交于
      In postgres libpq document, it says:
      
      >  The pointer returned by PQgetvalue points to storage that is part of
      >  the PGresult structure. One should not modify the data it points to,
      >  and one must explicitly copy the data into other storage if it is to
      >  be used past the lifetime of the PGresult structure itself.
      
      So we should call pstrdup() on the result of PQgetvalue() because
      response will be used past the lifetime of the PGresult structure
      returned by PQexec().
      61042efb
  2. 23 10月, 2019 4 次提交
    • J
      Add append optimized support on partition table (#8880) · acbfbf8b
      Jinbao Chen 提交于
      The commit b3b2797e adds an alias
      appendoptimized for appendonly. Making changes on the partition table
      was missed. Add it back.
      acbfbf8b
    • Fix array overflow in AO concurrency handling · d72e2a19
      盏一 提交于
      In the function ao_foreach_extent_file() an array concurrency is used to
      store the concurrent levels, its size was AOTupleId_MaxSegmentFileNum,
      which equals to (MAX_AOREL_CONCURRENCY - 1).  However the possible index
      range is [0, MAX_AOREL_CONCURRENCY - 1], it exceeds the original size of
      the array, so an array overflow can happen at runtime.
      
      Fixed by correcting the array size, also define the size with
      MAX_AOREL_CONCURRENCY instead of AOTupleId_MaxSegmentFileNum.
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      Reviewed-by: NNing Yu <nyu@pivotal.io>
      d72e2a19
    • A
      Set gp_debug_linger to 0 for assert builds as well · 3dbbaa88
      Ashwin Agrawal 提交于
      gp_debug_linger is for developer convenience during interactive
      testing. So, setting it to value when required is better, instead of
      having it set to default for assert vs non-assert builds. As it
      unnecessarily halts processes in CI as well, where no one is
      monitoring or planning to attach. Any time prefer to get the hang of
      process, set the gp_debug_linger and same behavior as before this
      commit can be achieved, irrespective of assert or non-assert build.
      
      This is also mainly required now more than before is for walreceiver
      we are using regular error handling code same as upstream instead of
      GPDB specific one. With that if any error happens in walreceiver
      process, similar to upstream it would get converted to FATAL and
      walreceiver will exit. But in GPDB on FATAL in assert enabled builds,
      process hangs around for gp_debug_linger time (which used to be 2 mins
      by default for assert builds). This causes unnecessary weird test
      failures in assert builds.
      3dbbaa88
    • A
      Align walreceiver.c code to upstream · f58591f3
      Ashwin Agrawal 提交于
      I couldn't think of nor able to find reason for having these diffs
      against upstream. Hence, seems best to get rid of them.
      f58591f3
  3. 22 10月, 2019 9 次提交
    • A
      Properly determine the waiting backends in an isolation2 test · 83cc9590
      Asim R P 提交于
      The method used in reader_waits_for_lock test to identify a reader
      backend that is waiting and belongs to the desired session was wrong.
      It used a separate distributed table to record the desired session ID,
      and later queried that table in utility mode.  However, the utility mode
      connection was established with the segment that didn't receive the
      session ID tuple.  Fix it by identifying the reader that is waiting on a
      lock from pg_locks and the session it belongs to based on query string
      in pg_stat_activity, as suggested by Heikki.
      
      Fixes GitHub issue #8830.
      
      Reviewed thoroughly and patiently by Heikki Linnakangas.
      83cc9590
    • H
      Bump ORCA version to 3.79 (this time for real) · b9957614
      Hans Zeller 提交于
      b9957614
    • H
      Change CFLAGS to CXXFLAGS when using CXX to do link (#8852) · e2c9303a
      Hao Wu 提交于
      Currently, we use CXX to link postgres due to add some c++ code(glue
      code of orca). But the Makefile still passes CFLAGS to the compiler.
      This behavior will cause an error if a compiler option is supported by
      CC, but not supported by CXX.
      NB: Using CXX to link happens only to create postgres executable or to
      link object files into postgres.o
      e2c9303a
    • H
      a3c4978d
    • B
      Move server-build publishing to a separate CI job · aac2eeb9
      Bradford D. Boyle 提交于
      The `without_asserts` pipeline includes a time-based trigger which can
      cause the pipeline to re-run the compilation jobs for the same commit.
      When the compilation jobs attempt to re-`put` the server-build artifact,
      the job fails because the build artifacts bucket is immutable.
      
      This commit move the server-build publishing to a separate, hidden CI
      job that does not block the release.
      Authored-by: NBradford D. Boyle <bboyle@pivotal.io>
      aac2eeb9
    • A
      Avoid recovery log messages during initdb · 1af089bf
      Ashwin Agrawal 提交于
      Initdb outputs:
      ```
      creating template1 database in data-master/base/1 ... 2019-10-15 17:32:43.566646 CST,,,p73904,th345056704,,,,0,,,seg-10000,,,,,"LOG","00000","end of transaction log location is 0/4000098",,,,,,,,"StartupXLOG","xlog.c",7417,
      2019-10-15 17:32:43.567223 CST,,,p73904,th345056704,,,,0,,,seg-10000,,,,,"LOG","00000","latest completed transaction id is 4294967295 and next transaction id is 3",,,,,,,,"StartupXLOG","xlog.c",7705,
      2019-10-15 17:32:43.567528 CST,,,p73904,th345056704,,,,0,,,seg-10000,,,,,"LOG","00000","database system is ready",,,,,,,,"StartupXLOG","xlog.c",7729,
      ok
      initializing pg_authid ... 2019-10-15 17:32:45.008956 CST,,,p73906,th83031488,,,,0,,,seg-10000,,,,,"LOG","00000","end of transaction log location is 0/40101D8",,,,,,,,"StartupXLOG","xlog.c",7417,
      2019-10-15 17:32:45.009424 CST,,,p73906,th83031488,,,,0,,,seg-10000,,,,,"LOG","00000","latest completed transaction id is 4294967295 and next transaction id is 3",,,,,,,,"StartupXLOG","xlog.c",7705,
      2019-10-15 17:32:45.010083 CST,,,p73906,th83031488,,,,0,,,seg-10000,,,,,"LOG","00000","database system is ready",,,,,,,,"StartupXLOG","xlog.c",7729,
      ok
      ```
      
      These messages are helpful during crash recovery but noise for
      initdb. Hence, avoid them for initdb. With the change initdb output
      looks cleaner
      ```
      creating template1 database in data-master/base/1 ... ok
      initializing pg_authid ... ok
      initializing dependencies ... ok
      creating system views ... ok
      loading system objects' descriptions ... ok
      creating conversions ... ok
      ```
      Reported-by: NYandong Yao <yyao@pivotal.io>
      Reviewed-by: NYandong Yao <yyao@pivotal.io>
      Reviewed-by: NJimmy Yih <jyih@pivotal.io>
      1af089bf
    • A
      Avoid crash for explain analyze with sort in utility mode · 91cdc618
      Ashwin Agrawal 提交于
      show_sort_info() crashes if `((PlanState *)
      sortstate)->instrument->cdbNodeSummary` is
      NULL. cdbexplain_localExecStats() should be called for utility mode
      connections or DISPATCH mode on master. Hence, modifying the check to
      avoid crash.
      
      In general, utility mode doesn't have much such usecase but definitely
      crashing is not good, hence avoid the same.
      
      Fixes #8804 github issue.
      91cdc618
    • C
      docs - add step to set selinux_provider in sssd.conf (#8841) · d0beb5de
      Chuck Litzell 提交于
      * docs - add step to set selinux_provider in sssd.conf
      
      * Edit from review
      d0beb5de
    • L
      Docs - syctl reorg and adding reference links (#8785) · 84e35648
      Lena Hunter 提交于
      * changes to OS parameters page org
      
      * adding links for sysctl reorg
      
      * added links to sysctl section
      
      * fixing xref element issue
      
      * sysctl reorg few edits
      
      * chuck's review edits
      
      * updates from review comments
      
      * updates from review comments
      
      * mel's last comments and html->xml edits
      
      * remove titles from xref links
      84e35648
  4. 21 10月, 2019 3 次提交
    • G
      Handle dispatching of rule statements · 2a59025b
      Georgios Kokolatos 提交于
      The common practice is to dispatch already-analysed statements. However for
      rules raw statements where dispatched. As a consequence all the possible nodes
      in the tree would have to be serialised. This was (and still) is not the case.
      
      Follow the common practice and dispatch the already analysed query tree for
      rules.
      
      The issue rose during the 9.6 merge cycle in 'rowsecurity' tests by coincidence.
      Rule tests are dissabled for quite some time now and even if this case was
      included it was never tested. A specific test is added now.
      
      Addresses issue #8787
      
      Reported, reviewed and heavily modified by Heikki Linnakangas.
      2a59025b
    • H
      Use Explicit Motion, instead of Redistribute, on top of Split Updates. · 2fdc5376
      Heikki Linnakangas 提交于
      SplitUpdate no longer needs the old row's distribution key columns, to
      calculate the segment where the old row should be deleted from. Instead,
      we get the old target segment directly from the "gp_segment_id" junk
      column, like a normal non-split update does. So we now use an Explicit
      Motion on top Split Updates, instead of a Redistribute Motion. For the new
      row, the SplitUpdate node computes the target segment from by hashing the
      distribution key columns, like a Redistribute Motion does.
      
      The reason to do this now is that with the 9.6 merge, and the
      "pathification" of Split Updates, it became hard to represent the same
      columns old and new value in the Motion's hash expression, before the
      set_plan_references() has run. I'm sure there would be other solutions,
      but this seems like a good idea anyway.
      Reviewed-by: NDavid Kimura <dkimura@pivotal.io>
      2fdc5376
    • H
      Remove stuff that was left unused by commit 00e25afe. · ad8cf65b
      Heikki Linnakangas 提交于
      Noted by David Kimura.
      ad8cf65b
  5. 19 10月, 2019 4 次提交
    • M
      docs - update disk usage planning info. (#8842) · 16296e0c
      Mel Kiyama 提交于
      * docs - update disk usage planning info.
      --Add information about using tablespace for temp files
      --Also update performance overview topic (not related to this PR)
      
      This will be backported to 5.x and 4.3.x
      
      * docs - fixed some errors
      
      * docs - mentioned both temp and transaction files in description.
      16296e0c
    • A
      gp_replica_check: avoid printing "Success" for invalid page header · c0f9037d
      Ashwin Agrawal 提交于
      It's confusing to read:
      ```
      WARNING:  page verification failed, calculated checksum 14987 but expected 31011
      NOTICE:  invalid page header or checksum in heap file "/home/ashwin/workspace/gpdb/gpAux/gpdemo/datadirs/dbfast_mirror1/demoDataDir0/base/1/12390", block 0: Success
      ```
      
      Avoid printing `%m` for these messages as read succeeded, just these
      checks fails which don't set errno.
      c0f9037d
    • A
      gp_replica_check: run checks in parallel to cut down time · b662fb53
      Ashwin Agrawal 提交于
      gp_replica_check can easily perform checks in paraellel for multiple
      database directories for various segments. This should help to
      significantly cut down time. On my laptop just after installcheck,
      gp_replica_check finishes in 2 secs compared to 17sec before. I think
      in CI this would make huge impact as currently gp_replica_check takes
      approximately 10mins after installcheck-world.
      
      I have not implemented any upper bound or batching currently. If it
      becomes a problem, we can in future add to spawn N threads at a
      time. Let's see first what effect this change has.
      b662fb53
    • C
      Bump ORCA version to 3.78.0, Optimize CMemoryPoolPalloc in ORCA (#8752) · 6c92fa9f
      Chris Hajas 提交于
      CMemoryPoolPalloc previously used headers and logic that were only
      needed in CMemoryPoolTracker. For each allocation, a fairly large header
      was added, which caused memory intensive operations in ORCA to use large
      amounts of memory. Now, we only store the size of array allocations in a
      header if needed. Otherwise, no header information is needed/stored on
      the ORCA side. This reduces the memory utilization for some queries by
      30%+. For TPC-DS Q72 on my laptop, peak memory utilization went from 1.1GB to
      720MB. This header accounted for ~20MB of the 720MB peak usage in Q72.
      
      Corresponding ORCA commit: greenplum-db/gporca@d828eed "Simplify CMemoryPool to reduce unnecessary headers and logic".
      Co-authored-by: NChris Hajas <chajas@pivotal.io>
      Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
      6c92fa9f
  6. 18 10月, 2019 3 次提交
  7. 17 10月, 2019 3 次提交
    • T
      Revert c93bb171 · 8877a8e2
      Tyler Ramer 提交于
      The logic used in the initial commit is faulty and fragile.
      
      In the event that we want to force the master postmaster process to
      listen on a subset of addresses available on the system, it is most
      likely that we don't want to use the address(es) used by the
      interconnect.
      
      In the event of an external network and internal interconnect, the
      binding of "backend" listenning sockets to the "external" network would
      break the interconnect.
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      8877a8e2
    • T
      Update comments for getaddrinfo · ab61936a
      Tyler Ramer 提交于
      Some of the commentary was incorrect or misleading
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      ab61936a
    • A
      Avoid panic with DEBUG5 enabled · efe11a47
      Ashwin Agrawal 提交于
      elog(DEBUG5) inside PostgresMain() after sigsetjmp() seems bad
      idea. This is the place where control jumps in-case error happens in
      normal flow. Shouldn't be performing elog() without handling that
      error first. write_stderr() exists to write the exact thing being
      intended with this elog(). Having elog(DEBUG5) is causing SIGSEGV.
      
      Simple repro for the same is:
      SET log_min_messages=debug5;
      SELECT xmlconcat('bad', '<syntax');
      
      This fixes #8733 github issue.
      efe11a47
  8. 16 10月, 2019 5 次提交
    • M
      docs - add gp_workfile_* GUCs to GUC table · ce4b6bfa
      mkiyama 提交于
      ce4b6bfa
    • A
      Fix session ID assignment to utility mode connections · 5c4328a0
      Asim R P 提交于
      In utility mode on a segment, a valid value should not be assigned to
      gp_session_id.  Otherwise, it may collide with session ID dispatched by
      master (QD) to QE backends.  One problem with such a collision is a lock
      acquired by a utility backend may not conflict with lock requested by
      nother backend, when it should.  LockCheckConflicts() groups locks
      acqiured by all processes having the same session ID together.  Locks
      held by the same session do not conflict with each other.  We have seen
      false positives in CI due to a utility mode backend's session ID being
      identical to a normal mode backend's session ID.  E.g.
      reader_waits_for_locks, vacuum_drop_phase_ao.
      
      The patch fixes it such that utility mode connections started on a
      segment postmaster will not set gp_session_id, leaving it initialized to
      -1.  A session with invalid (-1) session ID is treated as singleton
      backend, not part of an MPP session, when evaluating lock conflicts.
      Therefore, locks held by two or more backends, all having -1 as session
      ID, are all considered for conflict evaluation.  Utility connections on
      master, however, will continue to get a valid gp_session_id.  This will
      never conflict because session IDs should be generated only on master by
      incrementing an integer counter atomically.
      
      An underlying assumption for all of this to work is that the dispatcher
      assigns EXECUTE role, which is different from UTILITY, to backends that
      take part in a MPP session.  If that changes in future, we will have to
      rethink this solution.
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      
      Reviewed by Ashwin Agrawal and Soumyadeep Chakraborty.
      5c4328a0
    • N
      ic: tcp: init incoming conns before outgoing conns · 296dba82
      Ning Yu 提交于
      In SetupTCPInterconnect() we initialize both incoming and outgoing
      connections, a state pointer sendingChunkTransportState is created to
      track the status of outgoing connections, it is an entry of the states
      array, we expect the pointer to be valid during the function.
      
      However, after we get this pointer we will initialize the incoming
      connections, they will resize the states array with repalloc(), so
      sendingChunkTransportState will point to invalid memory and crash at
      runtime.
      
      To fix that we should initialize the incoming connections before the
      outgoing ones, so the sendingChunkTransportState pointer stays valid
      during its lifecycle.
      
      Tests are not added as it has chances to be triggered by existing tests.
      296dba82
    • D
      ab8a8e0c
    • C
      docs - fix custom hash operator class example (#8833) · 9ce24e90
      Chuck Litzell 提交于
      9ce24e90
  9. 15 10月, 2019 4 次提交
  10. 14 10月, 2019 2 次提交
    • B
      Removing locking while looking up tablespace using name · 362c48b6
      Bhuvnesh Chaudhary 提交于
      While looking up tablespace oid using name, a tuple lock was taken on the
      pg_tablespace entry for the tablespace. This lock was to avoid any concurrent
      drop for the same tablespace. However, in queries containing a reader and
      writer gang, and if temp_tablespaces GUC is enforced to spill to a specific
      tablespace, the reader may try to assign a transaction id for taking a lock,
      but a transaction id may no tbe assigned yet. This will cause such queries to
      fail.
      
      This commit fixes the issue by removing the requirement to acquire a lock while
      looking up a tablespace using name.
      
      Dropping a tablespace mandates that the tablespace must be empty, i.e any
      tables created in that tablespace must be dropped before proceeding to drop the
      tablespace. If any data exists in the tablespace before drop tablespace being
      dispatched to segments, the drop tablespace operation will fail. In cases,
      where data was created in the tablespace directory after the command was
      dispatched, the contents in the tablespace directory are not dropped, as
      someone might have created a table in the tablespace being dropped in this
      small window, but the entry for pg_tablespace is removed. The user can still
      run ALTER commands to move the table to another tablespace.
      
      With the above behavior, and considering the usage of DROP tablespace
      not frequent, we don't need to take locks while looking up the
      tablespace using name as it restricts other use cases. For instance:
      
      ```sql
      CREATE TABLE hashagg_spill(col1 numeric, col2 int) DISTRIBUTED BY (col1);
      INSERT INTO hashagg_spill SELECT id, 1 FROM generate_series(1,20000) id;
      ANALYZE hashagg_spill;
      SET temp_tablespaces=pg_default;
      SET statement_mem='1000kB';
      CREATE TABLE spill_temptblspace (a numeric) DISTRIBUTED BY (a);
      INSERT INTO spill_tmptblspace SELECT avg(col2) col2 FROM hashagg_spill GROUP BY col1 HAVING(sum(col1)) < 0;
      ERROR: AssignTransactionId() called by Segment Reader process...
      ```
      
      Also, added tests to validate that the data files in the tablespace are not
      removed in the case described above.
      362c48b6
    • N
      resgroup: refactor python based tests · 993c206b
      Ning Yu 提交于
      Simplified some cgroup verification logic with python, also removed some
      hard coded dbnames.
      993c206b