Correctly deallocate prepared statements if we fail inside a transaction

- Addresses issue #12330 Overview ======== Cached postgres prepared statements become invalidated if the schema changes in a way that it affects the returned result. Examples: - adding or removing a column then doing a 'SELECT *' - removing the foo column then doing a 'SELECT bar.foo' In normal operation this isn't a problem, we can rescue the error, deallocate the prepared statement and re-issue the command. However in PostgreSQL transactions, once any command fails, the transaction becomes 'poisoned' and any subsequent commands will raise InFailedSQLTransaction. This includes DEALLOCATE statements, so the default deallocation strategy instead of removing the cached prepared statement instead raises InFailedSQLTransaction. Why this is bad =============== 1. InFailedSQLTransaction is a fairly cryptic error and doesn't communicate any useful information about what has actually gone wrong. 2. In the naive implementation the prepared statement never gets deallocated - it stays alive for the length of the session taking up memory on the postgres server. 3. It is unsafe to retry the transaction because the bad prepared statement is still in the cache and we would see the exact same failure repeated. Solution ======== If we are outside a transaction we can continue to handle these failures gracefully in the usual way. Inside a transaction instead of issuing a DEALLOCATE command that will certainly fail, we now raise ActiveRecord::PreparedStatementCacheExpired. This can be handled further up the stack, notably inside TransactionManager#within_new_transaction. Here we can make sure to first rollback the transaction, then safely issue DEALLOCATE statements to invalidate the rest of the cached prepared statements. This also allows the user (or some gem) the opportunity to catch this error and voluntarily retry the transaction if a schema change causes the prepared statement cache to become invalidated. Because the outdated statement has been deallocated, we can expect the transaction to succeed on the second try.

Correctly deallocate prepared statements if we fail inside a transaction
- Addresses issue #12330 Overview ======== Cached postgres prepared statements become invalidated if the schema changes in a way that it affects the returned result. Examples: - adding or removing a column then doing a 'SELECT *' - removing the foo column then doing a 'SELECT bar.foo' In normal operation this isn't a problem, we can rescue the error, deallocate the prepared statement and re-issue the command. However in PostgreSQL transactions, once any command fails, the transaction becomes 'poisoned' and any subsequent commands will raise InFailedSQLTransaction. This includes DEALLOCATE statements, so the default deallocation strategy instead of removing the cached prepared statement instead raises InFailedSQLTransaction. Why this is bad =============== 1. InFailedSQLTransaction is a fairly cryptic error and doesn't communicate any useful information about what has actually gone wrong. 2. In the naive implementation the prepared statement never gets deallocated - it stays alive for the length of the session taking up memory on the postgres server. 3. It is unsafe to retry the transaction because the bad prepared statement is still in the cache and we would see the exact same failure repeated. Solution ======== If we are outside a transaction we can continue to handle these failures gracefully in the usual way. Inside a transaction instead of issuing a DEALLOCATE command that will certainly fail, we now raise ActiveRecord::PreparedStatementCacheExpired. This can be handled further up the stack, notably inside TransactionManager#within_new_transaction. Here we can make sure to first rollback the transaction, then safely issue DEALLOCATE statements to invalidate the rest of the cached prepared statements. This also allows the user (or some gem) the opportunity to catch this error and voluntarily retry the transaction if a schema change causes the prepared statement cache to become invalidated. Because the outdated statement has been deallocated, we can expect the transaction to succeed on the second try.
50c53340 · Sam Davies · e670611e · 50c53340 · 50c53340 · 50c53340
4 changed file
--- a/activerecord/lib/active_record/connection_adapters/abstract/transaction.rb
+++ b/activerecord/lib/active_record/connection_adapters/abstract/transaction.rb
@@ -182,7 +182,10 @@ def within_new_transaction(options = {})
        transaction = begin_transaction options
        yield
      rescue Exception => error
-        rollback_transaction if transaction
+        if transaction
+          rollback_transaction
+          after_failure_actions(transaction, error)
+        end
        raise
      ensure
        unless error
@@ -208,7 +211,16 @@ def current_transaction
      end

      private
+
        NULL_TRANSACTION = NullTransaction.new
+
+        # Deallocate invalidated prepared statements outside of the transaction
+        def after_failure_actions(transaction, error)
+          return unless transaction.is_a?(RealTransaction)
+          return unless error.is_a?(ActiveRecord::PreparedStatementCacheExpired)
+          @connection.clear_cache!
+        end
+
    end
  end
 end
--- a/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb
+++ b/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb
@@ -589,25 +589,41 @@ def exec_cache(sql, name, binds)
            @connection.exec_prepared(stmt_key, type_casted_binds)
          end
        rescue ActiveRecord::StatementInvalid => e
-          pgerror = e.cause
+          raise unless is_cached_plan_failure?(e)

-          # Get the PG code for the failure.  Annoyingly, the code for
-          # prepared statements whose return value may have changed is
-          # FEATURE_NOT_SUPPORTED.  Check here for more details:
-          # http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/cache/plancache.c#l573
-          begin
-            code = pgerror.result.result_error_field(PGresult::PG_DIAG_SQLSTATE)
-          rescue
-            raise e
-          end
-          if FEATURE_NOT_SUPPORTED == code
+          # Nothing we can do if we are in a transaction because all commands
+          # will raise InFailedSQLTransaction
+          if in_transaction?
+            raise ActiveRecord::PreparedStatementCacheExpired.new(e.cause.message)
+          else
+            # outside of transactions we can simply flush this query and retry
            @statements.delete sql_key(sql)
            retry
-          else
-            raise e
          end
        end

+        # Annoyingly, the code for prepared statements whose return value may
+        # have changed is FEATURE_NOT_SUPPORTED.
+        #
+        # This covers various different error types so we need to do additional
+        # work to classify the exception definitively as a
+        # ActiveRecord::PreparedStatementCacheExpired
+        #
+        # Check here for more details:
+        # http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/cache/plancache.c#l573
+        CACHED_PLAN_HEURISTIC = 'cached plan must not change result type'.freeze
+        def is_cached_plan_failure?(e)
+          pgerror = e.cause
+          code = pgerror.result.result_error_field(PGresult::PG_DIAG_SQLSTATE)
+          code == FEATURE_NOT_SUPPORTED && pgerror.message.include?(CACHED_PLAN_HEURISTIC)
+        rescue
+          false
+        end
+
+        def in_transaction?
+          open_transactions > 0
+        end
+
        # Returns the statement identifier for the client side cache
        # of statements
        def sql_key(sql)

--- a/activerecord/lib/active_record/errors.rb
+++ b/activerecord/lib/active_record/errors.rb
@@ -139,6 +139,11 @@ class PreparedStatementInvalid < ActiveRecordError
  class NoDatabaseError < StatementInvalid
  end

+  # Raised when Postgres returns 'cached plan must not change result type' and
+  # we cannot retry gracefully (e.g. inside a transaction)
+  class PreparedStatementCacheExpired < StatementInvalid
+  end
+
  # Raised on attempt to save stale record. Record is stale when it's being saved in another query after
  # instantiation, for example, when two users edit the same wiki page and one starts editing and saves
  # the page before the other.

--- a/activerecord/test/cases/hot_compatibility_test.rb
+++ b/activerecord/test/cases/hot_compatibility_test.rb
 require 'cases/helper'
+require 'support/connection_helper'

 class HotCompatibilityTest < ActiveRecord::TestCase
  self.use_transactional_tests = false
+  include ConnectionHelper

  setup do
    @klass = Class.new(ActiveRecord::Base) do
@@ -51,4 +53,87 @@ def self.name; 'HotCompatibility'; end
    record.reload
    assert_equal 'bar', record.foo
  end
+
+  if current_adapter?(:PostgreSQLAdapter)
+    test "cleans up after prepared statement failure in a transaction" do
+      with_two_connections do |original_connection, ddl_connection|
+        record = @klass.create! bar: 'bar'
+
+        # prepare the reload statement in a transaction
+        @klass.transaction do
+          record.reload
+        end
+
+        # add a new column
+        ddl_connection.add_column :hot_compatibilities, :baz, :string
+
+        assert_raise(ActiveRecord::PreparedStatementCacheExpired) do
+          @klass.transaction do
+            record.reload
+          end
+        end
+
+        prepared_statement_cache = @klass.connection
+          .instance_variable_get(:@statements)
+          .instance_variable_get(:@cache)[Process.pid]
+
+        assert_empty prepared_statement_cache,
+          "expected prepared statement cache to be empty but it wasn't"
+      end
+    end
+
+    test "cleans up after prepared statement failure in nested transactions" do
+      with_two_connections do |original_connection, ddl_connection|
+        record = @klass.create! bar: 'bar'
+
+        # prepare the reload statement in a transaction
+        @klass.transaction do
+          record.reload
+        end
+
+        # add a new column
+        ddl_connection.add_column :hot_compatibilities, :baz, :string
+
+        assert_raise(ActiveRecord::PreparedStatementCacheExpired) do
+          @klass.transaction do
+            @klass.transaction do
+              @klass.transaction do
+                record.reload
+              end
+            end
+          end
+        end
+
+        prepared_statement_cache = @klass.connection
+          .instance_variable_get(:@statements)
+          .instance_variable_get(:@cache)[Process.pid]
+
+        assert_empty prepared_statement_cache,
+          "expected prepared statement cache to be empty but it wasn't"
+      end
+    end
+  end
+
+  private
+
+  # Rails will automatically clear the prepared statements on the connection
+  # that runs the migration, so we use two connections to simulate what would
+  # actually happen on a production system; we'd have one connection running the
+  # migration from the rake task ("ddl_connection" here), and we'd have another
+  # connection in a web worker.
+  def with_two_connections
+    run_without_connection do |original_connection|
+      ActiveRecord::Base.establish_connection(original_connection.merge(pool_size: 2))
+      begin
+        ddl_connection = ActiveRecord::Base.connection_pool.checkout
+        begin
+          yield original_connection, ddl_connection
+        ensure
+          ActiveRecord::Base.connection_pool.checkin ddl_connection
+        end
+      ensure
+        ActiveRecord::Base.clear_all_connections!
+      end
+    end
+  end
 end