Auto merge of #91919 - Aaron1011:query-recursive-read, r=michaelwoerister

Don't perform any new queries while reading a query result on disk In addition to being very confusing, this can cause us to add dep node edges between two queries that would not otherwise have an edge. We now panic if any new dep node edges are created during the deserialization of a query result. This requires serializing the full `AdtDef` to disk, instead of just serializing the `DefId` and invoking the `adt_def` query during deserialization. I'll probably split this up into several smaller PRs for perf runs.

Auto merge of #91919 - Aaron1011:query-recursive-read, r=michaelwoerister
Don't perform any new queries while reading a query result on disk In addition to being very confusing, this can cause us to add dep node edges between two queries that would not otherwise have an edge. We now panic if any new dep node edges are created during the deserialization of a query result. This requires serializing the full `AdtDef` to disk, instead of just serializing the `DefId` and invoking the `adt_def` query during deserialization. I'll probably split this up into several smaller PRs for perf runs.
a7e2e339 · bors · 488acf86 · 27ed52c0 · a7e2e339 · a7e2e339
Showing with 76 addition and 2 deletion

compiler/rustc_query_system/src/dep_graph/graph.rs compiler/rustc_query_system/src/dep_graph/graph.rs +69 -0

compiler/rustc_query_system/src/query/plumbing.rs compiler/rustc_query_system/src/query/plumbing.rs +7 -2

未找到文件。
--- a/compiler/rustc_query_system/src/dep_graph/graph.rs
+++ b/compiler/rustc_query_system/src/dep_graph/graph.rs
@@ -177,6 +177,62 @@ pub fn with_ignore<OP, R>(&self, op: OP) -> R
        K::with_deps(None, op)
    }

+    /// Used to wrap the deserialization of a query result from disk,
+    /// This method enforces that no new `DepNodes` are created during
+    /// query result deserialization.
+    ///
+    /// Enforcing this makes the query dep graph simpler - all nodes
+    /// must be created during the query execution, and should be
+    /// created from inside the 'body' of a query (the implementation
+    /// provided by a particular compiler crate).
+    ///
+    /// Consider the case of three queries `A`, `B`, and `C`, where
+    /// `A` invokes `B` and `B` invokes `C`:
+    ///
+    /// `A -> B -> C`
+    ///
+    /// Suppose that decoding the result of query `B` required re-computing
+    /// the query `C`. If we did not create a fresh `TaskDeps` when
+    /// decoding `B`, we would still be using the `TaskDeps` for query `A`
+    /// (if we needed to re-execute `A`). This would cause us to create
+    /// a new edge `A -> C`. If this edge did not previously
+    /// exist in the `DepGraph`, then we could end up with a different
+    /// `DepGraph` at the end of compilation, even if there were no
+    /// meaningful changes to the overall program (e.g. a newline was added).
+    /// In addition, this edge might cause a subsequent compilation run
+    /// to try to force `C` before marking other necessary nodes green. If
+    /// `C` did not exist in the new compilation session, then we could
+    /// get an ICE. Normally, we would have tried (and failed) to mark
+    /// some other query green (e.g. `item_children`) which was used
+    /// to obtain `C`, which would prevent us from ever trying to force
+    /// a non-existent `D`.
+    ///
+    /// It might be possible to enforce that all `DepNode`s read during
+    /// deserialization already exist in the previous `DepGraph`. In
+    /// the above example, we would invoke `D` during the deserialization
+    /// of `B`. Since we correctly create a new `TaskDeps` from the decoding
+    /// of `B`, this would result in an edge `B -> D`. If that edge already
+    /// existed (with the same `DepPathHash`es), then it should be correct
+    /// to allow the invocation of the query to proceed during deserialization
+    /// of a query result. We would merely assert that the dep-graph fragment
+    /// that would have been added by invoking `C` while decoding `B`
+    /// is equivalent to the dep-graph fragment that we already instantiated for B
+    /// (at the point where we successfully marked B as green).
+    ///
+    /// However, this would require additional complexity
+    /// in the query infrastructure, and is not currently needed by the
+    /// decoding of any query results. Should the need arise in the future,
+    /// we should consider extending the query system with this functionality.
+    pub fn with_query_deserialization<OP, R>(&self, op: OP) -> R
+    where
+        OP: FnOnce() -> R,
+    {
+        let mut deps = TaskDeps::default();
+        deps.read_allowed = false;
+        let deps = Lock::new(deps);
+        K::with_deps(Some(&deps), op)
+    }
+
    /// Starts a new dep-graph task. Dep-graph tasks are specified
    /// using a free function (`task`) and **not** a closure -- this
    /// is intentional because we want to exercise tight control over
@@ -257,6 +313,7 @@ fn with_task_impl<Ctxt: HasDepContext<DepKind = K>, A: Debug, R>(
                reads: SmallVec::new(),
                read_set: Default::default(),
                phantom_data: PhantomData,
+                read_allowed: true,
            }))
        };
        let result = K::with_deps(task_deps.as_ref(), || task(cx, arg));
@@ -368,6 +425,11 @@ pub fn read_index(&self, dep_node_index: DepNodeIndex) {
                if let Some(task_deps) = task_deps {
                    let mut task_deps = task_deps.lock();
                    let task_deps = &mut *task_deps;
+
+                    if !task_deps.read_allowed {
+                        panic!("Illegal read of: {:?}", dep_node_index);
+                    }
+
                    if cfg!(debug_assertions) {
                        data.current.total_read_count.fetch_add(1, Relaxed);
                    }
@@ -1129,6 +1191,12 @@ pub struct TaskDeps<K> {
    reads: EdgesVec,
    read_set: FxHashSet<DepNodeIndex>,
    phantom_data: PhantomData<DepNode<K>>,
+    /// Whether or not we allow `DepGraph::read_index` to run.
+    /// This is normally true, except inside `with_query_deserialization`,
+    /// where it set to `false` to enforce that no new `DepNode` edges are
+    /// created. See the documentation of `with_query_deserialization` for
+    /// more details.
+    read_allowed: bool,
 }

 impl<K> Default for TaskDeps<K> {
@@ -1139,6 +1207,7 @@ fn default() -> Self {
            reads: EdgesVec::new(),
            read_set: FxHashSet::default(),
            phantom_data: PhantomData,
+            read_allowed: true,
        }
    }
 }

--- a/compiler/rustc_query_system/src/query/plumbing.rs
+++ b/compiler/rustc_query_system/src/query/plumbing.rs
@@ -9,7 +9,6 @@
    report_cycle, QueryInfo, QueryJob, QueryJobId, QueryJobInfo, QueryShardJobId,
 };
 use crate::query::{QueryContext, QueryMap, QuerySideEffects, QueryStackFrame};
-
 use rustc_data_structures::fingerprint::Fingerprint;
 use rustc_data_structures::fx::{FxHashMap, FxHasher};
 #[cfg(parallel_compiler)]
@@ -515,7 +514,13 @@ fn try_load_from_disk_and_cache_in_memory<CTX, K, V>(
    // Some things are never cached on disk.
    if query.cache_on_disk {
        let prof_timer = tcx.dep_context().profiler().incr_cache_loading();
-        let result = query.try_load_from_disk(tcx, prev_dep_node_index);
+
+        // The call to `with_query_deserialization` enforces that no new `DepNodes`
+        // are created during deserialization. See the docs of that method for more
+        // details.
+        let result = dep_graph
+            .with_query_deserialization(|| query.try_load_from_disk(tcx, prev_dep_node_index));
+
        prof_timer.finish_with_query_invocation_id(dep_node_index.into());

        if let Some(result) = result {