revset: minor fixes to documentation of graph iterator

This commit is contained in:
Martin von Zweigbergk 2023-02-16 10:54:53 -08:00 committed by Martin von Zweigbergk
parent 7bf1ab712a
commit a897b27770

View file

@ -20,81 +20,82 @@ use crate::default_index_store::{IndexEntry, IndexPosition};
use crate::nightly_shims::BTreeMapExt; use crate::nightly_shims::BTreeMapExt;
use crate::revset::{RevsetGraphEdge, RevsetGraphEdgeType}; use crate::revset::{RevsetGraphEdge, RevsetGraphEdgeType};
// Given an iterator over some set of revisions, yields the same revisions with /// Given an iterator over some set of revisions, yields the same revisions with
// associated edge types. /// associated edge types.
// ///
// If a revision's parent is in the input set, then the edge will be "direct". /// If a revision's parent is in the input set, then the edge will be "direct".
// Otherwise, there will be one "indirect" edge for each closest ancestor in the /// Otherwise, there will be one "indirect" edge for each closest ancestor in
// set, and one "missing" edge for each edge leading outside the set. /// the set, and one "missing" edge for each edge leading outside the set.
// ///
// Example (uppercase characters are in the input set): /// Example (uppercase characters are in the input set):
// ///
// A A /// A A
// |\ |\ /// |\ |\
// B c B : /// B c B :
// |\| => |\: /// |\| => |\:
// d E ~ E /// d E ~ E
// |/ ~ /// |/ ~
// root /// root
// ///
// The implementation works by walking the input iterator in one commit at a /// The implementation works by walking the input iterator one commit at a
// time. It then considers all parents of the commit. It looks ahead in the /// time. It then considers all parents of the commit. It looks ahead in the
// input iterator far enough that all the parents will have been consumed if /// input iterator far enough that all the parents will have been consumed if
// they are in the input (and puts them away so we can emit them later). If a /// they are in the input (and puts them away so we can emit them later). If a
// parent of the current commit is not in the input set (i.e. it was not /// parent of the current commit is not in the input set (i.e. it was not
// in the look-ahead), we walk these external commits until we end up back back /// in the look-ahead), we walk these external commits until we end up back back
// in the input set. That walk may result in consuming more elements from the /// in the input set. That walk may result in consuming more elements from the
// input iterator. In the example above, when we consider "A", we will initially /// input iterator. In the example above, when we consider "A", we will
// look ahead to "B" and "c". When we consider edges from the external commit /// initially look ahead to "B" and "c". When we consider edges from the
// "c", we will further consume the input iterator to "E". /// external commit "c", we will further consume the input iterator to "E".
// ///
// Missing edges are those that don't lead back into the input set. If all edges /// Missing edges are those that don't lead back into the input set. If all
// from an external commit are missing, we consider the edge to that edge to /// edges from an external commit are missing, we consider the edge to that
// also be missing. In the example above, that means that "B" will have a /// commit to also be missing. In the example above, that means that "B" will
// missing edge to "d" rather than to the root. /// have a missing edge to "d" rather than to the root.
// ///
// The iterator can be configured to skip transitive edges that it would /// The iterator can be configured to skip transitive edges that it would
// otherwise return. In this mode (which is the default), the edge from "A" to /// otherwise return. In this mode (which is the default), the edge from "A" to
// "E" in the example above would be excluded because there's also a transitive /// "E" in the example above would be excluded because there's also a transitive
// path from "A" to "E" via "B". The implementation of that mode /// path from "A" to "E" via "B". The implementation of that mode
// adds a filtering step just before yielding the edges for a commit. The /// adds a filtering step just before yielding the edges for a commit. The
// filtering works doing a DFS in the simplified graph. That may require even /// filtering works by doing a DFS in the simplified graph. That may require
// more look-ahead. Consider this example (uppercase characters are in the input /// even more look-ahead. Consider this example (uppercase characters are in the
// set): /// input set):
// ///
// J /// J
// /| /// /|
// | i /// | i
// | |\ /// | |\
// | | H /// | | H
// G | | /// G | |
// | e f /// | e f
// | \|\ /// | \|\
// | D | /// | D |
// \ / c /// \ / c
// b / /// b /
// |/ /// |/
// A /// A
// | /// |
// root /// root
// ///
// When walking from "J", we'll find indirect edges to "H", "G", and "D". This /// When walking from "J", we'll find indirect edges to "H", "G", and "D". This
// is our unfiltered set of edges, before removing transitive edges. In order to /// is our unfiltered set of edges, before removing transitive edges. In order
// know that "D" is an ancestor of "H", we need to also walk from "H". We use /// to know that "D" is an ancestor of "H", we need to also walk from "H". We
// the same search for finding edges from "H" as we used from "J". That results /// use the same search for finding edges from "H" as we used from "J". That
// in looking ahead all the way to "A". We could reduce the amount of look-ahead /// results in looking ahead all the way to "A". We could reduce the amount of
// by stopping at "c" since we're only interested in edges that could lead to /// look-ahead by stopping at "c" since we're only interested in edges that
// "D", but that would require extra book-keeping to remember for later that the /// could lead to "D", but that would require extra book-keeping to remember for
// edges from "f" and "H" are only partially computed. /// later that the edges from "f" and "H" are only partially computed.
pub struct RevsetGraphIterator<'revset, 'index> { pub struct RevsetGraphIterator<'revset, 'index> {
input_set_iter: Box<dyn Iterator<Item = IndexEntry<'index>> + 'revset>, input_set_iter: Box<dyn Iterator<Item = IndexEntry<'index>> + 'revset>,
// Commits in the input set we had to take out of the iterator while walking external /// Commits in the input set we had to take out of the iterator while
// edges. Does not necessarily include the commit we're currently about to emit. /// walking external edges. Does not necessarily include the commit
/// we're currently about to emit.
look_ahead: BTreeMap<IndexPosition, IndexEntry<'index>>, look_ahead: BTreeMap<IndexPosition, IndexEntry<'index>>,
// The last consumed position. This is always the smallest key in the look_ahead map, but it's /// The last consumed position. This is always the smallest key in the
// faster to keep a separate field for it. /// look_ahead map, but it's faster to keep a separate field for it.
min_position: IndexPosition, min_position: IndexPosition,
// Edges for commits not in the input set. /// Edges for commits not in the input set.
// TODO: Remove unneeded entries here as we go (that's why it's an ordered map)? // TODO: Remove unneeded entries here as we go (that's why it's an ordered map)?
edges: BTreeMap<IndexPosition, HashSet<(IndexPosition, RevsetGraphEdge)>>, edges: BTreeMap<IndexPosition, HashSet<(IndexPosition, RevsetGraphEdge)>>,
skip_transitive_edges: bool, skip_transitive_edges: bool,