Apparently, libgit2 doesn't deduce "core.bare" config from the directory name,
but gitoxide implements it correctly. So we shouldn't blindly canonicalize
the Git repository path. Fortunately, the saved git_target path isn't a fully-
canonicalized form (unless user explicitly sepcified "--git-repo ./.git"), so
we don't need a hack to remap git_target back to the symlink path.
is_colocated_git_workspace() is adjusted since the git_workdir is no longer
resolved from the fully-canonicalized repo path, at least in our code. Still we
have the ".git/.." fallback because test_init_git_colocated_symlink_gitlink()
would otherwise fail. I haven't figured out why, and the test might be actually
wrong compared to the git CLI behavior, but let's not change that for now.
Fixes#2668
This adds an initial `jj util gc` command, which simply calls `git gc`
when using the Git backend. That should already be useful in
non-colocated repos because it's not obvious how to GC (repack) such
repos. In my own jj repo, it shrunk `.jj/repo/store/` from 2.4 GiB to
780 MiB, and `jj log --ignore-working-copy` was sped up from 157 ms to
86 ms.
I haven't added any tests because the functionality depends on having
`git` binary on the PATH, which we don't yet depend on anywhere
else. I think we'll still be able to test much of the future parts of
garbage collection without a `git` binary because the interesting
parts are about manipulating the Git repo before calling `git gc` on
it.
I think this is a variant of the problem fixed by 7fda80fc22 "tree: simplify
conflict before resolving at hunk level." We need to simplify() the conflict
before and after extracting file ids because the source conflict values may
contain trees to be cancelled out, and the file values may differ only in exec
bits. Since the legacy tree passes a simplified conflict in to this function,
I made the merged tree do the same.
Fixes#2654
Otherwise an empty subtree would be added to the parent tree.
If the stored tree contained an empty subtree, simplify() wouldn't work
against new "absent" subtree representation. I don't know if there's a
such code path, but I believe it's very rare to encounter the problem.
#2654
This basically backs out the change 1b9a3e27e0 "merged_tree: read before/after
trees concurrently." As we decided to add a separate impl for async access, it
doesn't make sense to read before/after pair in parallel.
The async single_tree() is moved to TreeDiffStreamImpl. It will help remove
the sync version when the performance problem is solved.
In snapshot(), changed_file_states are received in arbitrary order. For the
other callers, entries are in diff_stream order, so we don't have to sort
them.
With watchman enabled, we can see the cost of sorting the sorted proto entries.
I don't think this is significant, but we can mitigate it by adding
is_file_states_sorted flag to the proto message if needed:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
Time (mean ± σ): 164.8 ms ± 16.6 ms [User: 50.2 ms, System: 111.7 ms]
Range (min … max): 148.1 ms … 195.0 ms 20 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
Time (mean ± σ): 171.8 ms ± 13.6 ms [User: 61.7 ms, System: 109.0 ms]
Range (min … max): 159.5 ms … 192.1 ms 20 runs
```
Without watchman:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
Time (mean ± σ): 367.3 ms ± 30.3 ms [User: 1415.2 ms, System: 633.8 ms]
Range (min … max): 325.4 ms … 421.7 ms 20 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
Time (mean ± σ): 327.7 ms ± 24.9 ms [User: 1059.1 ms, System: 654.3 ms]
Range (min … max): 296.0 ms … 385.4 ms 20 runs
```
I haven't measured snapshotting against dirty working copy, but I don't think
it would be slower than the original implementation.
I'll replace the current lazy loading mechanism with this. Read-only methods
are implemented on the borrowed type so that we can narrow lookup scope
recursively.
self.contains(other) means that the self tree contains the other tree (i.e.
the self path is prefix of the other), but it could be confused the other way
around if we were thinking about the path literal, not the tree. Let's add
.starts_with() instead by copying the std::path::Path definition.
This enables cheap str-to-RepoPath cast, which is useful when sorting and
filtering a large Vec<(String, _)> list by using matcher for example. It
will also eliminate temporary allocation by repo_path.parent().
I'm going to add borrowed RepoPath type, and most from_internal_string()
callers will be migrated to it. For the remaining callers, it makes more
sense to move the ownership of String to RepoPathBuf.
This follows up on 3967f63 (see that commit's description for more
motivation) and e79c8b6.
In a discussion linked below, it was decided that forbidding `-r --skip-empty`
entirely is preferable to the mixed behavior introduced in 3967f63.
3967f637dc (commitcomment-133539911)
RepoPath::from_components() is removed since it is no longer a primitive
function.
The components iterator could be implemented on top of str::split(), but
it's not as we'll probably want to add components.as_path() -> &RepoPath.
Tree walking and tree_states map construction get slightly faster thanks to
fewer allocations and/or better cache locality. If we add a borrowed RepoPath
type, we can also implement a cheap &str to &RepoPath conversion on top. Then,
we can get rid of BTreeMap<RepoPath, FileState> construction at all.
Snapshot without watchman:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 950.1 ms ± 24.9 ms [User: 1642.4 ms, System: 681.1 ms]
Range (min … max): 913.8 ms … 990.9 ms 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 872.1 ms ± 14.5 ms [User: 1922.3 ms, System: 625.8 ms]
Range (min … max): 853.2 ms … 895.9 ms 10 runs
Relative speed comparison
1.09 ± 0.03 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux status
```
Tree walk:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files --ignore-working-copy"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files --ignore-working-copy
Time (mean ± σ): 375.3 ms ± 15.4 ms [User: 223.3 ms, System: 151.8 ms]
Range (min … max): 359.4 ms … 394.1 ms 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files --ignore-working-copy
Time (mean ± σ): 357.1 ms ± 16.2 ms [User: 214.7 ms, System: 142.6 ms]
Range (min … max): 341.6 ms … 378.9 ms 10 runs
Relative speed comparison
1.05 ± 0.06 target/release-with-debug/jj-0 -R ~/mirrors/linux files --ignore-working-copy
1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux files --ignore-working-copy
```
The added test would fail if paths were purely ordered by concatenated strings.
I'm not sure if we want to preserve the current ordering, but let's not break
it for the moment.
This follows up on @matts1 's #2609.
We still allow the `-r` commit to become empty. I would be more comfortable if
there was a test for that, but I haven't done that (yet?) and it seems pretty
safe. If that's a problem, I'm happy to forbid `-r --skip-empty` entirely,
since it is far less useful than `-s --skip-empty` or `-b --skip-empty`.
I think it is undesired to abandon emptied descendants. As far as descendants
of `A` are concerned, `jj rebase -r A` should be equivalent to `jj abandon A`,
and `jj abandon` does not remove emptied commits. It also doesn't seem very
useful to do that, since I think descendant commits of an abandoned (or moved
with `-r`) commit only become empty in pathological cases.
Additionally, if we did want -r to empty descendants of `A`, we'd have to add
thorough tests and possibly improve the algorithm. I want to refactor `rebase
-r` and add features to it, and having to consider cases of commits becoming
abandoned makes everything harder.
For example, if we have
```
root -> A -> B -> C
```
and `jj rebase -r A -d C` empties commit `B` (or `C`), I do not know whether
the current algorithm will work correctly. It seems possible that it would, but
that depends on the fact that empty merge commits are not abandoned for
descendants. That seems dangerous to rely on without tests.
I hope (but can't promise) that in the near future, making DescendantRebaser
return more information should help make it possible to create such
functionality in a more robust way. I am likely to attempt this as part of
implementing `-r --after`.
This is a step towards introducing a borrowed RepoPath type. The current
RepoPath type is inefficient as each component String is usually short. We
could apply short-string optimization, but still each inlined component would
consume 24 bytes just for e.g. "src", and increase the chance of random memory
access. If the owned RepoPath type is backed by String, we can implement cheap
cast from &str to borrowed &RepoPath type.
Leading/trailing slashes would introduce a bit of complexity if we migrate
the backing type from Vec<String> to String. Empty components are okay, but
let's reject them as they are cryptic and invalid.
It seems generally useful to optimize revset expressions in
`evaluate_programmatic()` so the caller doesn't have to remember to do
it. It should generally be cheap to do so even if it's often not
needed.
`RevsetExpression::resolve()` is meant for programmatically created
expressions. In particular, it may not contain symbols. Let's try to
clarify that by renaming the function and documenting it.
While it got faster to build a large BTreeMap<RepoPath, _>, there's still
a measurable cost. Let's eliminate it if watchman is enabled and the working
copy is clean. Perhaps, we should introduce new serialization format that
supports instant loading and lookup, but this hack works for the moment.
I'm not sure if the new tree_state format should be flat (RepoPath, _) list,
or tree like the backend storage btw.
In my "linux" repo (watchman enabled):
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 768.9 ms ± 14.2 ms [User: 630.7 ms, System: 131.2 ms]
Range (min … max): 742.3 ms … 783.1 ms 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 713.0 ms ± 16.8 ms [User: 587.9 ms, System: 116.2 ms]
Range (min … max): 681.5 ms … 731.1 ms 10 runs
Relative speed comparison
1.08 ± 0.03 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux status
This ensures that the root fsmonitor_matcher matches nothing if there are no
working-copy changes. The query result can be observed by "jj debug watchman
query-changed-files".
I don't have expertise on watchman query language, but using the watchman API
is probably better than .filter()-ing the result manually.
Suppose the input list is presorted, sorting a sorted vec would be cheaper
than .insert()-ing sorted items one by one.
In my "linux" repo (watchman eanbled):
- jj-0: baseline
- jj-1: previous (don't randomize by HashMap)
- jj-2: this
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s]
Range (min … max): 1.011 s … 1.068 s 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms]
Range (min … max): 821.7 ms … 870.2 ms 10 runs
Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status
Time (mean ± σ): 786.2 ms ± 16.7 ms [User: 650.7 ms, System: 204.1 ms]
Range (min … max): 760.8 ms … 805.2 ms 10 runs
Relative speed comparison
1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status
1.00 target/release-with-debug/jj-2 -R ~/mirrors/linux status
According to the doc, this is compatible with the map syntax.
https://protobuf.dev/programming-guides/proto3/#maps
This change means that the serialized file states are sorted by RepoPath,
so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses.
In my "linux" repo (watchman enabled):
- jj-0: baseline
- jj-1: this
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s]
Range (min … max): 1.011 s … 1.068 s 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms]
Range (min … max): 821.7 ms … 870.2 ms 10 runs
Relative speed comparison
1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status
Cache-misses got reduced:
% perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
-- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status
1,091.68 msec task-clock # 1.032 CPUs utilized
4,179,596,978 cycles # 3.829 GHz
6,166,231,489 instructions # 1.48 insn per cycle
134,032,047 cache-references # 122.776 M/sec
29,322,707 cache-misses # 21.88% of all cache refs
1.057474164 seconds time elapsed
0.897042000 seconds user
0.194819000 seconds sys
% perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
-- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status
927.05 msec task-clock # 1.083 CPUs utilized
3,451,299,198 cycles # 3.723 GHz
6,222,418,272 instructions # 1.80 insn per cycle
98,499,363 cache-references # 106.251 M/sec
11,998,523 cache-misses # 12.18% of all cache refs
0.855938336 seconds time elapsed
0.720568000 seconds user
0.207924000 seconds sys
The idea is the same as the heads_pos() change in 9832ee205d. While
IndexEntry::position() should be cheap, saving 20 bytes per entry appears to
improve the performance in mid-size repos.
In my "linux" repo:
revsets/all()
-------------
baseline 1.24 156.0±1.06ms
this 1.00 126.0±0.51ms
I don't see significant difference in small-sized repos like "jj" or "git".
IndexEntryByPosition isn't removed since it's still used by the revset engine.
This removes the last use of `ouroboros`. Since `TreeEntriesDirItem`
is only used in "legacy trees" (before tree-level conflicts), I didn't
bother to check the performance impact. I also didn't bother to check
the matcher before adding the entries to the list, instead leaving
that where it is in `Iterator::next()`.
This removes the last use of `ouroboros` in `merged_tree.rs`. The set
of conflicts to iterate is usually so small that I didn't bother
checking the performance impact.
While importing the `ouroboros` crate and the `aliasable` crate it
depends on, the "unsafe Rust reviewer" expressed some concern that
they contain a lot of unsafe code that's hard to review. We can avoid
the unsafe code altogether by making `TreeEntriesIterator` not
self-refential. Instead, we can collect the matching entries in an
individual tree up front. It does have some performance cost:
```
❯ hyperfine --warmup 3 --runs 30 \
'/tmp/jj-before --ignore-working-copy files -r v6.0' \
'/tmp/jj-after --ignore-working-copy files -r v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy files -r v6.0
Time (mean ± σ): 461.4 ms ± 14.3 ms [User: 232.1 ms, System: 229.4 ms]
Range (min … max): 443.4 ms … 496.3 ms 30 runs
Benchmark 2: /tmp/jj-after --ignore-working-copy files -r v6.0
Time (mean ± σ): 482.0 ms ± 14.3 ms [User: 257.2 ms, System: 224.9 ms]
Range (min … max): 461.8 ms … 513.3 ms 30 runs
Summary
'/tmp/jj-before --ignore-working-copy files -r v6.0' ran
1.04 ± 0.04 times faster than '/tmp/jj-after --ignore-working-copy files -r v6.0'
```
I think that's acceptable.
This is much faster (maybe because of better cache locality?) Another option
is to use BTreeSet, but the BinaryHeap version is slightly faster.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 2.92 500.0±2.99ms
2 1.98 339.6±1.64ms
3 (this) 1.00 171.2±0.30ms
Apparently, IndexEntry::generation_number() isn't cheap probably because it
involves random access to larger memory region, and the u32 value might not
be aligned. Let's instead store the generation numbers in BinaryHeap.
Also, heads_pos() becomes slightly faster by keeping the BinaryHeap entries
small, so I've removed the IndexEntry at all.
This makes the default log and disambiguation revsets fast, which evaluate
'heads(immutable_heads())'.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 2.92 500.0±2.99ms
2 (this) 1.98 339.6±1.64ms
All callers just iterate over the parent entries.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 (this) 2.92 500.0±2.99ms
For loose refs, uninteresting directories can be just skipped. For packed refs,
gix will have to do binary search for each prefix to find the starting point.
Still it's better overall if the repository contains tons of refs/jj/keep refs.
With my linux repo containing ~5k loose jj refs, this saves ~40ms:
% hyperfine --warmup 3 --runs 10 \
"/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux" \
"/tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux"
Benchmark 1: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux
Time (mean ± σ): 151.6 ms ± 11.4 ms [User: 38.8 ms, System: 111.6 ms]
Range (min … max): 129.8 ms … 159.5 ms 10 runs
Benchmark 2: /tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux
Time (mean ± σ): 109.9 ms ± 11.6 ms [User: 27.5 ms, System: 82.4 ms]
Range (min … max): 89.4 ms … 117.8 ms 10 runs
Gitoxide errors are boxed since there are various error types and they tend
to exceed the clippy size limit.
Apparently, gitoxide is faster than git2:
% hyperfine --warmup 3 --runs 10 \
"/tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux" \
"/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux"
Benchmark 1: /tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux
Time (mean ± σ): 205.4 ms ± 15.7 ms [User: 59.6 ms, System: 144.6 ms]
Range (min … max): 189.7 ms … 223.9 ms 10 runs
Benchmark 2: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux
Time (mean ± σ): 176.2 ms ± 13.7 ms [User: 41.2 ms, System: 134.0 ms]
Range (min … max): 155.4 ms … 186.5 ms 10 runs
We need to .collect_vec() the parents iterator to temporary buffer since the
borrowed iterator can't be returned back to the dag_walk functions. Another
option is to clone op_store and parent ids to remove &self lifetime from the
iterator, but that also means a temporary Vec is created.
Unlike dfs_ok(), this function short-circuits at an Err as we use non-lazy
topo_order_forward() internally. I think that's good enough. If we implement
GC on operation log, deleted parents will be excluded (or mapped to tombstone)
by caller. An Err shouldn't mean it's GC-ed.
This unblocks the use of Result<T, E> in op.parents().
There are two ways to encode errors:
a. impl IntoIterator<Item = Result<T, E>>
b. Result<V, E> where V: FromIterator<Item = T>
I think (a) is more natural to algorithms like dfs(), which can process error
nodes transparently.
Still the caller might have to collect the source iterator to temporary Vec
to conform to the neighbors_fn signature. It's not easy for neighbors_fn to
return an iterator borrowing the input node. We already have GAT, but doesn't
have return-position impl Trait in trait yet.
Recognize signature metadata from git commit objects, implement
a basic version of that for the native backend.
Extract the signed data (a commit binary repr without the signature) to
be verified later.
Otherwise, ref updates would fail if we port git::export_refs() to gitoxide.
This change isn't strictly needed for the backend itself, but we'll reuse the
gix::Repository instance created by the backend when importing and exporting
Git refs.
GitBackend will use it to configure gix::Repository. I think UserSettings
is generally useful to pass store-specific parameters, so I've updated all
factory functions.
While the safe implementation is a bit more complex (and probably more branchy),
I don't think the runtime overhead would matter here. Let's remove one more
unsafe for better code maintainability.
We have a few places where we have a `MergedTreeValue` and need to
read the data associated with it so we can write to the working copy
or include it in a diff. Let's extract some of that shared logic to a
function so we can reuse it. I plan to use it for reading file
contents in advance while streaming a diff in `local_working_copy`
soon (and probably in `jj diff` thereafter), but I think it seems like
an improvement on its own.
I'd like to read N files ahead from the backend, to avoid serializing
too many server calls on backends that are backed by a server. Moving
the reads a little earlier is a little step towards that.
The `TreeState::write_*()` functions can now be made into free/static
functions if we prefer.
We no longer have "unsafe" in this function, so let's use the iterator API
instead of recursion. Apparently I haven't pushed this change before because
unsafe in .find_map() looked scary.
Remove a couple of unnecessary unsafes:
- The NonZeroUsize is a constant where the unwrap will optimize away
anyway and we don't have an unsafe without any good reason there :)
- The other two were simply not needed, lifetimes worked fine, maybe
Rust became better since that code was written? NLL? Anyway, they're
gone now
As discussed in Discord, it's less useful if remote_branches() included
Git-tracking branches. Users wouldn't consider the backing Git repo as
a remote.
We could allow explicit 'remote_branches(remote=exact:"git")' query by changing
the default remote pattern to something like 'remote=~exact:"git"'. I don't
know which will be better overall, but we don't have support for negative
patterns anyway.
It's super common that a Merge object holds a resolved value, so let's inline
up to 1 element. T of Merge<T> usually consists of a couple of pointer-sized
fields. I don't see any measurable speed up, but it's no worse than the
original.
Since the concurrent diff algorithm is significantly slower when using
the Git backend, I think we'll have to use switch between the two
algorithms depending on backend. Even if the concurrent version always
performed as well as the sequential version, exactly how concurrent it
should be probably still depends on the backend. This commit therefore
adds a function to the `Backend` trait, so each backend can say how
much concurrency they deal well with. I then use that number for
choosing between the sequential and concurrent versions in
`MergedTree::diff_stream()`, and also to decide the number of
concurrent reads to do in the concurrent version.
When diffing two trees, we currently start at the root and diff those
trees. Then we diff each subtree, one at a time, recursively. When
using a commit backend that uses remote storage, like our backend at
Google does, diffing the subtrees one at a time gets very slow. We
should be able to diff subtrees concurrently. That way, the number of
roundtrips to a server becomes determined by the depth of the deepest
difference instead of by the number of differing trees (times 2,
even). This patch implements such an algorithm behind a `Stream`
interface. It's not hooked in to `MergedTree::diff_stream()` yet; that
will happen in the next commit.
I timed the new implementation by updating `jj diff -s` to use the new
diff stream and then ran it on the Linux repo with `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. That slowed down by
~20%, from ~750 ms to ~900 ms. Maybe we can get some of that
performance back but I think it'll be hard to match
`MergedTree::diff()`. We can decide later if we're okay with the
difference (after hopefully reducing the gap a bit) or if we want to
keep both implementations.
I also timed the new implementation on our cloud-based repo at
Google. As expected, it made some diffs much faster (I'm not sure if
I'm allowed to share figures).
I'm going to add a Merge method that removes negative/positive terms pair, and
swap_remove() is the easiest option. The order of the conflicted ref targets
doesn't matter.
Many callers use interleaved iterators, and recently-added serialization code
is built on top of that, so I think it's better to store terms in that format.
map() functions no longer use MergeBuilder as we know the mapped values are
ordered properly. flatten() and simplify() are reimplemented to work with the
interleaved values. The other changes are trivial.
This motivation for this is so we can easily skip calling the function
if the user has opted out of the propagation of abandoned commits we
usually do (#2504). However, it seems like a good piece of code to
extract regardless of that feature.
One less git2 API use in CLI.
The function name GitBackend::init_colocated() is a bit odd, but we need to
specify the work-tree path, not the ".git" repo path. So we can't eliminate
the notion of the working copy path anyway.
What make rebase_to_dest_parent a good candidate for jj_lib::rewrite module:
- It is used both in obslog and interdiff. It's a sign that it may be moved to a lower layer
- CommandError is returned by converting from TreeMergeError. Not explicitly.
- It only use jj_lib::rewrite fonctions.
This will make it a little faster to update the working copy at Google
once we've made `MergedTree::diff_stream()` fetch trees
concurrently. (It only makes it a little faster because we still fetch
files serially.)
I'm going to implement a `Stream`-based version optimized for
high-latency (RPC-based) commit backends. So far, that implementation
is about 20% slower in the Linux repo when running `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. I think that's almost
only because the algorithm is different, not because it's async per
se.
This commit adds a `Stream`-based version of `MergedTree::diff()` that
just wraps the regular iterator in stream. I updated `jj diff` to use
it. I couldn't measure any difference on the command above in the
Linux repo. I think that means we can safely use the same
`Stream`-based interface regardless of backend, even if we end up
needing two different implementations of the `Stream`. We would then
be using the wrapped iterator from this commit for local backends, and
the new implementation for remote backends. But ideally we can make
the remote-friendly implementation fast enough that we don't need two
implementations.
During the transition to using more async code, I keep running into
https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want
to convert `MergedTree::diff()` into a `Stream`. I don't want to
update all call sites at once, so instead I'm adding a
`MergedTree::diff_stream()` method, which just wraps
`MergedTree::diff()` in a `Stream. However, since the iterator is
synchronous, it needs to block on the async `Backend::read_tree()`
calls. If we then also block on the `Stream` in the CLI, we run into
the panic.
We had similar code in two places for restoring paths from one tree to
another. Let's reuse it instead.
I put the new function in the `rewrite` module. I'm not sure if that's
right place. Maybe it belongs in `tree`?
Since gix::Repository::config_snapshot() borrows the repo instance, it has to
be allocated in caller's stack. That's why GitBackend::git_config() is removed.
My gut feeling is that gitoxide aims to be more transparent than libgit2. We'll
need to know more about the underlying Git data model.
Random comments on gix API:
* gix::Repository provides API similar to git2::Repository, but has less
"convenient" functions. For example, we need to use .find_object() +
.try_to/into_<kind>() instead of .find_<kind>().
* gix::Object, Blob, etc. own raw data as bytes. gix::object and gix::objs
types provide high-level views on such data.
* Tree building is pretty low-level compared to git2.
* gix leverages bstr (i.e. bytes) extensively.
It's probably not difficult to migrate git::import/export_refs(). It might
help eliminate the startup overhead of libssl initialization. The gix-based
GitBackend appears to be a bit faster, but that wouldn't practically matter.
#2316
Otherwise, the initialized repo could have a different work-dir path than the
load()-ed one. libgit2 appears to do some normalization somewhere, but gix
won't.