mirrors/jj

mirror of https://github.com/martinvonz/jj.git synced 2025-01-16 00:56:23 +00:00

Author	SHA1	Message	Date
Martin von Zweigbergk	c0295c5dbc	merged_tree: make ConflictsDirItem not self-referential This removes the last use of `ouroboros` in `merged_tree.rs`. The set of conflicts to iterate is usually so small that I didn't bother checking the performance impact.	2023-11-17 03:50:34 -08:00
Martin von Zweigbergk	e1a02c5c5b	merged_tree: make TreeDiffDirItem not self-referential This removes another dependency on `ouroboros`, for a small performance hit: ``` ❯ hyperfine --warmup 3 --runs 30 \ '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' \ '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0' Benchmark 1: /tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0 Time (mean ± σ): 689.7 ms ± 23.9 ms [User: 400.0 ms, System: 289.8 ms] Range (min … max): 666.9 ms … 759.2 ms 30 runs Benchmark 2: /tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0 Time (mean ± σ): 710.9 ms ± 19.2 ms [User: 420.4 ms, System: 290.6 ms] Range (min … max): 688.5 ms … 752.0 ms 30 runs Summary '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' ran 1.03 ± 0.05 times faster than '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0' ```	2023-11-17 03:50:34 -08:00
Martin von Zweigbergk	61d87fe296	merged_tree: make `TreeEntriesIterator` not self-referential While importing the `ouroboros` crate and the `aliasable` crate it depends on, the "unsafe Rust reviewer" expressed some concern that they contain a lot of unsafe code that's hard to review. We can avoid the unsafe code altogether by making `TreeEntriesIterator` not self-refential. Instead, we can collect the matching entries in an individual tree up front. It does have some performance cost: ``` ❯ hyperfine --warmup 3 --runs 30 \ '/tmp/jj-before --ignore-working-copy files -r v6.0' \ '/tmp/jj-after --ignore-working-copy files -r v6.0' Benchmark 1: /tmp/jj-before --ignore-working-copy files -r v6.0 Time (mean ± σ): 461.4 ms ± 14.3 ms [User: 232.1 ms, System: 229.4 ms] Range (min … max): 443.4 ms … 496.3 ms 30 runs Benchmark 2: /tmp/jj-after --ignore-working-copy files -r v6.0 Time (mean ± σ): 482.0 ms ± 14.3 ms [User: 257.2 ms, System: 224.9 ms] Range (min … max): 461.8 ms … 513.3 ms 30 runs Summary '/tmp/jj-before --ignore-working-copy files -r v6.0' ran 1.04 ± 0.04 times faster than '/tmp/jj-after --ignore-working-copy files -r v6.0' ``` I think that's acceptable.	2023-11-17 03:50:34 -08:00
Waleed Khan	a60733f632	tree: remove unsafe with `ouroboros` for self-referential iterators	2023-11-09 21:50:29 -08:00
Yuya Nishihara	2c128f1b61	merged_tree: convert from legacy conflicts through interleaved list This is basically the same change as the previous commit.	2023-11-07 17:10:12 +09:00
Yuya Nishihara	a734f46130	merged_tree: build unresolved Merge<Tree> from interleaved list We no longer need to iterate removes and adds separately.	2023-11-07 17:10:12 +09:00
Yuya Nishihara	dd26b7be40	merge: add Merge constructor that accepts interleaved values Also migrated some callers of 3-way merge, where [left, base, right] order looks okay.	2023-11-07 17:10:12 +09:00
Martin von Zweigbergk	1140295829	merged_tree: extract polling of tree futures into a function	2023-11-07 00:03:50 -08:00
Martin von Zweigbergk	c77417d4e4	merged_tree: drop outer loop in `TreeDiffStreamImpl::poll_next()` As suggested by Yuya. I also added a comment and an assertion in the case where return `Poll::Pending`.	2023-11-07 00:03:50 -08:00
Martin von Zweigbergk	d989d4093d	merged_tree: let backend influence whether to use new diff algo Since the concurrent diff algorithm is significantly slower when using the Git backend, I think we'll have to use switch between the two algorithms depending on backend. Even if the concurrent version always performed as well as the sequential version, exactly how concurrent it should be probably still depends on the backend. This commit therefore adds a function to the `Backend` trait, so each backend can say how much concurrency they deal well with. I then use that number for choosing between the sequential and concurrent versions in `MergedTree::diff_stream()`, and also to decide the number of concurrent reads to do in the concurrent version.	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	f40adb84fc	merged_tree: add a `Stream` for concurrent diff off trees When diffing two trees, we currently start at the root and diff those trees. Then we diff each subtree, one at a time, recursively. When using a commit backend that uses remote storage, like our backend at Google does, diffing the subtrees one at a time gets very slow. We should be able to diff subtrees concurrently. That way, the number of roundtrips to a server becomes determined by the depth of the deepest difference instead of by the number of differing trees (times 2, even). This patch implements such an algorithm behind a `Stream` interface. It's not hooked in to `MergedTree::diff_stream()` yet; that will happen in the next commit. I timed the new implementation by updating `jj diff -s` to use the new diff stream and then ran it on the Linux repo with `jj diff --ignore-working-copy -s --from v5.0 --to v6.0`. That slowed down by ~20%, from ~750 ms to ~900 ms. Maybe we can get some of that performance back but I think it'll be hard to match `MergedTree::diff()`. We can decide later if we're okay with the difference (after hopefully reducing the gap a bit) or if we want to keep both implementations. I also timed the new implementation on our cloud-based repo at Google. As expected, it made some diffs much faster (I'm not sure if I'm allowed to share figures).	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	c9ce80a82a	merged_tree: extract function for merged iterator of basenames in diff I'm going to reuse this for stream/async diffing.	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	b72f04ba61	merged_tree: rename `all_tree_conflict_names()` since it's not about conflicts	2023-11-06 23:12:02 -08:00
Yuya Nishihara	d9fbf21794	merge: have Merge::adds()/removes() return iterator The Merge type will be changed to store interleaved values internally.	2023-11-05 16:43:06 +09:00
Yuya Nishihara	1c6913d618	merge: use Merge::iter() instead of adds()/removes() where order doesn't matter Merge::iter() will be a slice::Iter, and be more efficient than chaining adds and removes.	2023-11-05 16:43:06 +09:00
Yuya Nishihara	f6d85c51cd	merge: add non-optional Merge accessor to the zeroth value We have a few callers which just need to obtain an object common among all the merge values. Let's add a non-failing accessor for that purpose.	2023-11-05 16:43:06 +09:00
Yuya Nishihara	b12c688ea0	merge: add method for indexed adds/removes access The current adds()/removes() will be changed to return iterators.	2023-11-05 16:43:06 +09:00
Martin von Zweigbergk	72245cfac5	merged_tree: add `Stream`-based version of `diff()`, delegating for now I'm going to implement a `Stream`-based version optimized for high-latency (RPC-based) commit backends. So far, that implementation is about 20% slower in the Linux repo when running `jj diff --ignore-working-copy -s --from v5.0 --to v6.0`. I think that's almost only because the algorithm is different, not because it's async per se. This commit adds a `Stream`-based version of `MergedTree::diff()` that just wraps the regular iterator in stream. I updated `jj diff` to use it. I couldn't measure any difference on the command above in the Linux repo. I think that means we can safely use the same `Stream`-based interface regardless of backend, even if we end up needing two different implementations of the `Stream`. We would then be using the wrapped iterator from this commit for local backends, and the new implementation for remote backends. But ideally we can make the remote-friendly implementation fast enough that we don't need two implementations.	2023-11-03 08:15:10 -07:00
Martin von Zweigbergk	24b706641f	async: switch to `pollster`'s `block_on()` During the transition to using more async code, I keep running into https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want to convert `MergedTree::diff()` into a `Stream`. I don't want to update all call sites at once, so instead I'm adding a `MergedTree::diff_stream()` method, which just wraps `MergedTree::diff()` in a `Stream. However, since the iterator is synchronous, it needs to block on the async `Backend::read_tree()` calls. If we then also block on the `Stream` in the CLI, we run into the panic.	2023-11-03 08:15:10 -07:00
Martin von Zweigbergk	a1ef9dc845	merged_tree: propagate backend errors in diff iterator I want to fix error propagation before I start using async in this code. This makes the diff iterator propagate errors from reading tree objects. Errors include the path and don't stop the iteration. The idea is that we should be able to show the user an error inline in diff output if we failed to read a tree. That's going to be especially useful for backends that can return `BackendError::AccessDenied`. That error variant doesn't yet exist, but I plan to add it, and use it in Google's internal backend.	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	309f1200d6	merge: introduce a type alias for `Merge<Option<TreeValue>>` Reasons to introduce this alias: * Reduces complexity of a type, to silence Clippy warnings in the future if we use this type as a type parameter * The type is used quite frequently, so it makes sense to have a name for it * It's easier to visually scan for the end of the type when you don't have to match opening and closing angle brackets	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	6ad71e658d	merged_tree: rename `MergedTreeValue` to `MergedTreeVal` I'm going to add `MergedTreeValue` as an alias for `Merge<Option<TreeValue>>`, but we already have a type by that name in `merged_tree`. This patch renames it away, to make room for the new alias. I used `MergedTreeVal` for this borrowing version to be a bit like how `str` is a borrowed version of `String`.	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	f541f9f3a6	cleanup: import `futures::exectutor::block_on()` instead of qualifying It seems we'll end up using `block_on()` quite a bit, at least until we're done transitioning to async, and the function name doesn't conflict with anything else, so let's always import it when we need it.	2023-10-20 07:38:34 -07:00
Martin von Zweigbergk	1b9a3e27e0	merged_tree: read before/after trees concurrently I'm going to rewrite `TreeDiffIterator` to fetch one level (depth) of the tree at a time and concurrently. One step towards that is to convert the iterator to a `Stream`. I'd like to do that by making the current `Iterator` implementation call the new `Stream` implementation. However, we can't call `futures::executor::block_on()` on a future that itself calls `futures::executor::block_on()` (as `Store::read_tree()` does), so the first step is to bubble up the async-ness a bit. This patch does that by fetching both sides of the diff concurrently. That should give close to a 2x speedup on high-latency backends. (It doesn't help with our backend at Google, however, because we have a daemon process that does some speculative prefetching that usually downloads the child trees anyway.)	2023-10-08 23:36:49 -07:00
Martin von Zweigbergk	7fda80fc22	tree: simplify conflict before resolving at hunk level I ran into a bug the other day where `jj status` said there was a conflict in a file but there were no conflict markers in the working copy. The commit was created when I squashed a conflict resolution into the commit's parent. The rebased child commit then ended up in this state. I.e., it looked something like this before squashing: ``` C (no conflict) \| \| B conflict \|/ A conflict ``` The conflict in B was different from the conflict in A. When I squashed in C, jj would try to resolve the conflicts by first creating a 7-way conflict (3 from A, 3 from B, 1 from C). Because of the exact content-level changes, the 7-way conflict couldn't be automatically resolved by `files::merge()` (the way it currently works anyway). However, after simplifying the conflict, it could be resolved. Because `MergedTree::merge()` does another round of conflict simplification of the result at the end of the function, it was the simplifed version that actually got stored in the commit. So when inspecting the conflict later (e.g. in the working copy, as I did), it could be automatically resolved. I think there are at least two ways to solve this. One is to call `merge_trees()` again after calling `tree.simplify()` in `MergedTree::merge()`. However, I think it would only matter in the case of content-level conflicts. Therefore, it seems better to make the content-level resolution solve this case to start with. I've done that by simplifying the conflict before passing it into `files::merge()`. We could even do the fix in `files::merge()`, but doing it before calling it has the advantage that we can avoid reading some unchanged content from the backend.	2023-09-27 22:14:39 -07:00
Martin von Zweigbergk	f39b0d24c8	tests: use test backend in working copy tests, fix `MergedTree` bug Only tests dealing with Git submodules care about the backend type. Switching the tests to use the test backend also uncovered another bug in `MergedTree`, so I fixed that too. The bug only happens with legacy trees (path-level conflicts) and backends that care about the conflict path, so it wouldn't happen with Git backends, and it wouldn't happen at Google either (because we use tree-level conflicts).	2023-09-19 20:49:41 -07:00
Martin von Zweigbergk	7ecd64fde1	merged_tree: use child path when merging child This fixes a bug where we used the parent directory's path when trying read trees and files for a child entry. Many tests in `test_merged_tree` fail after switching to the test backend there without this fix/	2023-09-18 07:53:19 -07:00
Martin von Zweigbergk	5ef0be73c1	merged_tree: allow building trees with variable-arity overrides When restoring (`jj restore`) a 3-sided conflict from one tree into a 2-sided tree (or a resolved tree), we'll need to extend the size arity of the target tree to that of the source tree. I had not considered this case before. This patch relaxes the constraint in `MergedTreeBuilder` to allow such cases. The additional trees are based on empty trees with only the larger overrides in them.	2023-09-01 06:09:37 -07:00
Martin von Zweigbergk	8e47d2d66f	merged_tree: add config option to write trees using new format We're finally ready to start writing trees using the new format where we represent conflicts by having multiple trees in the commit instead of having a single tree with multiple entries at a path. This patch adds a config option for that. It's not ready to be used yet, so I haven't updated the release notes or other documentation. I added only a simple CLI test for testing what happens when the config is enabled in an existing repo. 108 tests currently fail if we flip the default.	2023-08-30 06:17:21 -07:00
Martin von Zweigbergk	2d50d8a077	merged_tree: propagate errors from `from_legacy_tree()`	2023-08-29 08:32:04 -07:00
Martin von Zweigbergk	67832a3940	merged_tree: take store argument to `write_tree()` instead of `new()` The store isn't needed until we write the trees, so I think it makes more sense to pass it there.	2023-08-29 08:32:04 -07:00
Martin von Zweigbergk	64b47bae56	tree: inline `legacy_id()` into its sole caller	2023-08-29 07:01:52 -07:00
Martin von Zweigbergk	f47da04a43	tree: delete recursive diff iterator, which is no longer used	2023-08-28 16:21:44 -07:00
Martin von Zweigbergk	1b24b522f6	tree: move `diff_summary()` to `MergedTree`	2023-08-28 16:21:44 -07:00
Martin von Zweigbergk	873a6f0674	merged_tree: add a function for merging 3 `MergedTree`s With the already existing `MergedTree::resolve()` and all the recent refactorings into `Merge<T>`, it's now very easy to add support for 3-way merging of `MergedTree` instances.	2023-08-28 15:58:34 -07:00
Martin von Zweigbergk	1674a421ec	commit_builder: take `MergedTreeId` for root id argument	2023-08-28 15:58:34 -07:00
Martin von Zweigbergk	49e32aa532	merged_tree: teach tree builder to build multiple trees	2023-08-27 06:49:45 -07:00
Martin von Zweigbergk	2dd2e77170	merged_tree: add `entries()` for iterating over all entries We already have `entries_matching()`, so this is just a version of that that doesn't take a matcher.	2023-08-27 06:49:45 -07:00
Martin von Zweigbergk	36674e8f7e	merged_tree: make `id()` return a `MergedTreeId` We will rarely want to use the tree id without knowing whether it can contain `TreeValue::Conflict` values, so let's make the callers check.	2023-08-27 06:49:45 -07:00
Martin von Zweigbergk	389f27f042	working_copy: move writing of conflict objects into new tree builder This introduces a `MergedTreeBuilder` type, which takes a set of base trees and overrides. The idea is that it will be able to write multiple trees or a legacy tree. For now, it's only able to write legacy trees. To show that it works, the working copy's snaphotting code has been updated to use it.	2023-08-26 08:16:57 -07:00
Martin von Zweigbergk	598cfcb89b	merged_tree: in diff iterator, maintain legacy/modern variant in subtree As #2165 showed, when diffing two `MergedTree::Legacy` variants (or one of each variant) and re recurse into a subtree, we need to treat that as a legacy tree too, so we expand `TreeValue::Conflict`s found in the diff.	2023-08-26 05:58:54 -07:00
Martin von Zweigbergk	f3fbdf9f84	merged_tree: pass `MergedTree` into `TreeDiffIterator::tree()` This converts `TreeDiffIterator::tree()` and `TreeDiffIterator::single_tree()` into associated functions and passes in the `&MergedTree` into the former. This prepares for fixing #2165, and it removes the need for the `TreeDiffIterator::store` field.	2023-08-26 05:58:54 -07:00
Martin von Zweigbergk	416fa2741c	merged_tree: add entry iterator	2023-08-25 07:06:20 -07:00
Martin von Zweigbergk	d5ceefcd8e	merged_tree: add diff iterator If we're going to be able to replace most instances of `Tree` by `MergedTree`, we'll need to be able to diff two `MergedTree`s. This implements support for that. The implementation copies a lot from the diff iterator we have for `Tree`. I suspect we should be able to reuse some of the code by introducing some traits that can then be implemented by both `Tree` and `MergedTree`. I've left a TODO about that.	2023-08-25 06:40:36 -07:00
Martin von Zweigbergk	6b5544f335	tree_builder: add a `set_or_remove()` and simplify callers Many of the `TreeBuilder` users have an `Option<TreeValue>` and call either `set()` or `remove()` or the builder depending on whether the value is present. Let's centralize this logic in a new `TreeBuilder::set_or_remove()`.	2023-08-24 06:08:25 -07:00
Martin von Zweigbergk	1d55a404cc	merged_tree: add `path_value()`	2023-08-15 07:56:55 -07:00
Martin von Zweigbergk	d4e755b4e4	merged_tree: rename some symbols away from "conflict" There were still many instances of `conflict` left from before we renamed `Conflict<T>` to `Merge<T>`. I decided to rename many of them based on the type parameter instead of the container. I think that made it more readable in many cases.	2023-08-11 21:11:25 +00:00
Martin von Zweigbergk	0570963fe3	merge: add a `Merge::into_resolved()` to avoid cloning I don't know if this has any measurable impact. It just seems like we should be able to take a resolved value out of a `Merge` without clonning.	2023-08-09 21:58:15 +00:00
Martin von Zweigbergk	f7160cf936	merge: add `absent()` and `normal()` to `Merge<Option<T>>` These mimic the `RefTarget` functions. They're very useful in `MergedTree`. I might copy over other helpers from `RefTarget` later.	2023-08-09 21:58:15 +00:00
Martin von Zweigbergk	ef5f97f8d7	conflicts: move `Merge<T>` to `merge` module The `merge` module now seems like the obvious place for this type.	2023-08-06 22:08:09 +00:00

1 2

55 commits