This also allows some minor optimizations to be performed, such as
avoiding recomputation of the connected target set when
`MoveCommitsTarget::Roots` is used since the connected target set is
identical to the target set (all descendants of the roots).
is_empty() could also return Result<bool, _>, but I think the current definition
is also good. If an error occurred, revset.iter() would return at least one
item, so it's not empty.
Let's say we're updating one parent of a merge:
```
E E'
/|\ /|\
B C D -> B C D'
\|/ \|/
A A
```
When rebasing `E` to create `E'` there, we do that by merging the
changes compared to the auto-merged parents. The auto-merged parents
before is `B+(C-A)+(D-A)`, and after it's `B+(C-A)+(D'-A)`. Then we
rebase the diff, which gives us `E' = B+(C-A)+(D'-A) + (E -
(B+(C-A)+(D-A))) = D' + (E - D')`.
However, we currently don't do quite that simplification because we
first resolve conflicts when possible in the two auto-merged parent
trees (before and after). That rarely makes a difference to the
result, but it's wasteful to do it. It does make a difference in some
cases where our merge algorithm is lossy, which currently is only the
"A+(A-B)=A" case. I added a test case showing where it does make a
difference. It's a non-obvious cases but I think the new behavior is
more correct (the old behavior was a conflict).
This was added at f5f61f6bfe "revset: resolve 'HEAD@git' just like other
pseudo @git branches." As I said in this patch, there was no practical use case
of the HEAD@git symbol.
Suppose we implement colocated workspaces/worktrees #4436, there may be multiple
Git HEAD revisions. This means HEAD can no longer be abstracted as a symbol of
the "git" remote.
Custom backends may rely on networking or other unreliable implementations to support revsets, this change allows them to return errors cleanly instead of panicking.
For simplicity, only the public-facing Revset and RevsetGraph types are changed in this commit; the internal revset engine remains mostly unchanged and error-free since it cannot generally produce errors.
Clippy 1.83 (currently in beta) detects more cases of unneeded lifetimes,
namely in trait implementation declarations. Since this lint is warn by
default, we need to fix those instances to get a clean CI.
This is required when performing `rebase -s a -s b` where "b" is a
descendant of "a". Both "a" and "b" should be regarded as the roots of
the target set and be rebased onto the new destination.
This allows for `RebaseOptions` to be respected. This will be used when
migrating `rebase --source`/`rebase --branch` to use `move_commits` to
respect the `--before`/`--after` options.
The `coalesce` function takes a list of revsets and returns the commits in the
first revset in the list which evalutes to a non-empty set of commits.
It can be used to display fallbacks if a certain commit cannot be found,
e.g. `coalesce(present(user_configured_trunk), builtin_trunk)`.
A new module is added to jj_lib which exposes a function
get_annotation_for_file. This annotates the given file line by line with
commit information according to the commit that made the most recent
change to the line.
Similarly, a new command is added to the CLI called `jj file annotate` which
accepts a file path. It then prints out line by line the commit
information for the line and the line itself. Specific commit
information can be configured via the templates.annotate_commit_summary
config variable
This appears to be a bit faster if there are tons of unchanged ranges.
```
group new old
----- --- ---
bench_diff_git_git_read_tree_c 1.00 58.5±0.12µs 1.07 62.7±0.60µs
bench_diff_lines/modified/10k 1.00 34.2±0.72ms 1.08 37.0±1.09ms
bench_diff_lines/modified/1k 1.00 3.1±0.08ms 1.12 3.5±0.01ms
bench_diff_lines/reversed/10k 1.00 28.0±0.15ms 1.01 28.4±0.51ms
bench_diff_lines/reversed/1k 1.00 616.0±16.20µs 1.00 617.0±9.29µs
bench_diff_lines/unchanged/10k 1.00 3.5±0.04ms 1.10 3.9±0.06ms
bench_diff_lines/unchanged/1k 1.00 328.4±4.44µs 1.07 352.0±1.41µs
```
One particular use case for these is escape sequences -- and to that
end, I'm also adding `\e` as a shorthand for `\x1b`.
Change-Id: Id000000040ea6fd8e2d720219931485960c570dd
The Ord implementation didn't conform to Eq (which compares "self.others"
literally), and we don't need Ord to insert non-overlapping ranges before the
current range.
This isn't fancy, but I couldn't find a better abstraction. Note that
MutableRepo::base_repo() isn't removed as it has to return &Arc<_>, whereas
<ReadonlyRepo as Repo>::base_repo() can't return &Arc<_>.
The id.shortest() template prints a warning and falls back to repo-global
resolution. This seems better than erroring out. There are a few edge cases
in which the short-prefixes resolution can fail unexpectedly. For example, the
trunk() revision might not exist in operations before "jj git clone".
The idea is that the disambiguation index can be loaded from a repo which is
different from the symbol resolution context.
Suppose we add at_operation(op, expr) revset, a symbol inside at_operation()
expression will have to be resolved within that operation, whereas the
disambiguation index is cached globally by WorkspaceCommandHelper. We could
build temporary disambiguation index for each at-op repo, but that would be
complicated implementation-wise, and wouldn't be useful. For example, a query
"x | at_operation(@-, x)" might be resolved to "xy | at_operation(@-, xz)"
if disambiguation index were reloaded for the @- operation. Instead, the
short change ID "x" can be disambiguated to "xy", then resolved to the
corresponding commit IDs at each operation.
This unblocks reuse of a symbol resolver instance for a different repo view
specified by at_operation() revset. See later commits for details. It's also
easier to handle error if there is a single function that can fail.
This removes an invalid View state from the root operation.
Note that the root index will have to be reindexed in order to resolve "root()"
in the root operation. I don't think this would practically matter, so this
patch doesn't bump the index version to invalidate the existing indexes.
See also 48a9f9ef56 "repo: use Transaction for creating repo-init operation."
See the next patch for why. It might look odd that OpStore depends on the root
CommitId, but that seems okay because OpStore manages Views, and a View is
basically a set of CommitIds.
For #3673, we will have aliases such as:
```toml
'upload(revision)' = [
["fix", "-r", "$revision"],
["lint", "-r", "$revision"],
["git", "push", "-r", "$revision"],
]
```
Template aliases:
1) Start as Config::Value
2) Are converted to String
3) Are placed in the alias map
4) Expand to a TemplateExpression type via expand_defn.
However, command aliases:
1) Start as Config::Value
2) Are converted to Vec<Vec<String>>
3) Are placed in an alias map
4) Do not expand
Thus, AliasesMap will need to support non-string values.
We're likely to use the right (or new) context lines in rendered diffs, but
it's odd that the hunks iterator choose which context hunk to return. We'll
also need both contents to calculate left/right line numbers.
Since the hunk content types are the same, I also split enum DiffHunk into
{ kind, contents } pair.
This change made some diff benches slow, maybe because the generated code
becomes slightly worse due to the added abstraction? I'll revisit the
performance problem later. There are a couple of ways to mitigate it.
```
group new old
----- --- ---
bench_diff_git_git_read_tree_c 1.02 61.0±0.23µs 1.00 59.7±0.38µs
bench_diff_lines/modified/10k 1.00 41.6±0.24ms 1.02 42.3±0.22ms
bench_diff_lines/modified/1k 1.00 3.8±0.07ms 1.00 3.8±0.03ms
bench_diff_lines/reversed/10k 1.29 23.4±0.20ms 1.00 18.2±0.26ms
bench_diff_lines/reversed/1k 1.05 517.2±5.55µs 1.00 493.7±59.72µs
bench_diff_lines/unchanged/10k 1.00 3.9±0.10ms 1.08 4.2±0.10ms
bench_diff_lines/unchanged/1k 1.01 356.8±2.33µs 1.00 353.7±1.99µs
```
(I don't get stable results on my noisy machine, so the results would vary.)
The added comparison functions correspond to --ignore-all-space and
--ignore-space-change. --ignore-space-at-eol can be combined with the other
flags, so it will have to be implemented as a preprocessing function.
--ignore-blank-lines will also require some change in the tokenizer function.
This could be implemented as a newtype `Wrapper<'a>(&'a [u8])`, but a lifetime
of the wrap function couldn't be specified correctly:
fn diff(left: &[u8], right: &[u8], wrap_fn: F, ..)
where
F: for<'a> Fn(&'a [u8]) -> W<'a>, // F::Output<'a> can't be specified
W: Copy + Eq + Hash
If the wrapper were of `&Wrapper([u8])` type, `Fn(&[u8]) -> &W` works. However,
it means we can no longer set comparison parameter (such as Regex) dynamically.
Another idea is to add some filter function of `Fn(&[u8]) -> Cow<'_, [u8]>`
type, but I don't think we would want to pay the allocation cost in
hashing/comparison code. `Fn(&[u8]) -> impl Iterator<Item = &[u8]>` might work,
but it would be equally complex.
We'll use low-level HashTable to customize Eq/Hash without implementing newtype
wrappers.
Unneeded default features are disabled for now. Note that the new default
hasher, foldhash, is released under the Zlib license, which isn't currently
included in the allow list.
Most collection references implement `.into_iter()` or its mutable version,
so it is possible to iterate over the elements without using an explicit
method to do so.
We have two options to achieve "diff --ignore-*-space":
a. preprocess contents to be diffed, then translate hunk ranges back
b. add hooks to customize eq and hash functions
I originally thought (a) would be easier, but actually, there aren't many
changes needed to implement (b). And (b) should have a fewer logic errors.
This patch removes assumption that each unchanged region has the same content
length. It won't be true if whitespace characters are ignored.
Intersection of unchanged ranges becomes a simple merge-join loop, so I've
removed the existing tests. I also added a fast path for the common 2-way
diffs in which we don't have to build vec![(pos, vec![pos])].
One source of confusion introduced by this change is that WordPosition means
both global and local indices. This is covered by the added tests, but I might
add separate local/global types later.
It's silly that we build new Vec for each recursion stack and merge elements
back. I don't see a measurable performance difference in the diff bench, but
this change will help simplify the next patch. If a result vec were created for
each unchanged_ranges() invocation, it would probably make more sense to return
a list of "local" word positions. Then, callers would have to translate the
returned positions to the caller's local positions.
We can assign a unique integer to each (word, occurrence) pair instead. As a
bonus, HashMap can be replaced with Vec.
```
group new old
----- --- ---
bench_diff_git_git_read_tree_c 1.00 72.5±3.25µs 1.08 78.5±0.48µs
bench_diff_lines/modified/10k 1.00 45.1±1.18ms 1.10 49.8±1.85ms
bench_diff_lines/modified/1k 1.00 4.1±0.07ms 1.11 4.5±0.34ms
bench_diff_lines/reversed/10k 1.00 19.0±0.12ms 1.12 21.2±1.26ms
bench_diff_lines/reversed/1k 1.00 558.5±37.42µs 1.17 655.6±16.27µs
bench_diff_lines/unchanged/10k 1.00 5.3±0.78ms 1.33 7.0±0.89ms
bench_diff_lines/unchanged/1k 1.00 422.0±16.68µs 1.28 540.7±13.96µs
```
I'll add a few more helper methods there. It might also make sense to cache
precomputed hash values.
unchanged_ranges() is made private since there are no external callers, and
I'm going to add more private types.
See discussion thread in linked issue.
With this PR, all revset functions in [BUILTIN_FUNCTION_MAP](8d166c7642/lib/src/revset.rs (L570))
that return multiple values are either named in plural or the naming is hard to misunderstand (e.g. `reachable`)
Fixes: #4122
Stacking at AliasExpanded node looks wonky. If we migrate error handling to
Diagnostics API, it might make sense to remove AliasExpanded node and add
node.aliases: vec![(id, span), ..] field instead.
Some closure arguments are inlined in order to help type inference.
I recently got random test_commit_parallel*() failures, and this patch appears
to fix the problem.
SimpleOpHeadsStore::update_op_heads() adds new_id file and removes old_ids
files in that order. It ensures that there exists at least one id file, but it
doesn't mean readdir() can observe the added id file.
1. process A: get_op_heads() -> opendir()
2. process B: update_op_heads() -> add_op_head(new_id), remove_op_head(old_id)
3. process A: -> readdir() (can miss new_id)
update_op_heads() could do rename(old_ids[0], new_id), but I don't remember if
readdir() can always pick up a renamed entry.
I think it's slightly easier to follow if we calculate a diff of input bits
first. I don't know which one is faster, but I assume compiler can optimize to
similar instructions.
Comparing each byte before comparing the nibbles is more efficient. A
benchmark comparing the old and new implementations with various common
prefix lengths shows:
```
Common hex len/old/3 time: [7.5444 ns 7.5807 ns 7.6140 ns]
Common hex len/new/3 time: [1.2100 ns 1.2144 ns 1.2192 ns]
Common hex len/old/6 time: [11.849 ns 11.879 ns 11.910 ns]
Common hex len/new/6 time: [1.9950 ns 2.0046 ns 2.0156 ns]
Common hex len/old/32 time: [63.030 ns 63.345 ns 63.718 ns]
Common hex len/new/32 time: [6.4647 ns 6.4800 ns 6.4999 ns]
```
Jujutsu's branches do not behave like Git branches, which is a major
hurdle for people adopting it from Git. They rather behave like
Mercurial's (hg) bookmarks.
We've had multiple discussions about it in the last ~1.5 years about this rename in the Discord,
where multiple people agreed that this _false_ familiarity does not help anyone. Initially we were
reluctant to do it but overtime, more and more users agreed that `bookmark` was a better for name
the current mechanism. This may be hard break for current `jj branch` users, but it will immensly
help Jujutsu's future, by defining it as our first own term. The `[experimental-moving-branches]`
config option is currently left alone, to force not another large config update for
users, since the last time this happened was when `jj log -T show` was removed, which immediately
resulted in breaking users and introduced soft deprecations.
This name change will also make it easier to introduce Topics (#3402) as _topological branches_
with a easier model.
This was mostly done via LSP, ripgrep and sed and a whole bunch of manual changes either from
me being lazy or thankfully pointed out by reviewers.
`jj new <commit>` automatically adds the checked out commits into the view head ids. However,
`jj edit` does not.
To reproduce:
```
jj git init test
cd test
jj commit -m "my commit"
jj log -r @- -T commit_id # Save the id
jj abandon -r @-
jj edit <saved_id>
jj log -r :: # Does not show the currently editing commit
```
It's a pretty frequent request to have support for turning off
auto-tracking of new files and to have a command to manually track
them instead. This patch adds a `snapshot.auto-track` config to decide
which paths to auto-track (defaults to `all()`). It also adds a `jj
track` command to manually track the untracked paths.
This patch does not include displaying the untracked paths in `jj
status`, so for now this is probably only useful in colocated repos
where you can run `git status` to find the untracked files.
#323