Layers are now constructed per file, not per source type. This will allow us
to report precise position where bad configuration variable is set. Because
layers are now created per file, it makes sense to require existence of the
file, instead of ignoring missing files which would leave an empty layer in the
stack. The path existence is tested by ConfigEnv::existing_config_path(), so I
simply made the new load_file/dir() methods stricter. However, we still need
config::File::required(false) flag in order to accept /dev/null as an empty
TOML file.
The lib type is renamed to StackedConfig to avoid name conflicts. The cli
LayeredConfigs will probably be reorganized as an environment object that
builds a StackedConfig.
Adds a new "git" conflict marker style option. This option matches Git's
"diff3" conflict style, allowing these conflicts to be parsed by some
external tools that don't support JJ-style conflicts. If a conflict has
more than 2 sides, then it falls back to the similar "snapshot" conflict
marker style.
The conflict parsing code now supports parsing Git-style conflict
markers in addition to the normal JJ-style conflict markers, regardless
of the conflict marker style setting. This has the benefit of allowing
the user to switch the conflict marker style while they already have
conflicts checked out, and their old conflicts will still be parsed
correctly.
Example of "git" conflict markers:
```
<<<<<<< Side #1 (Conflict 1 of 1)
fn example(word: String) {
println!("word is {word}");
||||||| Base
fn example(w: String) {
println!("word is {w}");
=======
fn example(w: &str) {
println!("word is {w}");
>>>>>>> Side #2 (Conflict 1 of 1 ends)
}
```
Adds a new "snapshot" conflict marker style which returns a series of
snapshots, similar to Git's "diff3" conflict style. The "snapshot"
option uses a subset of the conflict hunk headers as the "diff" option
(it just doesn't use "%%%%%%%"), meaning that the two options are
trivially compatible with each other (i.e. a file materialized with
"snapshot" can be parsed with "diff" and vice versa).
Example of "snapshot" conflict markers:
```
<<<<<<< Conflict 1 of 1
+++++++ Contents of side #1
fn example(word: String) {
println!("word is {word}");
------- Contents of base
fn example(w: String) {
println!("word is {w}");
+++++++ Contents of side #2
fn example(w: &str) {
println!("word is {w}");
>>>>>>> Conflict 1 of 1 ends
}
```
Adds a new "ui.conflict-marker-style" config option. The "diff" option
is the default jj-style conflict markers with a snapshot and a series of
diffs to apply to the snapshot. New conflict marker style options will
be added in later commits.
The majority of the changes in this commit are from passing the config
option down to the code that materializes the conflicts.
Example of "diff" conflict markers:
```
<<<<<<< Conflict 1 of 1
+++++++ Contents of side #1
fn example(word: String) {
println!("word is {word}");
%%%%%%% Changes from base to side #2
-fn example(w: String) {
+fn example(w: &str) {
println!("word is {w}");
>>>>>>> Conflict 1 of 1 ends
}
```
Since commit e716fe7, the function no longer returns a tree id. That
commit updated the function documentation to match the new behavior,
but then the old documentation was restored in commit ef83f2b, perhaps
by a bad merge conflict resolution.
There's a subtle difference in error message, but the conversion function to be
called is the same. The error message now includes "for key <key>", which is
nice.
.get_table() isn't implemented because it isn't cheap to build a HashMap,
and a table of an abstract Value type wouldn't be useful. Maybe we'll
instead provide an iterator of table keys.
.config() is renamed to .raw_config() to break existing callers.
Ui::with_config() is unchanged because I'm not sure if UserSettings should be
constructed earlier. I assume UserSettings will hold an immutable copy of
LayerdConfigs object, whereas Ui has to be initialized before all config layers
get loaded.
I'm planning to rewrite config store layer by leveraging toml_edit instead of
the config crate. It will allow us to merge config overlays in a way that
deprecated keys are resolved within a layer prior to merging, for example.
This patch moves ConfigNamePathBuf to jj-lib where new config API will be
hosted. We'll probably extract LayeredConfigs to this module, but we'll first
need to split environment dependencies from it.
Some editors strip trailing whitespace on save, which breaks any diffs
which have context lines, since the parsing function expects them to
start with a space. There's no visual difference between " \n" and "\n",
so it seems reasonable to accept both.
Currently, conflict markers ending in CRLF line endings aren't allowed.
I don't see any reason why we should reject them, since some
editors/tools might produce CRLF automatically on Windows when saving
files, which would break the conflicts otherwise.
Because the output of diff.hunks() is a list of [&BStr]s, a Merge object
reconstructed from a diff hunk will be Merge<&BStr>. I'm not pretty sure if
we'll implement conflict diffs in that way, but this change should be harmless
anyway.
I believe this was an oversight. "jj duplicate" should duplicate commits (=
patches), not trees.
This patch adds a separate test file because test_rewrite.rs is pretty big, and
we'll probably want to migrate CLI tests to jj-lib.
The working-copy revision is usually the latest commit, but it's not always
true. This patch ensures that the wc branch is emitted first so the graph node
order is less dependent on rewrites.
This isn't always fast because it increases the chance of cache miss, but in
practice, it makes "jj file annotate" faster. It's still slower than
"git blame", though.
Maybe we should also change the hash function.
```
group new old
----- --- ---
bench_diff_git_git_read_tree_c 1.00 45.2±0.38µs 1.29 58.4±0.32µs
bench_diff_lines/modified/10k 1.00 32.7±0.24ms 1.05 34.4±0.17ms
bench_diff_lines/modified/1k 1.00 2.9±0.00ms 1.06 3.1±0.01ms
bench_diff_lines/reversed/10k 1.00 22.7±0.18ms 1.02 23.2±0.29ms
bench_diff_lines/reversed/1k 1.00 439.0±9.46µs 1.19 523.9±6.05µs
bench_diff_lines/unchanged/10k 1.00 2.9±0.06ms 1.20 3.5±0.02ms
bench_diff_lines/unchanged/1k 1.00 240.8±1.03µs 1.30 312.1±1.05µs
```
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
'target/release-with-debug/{bin} --ignore-working-copy file annotate lib/src/revset.rs'
Benchmark 1: target/release-with-debug/jj-0 ..
Time (mean ± σ): 1.604 s ± 0.259 s [User: 1.543 s, System: 0.057 s]
Range (min … max): 1.348 s … 1.917 s 10 runs
Benchmark 2: target/release-with-debug/jj-1 ..
Time (mean ± σ): 1.183 s ± 0.026 s [User: 1.118 s, System: 0.062 s]
Range (min … max): 1.155 s … 1.237 s 10 runs
```
Since patience diff is recursive, it makes some sense to reuse precomputed
hash values. This patch migrates Histogram to remembering hashed values. The
precomputed values will be cached globally by DiffSource.
Technically, Histogram doesn't have to keep a separate copy of hash values, but
this appears to give better perf than slicing text and hash value from two Vecs.
This can be used to find the fork point (best common ancestors) of a
revset with an arbitrary number of commits, which cannot be expressed
currently in the revset language.
- gix::object::tree::diff::change::Event::Rewrite is flattened
- diff options are extracted to separate type
2b81e6c8bd
- signature text now includes a trailing newline
4a6bbb1b79
The last change means that our SecureSig { sig } returned by GitBackend is now
terminated by '\n'. I think this is harmless since textual signature is usually
ends with '\n'.
I don't think we need to declare these dependencies separately because both
are the optional dependencies enabled by the git feature.
We'll also need the gix's "attributes" feature at some point.
> attributes - Query attributes and excludes. Enables access to pathspecs,
> worktree checkouts, filter-pipelines and submodules.
https://docs.rs/gix/0.67.0/gix/index.html#feature-flags
There is now an updated version of `esl01-renderdag` published from
the Sapling repo, so let's use that. That lets us remove the
`bitflags` 1.x dependency and the `itertools` 0.10.x non-dev
dependency.
This test reliably failed if I dropped tv_nsec part from statx().
Since we reload the repo now, several assertions get "fixed". I've added
index().has_id() test to clarify that it's still broken.
This is different from skipped paths because the file state has to remain as
FileType::GitSubmodule in order to ignore the submodule directory when
snapshotting.
Fixes#4825.
In "jj absorb", we'll need to calculate annotation from the parent tree. It's
usually identical to the tree of the parent commit, but this is not true for a
merge commit. Since I'm not sure how we'll process conflict trees in general,
this patch adds a minimal API to specify a single file content, not a
MergedTree.
The primary use case is to exclude immutable commits when calculating line
ranges to absorb. For example, "jj absorb" will build annotation of @ revision
with domain = mutable().
Due to the gradual rewrite to use the
`MutableRepo::transform_descendants` API, `DescendantRebaser` is no
longer used in `MutableRepo::rebase_descendants`, and is only used
(indirectly through `MutableRepo::rebase_descendants_return_map`) in
`rewrite::squash_commits`.
`DescendantRebaser` is removed since it contains a lot of logic
similar to `MutableRepo::transform_descendants`. Instead,
`MutableRepo::rebase_descendants_with_options_return_map` is rewritten
to use `MutableRepo::transform_descendants` directly, and the other
`MutableRepo::rebase_descendants_{return_map,with_options}` functions
call `MutableRepo::rebase_descendants_with_options_return_map` directly.
`MutableRepo::rebase_descendants_return_rebaser` is also removed.
https://rustsec.org/advisories/RUSTSEC-2024-0384 says to migrate off
of the `instant` crate because it's unmaintained. We depend on it only
via the `backoff` crate. That crate also seems unmaintained. So this
patch replaces our use of `backoff` by a custom implementation.
I initially intended to migrate to the `backon` crate, but that made
`lock::tests::lock_concurrent` tests fail. The test case spawns 72
threads (on my machine) and lets them all lock a file, and then it
waits 1 millisecond before releasing the file lock. I think the
problem is that their version of jitter is implemented as a random
addition of up to the initial backoff value. In our case, that means
we would add at most a millisecond. The `backoff` crate, on the other
hand does it by adding -50% to +50% of the current backoff value. I
think that leads to a much better distribution of backoffs, while
`backon`'s implementation means that only a few threads can lock the
file for each backoff exponent.
Both user and programmatic expressions use the same .evaluate() function now.
optimize() is applied globally after symbol resolution. The order shouldn't
matter, but it might be nicer because union of commit refs could be rewritten
to a single Commits(Vec<CommitId>) node.
This helps add library API that takes resolved revset expressions. For example,
"jj absorb" will first compute annotation within a user-specified ancestor range
such as "mutable()". Because the range expression may contain symbols, it should
be resolved by caller.
There are two ideas to check resolution state at compile time:
<https://github.com/martinvonz/jj/pull/4374>
a. add RevsetExpressionWrapper<PhantomState> and guarantee inner tree
consistency at public API boundary
b. parameterize RevsetExpression variant types in a way that invalid variants
can never be constructed
(a) is nice if we want to combine "resolved" and "unresolved" expressions. The
inner expression types are the same, so we can just calculate new state as
Resolved & Unresolved = Unresolved. (b) is stricter as the compiler can
guarantee invariants. This patch implements (b) because there are no existing
callers who need to construct "resolved" expression and convert it to "user"
expression.
.evaluate_programmatic() now requires that the expression is resolved.
This will become safe as I'm going to add static check that the expression does
never contain CommitRef(_)s. We could make Present(_) unconstructible, but the
existence of Present node is harmless.
I'm going to add RevsetExpression<State> type parameter, but the existing tree
transformer can't rewrite nodes to different state because the input and the
output must be of the same type. (If they were of different types, we couldn't
reuse the input subtree by Rc::clone().) The added visitor API will handle
state transitions by mapping RevsetExpression::<St1>::<Kind> to
RevsetExpression::<St2>::<Kind>.
CommitRef and AtOperation nodes are processed by specialized methods because
these nodes will depend on the State type. OTOH, Present node won't be
State-dependent, so it's inspected by the common fold_expression() method.
An input expression is not taken as an &Rc<RevsetExpression> but a &_ because
we can't reuse the allocation behind the Rc.
I originally considered adding deny-list-based implementation, but the Windows
compatibility rules are super confusing and I don't have a machine to find out
possible aliases. This patch instead adds directory equivalence tests.
In order to test file entity equivalence, we first need to create a file or
directory of the requested name. It's harmless to create an empty .jj or .git
directory, but materializing .git file or symlink can temporarily set up RCE
situation. That's why new empty file is created to test the path validity. We
might want to add some optimization for safe names (e.g. ASCII, not contain
"git" or "jj", not contain "~", etc.)
That being said, I'm not pretty sure if .git/.jj in sub directory must be
checked. It's not safe to cd into the directory and run "jj", but the same
thing can be said to other tools such as "cargo". Perhaps, our minimum
requirement is to protect our metadata (= the root .jj and .git) directories.
Despite the crate name (and internal use of std::fs::File),
same_file::is_same_file() can test equivalence of directories. This is
documented and tested, so I've removed my custom implementation, which was
slightly simpler but lacks Windows support.
If new file would overwrite an existing regular file, the file path is skipped.
It makes sense to apply the same rule to existing symlinks. Without this patch,
check out would fail if an existing path was a dead symlink or a symlink to
a directory.
I'm not sure if this was attackable before, but it should be better to not
try to remove file across symlinks.
The disk_path is now returned from create_parent_dirs() to clarify that the
path is identical.
This should be safer than relying on file open error. It's scary to continue
processing if the file was a symlink.
I'll add a few more sanity checks to remove_old_file(), so it's extracted as a
function.
I'm going to add "checked" version of to_fs_path(), but all callers can't be
migrated to it. For example, an error message should be produced even if the
path is malformed.
This patch also adds error variants to propagate InvalidRepoPathError. They
don't use ::Other { .. } so the errors can be distinguished in tests.
I'm going to replace the current .evaluate_programmatic() which does minimal
commit-ref resolution. The new .evaluate_programmatic() will be implemented on
a "resolved" expression.
For the same reason as the previous patch. It's nice if root() is considered
a "resolved" expression. With this change, most of the evaluate_programmatic()
callers won't have to do symbol resolution at all.
I'm going to add RevsetExpression<State = Resolved|User> type parameter to
detect API misuse at compile time. VisibleHeads is similar to All, and appears
in generic expression substitution function where a concrete State type
shouldn't be known.
This ensures that a symbol-resolved at_operation() expression won't be resolved
again when it's intersected with another expression, for example.
# in CLI
let expr1 = parse("at_operation(..)").resolve_user_symbol();
# in library
let expr2 = RevsetExpression::ancestors().intersection(&expr1);
expr2.evaluate_programmatic()
This will help construct file content based on diff hunks. For example, "jj
absorb" will first calculate annotation of the source parent (within mutable
ancestors), calculate diff, then "squash" hunks into ancestor commits of the
surrounding ranges.
This unblocks the use of TestBackend in long-running processes such as fuzzer.
It should also be safer because TempDir doesn't guarantee that the path is never
reused.
TestBackendData instances persist in memory right now, but they should be
discarded when the corresponding temp_dir gets dropped. The added struct will
manage the TestBackendData mapping.
We had documented that we support `git.auto-local-bookmark` but we
don't. The documentation has been incorrect since d9c68e08b1. This
patch fixes it by adding support for `git.auto-local-bookmark` with
fallback to the old/current `git.auto-local-branch`.
.