Commit graph

3351 commits

Author SHA1 Message Date
Yuya Nishihara
3144a8cb9e annotate: add line_ranges() and compact_line_ranges() iterator
They allow callers to test range overlaps with e.g. diff hunks. "jj absorb"
will leverage compact_line_ranges().
2024-11-12 08:26:42 +09:00
Yuya Nishihara
077bac8be1 annotate: add low-level function to specify starting file content
In "jj absorb", we'll need to calculate annotation from the parent tree. It's
usually identical to the tree of the parent commit, but this is not true for a
merge commit. Since I'm not sure how we'll process conflict trees in general,
this patch adds a minimal API to specify a single file content, not a
MergedTree.
2024-11-12 08:26:42 +09:00
Yuya Nishihara
85e0a8b068 annotate: add option to not search all ancestors of starting commit
The primary use case is to exclude immutable commits when calculating line
ranges to absorb. For example, "jj absorb" will build annotation of @ revision
with domain = mutable().
2024-11-12 08:26:42 +09:00
Benjamin Tan
0e67ef9184 rewrite: avoid abandoned commit parent lookup in rebase_commit_with_options
This is an optimization to avoid fetching the parent commit of an
abandoned commit after rebasing, given that it might not even be used.
2024-11-12 01:33:12 +08:00
Benjamin Tan
9bd7e7707f repo: add docs for MutableRepo::rebase_descendants_* functions 2024-11-12 01:33:12 +08:00
Benjamin Tan
d7f929fefb repo: group MutableRepo::rebase_descendants_* functions together 2024-11-12 01:33:12 +08:00
Benjamin Tan
18faaf72a3 rewrite: remove DescendantRebaser
Due to the gradual rewrite to use the
`MutableRepo::transform_descendants` API, `DescendantRebaser` is no
longer used in `MutableRepo::rebase_descendants`, and is only used
(indirectly through `MutableRepo::rebase_descendants_return_map`) in
`rewrite::squash_commits`.

`DescendantRebaser` is removed since it contains a lot of logic
similar to `MutableRepo::transform_descendants`. Instead,
`MutableRepo::rebase_descendants_with_options_return_map` is rewritten
to use `MutableRepo::transform_descendants` directly, and the other
`MutableRepo::rebase_descendants_{return_map,with_options}` functions
call `MutableRepo::rebase_descendants_with_options_return_map` directly.

`MutableRepo::rebase_descendants_return_rebaser` is also removed.
2024-11-12 01:33:12 +08:00
Martin von Zweigbergk
02486dc064 fallback: replace use of backoff crate of by own implementation
https://rustsec.org/advisories/RUSTSEC-2024-0384 says to migrate off
of the `instant` crate because it's unmaintained. We depend on it only
via the `backoff` crate. That crate also seems unmaintained. So this
patch replaces our use of `backoff` by a custom implementation.

I initially intended to migrate to the `backon` crate, but that made
`lock::tests::lock_concurrent` tests fail. The test case spawns 72
threads (on my machine) and lets them all lock a file, and then it
waits 1 millisecond before releasing the file lock. I think the
problem is that their version of jitter is implemented as a random
addition of up to the initial backoff value. In our case, that means
we would add at most a millisecond. The `backoff` crate, on the other
hand does it by adding -50% to +50% of the current backoff value. I
think that leads to a much better distribution of backoffs, while
`backon`'s implementation means that only a few threads can lock the
file for each backoff exponent.
2024-11-11 07:04:21 -08:00
dploch
41631bc0e6 test_git: fix some clippy ref errors 2024-11-08 13:59:37 -05:00
Benjamin Tan
9b99f4810c rewrite: move_commits: do not allow emptying of descendants 2024-11-08 14:35:17 +08:00
Yuya Nishihara
62e4943c04 revset: reorganize expression resolution/evaluation methods
Both user and programmatic expressions use the same .evaluate() function now.
optimize() is applied globally after symbol resolution. The order shouldn't
matter, but it might be nicer because union of commit refs could be rewritten
to a single Commits(Vec<CommitId>) node.
2024-11-08 10:34:02 +09:00
Yuya Nishihara
e55d03a2ee revset: introduce type-safe user/resolved expression states
This helps add library API that takes resolved revset expressions. For example,
"jj absorb" will first compute annotation within a user-specified ancestor range
such as "mutable()". Because the range expression may contain symbols, it should
be resolved by caller.

There are two ideas to check resolution state at compile time:
<https://github.com/martinvonz/jj/pull/4374>

 a. add RevsetExpressionWrapper<PhantomState> and guarantee inner tree
    consistency at public API boundary
 b. parameterize RevsetExpression variant types in a way that invalid variants
    can never be constructed

(a) is nice if we want to combine "resolved" and "unresolved" expressions. The
inner expression types are the same, so we can just calculate new state as
Resolved & Unresolved = Unresolved. (b) is stricter as the compiler can
guarantee invariants. This patch implements (b) because there are no existing
callers who need to construct "resolved" expression and convert it to "user"
expression.

.evaluate_programmatic() now requires that the expression is resolved.
2024-11-08 09:56:33 +09:00
Yuya Nishihara
e83072b98f revset: ignore Present node when building backend expression
This will become safe as I'm going to add static check that the expression does
never contain CommitRef(_)s. We could make Present(_) unconstructible, but the
existence of Present node is harmless.
2024-11-08 09:56:33 +09:00
Yuya Nishihara
e6ea88aac0 revset: add visitor-like tree rewriting function, reimplement symbol resolution
I'm going to add RevsetExpression<State> type parameter, but the existing tree
transformer can't rewrite nodes to different state because the input and the
output must be of the same type. (If they were of different types, we couldn't
reuse the input subtree by Rc::clone().) The added visitor API will handle
state transitions by mapping RevsetExpression::<St1>::<Kind> to
RevsetExpression::<St2>::<Kind>.

CommitRef and AtOperation nodes are processed by specialized methods because
these nodes will depend on the State type. OTOH, Present node won't be
State-dependent, so it's inspected by the common fold_expression() method.

An input expression is not taken as an &Rc<RevsetExpression> but a &_ because
we can't reuse the allocation behind the Rc.
2024-11-08 09:56:33 +09:00
Yuya Nishihara
78d68f98f5 revset: group RevsetExpression constructors by resolved/user/generic
They'll become different in State types.
2024-11-08 09:56:33 +09:00
Yuya Nishihara
ba76299818 tests: use platform path separator in symlink content
Appears that this was the reason why we got the error "The filename, directory
name, or volume label syntax is incorrect" on Windows CI.
2024-11-07 13:38:04 +09:00
Yuya Nishihara
adef815d1d tests: try both DOS and hashed NT short file names
For some unknown reasons, hashed 8.3 file name is chosen for ".jj" on Github
CI. Hashed ".git" short name is also added for consistency.
2024-11-07 13:38:04 +09:00
Yuya Nishihara
dedab69eaa local_working_copy: lstat() path to test file existence if creation failed
Appears that file creation fails for other unknown reasons on Windows CI.
2024-11-07 13:38:04 +09:00
Martin von Zweigbergk
c697ee7d80 tests: work around codespell suggesting dows->does 2024-11-07 13:38:04 +09:00
Yuya Nishihara
ded48ff6e7 local_working_copy: do not create file or write in directory named .jj or .git
I originally considered adding deny-list-based implementation, but the Windows
compatibility rules are super confusing and I don't have a machine to find out
possible aliases. This patch instead adds directory equivalence tests.

In order to test file entity equivalence, we first need to create a file or
directory of the requested name. It's harmless to create an empty .jj or .git
directory, but materializing .git file or symlink can temporarily set up RCE
situation. That's why new empty file is created to test the path validity. We
might want to add some optimization for safe names (e.g. ASCII, not contain
"git" or "jj", not contain "~", etc.)

That being said, I'm not pretty sure if .git/.jj in sub directory must be
checked. It's not safe to cd into the directory and run "jj", but the same
thing can be said to other tools such as "cargo". Perhaps, our minimum
requirement is to protect our metadata (= the root .jj and .git) directories.

Despite the crate name (and internal use of std::fs::File),
same_file::is_same_file() can test equivalence of directories. This is
documented and tested, so I've removed my custom implementation, which was
slightly simpler but lacks Windows support.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
eaafde7119 cargo: add same-file dependency 2024-11-06 15:03:41 -08:00
Yuya Nishihara
f10c5db739 local_working_copy: skip existing symlinks consistently
If new file would overwrite an existing regular file, the file path is skipped.
It makes sense to apply the same rule to existing symlinks. Without this patch,
check out would fail if an existing path was a dead symlink or a symlink to
a directory.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
24ccfda781 local_working_copy: do not try to remove old file traversing symlinks
I'm not sure if this was attackable before, but it should be better to not
try to remove file across symlinks.

The disk_path is now returned from create_parent_dirs() to clarify that the
path is identical.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
8540536ea2 local_working_copy: detect error of file removal earlier
This should be safer than relying on file open error. It's scary to continue
processing if the file was a symlink.

I'll add a few more sanity checks to remove_old_file(), so it's extracted as a
function.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
1c30f3b3e8 repo_path: reject invalid path components by to_fs_path/name()
This addresses a simple path traversal attack.

I don't have a Windows machine, so the added Windows tests aren't checked
locally.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
739bf8decf repo_path: add stub for checked to_fs_path(), rename unchecked functions
I'm going to add "checked" version of to_fs_path(), but all callers can't be
migrated to it. For example, an error message should be produced even if the
path is malformed.

This patch also adds error variants to propagate InvalidRepoPathError. They
don't use ::Other { .. } so the errors can be distinguished in tests.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
e819cec305 revset: inline resolve/evaluate_programmatic() in tests
I'm going to replace the current .evaluate_programmatic() which does minimal
commit-ref resolution. The new .evaluate_programmatic() will be implemented on
a "resolved" expression.
2024-11-06 09:45:09 +09:00
Yuya Nishihara
0a73245b82 revset: move RevsetCommitRef::Root to RevsetExpression
For the same reason as the previous patch. It's nice if root() is considered
a "resolved" expression. With this change, most of the evaluate_programmatic()
callers won't have to do symbol resolution at all.
2024-11-04 09:20:46 +09:00
Yuya Nishihara
0e8f1ce579 revset: move RevsetCommitRef::VisibleHeads to RevsetExpression
I'm going to add RevsetExpression<State = Resolved|User> type parameter to
detect API misuse at compile time. VisibleHeads is similar to All, and appears
in generic expression substitution function where a concrete State type
shouldn't be known.
2024-11-04 09:20:46 +09:00
Yuya Nishihara
e38f7b0594 revset: add RevsetExpression::present() as there's an external caller 2024-11-04 09:20:46 +09:00
Yuya Nishihara
a740eaeb86 revset: add convenient method that extracts symbol name from expression 2024-11-04 09:20:46 +09:00
Yuya Nishihara
12eb5c5515 revset: split "resolved" variant from RevsetExpression::AtOperation
This ensures that a symbol-resolved at_operation() expression won't be resolved
again when it's intersected with another expression, for example.

    # in CLI
    let expr1 = parse("at_operation(..)").resolve_user_symbol();
    # in library
    let expr2 = RevsetExpression::ancestors().intersection(&expr1);
    expr2.evaluate_programmatic()
2024-11-04 09:20:46 +09:00
Yuya Nishihara
f251f08ce7 diff: impl Clone, Debug for DiffHunkIterator
Some checks failed
binaries / Build binary artifacts (linux-aarch64-gnu, ubuntu-24.04, aarch64-unknown-linux-gnu) (push) Has been cancelled
binaries / Build binary artifacts (linux-aarch64-musl, ubuntu-24.04, aarch64-unknown-linux-musl) (push) Has been cancelled
binaries / Build binary artifacts (linux-x86_64-gnu, ubuntu-24.04, x86_64-unknown-linux-gnu) (push) Has been cancelled
binaries / Build binary artifacts (linux-x86_64-musl, ubuntu-24.04, x86_64-unknown-linux-musl) (push) Has been cancelled
binaries / Build binary artifacts (macos-aarch64, macos-14, aarch64-apple-darwin) (push) Has been cancelled
binaries / Build binary artifacts (macos-x86_64, macos-13, x86_64-apple-darwin) (push) Has been cancelled
binaries / Build binary artifacts (win-x86_64, windows-2022, x86_64-pc-windows-msvc) (push) Has been cancelled
nix / flake check (macos-14) (push) Has been cancelled
nix / flake check (ubuntu-latest) (push) Has been cancelled
build / build (, macos-13) (push) Has been cancelled
build / build (, macos-14) (push) Has been cancelled
build / build (, ubuntu-latest) (push) Has been cancelled
build / build (, windows-latest) (push) Has been cancelled
build / build (--all-features, ubuntu-latest) (push) Has been cancelled
build / Build jj-lib without Git support (push) Has been cancelled
build / Check protos (push) Has been cancelled
build / Check formatting (push) Has been cancelled
build / Check that MkDocs can build the docs (push) Has been cancelled
build / Check that MkDocs can build the docs with Poetry 1.8 (push) Has been cancelled
build / cargo-deny (advisories) (push) Has been cancelled
build / cargo-deny (bans licenses sources) (push) Has been cancelled
build / Clippy check (push) Has been cancelled
Codespell / Codespell (push) Has been cancelled
website / prerelease-docs-build-deploy (ubuntu-latest) (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
For consistency with the ranges iterator.
2024-11-02 10:09:10 +09:00
Yuya Nishihara
de2a8a579a diff: extract hunk_ranges() iterator
This will help construct file content based on diff hunks. For example, "jj
absorb" will first calculate annotation of the source parent (within mutable
ancestors), calculate diff, then "squash" hunks into ancestor commits of the
surrounding ranges.
2024-11-02 10:09:10 +09:00
Yuya Nishihara
9f1d2abd76 testutils: move global TestBackendData mapping to TestEnvironment
This unblocks the use of TestBackend in long-running processes such as fuzzer.
It should also be safer because TempDir doesn't guarantee that the path is never
reused.
2024-11-02 08:39:02 +09:00
Yuya Nishihara
7b5df93fe4 testutils: move default_store_factories() to TestEnvironment
It will capture the TestBackendData mapping.
2024-11-02 08:39:02 +09:00
Yuya Nishihara
d4786a3256 testutils: move load_repo_at_head() to TestEnvironment
It will depend on the TestBackendData mapping.
2024-11-02 08:39:02 +09:00
Yuya Nishihara
22f2393322 testutils: add stub TestEnvironment that will manage in-memory backend data
TestBackendData instances persist in memory right now, but they should be
discarded when the corresponding temp_dir gets dropped. The added struct will
manage the TestBackendData mapping.
2024-11-02 08:39:02 +09:00
Martin von Zweigbergk
30ab71d340 bookmarks: add support for git.auto-local-bookmark (to match docs)
We had documented that we support `git.auto-local-bookmark` but we
don't. The documentation has been incorrect since d9c68e08b1. This
patch fixes it by adding support for `git.auto-local-bookmark` with
fallback to the old/current `git.auto-local-branch`.
.
2024-10-30 08:01:02 -07:00
Yuya Nishihara
e464c0e607 annotate: rename AnnotateResults to FileAnnotation
The name "Results" was a bit misleading because Result<T, E> aliases are often
called FooResult.
2024-10-29 23:33:46 +09:00
Yuya Nishihara
ab10b7c0a0 annotate: do not collect result lines into Vec, return Iterator instead
We might want to calculate (commit_id, range) pairs of consecutive lines in
order to "absorb" changes, for example.

This should also be cheaper since Vec<u8> doesn't have to be allocated per line.
2024-10-29 23:33:46 +09:00
Yuya Nishihara
bd1024547d annotate: use sorted Vec<(usize, usize)> to propagate lines to ancestors
Some checks are pending
binaries / Build binary artifacts (linux-aarch64-gnu, ubuntu-24.04, aarch64-unknown-linux-gnu) (push) Waiting to run
binaries / Build binary artifacts (linux-aarch64-musl, ubuntu-24.04, aarch64-unknown-linux-musl) (push) Waiting to run
binaries / Build binary artifacts (linux-x86_64-gnu, ubuntu-24.04, x86_64-unknown-linux-gnu) (push) Waiting to run
binaries / Build binary artifacts (linux-x86_64-musl, ubuntu-24.04, x86_64-unknown-linux-musl) (push) Waiting to run
binaries / Build binary artifacts (macos-aarch64, macos-14, aarch64-apple-darwin) (push) Waiting to run
binaries / Build binary artifacts (macos-x86_64, macos-13, x86_64-apple-darwin) (push) Waiting to run
binaries / Build binary artifacts (win-x86_64, windows-2022, x86_64-pc-windows-msvc) (push) Waiting to run
nix / flake check (macos-14) (push) Waiting to run
nix / flake check (ubuntu-latest) (push) Waiting to run
build / build (, macos-13) (push) Waiting to run
build / build (, macos-14) (push) Waiting to run
build / build (, ubuntu-latest) (push) Waiting to run
build / build (, windows-latest) (push) Waiting to run
build / build (--all-features, ubuntu-latest) (push) Waiting to run
build / Build jj-lib without Git support (push) Waiting to run
build / Check protos (push) Waiting to run
build / Check formatting (push) Waiting to run
build / Check that MkDocs can build the docs (push) Waiting to run
build / Check that MkDocs can build the docs with Poetry 1.8 (push) Waiting to run
build / cargo-deny (advisories) (push) Waiting to run
build / cargo-deny (bans licenses sources) (push) Waiting to run
build / Clippy check (push) Waiting to run
Codespell / Codespell (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-latest) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
This isn't so complicated compared to the HashMap version, and we can handle
multiple (cur, orig1), (cur, orig2) pairs. It's also cheaper to access.
2024-10-29 14:57:57 +09:00
Yuya Nishihara
1fd628a0cf annotate: omit building intermediate same_line_map
Perhaps, get_same_line_map() could return an iterator, but implementing an
iterator to be "pull"-ed is much harder than writing a function to "push",
especially when lifetime is involved.
2024-10-29 14:57:57 +09:00
Yuya Nishihara
0eedc0cbae annotate: simply use Vec<_> for list of originating commit IDs
Since we're going to fill the list at all, it doesn't make sense to keep it as
a sparse HashMap.
2024-10-29 14:57:57 +09:00
Yuya Nishihara
53af8a1fbc annotate: simplify condition when to exit early from process_commits() loop 2024-10-29 14:57:57 +09:00
Yuya Nishihara
89a0f46986 annotate: remove redundant .is_absent() test from get_file_contents()
Some checks are pending
binaries / Build binary artifacts (linux-aarch64-gnu, ubuntu-24.04, aarch64-unknown-linux-gnu) (push) Waiting to run
binaries / Build binary artifacts (linux-aarch64-musl, ubuntu-24.04, aarch64-unknown-linux-musl) (push) Waiting to run
binaries / Build binary artifacts (linux-x86_64-gnu, ubuntu-24.04, x86_64-unknown-linux-gnu) (push) Waiting to run
binaries / Build binary artifacts (linux-x86_64-musl, ubuntu-24.04, x86_64-unknown-linux-musl) (push) Waiting to run
binaries / Build binary artifacts (macos-aarch64, macos-14, aarch64-apple-darwin) (push) Waiting to run
binaries / Build binary artifacts (macos-x86_64, macos-13, x86_64-apple-darwin) (push) Waiting to run
binaries / Build binary artifacts (win-x86_64, windows-2022, x86_64-pc-windows-msvc) (push) Waiting to run
nix / flake check (macos-14) (push) Waiting to run
nix / flake check (ubuntu-latest) (push) Waiting to run
build / build (, macos-13) (push) Waiting to run
build / build (, macos-14) (push) Waiting to run
build / build (, ubuntu-latest) (push) Waiting to run
build / build (, windows-latest) (push) Waiting to run
build / build (--all-features, ubuntu-latest) (push) Waiting to run
build / Build jj-lib without Git support (push) Waiting to run
build / Check protos (push) Waiting to run
build / Check formatting (push) Waiting to run
build / Check that MkDocs can build the docs (push) Waiting to run
build / Check that MkDocs can build the docs with Poetry 1.8 (push) Waiting to run
build / cargo-deny (advisories) (push) Waiting to run
build / cargo-deny (bans licenses sources) (push) Waiting to run
build / Clippy check (push) Waiting to run
Codespell / Codespell (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-latest) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
Here we shouldn't care whether the file value is absent or a tree, for example.
2024-10-28 12:40:01 +09:00
Yuya Nishihara
d6026e46e9 annotate: inline process_files_in_commits()
The doc comment describes what the caller should do, not the function would do.
2024-10-28 12:40:01 +09:00
Yuya Nishihara
db239536da annotate: inline mark_lines_from_original()
This function was short, and this change makes it clear that !.is_empty() was
redundant. Duplicated doc comment is also removed. I feel the inline comment is
easier to follow here.
2024-10-28 12:40:01 +09:00
Yuya Nishihara
52d842b8df annotate: use "let else" and "continue" where makes sense 2024-10-28 12:40:01 +09:00
Yuya Nishihara
7b9c90d8e6 annotate: remove unneeded commit object lookup 2024-10-28 12:40:01 +09:00