Commit graph

71 commits

Author SHA1 Message Date
Yuya Nishihara
49999a7507 local_working_copy: on snapshot, ignore submodule in ignored directories
Fixes #5246
2025-01-08 09:39:59 +09:00
Yuya Nishihara
b8653989c1 tests: add convenient method to initialize TestWorkspace with test settings
Functions are renamed, and their arguments are reordered to be consistent with
the TestRepo API.
2025-01-06 22:37:33 +09:00
Yuya Nishihara
cff73841ed repo: remove &UserSettings argument from new/rewrite_commit(), use self.settings
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
nix / flake check (push) Waiting to run
build / build (, macos-13) (push) Waiting to run
build / build (, macos-14) (push) Waiting to run
build / build (, ubuntu-latest) (push) Waiting to run
build / build (, windows-latest) (push) Waiting to run
build / build (--all-features, ubuntu-latest) (push) Waiting to run
build / Build jj-lib without Git support (push) Waiting to run
build / Check protos (push) Waiting to run
build / Check formatting (push) Waiting to run
build / Check that MkDocs can build the docs (push) Waiting to run
build / Check that MkDocs can build the docs with latest Python and uv (push) Waiting to run
build / cargo-deny (advisories) (push) Waiting to run
build / cargo-deny (bans licenses sources) (push) Waiting to run
build / Clippy check (push) Waiting to run
Codespell / Codespell (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-latest) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
2024-12-31 10:51:57 +09:00
Yuya Nishihara
14b52205fb repo: remove &UserSettings argument from start_transaction(), use self.settings 2024-12-31 10:51:57 +09:00
Yuya Nishihara
168c7979fe working_copy: on snapshot, warn new large files and continue
I think this provides a better UX than refusing any operation due to large
files. Because untracked files won't be overwritten, it's usually safe to
continue operation ignoring the untracked files. One caveat is that new large
files can become tracked if files of the same name checked out. (see the test
case)

FWIW, the warning will be printed only once if watchman is enabled. If we use
the snapshot stats to print untracked paths in "jj status", this will be a
problem.

Closes #3616, #3912
2024-12-11 20:19:51 +09:00
Yuya Nishihara
f4fdc19d9e working_copy: plumbing to propagate untracked paths to caller 2024-12-11 20:19:51 +09:00
Yuya Nishihara
8caec186c1 local_working_copy: filter deleted files per directory (or job)
This greatly reduces the amount of paths to be sent over the channel and the
strings to be hashed.

Benchmark:
1. original (omitted)
2. per-directory spawn (previous patch)
3. per-directory deleted files (this patch)
4. shorter path comparison (omitted)

gecko-dev (~357k files, ~25k dirs)
```
% JJ_CONFIG=/dev/null hyperfine --sort command --warmup 3 --runs 30 ..
Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/gecko-dev debug snapshot
  Time (mean ± σ):     710.7 ms ±   9.1 ms    [User: 3070.7 ms, System: 2142.6 ms]
  Range (min … max):   695.9 ms … 740.1 ms    30 runs

Benchmark 3: target/release-with-debug/jj-3 -R ~/mirrors/gecko-dev debug snapshot
  Time (mean ± σ):     480.1 ms ±   8.8 ms    [User: 3190.5 ms, System: 2127.2 ms]
  Range (min … max):   471.2 ms … 509.8 ms    30 runs

Relative speed comparison
        1.76 ±  0.03  target/release-with-debug/jj-2 -R ~/mirrors/gecko-dev debug snapshot
        1.19 ±  0.03  target/release-with-debug/jj-3 -R ~/mirrors/gecko-dev debug snapshot
```

linux (~87k files, ~6k dirs)
```
% JJ_CONFIG=/dev/null hyperfine --sort command --warmup 3 --runs 30 ..
Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/linux debug snapshot
  Time (mean ± σ):     242.3 ms ±   3.3 ms    [User: 656.8 ms, System: 538.0 ms]
  Range (min … max):   236.9 ms … 252.3 ms    30 runs

Benchmark 3: target/release-with-debug/jj-3 -R ~/mirrors/linux debug snapshot
  Time (mean ± σ):     204.2 ms ±   3.0 ms    [User: 667.3 ms, System: 545.6 ms]
  Range (min … max):   197.1 ms … 209.2 ms    30 runs

Relative speed comparison
        1.27 ±  0.03  target/release-with-debug/jj-2 -R ~/mirrors/linux debug snapshot
        1.07 ±  0.02  target/release-with-debug/jj-3 -R ~/mirrors/linux debug snapshot
```

nixpkgs (~45k files, ~31k dirs)
```
% JJ_CONFIG=/dev/null hyperfine --sort command --warmup 3 --runs 30 ..
Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/nixpkgs debug snapshot
  Time (mean ± σ):     190.7 ms ±   4.1 ms    [User: 859.3 ms, System: 881.1 ms]
  Range (min … max):   184.6 ms … 202.4 ms    30 runs

Benchmark 3: target/release-with-debug/jj-3 -R ~/mirrors/nixpkgs debug snapshot
  Time (mean ± σ):     173.3 ms ±   6.7 ms    [User: 899.4 ms, System: 889.0 ms]
  Range (min … max):   166.5 ms … 197.9 ms    30 runs

Relative speed comparison
        1.18 ±  0.03  target/release-with-debug/jj-2 -R ~/mirrors/nixpkgs debug snapshot
        1.07 ±  0.04  target/release-with-debug/jj-3 -R ~/mirrors/nixpkgs debug snapshot
```

git (~4.5k files, 0.2k dirs)
```
% JJ_CONFIG=/dev/null hyperfine --sort command --warmup 30 --runs 50 ..
Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/git debug snapshot
  Time (mean ± σ):      30.6 ms ±   1.1 ms    [User: 33.8 ms, System: 39.0 ms]
  Range (min … max):    29.0 ms …  35.0 ms    50 runs

Benchmark 3: target/release-with-debug/jj-3 -R ~/mirrors/git debug snapshot
  Time (mean ± σ):      28.8 ms ±   1.0 ms    [User: 33.0 ms, System: 37.6 ms]
  Range (min … max):    26.8 ms …  31.3 ms    50 runs

Relative speed comparison
        1.06 ±  0.05  target/release-with-debug/jj-2 -R ~/mirrors/git debug snapshot
        1.00          target/release-with-debug/jj-3 -R ~/mirrors/git debug snapshot
```
2024-12-10 10:51:04 +09:00
Yuya Nishihara
99d8703d3b local_working_copy: spawn snapshot job per directory with file count threshold
This change basically means two things:
 a. a directory scan isn't split into too many small jobs, and
 b. a directory scan isn't blocked by recursive visit_directory() calls.
Before, small jobs were created at each recursion depth, so there were silent
time slice before these jobs started producing work.

I don't know if this mitigates the issue #4508, but it's slightly faster on my
Linux machine.

matcher.visit(dir) is moved to caller because it's silly to spawn an empty job.
TreeState::snapshot() already checks that for the root path.

Benchmark:
1. original
2. per-directory spawn (this patch)
3. per-directory deleted files (omitted)
4. shorter path comparison (omitted)

gecko-dev (~357k files, ~25k dirs)
```
% JJ_CONFIG=/dev/null hyperfine --sort command --warmup 3 --runs 30 ..
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/gecko-dev debug snapshot
  Time (mean ± σ):     764.9 ms ±  16.7 ms    [User: 3274.7 ms, System: 2183.3 ms]
  Range (min … max):   731.9 ms … 814.2 ms    30 runs

Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/gecko-dev debug snapshot
  Time (mean ± σ):     710.7 ms ±   9.1 ms    [User: 3070.7 ms, System: 2142.6 ms]
  Range (min … max):   695.9 ms … 740.1 ms    30 runs

Relative speed comparison
        1.89 ±  0.05  target/release-with-debug/jj-1 -R ~/mirrors/gecko-dev debug snapshot
        1.76 ±  0.03  target/release-with-debug/jj-2 -R ~/mirrors/gecko-dev debug snapshot
```

linux (~87k files, ~6k dirs)
```
% JJ_CONFIG=/dev/null hyperfine --sort command --warmup 3 --runs 30 ..
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/linux debug snapshot
  Time (mean ± σ):     268.2 ms ±  11.3 ms    [User: 636.6 ms, System: 518.5 ms]
  Range (min … max):   247.5 ms … 295.2 ms    30 runs

Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/linux debug snapshot
  Time (mean ± σ):     242.3 ms ±   3.3 ms    [User: 656.8 ms, System: 538.0 ms]
  Range (min … max):   236.9 ms … 252.3 ms    30 runs

Relative speed comparison
        1.40 ±  0.06  target/release-with-debug/jj-1 -R ~/mirrors/linux debug snapshot
        1.27 ±  0.03  target/release-with-debug/jj-2 -R ~/mirrors/linux debug snapshot
```

nixpkgs (~45k files, ~31k dirs)
```
% JJ_CONFIG=/dev/null hyperfine --sort command --warmup 3 --runs 30 ..
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/nixpkgs debug snapshot
  Time (mean ± σ):     201.0 ms ±   8.5 ms    [User: 929.3 ms, System: 917.6 ms]
  Range (min … max):   170.3 ms … 218.5 ms    30 runs

Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/nixpkgs debug snapshot
  Time (mean ± σ):     190.7 ms ±   4.1 ms    [User: 859.3 ms, System: 881.1 ms]
  Range (min … max):   184.6 ms … 202.4 ms    30 runs

Relative speed comparison
        1.24 ±  0.06  target/release-with-debug/jj-1 -R ~/mirrors/nixpkgs debug snapshot
        1.18 ±  0.03  target/release-with-debug/jj-2 -R ~/mirrors/nixpkgs debug snapshot
```

git (~4.5k files, 0.2k dirs)
```
% JJ_CONFIG=/dev/null hyperfine --sort command --warmup 30 --runs 50 ..
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/git debug snapshot
  Time (mean ± σ):      30.3 ms ±   1.1 ms    [User: 40.5 ms, System: 39.4 ms]
  Range (min … max):    28.3 ms …  35.7 ms    50 runs

Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/git debug snapshot
  Time (mean ± σ):      30.6 ms ±   1.1 ms    [User: 33.8 ms, System: 39.0 ms]
  Range (min … max):    29.0 ms …  35.0 ms    50 runs

Relative speed comparison
        1.05 ±  0.05  target/release-with-debug/jj-1 -R ~/mirrors/git debug snapshot
        1.06 ±  0.05  target/release-with-debug/jj-2 -R ~/mirrors/git debug snapshot
```

- CPU: 8-core AMD Ryzen 7 PRO 4750U with Radeon Graphics (-MT MCP-)
- speed/min/max: 1600/1400/1700 MHz Kernel: 6.11.10-amd64 x86_64
- Filesystem: ext4
2024-12-10 10:51:04 +09:00
Martin von Zweigbergk
409be2e1c4 store: make get_tree() functions take owned repo path
The function needs an owned value, so we might as well pass it one and
avoid a few clone calls.
2024-11-27 18:53:28 -08:00
Scott Taylor
e5cb9f94f6 conflicts: add "ui.conflict-marker-style" config
Adds a new "ui.conflict-marker-style" config option. The "diff" option
is the default jj-style conflict markers with a snapshot and a series of
diffs to apply to the snapshot. New conflict marker style options will
be added in later commits.

The majority of the changes in this commit are from passing the config
option down to the code that materializes the conflicts.

Example of "diff" conflict markers:

```
<<<<<<< Conflict 1 of 1
+++++++ Contents of side #1
fn example(word: String) {
    println!("word is {word}");
%%%%%%% Changes from base to side #2
-fn example(w: String) {
+fn example(w: &str) {
     println!("word is {w}");
>>>>>>> Conflict 1 of 1 ends
}
```
2024-11-23 08:28:47 -06:00
Martin von Zweigbergk
de6da1a088 transaction: propagate errors from commit() 2024-11-13 23:05:24 -08:00
Yuya Nishihara
062a1bceb9 local_working_copy: on check out, skip entries conflicting with untracked dirs
This seems more consistent because file->directory conflicts are skipped.
2024-11-12 16:12:12 +09:00
Yuya Nishihara
f3a75c5c46 local_working_copy: on check out, ignore diff of Git submodule ids
This is different from skipped paths because the file state has to remain as
FileType::GitSubmodule in order to ignore the submodule directory when
snapshotting.

Fixes #4825.
2024-11-12 16:12:12 +09:00
Yuya Nishihara
4983db563f local_working_copy: migrate Git submodule test to MergedTreeBuilder
I also removed tx.commit() because the test doesn't rely on the committed
operation.
2024-11-12 16:12:12 +09:00
Yuya Nishihara
ba76299818 tests: use platform path separator in symlink content
Appears that this was the reason why we got the error "The filename, directory
name, or volume label syntax is incorrect" on Windows CI.
2024-11-07 13:38:04 +09:00
Yuya Nishihara
adef815d1d tests: try both DOS and hashed NT short file names
For some unknown reasons, hashed 8.3 file name is chosen for ".jj" on Github
CI. Hashed ".git" short name is also added for consistency.
2024-11-07 13:38:04 +09:00
Yuya Nishihara
ded48ff6e7 local_working_copy: do not create file or write in directory named .jj or .git
I originally considered adding deny-list-based implementation, but the Windows
compatibility rules are super confusing and I don't have a machine to find out
possible aliases. This patch instead adds directory equivalence tests.

In order to test file entity equivalence, we first need to create a file or
directory of the requested name. It's harmless to create an empty .jj or .git
directory, but materializing .git file or symlink can temporarily set up RCE
situation. That's why new empty file is created to test the path validity. We
might want to add some optimization for safe names (e.g. ASCII, not contain
"git" or "jj", not contain "~", etc.)

That being said, I'm not pretty sure if .git/.jj in sub directory must be
checked. It's not safe to cd into the directory and run "jj", but the same
thing can be said to other tools such as "cargo". Perhaps, our minimum
requirement is to protect our metadata (= the root .jj and .git) directories.

Despite the crate name (and internal use of std::fs::File),
same_file::is_same_file() can test equivalence of directories. This is
documented and tested, so I've removed my custom implementation, which was
slightly simpler but lacks Windows support.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
f10c5db739 local_working_copy: skip existing symlinks consistently
If new file would overwrite an existing regular file, the file path is skipped.
It makes sense to apply the same rule to existing symlinks. Without this patch,
check out would fail if an existing path was a dead symlink or a symlink to
a directory.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
24ccfda781 local_working_copy: do not try to remove old file traversing symlinks
I'm not sure if this was attackable before, but it should be better to not
try to remove file across symlinks.

The disk_path is now returned from create_parent_dirs() to clarify that the
path is identical.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
8540536ea2 local_working_copy: detect error of file removal earlier
This should be safer than relying on file open error. It's scary to continue
processing if the file was a symlink.

I'll add a few more sanity checks to remove_old_file(), so it's extracted as a
function.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
1c30f3b3e8 repo_path: reject invalid path components by to_fs_path/name()
This addresses a simple path traversal attack.

I don't have a Windows machine, so the added Windows tests aren't checked
locally.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
739bf8decf repo_path: add stub for checked to_fs_path(), rename unchecked functions
I'm going to add "checked" version of to_fs_path(), but all callers can't be
migrated to it. For example, an error message should be produced even if the
path is malformed.

This patch also adds error variants to propagate InvalidRepoPathError. They
don't use ::Other { .. } so the errors can be distinguished in tests.
2024-11-06 15:03:41 -08:00
Yuya Nishihara
7b5df93fe4 testutils: move default_store_factories() to TestEnvironment
It will capture the TestBackendData mapping.
2024-11-02 08:39:02 +09:00
Samuel Tardieu
12f4d6d17b style: avoid using .to_owned()/.to_vec() on owned objects
`.clone()` is more explicit when we already have an object
of the right type.
2024-10-04 22:29:13 +02:00
Yuya Nishihara
653e8087da workspace: make workspace_root() and repo_path() return slice &Path
It's common to return &PathBuf as &Path.
2024-09-08 05:40:52 +09:00
Yuya Nishihara
47307556dd working_copy: pass SnapshotOptions by reference
Though SnapshotOptions can be cheaply cloned, it doesn't make much sense that
snapshot() consumes a settings-like object.
2024-09-08 04:51:21 +09:00
Martin von Zweigbergk
b22d8fefd9 local_working_copy: pass max file size to snapshot directly
We were passing the max file size to snapshot to
`WorkingCopy::snapshot()` via `UserSettings`. It's simpler and more
flexible to set it  on `SnapshotOptions` instead.
2024-09-07 11:33:05 -07:00
Martin von Zweigbergk
8d090628c3 transaction: rename mut_repo() to idiomatic repo_mut()
We had both `repo()` and `mut_repo()` on `Transaction` and I think it
was easy to get confused and think that the former returned a
`&ReadonlyRepo` but both of them actually return a reference to
`MutableRepo` (the latter obviously returns a mutable reference). I
hope that renaming to the more idiomatic `repo_mut()` will help
clarify.

We could instead have renamed them to `mut_repo()` and
`mut_repo_mut()` but that seemed unnecessarily long. It would better
match the `mut_repo` variables we typically use, though.
2024-09-07 10:51:43 -07:00
Martin von Zweigbergk
bc06b2a442 store: make write_symlink() async 2024-09-04 18:34:11 -07:00
Matt Kulukundis
8ead72e99f formatting only: switch to Item level import ganularity 2024-08-22 14:52:54 -04:00
Yuya Nishihara
37c41d0eaf tests: do not pass in commit objects loaded from different store
Otherwise the assertion would fail in the next patch.
2024-08-08 23:05:37 +09:00
Martin von Zweigbergk
352ca72314 tests: make helpers create non-legacy trees
Extracted and modified from #3746 by @ilyagr.
2024-07-24 14:33:05 +02:00
Matt Kulukundis
8aa71f58f3 feat: add an option to monitor the filesystem asynchronously
- make an internal set of watchman extensions until the client api gets
  updates with triggers
- add a config option to enable using triggers in watchman

Co-authored-by: Waleed Khan <me@waleedkhan.name>
2024-06-16 23:24:22 -04:00
Benjamin Tan
716ec37560 test_local_working_copy: add test for snapshotting of edited materialized simplified conflict 2024-06-15 06:05:06 +08:00
Benjamin Tan
9be33724dc conflicts: materialize simplified file conflicts 2024-06-15 06:05:06 +08:00
Benjamin Tan
f74991c2e1 tests: add tests showing that individual file conflicts are not simplified/deduplicated 2024-06-15 06:05:06 +08:00
Martin von Zweigbergk
404f31cbc1 backend: add error variant for access denied, handle when diffing
Some backends, like the one we have at Google, can restrict access to
certain files. For such files, if they return a regular
`BackendError::ReadObject`, then that will terminate iteration in many
cases (e.g. when diffing or listing files). This patch adds a new
error variant for them to return instead, plus handling of such errors
in diff output and in the working copy.

In order to test the feature, I added a new commit backend that
returns the new `ReadAccessDenied` error when the caller tries to read
certain objects.
2024-05-30 18:27:38 -07:00
Martin von Zweigbergk
07bb1d81b7 tree_builder: propagate errors from write_tree() 2024-05-22 06:46:38 -07:00
Thomas Castiglione
59d3a2c866 local_working_copy: when all sides of a conflict are executable, materialise the conflicted file as executable
Fixes #3579 and adds a testcase for an executable conflict treevalue.
2024-05-21 14:37:17 +08:00
Martin von Zweigbergk
0d1ff8a150 merged_tree: propagate errors from TreeEntriesIterator
We shouldn't panic if we fail to read a tree from the backend.
2024-05-01 06:10:08 -07:00
Thomas Castiglione
d661f59f9d working_copy: implement symlinks on windows with a helper function
enables symlink tests on windows, ignoring failures due to disabled developer mode,
and updates windows.md
2024-03-05 15:16:38 +08:00
Austin Seipp
6c31bab0d3 fsmonitor: allow core.fsmonitor = "none" to disable
When doing things like testing snapshot performance differences,
this allows you to turn off the monitor, no matter what the enabled
user or repository configuration has, e.g.

    jj st --config-toml='core.fsmonitor="none"'

Signed-off-by: Austin Seipp <aseipp@pobox.com>
2024-02-20 20:19:47 -06:00
Ilya Grigoriev
a9c3af8153 test_local_working_copy: use std::fs:write instead of OpenOptions 2024-02-10 16:06:28 -08:00
Ilya Grigoriev
b2e37d448b clippy: add truncate option as suggested by clippy
In the next commit, I replace the whole thing with
std::fs::write, but I'll leave this here in case
the next commit is somhow incorrect
2024-02-10 16:06:28 -08:00
Austin Seipp
5b517b542e rust: bump MSRV to 1.76.0
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2024-02-09 15:48:01 -06:00
Martin von Zweigbergk
6c1aeff7a9 working copy: materialize symlinks on Windows as regular files
I was a bit surprised to learn (or be reminded?) that checking out
symlinks on Windows leads to a panic. This patch fixes the crash by
materializing symlinks from the repo as regular files. It also updates
the snapshotting code so we preserve the symlink-ness of a path. The
user can update the symlink in the repo by updating the regular file
in the working copy. This seems to match Git's behavior on Windows
when symlinks are disabled.
2024-02-09 09:20:24 -08:00
Martin von Zweigbergk
b343289238 working_copy: make reset() take a commit instead of a tree
Our virtual file system at Google (CitC) would like to know the commit
so it can scan backwards and find the closest mainline tree based on
it. Since we always record an operation id (which resolves to a
working-copy commit) when we write the working-copy state, it doesn't
seem like a restriction to require a commit.
2024-02-06 12:41:09 -08:00
Yuya Nishihara
5a7d8ac596 working_copy: don't follow symlinks when visiting files in gitignored directory
Fixes #2878
2024-01-24 16:38:48 +09:00
Yuya Nishihara
d0d4496258 tests: add executable files and symlinks to gitignored directory test 2024-01-24 16:38:48 +09:00
Yuya Nishihara
95d83cbfe5 object_id: make ObjectId constructors non-trait methods
I'm going to add try_from_hex(), which requires Self: Sized. Such trait bound
could be added, but I don't think we'll need abstracted ObjectId constructors
at all.
2024-01-05 23:36:57 +09:00