Commit graph

203 commits

Author SHA1 Message Date
Martin von Zweigbergk
0d1ec835c1 repo: rename .jj/repo/store/backend to .jj/repo/store/type
We decided to call the files identifying the backend type `type`. We
already use that name for `OpStore` and `OpHeadsStore`.
2023-01-25 09:22:38 -08:00
Yuya Nishihara
c018ef229b repo: proxy shortest unique prefix function through RepoRef
Since this function depends on both index and view, it can't be moved to
one of the storage objects. If we go forward with this approach, some
revset::resolve_*() functions will also be migrated to RepoRef.

This patch slightly changes the function name since a "prefix" might have
various meanings.
2023-01-25 10:47:39 +09:00
Yuya Nishihara
c0c5e8f041 repo: rewrite "all()" query to clarify data dependency 2023-01-25 10:47:39 +09:00
Martin von Zweigbergk
ce094c618b repo: propagate error when current working-copy commit is not found
This should fix the panic in the case reported in #1107. It's a bit
hard to reproduce because we normally notice the missing commit when
we snapshot the working copy, but it's possible to reproduce it using
`--no-commit-working-copy`.

I suspect the added test is too brittle because it checks the exact
error message. On the other hand, it might be useful to have one test
case like this so we catch accidental changes in the format.
2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
63aa484046 repo: add a specific error type for MutableRepo::check_out() 2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
eb7de6dd3c repo: inline leave_commit() into single caller 2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
4777508df0 repo: make check_out() call edit()
This reduces duplication a little, and it makes logical sense.
2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
dd3472924b repo: add a specific error type for MutableRepo::edit()
The new type is just an enum version of `RewriteRootCommit`.  I'll add
another variant soon.
2023-01-24 12:20:28 -08:00
Yuya Nishihara
c82a62cf99 repo: turn IdIndex into sorted Vec, use binary search
Since IdIndex is immutable, we don't need fast insertion provided by BTreeMap.
Let's simply use Vec for some speed up. More importantly, this allows us to
store multiple (ChangeId, CommitId) pairs for the same change id, and will
unblock the use of IdIndex in revset::resolve_symbol().

Some benchmark numbers (against my "linux" repo) follow.

Command:
    hyperfine --warmup 3 "jj log -r master \
      -T 'commit_id.short_prefix_and_brackets()' \
      --no-commit-working-copy --no-graph"

Original:
    Time (mean ± σ):      1.892 s ±  0.031 s    [User: 1.800 s, System: 0.092 s]
    Range (min … max):    1.833 s …  1.935 s    10 runs

This commit:
    Time (mean ± σ):     867.5 ms ±   2.7 ms    [User: 809.9 ms, System: 57.7 ms]
    Range (min … max):   862.3 ms … 871.0 ms    10 runs
2023-01-23 07:38:04 +09:00
Yuya Nishihara
879f585b21 repo: leverage stored index to calculate shortest prefix in commit id space
With my "jj" work repo, this saves ~4ms to show the log with default revset.

Command:
    JJ_CONFIG=/dev/null hyperfine --warmup 3 --runs 100 \
      "jj log -T 'commit_id.short_prefix_and_brackets() \
                  change_id.short_prefix_and_brackets()' \
              --no-commit-working-copy"

Baseline (a7541e1ba4):
    Time (mean ± σ):      54.1 ms ±  16.4 ms    [User: 46.4 ms, System: 7.8 ms]
    Range (min … max):    36.5 ms …  78.1 ms    100 runs

This commit:
    Time (mean ± σ):      49.5 ms ±  16.4 ms    [User: 42.4 ms, System: 7.2 ms]
    Range (min … max):    31.4 ms …  70.9 ms    100 runs
2023-01-22 17:24:03 +09:00
Yuya Nishihara
a7541e1ba4 repo: add workaround for shortest prefix calculation of root ids
This is ugly, but we need a special case because root_change_id and
root_commit_id aren't equal but share the same prefix bytes. In practice,
no one would care for the shortest root id prefix, but we'll need to deal
with a similar problem when migrating prefix id resolution to repo layer.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
1a4b5c5ee6 index: make IdIndex store raw bytes, not hex bytes
This helps us to migrate commit_id index to ReadonlyIndex. For large
repositories, this also reduces initialization cost, but that's not the main
intent of this change.

https://github.com/martinvonz/jj/pull/1041#issuecomment-1399225876

common_hex_len() and iter_half_bytes() are added to backend.rs since more
call sites will be added to index.rs, and I feel index.rs isn't a good place
to host this kind of utility functions.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
65a659347e tests: pad odd-length hex bytes passed in to repo::IdIndex
This allows us to migrate IdIndex to raw bytes. In practice, these ids are
full hashes which should never be odd length.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
1d2642de1e repo: split commit_id and change_id indices
The goal is to replace the commit_id index with ReadonlyIndex to save the
initialization cost, but this also helps to fix root id handling.
2023-01-22 12:03:08 +09:00
Daniel Ploch
bd43580437 op_heads_store: remove LockedOpHeads
Make op resolution a closed operation, powered by a callback provided by the
caller which runs under an internal lock scope. This allows for greatly
simplifying the internal lifetime structuring.
2023-01-20 15:18:08 -08:00
Martin von Zweigbergk
0f8622dd5c repo: move test_id_index() into a tests module
This is the usual convention (to save on compilation time when not
running tests).
2023-01-18 16:59:16 -08:00
Ilya Grigoriev
606eefa8c4 A BTree-based index of commit & change ids to optimize unique_prefix
This is fast enough to be used on medium-sized repositories such as git/git.
It is a bit slow, but bearable, on huge repositories such as torvalds/linux.

There is 0 performance penalty if the display of unique prefixes is disabled

A trie-based implementation will be submitted for consideration in a
follow-up PR. It is faster, but more complicated.

**Update:** I also just discovered https://sapling-scm.com/docs/internals/indexedlog/

There are three important aspects of performance that seemed relevant:

1. Speed of computing the shortest unique prefix per id. It is worlds faster
  than the naive implementation before this commit. It can be optimized
  furher by using a trie or maybe the `fst` crate.

2. Speed of inital loading of the index that happens before the first commit is
  shown. This is the part that's noticeable but bearable on torvalds/linux. 
  
  This could be optimized by storing a sorted list of commit and change ids on
  disk.  This would likely involve reworking the `Index`.

  Failing that, the speed of inital loading doesn't change if a trie is used
  and would likely be worse with the `fst` crate

3. Memory use is unremarkable here. I don't have good tools to measure it
  precisely, but it does not balloon to gigabytes even on the linux repo.
2023-01-17 22:01:09 -08:00
Ilya Grigoriev
19d341d32a Templater: naive implementation of shortest prefix highlight for ids
This creates a templater function `short_underscore_prefix` for commit and
change ids. It is similar to `short` function, but shows one fewer hexadecimal
digit and inserts an underscore after the shortest unique prefix.

Highlighting with an underline and perhaps color/bold will be in a follow-up
PR.

The implementation is quadratic, a simple comparison of each id with every
other id. It is replaced in a subsequent commit. The problem with it is that,
while it works fine for a `jj`-sized repo, it becomes is painfully slow with a
repo the size of git/git. 

Still, this naive implemenation is included here since it's simple, and could
be used as a reference implementation. 

The `shortest_unique_prefix_length` function goes into `repo.rs` since that's
convenient for follow-up commits in this PR to have nicer diffs.
2023-01-17 22:01:09 -08:00
Ilya Grigoriev
a9e7c9bffc Make jj undo work after jj duplicate
Fixes https://github.com/martinvonz/jj/issues/1050

Thanks to Martin for suggesting the exact fix.

The tests go into the new tests/test_duplicate_command.rs, which will be
expanded shortly with other tests depending on this bugfix.
2023-01-17 21:17:27 -08:00
Martin von Zweigbergk
d6fcf4c7b2 repo: load correct OpHeadsStore depending on repo's type
We forgot to actually call `StoreFactories::load_op_heads_store()` to
load the right type of `OpHeadsStore` depending on the contents of
`.jj/repo/op_heads/type`. That shouldn't have any effect yet since we
only have one type so far, and there are no out-of-tree types yet
either (clearly, since they would not work).
2022-12-31 01:22:29 -08:00
Martin von Zweigbergk
d86ba708a3 repo: add MutableRepo::rewrite_commit() returning CommitBuilder
Same reasoning as the previous commit.
2022-12-26 23:30:52 -08:00
Martin von Zweigbergk
812ef97adb repo: add MutableRepo::new_commit() returning CommitBuilder
Since `CommitBuilder` now has a reference to `MutableRepo`, it's
convenient to create instances of it by calling a method on
`MutableRepo`.
2022-12-26 23:30:52 -08:00
Martin von Zweigbergk
f3208f59c4 store: propagate error from Backend::write_commit() 2022-12-26 23:30:52 -08:00
Martin von Zweigbergk
49b2f3b6ca commit_builder: keep MutableRepo reference
When you're done with the `CommitBuilder`, you're going to have to
call `write_to_repo()`, passing it a mutable `MutableRepo`
reference. It's a bit simpler to pass that reference when we create
the `CommitBuilder` instead, so that's what this patch does.

A drawback of passing in the mutable reference when we create the
builder is that we can't have multiple unfinished `CommitBuilder`
instance live at the same time. We don't have any such use cases yet,
and it's not hard to work around them, so I think this change is worth
it.
2022-12-26 23:30:52 -08:00
Daniel Ploch
e9bd6fbeae op_heads_store: give the OpHeadsStore factory semantics 2022-12-16 10:47:48 -08:00
Daniel Ploch
2c5b3d0cc7 op_heads_store: convert load() to take &Path like other factories 2022-12-16 10:47:48 -08:00
Daniel Ploch
309a3f91a1 op_heads_store: refactor into an interface and simple implemenation
The implementation has some hoops to jump through because Rust does not allow
`self: &Arc<Self>` on trait methods, and two of the OpHeadsStore functions need
to return cloned selves. This is worked around by making the implementation type
itself a wrapper around Arc<>.

This is not particularly note worthy for the current implementation type where
the only data copied is a PathBuf, but for extensions it is likely to be more
critical that the lifetime management of the OpHeadsStore is properly
maintained.
2022-12-16 10:47:48 -08:00
Daniel Ploch
bd31bfd2d7 repo: give OpStore factory load semantics 2022-12-14 14:10:30 -08:00
Daniel Ploch
0f62c795d8 repo: move backend loading onto the StoreFactories struct 2022-12-14 14:10:30 -08:00
Daniel Ploch
25c379429c op_store: init/load by &Path, for consistency with other stores 2022-12-14 14:10:30 -08:00
Daniel Ploch
7cbea42a24 repo: rename BackendFactories to StoreFactories 2022-12-14 14:10:30 -08:00
Martin von Zweigbergk
d8feed9be4 copyright: change from "Google LLC" to "The Jujutsu Authors"
Let's acknowledge everyone's contributions by replacing "Google LLC"
in the copyright header by "The Jujutsu Authors". If I understand
correctly, it won't have any legal effect, but maybe it still helps
reduce concerns from contributors (though I haven't heard any
concerns).

Google employees can read about Google's policy at
go/releasing/contributions#copyright.
2022-11-28 06:05:45 -10:00
Martin von Zweigbergk
9502d84872 operations: make hostname and username configurable
We currently get the hostname and username from the `whoami` crate. We
do that in lib crate, without giving the caller a way to override
them. That seems wrong since it might be used in a server and
performing operations on behalf of some other user. This commit makes
the hostname and username configurable, so the calling crate can pass
them in. If they have not been passed in, we still default to the
values from the `whoami` crate.
2022-11-14 10:02:04 -08:00
Martin von Zweigbergk
43cfb98f78 transaction: store full OperationMetadata instead of parts
We already store the description, start time, and tags. It's easier to
store the whole struct.
2022-11-13 19:06:11 -08:00
Martin von Zweigbergk
4aa4b838b4 op_store: move logic out of OperationMetadata
`OperationMetadata` is a data type used in the interface. It seems
wrong for it to know where to get data from.
2022-11-13 19:06:11 -08:00
Martin von Zweigbergk
9f0ae4586b repo: pass in OperationMetadata to OpHeadsStore::init()
Just a little refactoring to prepare for being able to get the
username and hostname from config.
2022-11-13 19:06:11 -08:00
Martin von Zweigbergk
26a554818a git: update our record of Git branches on export
When we export branches to Git, we didn't update our own record of
Git's refs. This frequently led to spurious conflicts in these refs
(e.g. #463). This is typically what happened:

 1. Import a branch pointing to commit A from Git
 2. Modify the branch in jj to point to commit B
 3. Export the branch to Git
 4. Update the branch in Git to point to commit C
 5. Import refs from Git

In step 3, we forgot to update our record of the branch in the repo
view's `git_refs` field. That led to the import in step 5 to think
that the branch moved from A to C in Git, which conflicts with the
internal branch target of B.

This commit fixes the bug by updating the refs in the `MutableRepo`.

Closes #463.
2022-11-13 15:06:10 -08:00
Yuya Nishihara
eb790760a2 repo: simplify unsafe cast of view reference 2022-11-12 01:03:48 +09:00
Yuya Nishihara
2c042cee75 repo: extract unsafe view handling bits to separate struct
This should help us reason about the safety implication. New inner module
is added to encapsulate unsafe access.

DirtyCell provides .with_ref(callback) instead of .borrow(). This isn't
strictly needed, but should clarify the intent of the temporary reference.
This also allows us to rewrite DirtyCell without unsafe code, if needed,
by leveraging OnceCell<T> x RefCell<Option<T>> pair.
2022-11-12 01:03:48 +09:00
Martin von Zweigbergk
a27da7d8d5 repo: remove last mutating method from ReadonlyRepo
`ReadonlyRepo::reindex()` is only used in `jj debug reindex`, and it
can be implemented by creating a new instance instead of mutating
`ReadonlyRepo`.
2022-11-06 16:43:54 -08:00
Yuya Nishihara
2681e8f908 repo: use OnceCell instead of Mutex<Option<_>> to store index
This is perfect use case of OnceCell per the safety comment in index().
2022-11-06 13:54:25 +09:00
Martin von Zweigbergk
61468ed126 commit_builder: remove redundant for_open_commit()
The function is now the same as `for_new_commit()`, except that it
accepts only one parent.
2022-11-05 06:14:37 -07:00
Yuya Nishihara
f62aafa79f repo: comment about safety implication of view/view_dirty 2022-11-05 00:13:26 +09:00
Yuya Nishihara
f26c5c1a05 repo: turn off view_dirty flag by enforce_view_invariants()
Otherwise enforce_view_invariants() could mutate view while its reference
is alive.
2022-11-05 00:13:26 +09:00
Martin von Zweigbergk
416a36a59c git: don't abandon root commit when all refs are gone
If you remove all refs from the backing Git repo and then run `jj git
import`, we would see that all commits disappeared from the Git repo,
so we would remove them from the jj repo too. However, we do that by
doing a history walk from old heads to the new heads, which includes
the root commit when the new heads is an empty set. That means that we
mark the root commit as abandoned, which led to a crash in
`rewrite.rs` (when we try pick the root commit's first parent to use
as parent for rebased commits).
2022-10-29 03:02:26 -07:00
Benjamin Saunders
3d1ac8b933 repo: propagate I/O errors gracefully from ReadonlyRepo::init 2022-10-28 11:51:53 -07:00
Benjamin Saunders
cfa46c50e2 workspace: propagate I/O errors gracefully 2022-10-28 11:51:53 -07:00
Martin von Zweigbergk
597e10103f repo: remove unused error type 2022-10-27 17:46:46 -07:00
Martin von Zweigbergk
2b64e52b4d repo: inline single-user init_repo_dir()
The function has only one caller since 25b922cd0b and it's pretty
small. Inlining also means we can reuse the joined paths created in
it, so I did that by extracting variables for them.
2022-10-27 17:46:13 -07:00
Benjamin Saunders
037eaaf36c repo: forbid checking out the root commit
Prevents `jj edit root` from succeeding, which would otherwise place
the repo in a state where every operation panics.
2022-10-21 10:10:07 -07:00