Commit graph

697 commits

Author SHA1 Message Date
Martin von Zweigbergk
905c45ddd7 cargo: upgrade to backoff 0.3
For no particular reason other than to stay up to date.
2021-12-01 08:13:20 -08:00
Martin von Zweigbergk
f27f110d10 repo: remove now-unused working copy path 2021-11-25 21:14:24 -08:00
Martin von Zweigbergk
b4f2b88522 workspace: move initialization of WorkingCopy from ReadonlyRepo 2021-11-25 21:14:24 -08:00
Martin von Zweigbergk
c6cba59c27 workspace: move creation of .jj/ directory from Repo to Workspace 2021-11-25 21:08:43 -08:00
Martin von Zweigbergk
f17aced374 workspace: move search for .jj/ directory from Repo to Workspace 2021-11-25 21:08:43 -08:00
Martin von Zweigbergk
466d35d4bc workspace: add functions for initializing a repo
`ReadonlyRepo::init_*()` currently calls `WorkingCopy::init()`. In
order to remove that dependency, this patch wraps the
`ReadonlyRepo::init_*()` functions in new `Workspace` functions. A
later patch will have those functions call `WorkspaceCopy::init()`.`
2021-11-25 21:07:28 -08:00
Martin von Zweigbergk
c08adba424 repo: remove working_copy(), only used in tests
This is another step towards removing coupling between the repo and
the working copy, so we can have multiple working copies for a single
repo (#13).
2021-11-25 21:04:56 -08:00
Martin von Zweigbergk
afe3bd5c2f testutils: update write_working_copy_file() to take a Workspace 2021-11-25 21:04:56 -08:00
Martin von Zweigbergk
ce094f64d5 testutils: make init_repo return a struct, including a Workspace
This is a fairly mechanical change; I'll do some minor cleanups in
later commits.
2021-11-25 21:04:56 -08:00
Martin von Zweigbergk
70073b94a8 repo: stop keeping a WorkingCopy instance
The `Repo` doesn't do anything with the `WorkingCopy` except keeping a
reference to it for its users to use. In fact, the entire lib crate
doesn't do antyhing with the `WorkingCopy`. It therefore seems simpler
to have the users of the crate manage the `WorkingCopy` instance. This
patch does that by letting `Workspace` own it. By not keeping an
instance in `Repo`, which is `Sync`, we can also drop the
`Arc<Mutex<>>` wrapping.

I left `Repo::working_copy()` for convenience for now, but now it
creates a new instance every time. It's only used in tests.

This further decoupling should help us add support for multiple
working copies (#13).
2021-11-25 21:04:56 -08:00
Martin von Zweigbergk
c0711f47cf workspace: introduce Workspace type
Having a concept of a "workspace" will be useful for adding support
for multiple workspaces (#13). You can think of the "workspace" as a
repo combined with a working copy. A workspace corresponds 1:1 with a
`.jj/` directory. It's pretty close to what other VCS simply call a
"repo", but I've ended up using the word "repo" for what Git calls a
"bare repo".
2021-11-25 21:04:56 -08:00
Martin von Zweigbergk
046b3c0541 op_store: make Vec inside ViewId and OperationId non-public 2021-11-19 23:19:13 -08:00
Martin von Zweigbergk
8cf5dd286a backend: make Vec inside CommitId non-public
The recent e5dd93cbf7, whose description says "cleanup: make Vec
inside CommitId etc. non-public", made all ID types in the `backend`
module *except* for `CommitId` non-public :P This patch makes
2021-11-19 23:19:00 -08:00
Martin von Zweigbergk
b35086d4c9 repo: use local variable for WorkingCopy in Repo::init()
No need to pass it into `Repo` and then get it back from there.
2021-11-17 10:22:15 -08:00
Martin von Zweigbergk
33b272f5fa working_copy: make some functions require mutable references
We use interior mutability for caching in `WorkingCopy`, but let's
still take mutable reference in the functions where the state change
is visible.
2021-11-17 10:15:33 -08:00
Martin von Zweigbergk
287602966e working_copy: add a method for untracking specified paths 2021-11-17 08:41:37 -08:00
Martin von Zweigbergk
41860692e9 trees: pass a matcher to the entry iterator 2021-11-14 22:00:46 -08:00
Martin von Zweigbergk
f2d5aa188d tree: refactor TreeEntriesIterator to look like TreeDiffIterator
Consistency is better, and I think this will make it a little easier
to add support for restricting the iterator by a `Matcher`.
2021-11-14 21:50:10 -08:00
Martin von Zweigbergk
f846112f80 cleanup: remove an unnecessary reference-taking noticed by Clippy 2021-11-14 12:47:14 -08:00
Martin von Zweigbergk
30845c1f3b cleanup: fix formatting regression 2021-11-14 12:46:38 -08:00
Martin von Zweigbergk
ced252f766 cleanup: replace some as_slice() by & 2021-11-10 10:55:58 -08:00
Martin von Zweigbergk
e5dd93cbf7 cleanup: make Vec inside CommitId etc. non-public 2021-11-10 10:46:10 -08:00
Martin von Zweigbergk
27107637f2 git: fix regression in pruning of remote refs
A while ago, I replaced a call to git2-rs's `Remote::fetch()` by calls
to `Remote::download()` and `Remote::update_tips()`. The function is
documented to be a convenience for those function, but it turns out
that the pruning of deleted remote refs is a separate call
(`Remote::prune()`), so we need to call that too.
2021-11-07 16:02:18 -08:00
Martin von Zweigbergk
bdea3ae31a git: don't pass remote credentials callback to Remote::update_tips()
AFAICT, the credentials callback is not used by
`Remote::update_tips()`.
2021-11-07 15:57:19 -08:00
Martin von Zweigbergk
e901a4e66b repo: don't materialize conflicts on checkout
Since the working copy can now handle conflicts, we don't need to
materialize conflicts when checking out a commit.

Before this patch, we used to create a new commit on top whenever we
checked out a commit with conflicts. That new commit was intended just
for resolving the conflicts. The typical workflow was the resolve the
conflicts and then amend. To use the same workflow after this patch,
one needs to explicitly create a new commit on top with `jj new` after
checking out a commit with conflict.
2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
ea82340654 working_copy: preserve conflicts in the working copy until markers are removed
I realized only recently that we can try to parse conflict markers in
files and leave them as conflicted if they haven't changed. If they
have changed and some conflict markers have been removed, we can even
update the conflict with that partial resolution.

This change teaches the working copy to write conflicts to the working
copy. It used to expect that the caller had already updated the tree
by materializing conflicts. With this change, we also start parsing
the conflict markers and leave the conflicts unresolved in the working
copy if the conflict markers remain.

There are some cases that we don't handle yet. For example, we don't
even try to set the executable bit correctly when we write
conflicts. OTOH, we didn't do that even before this change.

We still never actually write conflicts to the working copy (outside
of tests) because we currently materialize conflicts in
`MutRepo::check_out()`. I'll change that next.
2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
cea6061f3d conflicts: add a function for parsing a materialized conflict
I initially made the working copy materialize conflicts in its
`check_out()` method. Then I changed it later (exactly a year ago, on
Halloween of 2020, actually) so that the working copy expected
conflicts to already have been materalized, which happens in
`MutableRepo::check_out`().

I think my reasoning then was that the file system cannot represent a
conflict. While it's true that the file system itself doesn't have
information to know whether a file represents a conflict, we can
record that ourselves. We already record whether a file is executable
or not and then preserve that if we're on a file system that isn't
able to record it. It's not that different to do the same for
conflicts if we're on a file system that doesn't understand conflicts
(i.e. all file systems).

The plan is to have the working copy remember whether a file
represents a conflict. When we check if it has changed, we parse the
file, including conflict markers, and recreate the conflict from
it. We should be able to do that losslessly (and we should adjust
formats to make it possible if we find cases where it's not).

Having the working copy preserve conflict states has several
advantages:

 * Because conflicts are not materialized in the working copy, you can
   rebase the conflicted commit and the working copy without causing
   more conflicts (that's currently a UX bug I run into every now and
   then).

 * If you don't change anything in the working copy, it will be
   unchanged compared to its parent, which means we'll automatically
   abandon it if you update away from it.

 * The user can choose to resolve only some of the conflicts in a file
   and squash those in, and it'll work they way you'd hope.

 * It should make it easier to implement support for external merge
   tools (#18) without having them treat the working copy differently.

This patch prepares for that work by adding support for parsing
materialized conflicts.
2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
f6839a4ceb files: implement Debug for MergeResult, display byte vector as string 2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
856b219943 working_copy: if a file's recorded mtime is equal to the state's mtime, set to 0 early 2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
033bfe7d5b working_copy: fix an incorrect comment about timestamps 2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
9ac643ed1a working_copy: rename confusingly named read_time field 2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
baa403b41e working_copy: clarify code for preserving current file state
On Windows, we preserve the executable bit. I plan to also teach the
working copy to preserve conflict state. This refactoring prepares for
that by simplifying how we preserve parts of the current file state.
2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
7c12014513 working_copy: extract a function for updating the file state and tree if dirty 2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
730c442462 working_copy: move file_state() off of TreeState
The function doesn't use `self`, so let's move it off of
`TreeState`. That way we can call it even while holding a mutable
reference.
2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
f86ffbd47b working_copy: take advantage of DirEntry::path() 2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
5dce6e9884 working_copy: avoid using file_state() for getting mtime of tree_state file
I want to change the signature of `file_state()` and this caller is
different from the others and only needs the mtime.
2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
efe7316354 working_copy: make FileType variants more similar to TreeValue variants
This change doesn't make much difference yet, but I think it will
enable further improvements.
2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
bf61e9cf3e working_copy: don't stat ignored files 2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
76c58b3949 tree_builder: rename repo() to more accurate store() 2021-11-07 15:17:51 -08:00
Martin von Zweigbergk
acf3798abc revsets: allow underscore in identifiers 2021-11-07 07:55:52 -08:00
Martin von Zweigbergk
9abee0096f conflicts: propagate errors from materialize_conflict()
All current callers pass in a buffer, so it should never fail, but the
function itself can't know that.
2021-10-24 23:02:00 -07:00
Martin von Zweigbergk
0f46c2453e revset graph: fix formatting I missed in recent commit 2021-10-23 20:53:41 -07:00
Martin von Zweigbergk
b4a0e513dd revset graph: make order of edges stable
While working on demos, I noticed that `jj log` output in the
octocat/Hello-World repo was unstable: sometimes the first parent of
the merge was on the left and sometimes it was on the right. This
patch fixes that by sorting the edges by position in the index just
before returning them. It seems that most applications would want
stable output so I put it in the `RevsetGraphIterator` rather than
doing at the call site in the CLI. I ordered them with the reverse
index position rather than forward because it seemed to make the
graphs in the git.git repo slight nicer, with the left-most edge going
between subsequent releases.

There performance difference is within the noise level.
2021-10-23 20:33:59 -07:00
Martin von Zweigbergk
33bf6ce1d5 view: don't force commits pointed to by refs to be current heads
If you rewrite a commit that's also available on some remote, you'll
currently see both the old version and the new version in the view,
which means they're divergent. They're not logically divergent (the
rewritten version should replace the old version), so this is a UX
bug. I think it indicates that the set of current heads should be
redefined to be the *desired* heads. That's also what I had suspected
in the TODO removed by this change.  I think another indication that
we should hide the old heads even if they have e.g. a remote branch
pointing to them is that we don't want them to be rebased if we
rewrite an ancestor.

So that's what I decided to do: let the view's heads be the desired
heads. The user can still define revsets for showing non-current
commits pointed to by e.g. remote branches.
2021-10-23 09:15:05 -07:00
Martin von Zweigbergk
c0ae4b16e8 trees: try to resolve file conflicts after simplifying conflicts
This fixes a bug I've run into somewhat frequently. What happens is
that if you have a conflict on top of another conflict and you resolve
the conflict in the bottom commit, we just simplify the `Conflict`
object in the second commit, but we don't try to resolve the new
conflict. That shows up as an unexpected "conflict" in `jj log`
output, and when you check out the commit, there are actually no
conflicts, so you can just `jj squash` right away.

This patch fixes that bug. It also teaches the code to work with more
than 3 parts in the merge, so if there's a 5-way conflict, for
example, we still try to resolve it if possible.
2021-10-22 09:20:50 -07:00
Martin von Zweigbergk
f1e2124a64 trees: make simplify_conflict() always return a Conflict
The interface is much cleaner this way. The function no longer makes
any writes to the store.
2021-10-20 22:07:42 -07:00
Martin von Zweigbergk
2e4dc019d9 git: don't update public heads for now 2021-10-20 14:23:58 -07:00
Martin von Zweigbergk
23c3e02134 stacked_table: don't expect a certain listing order from the file system 2021-10-20 13:53:32 -07:00
Martin von Zweigbergk
c260fea811 GitBackend: move extra metadata from Git notes to stacked-table storage
Git notes (at least as implemented by libgit2) quickly gets really
slow, as noted in issue #7. This patch replaces it by a custom storage
format.

I tested the performance in the git.git repo with just a few hundred
annotated commits (~450, I think) and no sharding. I listed the first
~2900 commits there using `jj log --no-graph -r ,,v1.0.0 -T 'author
"\n"' | wc -l`. That took about 882ms. After this patch, it dropped to
108ms.

I did a similar test in this repo with 12700 annotated commits and
sharding, listing all visible commits. That took 142ms before this
patch (the sharding helps a lot!) and 55ms after.

Closes #3.
Closes #7.
2021-10-20 13:22:59 -07:00
Martin von Zweigbergk
deddd90762 stacked_table: add caching of tables
We don't want to re-read the whole table(s) every time we read extra
metadata for a commit (which is the immediate use-case I'm aiming for
in #7)..
2021-10-20 13:22:36 -07:00
Martin von Zweigbergk
85773cf81f stacked_table: add a generic store based on the stacked-table format
The new store works the same way as the `OpHeadsStore`. It keeps track
of the current head file(s) by recording their names in a
directory. When a write happens, it adds the new head and then removes
the old head. There will be generally be a single head at a time. The
only exception is when there's been concurrent operations (locally, or
remotely, in the case of a distributed file system). When there are
multiple heads files, they are automatically merged. No guarantee is
given about which value wins if the key exists in several heads; the
store is meant to be used for data that's immutable once written. As
long as different keys are written, this is a CRDT. That makes it fit
for solving both #3 and #7.
2021-10-20 13:21:12 -07:00
Martin von Zweigbergk
e86d266e6b stacked_table: add a file format for stacked, sorted tables
I'm trying to replace the Git backend's use of Git notes for storing
metadata (#7). This patch adds a file format that I hope can be used
for that. It's a simple generic format for storing fixed-size keys and
associated variable-size values. The keys are stored in sorted
order. Each key is followed by an offset to the value. The offset is
relative to the first value. All values are concatenated after each
other. I suppose it's a bit like Git's pack files but lacking both
delta-encoding and compression.

Each file can also have a parent pointer (just like the index files
have), so we don't have to rewrite the whole file each time. As with
the index files, the new format squashes a file into its parent if it
contains more than half the number of entries of the parent. The code
is also based on `index.rs`.

Perhaps we can alo replace the default operation storage with this
format. Maybe also the native local backend's storage. We'll need
delta-encoding and compression soon then.
2021-10-20 13:19:32 -07:00
Martin von Zweigbergk
d8795b9ae7 store: move logic for initialization of GitBackend to that type 2021-10-18 08:49:22 -07:00
Martin von Zweigbergk
e8f44fb470 index: remove predecessor information from the index
Since we removed the evolution state in commit 4c4e436f38, we no
longer use the predecessor information in the index, so let's remove
it.
2021-10-17 09:13:31 -07:00
Martin von Zweigbergk
0354b8721b index: transparently reindex if the index is corrupt
I'm about to change the index format (to remove predecessor
information), which will break the format. Let's prepare for that by
having `IndexStore` reindex the repo if it fails to read the index..
2021-10-17 09:13:31 -07:00
Martin von Zweigbergk
6722a79e27 repo: fix crash on upgrade of repo backed by external Git repo
If the repo is backed by Git repo that's not inside `.jj/`, then
there's no `.jj/git/` directory to move into `.jj/store/`.
2021-10-14 16:41:57 -07:00
Martin von Zweigbergk
97b48635f6 repo: move Backend initialization to Store
This is just a cleanup. Now only `Store` knows how to tell which
backend to use.
2021-10-13 23:55:16 -07:00
Martin von Zweigbergk
1b6efdc3f8 repo: make .jj/store a directory also in Git-backed repos
I think this is just cleaner, and it gives us room to put other
store-related data in the `.jj/store/` directory. I may want to use
that place for writing the metadata we currently write in Git notes
(#7).
2021-10-13 23:42:25 -07:00
Martin von Zweigbergk
dba72af857 repo: move creation of directories .jj/* directories closer and earlier 2021-10-13 22:49:37 -07:00
Martin von Zweigbergk
c0a26f7642 conflicts: work around rust-lang/rust#89716 2021-10-13 13:41:09 -07:00
Martin von Zweigbergk
d8c0282873 revset: make head() accept candidate set to find heads within
With this change, you can do e.g. `heads(remote_branches())`. That
should currently be the same as `public_heads()`, except that we don't
yet remove public heads when remote branches have been updated. Having
this support should be generally useful, but I may use it in the short
term specifically for depending less on the public heads, until I get
around to keeping them up to date.
2021-10-13 09:41:20 -07:00
Martin von Zweigbergk
aebdd4e8fd revset: add all() function 2021-10-13 09:28:11 -07:00
Martin von Zweigbergk
5874d5abfb revset: add none() function 2021-10-13 09:24:08 -07:00
Martin von Zweigbergk
8bbfe9731e cleanup: let newer Clippy fix a few things it found 2021-10-13 08:27:44 -07:00
Martin von Zweigbergk
94408f01e0 revset: extract a function for creating a revset from commit ids
This is just to avoid some repetition.
2021-10-10 12:44:35 -07:00
Martin von Zweigbergk
8465a578be revset: split out a new remote_branches() from branches()
It seems useful to be able to use `remote_branches()` in revsets
e.g. for filtering out commits known on remotes.
2021-10-10 12:24:48 -07:00
Martin von Zweigbergk
e52c902d3a diff: compact adjacent unchanged regions also when using for_tokenizer()
I noticed while working on support for unified diffs (#33) that
`Diff::for_tokenizer(..., &find_line_ranges)` would return a
`DiffHunk::Matching` for each matching line instead of a single
`DiffHunk::Matching` for all the matching lines. That's different from
what you get from `Diff::default_refinement()` and seems less
convenient to work with.
2021-10-10 00:00:06 -07:00
Martin von Zweigbergk
28a2c534a0 git: add support for password-less SSH keys
My SSH keys are password-protected, so I haven't been able to test
this patch completely, but I believe it should work. We now use
ssh-agent if `$SSH_AGENT_PID` is set, otherwise we check if
`$HOME/.ssh/id_rsa` exists and assume it's a password-less key. That's
quite hacky but I think it's good enough for now. We eventually need
to move this out of the library crate just like libgit2 has done.

Closes #25.
2021-10-09 09:44:57 -07:00
Martin von Zweigbergk
fc883d1d02 git: extract a function for creating RemoteCallbacks 2021-10-09 09:12:03 -07:00
Martin von Zweigbergk
d434b948f1 git: use git's HTTP proxy configs
I don't have an HTTP proxy to test with, but I hope this is all that's
needed.

Closes #34.
2021-10-09 09:01:47 -07:00
Martin von Zweigbergk
fdb861b957 backend: remove unused Commit::is_pruned (#32) 2021-10-06 23:53:15 -07:00
Martin von Zweigbergk
4c4e436f38 evolution: delete it now that we don't use it anymore (#32)
It's been a lot of work, but now we're finally able to remove the
`Evolution` state! `jj obslog` still works as before (it just walks
the predecessor pointers).
2021-10-06 23:28:30 -07:00
Martin von Zweigbergk
4c8beee81c Transaction: don't remove hidden heads on commit
The removal of hidden heads was just there to help with the transition
away from evolution (#32). Now that we no longer depend on evolution
for removing old heads, we can remove the hack.
2021-10-06 23:20:23 -07:00
Martin von Zweigbergk
56c7f32023 revset: avoid using evolution for resolving a change id to its commits
This rewrites the code for resolving a change id to simply walk the
entire index. That's obviously not optimal, but it's not worse than
what we did in the evolution-based resolution. This is yet another
step towards removing support for evolution (#32).
2021-10-06 23:20:23 -07:00
Martin von Zweigbergk
108c38d816 evolution: don't create pruned commits (#32)
Now that we rebase descendants and remove old heads as appropriate
(without using evolution), we don't need to create pruned commits
anymore.
2021-10-06 23:20:23 -07:00
Martin von Zweigbergk
b7acbae168 DescendantRebaser: also remove old heads
This patch teaches `DescendantRebaser` to also update heads. That's
done at the end of the rebase (when `rebase_next()` starts returning
`None`), which is a little weird. We should probably change the
interface, but this will do for now.

With this change, we should no longer need to remove hidden heads when
the transaction commits. That will remove one of the last bits of
dependence on evolution from most commands (#32).
2021-10-06 22:01:39 -07:00
Martin von Zweigbergk
e26d21dc18 revset: add .commit_ids() and .commits() to RevsetIterator
This makes it a little easier to iterate over commits or commit ids.
2021-10-06 14:18:11 -07:00
Martin von Zweigbergk
41779551fb revsets: remove heads arguments from builder API functions
Now that we no longer have to be careful whether we mean "all heads"
or "non-obsolete heads", there's no need to pass them as
arguments. It's still possible to get a DAG range to a hidden commit
by using `RevsetExpression::dag_range_to()`, as long as the hidden
commit is indexed.
2021-10-06 13:28:51 -07:00
Martin von Zweigbergk
9ec27645cf revsets: remove all_heads()
Now that we remove hidden heads whenever a transaction commits,
`non_obsolete_heads()` should always be the same as `all_heads()`,
except during a transaction. I don't think we depend on the difference
even during a transaction. Let's simplify a bit by removing the revset
function `all_heads()` and renaming `non_obsolete_heads()` to
`heads()`. This is part of issue #32.
2021-10-06 12:21:18 -07:00
Martin von Zweigbergk
f3bee0b12e evolution: remove support evolving orphans and divergent commits (#32) 2021-10-03 22:38:49 -07:00
Martin von Zweigbergk
fcb79ef0a6 DescendantRebaser: also update checkout (#32)
This is similar to how a recent change taught `DescendantRebaser` to
update branches pointing to rewritten commits. Now we also update the
checkout if it pointed to a rewritten commit.
2021-10-03 22:38:49 -07:00
Martin von Zweigbergk
5be75d0e31 DescendantRebaser: also update branches
This patch moves the logic for updating branches from
`update_branches_after_rewrite()` into `DescendantRebaser`. The
branches are now updated along with each rebased commit rather than
all being updated at the end. The new code uses the information about
rewritten and abandoned commits that `DescendantRebaser` gets from
`MutableRepo`. That is different from the old code, which used the
evolution state. This patch thus moves us one step closer to removing
evolution (#32).
2021-10-03 10:07:09 -07:00
Martin von Zweigbergk
a78976cd29 DescendantRebaser: allow passing divergent changes as input
I'm going to teach `DescendantRebaser` to also update local branches
pointing to rewritten commits, taking over the responsibility from
`rewrite::update_branches_after_rewrite()`. For commits that have been
rewritten as multiple new commits (divergent, not split), that
function makes local branches pointing to the old commit point to all
the new commits. To replicate that behavior in `DescendantRebaser`, it
needs to know about divergent changes. This change addresses that.
2021-10-02 23:12:46 -07:00
Martin von Zweigbergk
04c313462b DescendantRebaser: minor cleanups and refactoring 2021-10-02 23:12:46 -07:00
Martin von Zweigbergk
1c55c02106 Transaction: remove hidden heads on commit
I recently made the CLI remove hidden heads when a transaction is
committed (38474a9). Let's move that to `Transaction::commit()`, so
the library crate becomes more similar to how the CLI behaves and more
similar to our evolution-less future (#32).
2021-10-02 23:12:46 -07:00
Martin von Zweigbergk
5e0e90d81e MutableRepo: record old checkout abandoned if it's empty
Same reasoning as the previous change.

With this change, I believe we now record all rewritten and abandoned
commits correctly. We're now almost ready to switch the CLI away from
using evolution for automatically rebasing commits.
2021-09-29 20:47:23 -07:00
Martin von Zweigbergk
76352564ad CommitBuilder: record rewritten commits in MutableRepo
This is part of removing support for evolution (#32). Since
`CommitBuilder` now records rewritten commits in `MutableRepo`, we can
use that recorded information to automatically rebase descendants.
2021-09-29 15:45:38 -07:00
Martin von Zweigbergk
d00b805d97 MutableRepo: add support for recording rewritten and abandoned commits
When we remove support for evolution (#32), we need to still make it
easy for application code to rebase descendants of rewritten and
abandoned commits. The way applications currently do that is by using
e.g.  `CommitBuilder::for_rewrite_from()` followed by
`evolve_orphans()`. This patch puts some bookkeeping in `MutableRepo`
for rewritten and abandoned commits, along with a function for
creating a `DescendantRebaser` based on it. I'll then make
`CommitBuilder` record rewritten commits there.
2021-09-29 15:24:04 -07:00
Martin von Zweigbergk
23c3b503c0 tests: move assert_rebased() function to testutils 2021-09-29 11:41:51 -07:00
Martin von Zweigbergk
736a0e7442 Transaction: don't wrap MutableRepo in Arc
I think this was a vestige from the time when `MutableEvolution` had a
back-reference into `MutableRepo` (at least I think it used to be that
way).
2021-09-29 10:13:32 -07:00
Martin von Zweigbergk
ff71af1e11 MutableRepo: accept just CommitId instead of whole Commit where possible 2021-09-29 10:13:32 -07:00
Martin von Zweigbergk
e1fd69cfa8 evolution: don't crash when commit is no longer in view's set of heads
If you rewrite a change twice, from A to A' to A'', then undo the
operation that created A', you'll end up with a repo where A'' refers
to commit (A') that's not reachable from any head in the view. We
currently crash when that happens. This change fixes the
crash. Undoing the A' operation now instead produces a state where A
and A'' are divergent. That at least makes some sense.

This may not seem important since I'm working on removing support for
evolution (#32), but I wanted to get it fixed in order to help with
the transition off of evolution. Specifically, I want to be able to
start removing old heads more freely.

This closes #28.
2021-09-29 10:13:32 -07:00
Martin von Zweigbergk
de5bf90675 cleanup: fix issues found by latest rustc and clippy 2021-09-29 10:12:38 -07:00
Martin von Zweigbergk
edf7ec0a72 git: don't fetch tags we're not fetching commits for
In the recent switch away from `git2::Remote::fetch()`, I passed
`git2::AutotagOption::All`, which caused cloning of e.g. the `clap`
repo to fail like this:

```
Error: Fetch failed: Error { code: -1, klass: 4, message: "target OID for the reference doesn't exist on the repository" }
```

This commit changes from `All` to `Unspecified`, which respects the
remote's configuration.
2021-09-24 07:46:54 -07:00
Martin von Zweigbergk
eed715dc51 git: try to fix flaky (?) fetch tests by keeping connection open longer
It seems it wasn't Windows that behaved differently when it comes
getting the remote's default branch; the test failed on Ubuntu
too.

The documentation for `Remote::default_branch()` says that it can be
called even after the connection has been closed, but let's see if
calling it while the connection is open helps anyway. To do that, we
have to replicate what `Remote::fetch()` does.
2021-09-22 11:48:42 -07:00
Martin von Zweigbergk
d4004fcb6f rewrite: teach DescendantRebaser to handle abandoned commits specially
Descendants of abandoned commits should be rebased onto their parents,
or the rewritten parents if they had been rewritten. This patch
teaches `DescendantRebaser` to do that. It updates `jj rebase -r` to
use the functionality. I plan to also use it in `jj abandon`
(naturally, given the name), and for rebasing descendants of deleted
refs imported from `jj git refresh/fetch/push`.
2021-09-19 22:51:12 -07:00
Martin von Zweigbergk
439fe1cfd3 rewrite: don't report skipped commits when rebasing descendants
The fact that `DescendantRebaser` visits some commits that don't need
to be rebased is mostly an implementation detail. I can't think of a
reason that callers would care about these commits.
2021-09-19 22:51:12 -07:00
Martin von Zweigbergk
ae7f00e7b1 cli: rename jj prune to jj abandon
The command's help text says "Abandon a revision", which I think is a
good indication that the command's name should be `abandon`. This
patch renames the command and other user-facing occurrences of the
word. The remaining occurrences should be removed when I remove
support for evolution.
2021-09-19 22:51:12 -07:00
Martin von Zweigbergk
ef4cb663ae cli: move logic for updating branches after rewrite to lib crate
This patch moves the function for updating branches after rewrite from
`commands.rs` into `rewrite.rs`.

It also changes the function to update branches even if they were
conflicted or become conflicted. I think that seems better than
leaving branches on old commits. For example, let's say you have start
with this:

```
C main
|
B origin@main
|
A
```

You now pull from origin, which has updated the main branch from B to
B'. We apply that change to both the remote branch and the local
branch, which results in a conflict in the local branch:

```
C main?
|
B B' main? origin@main
|/
A
```

If you now rewrite C to C', the conflicted main branch will still
point to C, which is just weird. This patch changes that so the
conflicted side of main gets repointed to C'.

I also refactored the code to reuse our existing
`MutableRepo::merge_single_ref()`, which improves the behavior in
several cases, such as the conflict-resolution case in the last test
case.
2021-09-18 10:03:26 -07:00
Martin von Zweigbergk
0c1ce664ea store: remove (weak) self-reference and take &Arc<Self> arguments instead
The `weak_self` stuff was from before I knew that `self` could be of
type `&Arc<Self>`.
2021-09-16 23:30:30 -07:00
Martin von Zweigbergk
ca114d6d7e rewrite: add support for rebasing descendants of multiple rewritten commits
I plan to use this for rebasing descendants of rewritten remote
branches (on fetch).
2021-09-15 22:13:40 -07:00
Martin von Zweigbergk
30bcf6508e rewrite: when rebasing forward, also rebase "side branches"
As the updates test case shows, when rebasing forward, we missed
commits that fork off from the section between the source and the
destination.

As part of the fix, I also restructured the code a bit to prepare for
support for rebasing descendants of multiple rewritten commits.
2021-09-15 22:08:32 -07:00
Martin von Zweigbergk
e4bc8f5b4c tests: extract helpers to reduce repetition in test_rewrite 2021-09-15 22:02:58 -07:00
Martin von Zweigbergk
48f237e33e cli: correctly update to remote's default branch after clone
It turns out that `FETCH_HEAD` is not the remote's `HEAD` (it's
actually not even a normal symbolic ref; it contains many lines of
commits and names). We're supposed to ask the remote for its default
branch instead. That's what this patch does.
2021-09-13 22:28:13 -07:00
Martin von Zweigbergk
ce5e95fa80 store: rename Store to Backend and StoreWrapper to Store
For what's currently called `Store` in the code, I have been using
"backend" in plain text. That probably means that `Backend` is a good
name for it.
2021-09-12 12:02:10 -07:00
Martin von Zweigbergk
cea3c1537a cli: make jj git push push all branches by default
It's annoying to have to add `--branch main` every time I push to
GitHub.

Maybe we should make it push only the current branch by default, but
we don't even have a concept of a current branch yet...
2021-09-11 23:51:53 -07:00
Martin von Zweigbergk
0bc42c0066 cli: extract function for figuring out how to update branches on a remote 2021-09-11 23:41:53 -07:00
Martin von Zweigbergk
344435e90f git: add support for pushing multiple ref updates at once 2021-09-11 22:54:29 -07:00
Martin von Zweigbergk
11c0130303 index: squash an index segment into its parent more aggressively
Before this change, you could end up with an index segment with 10
commits, then a child segment with 9 commits, then another child with
8 commits, and so on. That's not what I had intended. This changes
makes it so we squash if a segment has more than half as many commits
as its parent instead.
2021-09-11 22:51:13 -07:00
Martin von Zweigbergk
20e9d29c4b CommitBuilder: remove write_to_new_transaction(), which was only used in tests 2021-09-11 10:11:15 -07:00
Martin von Zweigbergk
49e1462fe5 working_copy: delete two obsolete TODOs about ignores
We have had support for ignores via `.gitignore` files since
3b326a942c, and we haven't had the problem with the temporary
`.git/` directory created by libgit2 since 88f7f4732b.
2021-09-03 23:10:45 -07:00
Martin von Zweigbergk
ccdd651953 working_copy: ignore .git directory/file when writing tree to store
Git doesn't want `.git` entries in its trees, so at least when using
the Git backend, we need to ignore such paths. Let's just ignore
`.git` paths regardless of backend to keep it simple.

Closes #24.
2021-09-01 08:40:28 -07:00
Martin von Zweigbergk
f27ca16a16 rewrite: when rebasing descendants, actually rebase them
When I added the function for rebasing descendants, I forgot to call
the existing `rebase()` function and instead simply created a new
commit with the new parents but the old contents.
2021-08-29 09:42:37 -07:00
Martin von Zweigbergk
4e0a89b3dd rewrite: add a function for rebasing descendant commits
This should be useful in lots of places. For example, `jj rebase -r`
currently rebases all descendants, because that's what the auto-evolve
feature does. I think it would be nice to instead copy from
Mercurial's `-s` flag for also rebasing descendants. Then `jj rebase
-r` can be made to pull a commit out of a stack, rebasing descendants
onto the rebased commit's parents. I also intend to use this
functionality for rebasing descendants when remote branches have been
rewritten.
2021-08-28 10:01:00 -07:00
Martin von Zweigbergk
658b41b4e9 revset: add methods on RevsetExpression for constructing them
This makes it much easier to create `RevsetExpression` instances
programmatically.
2021-08-28 10:00:59 -07:00
Martin von Zweigbergk
451451563b revset: work with Rc<RevsetExpression> everywhere
It's about break-even in this commit to `Rc` everywhere, but it will
allow big savings in the next commit.
2021-08-25 22:53:57 -07:00
Martin von Zweigbergk
2afed65132 working_copy: move logic for creating commit to caller
The auto-rebasing of descendants doesn't work if you have an open
commit checked out, which means that you may still end up with orphans
in that case (though that's usually a short-lived problem since they
get rebased when you close the commit). I'm also about to make
branches update to successors, but that also doesn't work when the
branch is on a working copy commit that gets rewritten. To fix this
problem, I've decided to let the caller of `WorkingCopy::commit()`
responsible for the transaction.

I expect that some of the code that this change moves from the lib
crate to the cli crate will later move back into the lib crate in some
form.
2021-08-15 18:55:09 -07:00
Martin von Zweigbergk
7f335a4632 repo: remove some incorrect "mut" modifiers 2021-08-11 10:58:38 -07:00
Martin von Zweigbergk
4594932fc0 git: remove trailing single quotes from error messages 2021-08-11 08:21:42 -07:00
Martin von Zweigbergk
81ba65e3a5 git: force push when not known to be a fast-forward
With this change, we no longer fail if the user moves a branch
sideways or backwards and then push.

The push should ideally only succeed if the remote branch is where we
thought it was (like `git push --force-with-lease`), but that requires
rust-lang/git2-rs#733 to be fixed first.
2021-08-04 23:28:42 -07:00
Martin von Zweigbergk
d555e0c326 git: when fetching, prune refs that have been deleted on the remote
Otherwise remote-tracking branches just pile up.

It seems that both git and libgit2 remove the remote-tracking branch
when you push a deletion, so `jj branch --delete foo; jj git push
--branch foo` already sees `foo` disappear locally as well. However,
if a branch has been deleted on the remote, we would never know before
this change.
2021-08-04 23:00:43 -07:00
Martin von Zweigbergk
8b8aff171e cli: delete branch from git remote when pushing locally deleted branch 2021-08-04 22:50:52 -07:00
Martin von Zweigbergk
e57948347e git: extract a function that can be reused for pushing branch-deletion 2021-08-04 22:14:43 -07:00
Martin von Zweigbergk
7dc82c1580 cli: make jj git push push given branch
Now that we have native branches, we can make `jj git push` only be
about pushing a branch to a remote branch with the same name.

We may want to add back support for the more advanced case of pushing
an arbitrary commit to an arbitrary branch later, but let's get the
common case simplified first.
2021-08-04 22:14:43 -07:00
Martin von Zweigbergk
aeab6660d9 revsets: add branches() and tags() functions
The `branches()` function resolves to all "adds" on both local and
remote branches.
2021-08-04 12:03:13 -07:00
Martin von Zweigbergk
7fad705062 revsets: add support for resolving symbols as tags and branches
This adds support for resolving tags and branches in revsets. Branches
and tags can be resolved by specifying their name (e.g. "main"). To
specify a branch's target on a remote, use e.g. "main@origin". In case
of conflicts, they get resolved to their "adds".
2021-08-04 11:42:03 -07:00
Martin von Zweigbergk
8738421990 git: update own branch and tag records based on git refs
Now that we have our own representation of branches and tags, let's
update them when we import git refs. The View object's git refs are
now just a record of what the refs are in the underlying git ref last
time we imported them (we don't -- and won't -- provide a way for the
user to update our record of the git refs). We can therefore do a nice
3-way ref-merge using the `refs` module we added recently. That means
that we'll detect conflicts caused by changes made concurrently in the
underlying git repo and in jj's view.
2021-08-04 11:39:07 -07:00
Martin von Zweigbergk
044f23bc33 view: add support for ref-based branches and tags to model
I've finally decided to copy Git's branching model (issue #21), except
that I'm letting the name identify the branch across
remotes. Actually, now that I think about, that makes them more like
Mercurial's "bookmarks". Each branch will record the commit it points
to locally, as well as the commits it points to on each remote (as far
as the repo knows, of course). Those records are effectively the same
thing as Git's "remote-tracking branches"; the difference is that we
consider them the same branch. Consequently, when you pull a new
branch from a remote, we'll create that branch locally.

For example, if you pull branch "main" from a remote called "origin",
that will result in a local branch called "main", and also a record of
the position on the remote, which we'll show as "main@origin" in the
CLI (not part of this commit). If you then update the branch locally
and also pull a new target for it from "origin", the local "main"
branch will be divergent. I plan to make it so that pushing "main"
will update the remote's "main" iff it was currently at "main@origin"
(i.e. like using Git's `git push --force-with-lease`).

This commit adds a place to store information about branches in the
view model. The existing git_refs field will be used as input for the
branch information. For example, we can use it to tell if
"refs/heads/main" has changed and how it has changed. We will then use
that ref diff to update our own record of the "main" branch. That will
come later. In order to let git_refs take a back seat, I've also added
tags (like Git's lightweight tags) to the model in this commit.

I haven't ruled out *also* having some more persistent type of
branches (like Mercurials branches or topics).
2021-08-04 11:33:57 -07:00
Martin von Zweigbergk
9fb3521bf5 view: rename insert_git_ref() to set_git_ref()
I just feel like `set_git_ref()` is a more natural name (I was looking
for it the other day before I realized I had called it
`insert_git_ref()`).
2021-08-04 08:45:37 -07:00
Martin von Zweigbergk
7511b02da8 simple_op_store: add some tests
I'm about to start branches and tags in the view and we didn't have
any tests of the store itself (only via higher-level tests).
2021-07-31 19:49:30 -07:00
Martin von Zweigbergk
38032b0132 cleanup: commit transactions in tests when it's simpler
There were some tests that discarded a transaction only because it
used to be easier to do that than to commit and reload the repo. We
get the new repo back when we commit the transaction these days, so
now it's often easier to commit the transaction instead.
2021-07-30 17:47:00 -07:00
Martin von Zweigbergk
6b1ccd4512 view: add support for merging git ref targets
When there are two concurrent operations, we would resolve conflicting
updates of git refs quite arbitrarily before this change. This change
introduces a new `refs` module with a function for doing a 3-way merge
of ref targets. For example, if both sides moved a ref forward but by
different amounts, we pick the descendant-most target. If we can't
resolve it, we leave it as a conflict. That's fine to do for git refs
because they can be resolved by simply running `jj git refresh` to
import refs again (the underlying git repo is the source of truth).

As with the previous change, I'm doing this now because mostly because
it is a good stepping stone towards branch support (issue #21). We'll
soon use the same 3-way merging for updating the local branch
definition (once we add that) when a branch changes in the git repo or
on a remote.
2021-07-24 19:01:56 -07:00
Martin von Zweigbergk
0aa738a518 view: add support for conflicting git refs in the model
This adds support for having conflicting git refs in the view, but we
never create conflicts yet. The `git_refs()` revset includes all "add"
sides of any conflicts. Similarly `origin/main` (for example) resolves
to all "adds" if it's conflicted (meaning that `jj co origin/main` and
many other commands will error out if `origin/main` is
conflicted). The `git_refs` template renders the reference for all
"adds" and adds a "?" as suffix for conflicted refs.

The reason I'm adding this now is not because it's high priority on
its own (it's likely extremely uncommon to run two concurrent `jj git
refresh` and *also* update refs in the underlying git repo at the same
time) but because it's a building block for the branch support I've
planned (issue #21).
2021-07-24 19:01:56 -07:00
Martin von Zweigbergk
a14114256e cleanup: propagate some errors up when failing to write to file 2021-07-24 10:49:32 -07:00
Martin von Zweigbergk
d6a1f9848a cleanup: add explicit import of assert_matches, as required by new rustc 2021-07-24 10:48:52 -07:00
Martin von Zweigbergk
1a4d9d5644 evolution: don't create merge commits with one parent ancestor of another 2021-07-02 23:49:36 -07:00
Martin von Zweigbergk
443528159e conflicts: use new multi-way diff for materialized multi-way conflicts
This copies the conflict marker format I added a while ago to
Mercurial (https://phab.mercurial-scm.org/D9551), except that it uses
`+++++++` instead of `=======` for sections that are pure adds. The
reason I made that change is because we also have support for pure
removes (Mercurial never ends up in that situation because it has
exactly one remove and two adds).

This change resolves part of issue #19.
2021-06-30 12:29:20 -07:00
Martin von Zweigbergk
1390a044e7 files: make merge() accept any number of removes and adds as input 2021-06-30 12:09:36 -07:00
Martin von Zweigbergk
f8c016a8ea files: make MergeHunk support any number of removes and adds
I think `files::merge()` will be a useful place to share code for
resolving conflicting hunks after all. We'll want `MergeHunk` to
support multi-way merges then.
2021-06-30 09:55:16 -07:00
Martin von Zweigbergk
c93e806265 conflicts: make materialized tree/file conflicts etc a little more useful
When there are conflicts between different types of tree entries, we
currently materialize them as "Unresolved complex conflict.". This
change makes it so we mention what types were involved and what their
ids were (though we still don't have an easy way of resolving an id).
2021-06-29 10:49:15 -07:00
Martin von Zweigbergk
7effefa802 files: rewrite 3-way content merge using new multi-way diff 2021-06-26 23:49:58 -07:00
Martin von Zweigbergk
81b51de300 diff: rewrite diff() using new multi-way diff 2021-06-26 23:49:58 -07:00
Martin von Zweigbergk
9448fe665a files: use diff::DiffHunk in DiffLine definition
The new `diff::DiffHunk` type is very similar but more generic. We
don't need the generality here. I just don't two very similar types
with the same name.
2021-06-26 23:49:58 -07:00
Martin von Zweigbergk
987aecc749 diff: add a type for diffing arbitrary number of inputs
I have been trying to figure out how to generalize diffs and merges
for arbitrary number of inputs. For example, I want to have an
internal representation of an octopus merge adding 5 inputs (file
states/contents) and removing 4 inputs. I also want to be to represent
a diff from a regular 3-way-conflict state to a resolved state. Such a
diff would be from a state adding two inputs and removing one, to a
state adding just one input.

I finally realized last week that the problem is simple if you don't
care about adds vs removes. Instead, you line up the matching and
differing parts of all the inputs. It's then up to the caller to use
it in an appropriate way for its use case. For example, a regular diff
would pass in two inputs and would get back a list of matching and
dffering hunks. It might then present the first element of differing
hunks in red and the second element in green. Similarly, a 3-way merge
would pass in three inputs with the base first. It would then compare
the sides and decide on a resolution (or leave it unresolved if all
three sides are different).

This change adds a type representing this kind of multi-way
diff. Coming changes will update existing code to use it. In addition
to making the existing code simpler and more consistent, having this
in place should also:

 * Make it much easier to present merge conflicts involving more than
   3 parts.

 * Experiment with different ways of displaying diffs from/to conflict
   states.

 * Experiment with sub-line-level merging.
2021-06-26 23:49:58 -07:00
Martin von Zweigbergk
7360589246 working_copy: consider it an error if temp file cannot be renamed to target
Unlike the other places I fixed in 134940d2bb, the calls in
`working_copy.rs` should not simply use an existing file if the target
file was open. They should probably try again instead, but I'll leave
that for later.
2021-06-16 10:52:55 -07:00
Martin von Zweigbergk
4c416dd864 cleanup: let Clippy fix a bunch of warnings 2021-06-14 00:27:31 -07:00
Martin von Zweigbergk
134940d2bb windows: don't fail when concurrent threads/processes fail to rename file
On Windows, it seems that you can't rename a file if the target file
is open (Stebalien/tempfile#131). I think that's the reason for our
failing tests on Windows. This patch adds a simple wrapper around
`NamedTempFile::persist()` that returns the existing file instead of
failing, if there is one.
2021-06-14 00:09:22 -07:00
Martin von Zweigbergk
f26c9f01ce working_copy: silence some warnings about unused code on Windows 2021-06-13 22:34:33 -07:00
Martin von Zweigbergk
1ac72b9807 cli: rewrite jj restore <path> to use a matcher created from arguments 2021-06-09 16:45:30 -07:00
Martin von Zweigbergk
dd4c47f373 tree: support filtering diff by matcher
This change teaches `Tree::diff()` to filter by a matcher. It only
filters the result so far; it does not restrict the tree walk to what
`Matcher::visit()` says is necessary yet. It also doesn't teach the
CLI to create a matcher and pass it in.
2021-06-09 16:26:58 -07:00
Martin von Zweigbergk
b593e552b8 cleanup: remove some Vec<_> annotations, mostly by using collect_vec() 2021-06-09 14:21:57 -07:00
Martin von Zweigbergk
0ce50e137a store: use RepoPathComponent in Tree 2021-06-05 23:36:38 -07:00
Martin von Zweigbergk
6b609a8a9f tree: use RepoPathComponent in merge_tree_value() 2021-06-05 23:01:23 -07:00
Martin von Zweigbergk
5ece748dff tree: make TreeEntryDiffIterator use RepoPathComponent 2021-06-05 22:57:57 -07:00
Martin von Zweigbergk
ac38c8b641 repo_path: rename value() to clearer as_str() 2021-06-05 22:49:52 -07:00
Martin von Zweigbergk
fdeb499836 trees: merge into tree module 2021-06-05 14:20:07 -07:00
Martin von Zweigbergk
61ffa66fb5 view: move merge_views() onto View 2021-06-05 14:08:04 -07:00
Martin von Zweigbergk
ac2530637a view: merge ReadonlyView and MutableView into a single View
The two types have become very similar so it doesn't seem that there's
any point in having two types. We should probably do the same with
`ReadonlyEvolution` and `MutableEvolution`.
2021-06-05 14:08:04 -07:00
Martin von Zweigbergk
f722e8662f revsets: replace while loop by for loop, as pointed out by clippy 2021-06-05 09:10:04 -07:00
Martin von Zweigbergk
b50ef1410d styler: rename Styler to more standard Formatter 2021-06-05 08:38:28 -07:00
Martin von Zweigbergk
f6a9523b75 revsets: resolve symbol as change id if nothing else matches
This patch makes it so we attempt to resolve a symbol as the
non-obsolete commits in a change id if all other resolutions
fail.

This addresses issue #15. I decided to not require any operator for
looking up by change id. I want to make it as easy as possible to use
change ids instead of commit ids to see how well it works to interact
mostly with change ids instead of commit ids (I'll try to test that by
using it myself).
2021-05-31 08:32:10 -07:00
Martin von Zweigbergk
ca1df00e05 git: make change id the reverse bits of commit id
The fact that the default change id in git repos is currently a prefix
of the commit id makes it impossible to use for resolving a prefix of
the change id to commits. This patch addresses that by reversing the
bits of the change id (relative to the commit id). The next patch will
make it so a change id (or a prefix thereof) is a valid revset.
2021-05-30 22:10:02 -07:00
Martin von Zweigbergk
c7f701fddb revsets: make resolve_symbol() return multiple revisions
I'd like to experiment with mostly using change ids instead of commit
ids on the CLI. Then it needs to be easy to refer to the non-obsolete
commits in a change, which means we probably don't want to require any
operators (i.e. a plain change id should resolve to the non-obsolete
commits in the change). This patch prepares for letting a change id
resolve to (possibly) many commits.
2021-05-30 13:39:24 -07:00
Martin von Zweigbergk
a3cb0ee3d1 index: add a type parameter to PrefixResolution to prepare for ChangeId prefix 2021-05-30 10:24:35 -07:00
Martin von Zweigbergk
0d9d141705 revsets: refactor symbol lookup to prepare for trying to look up change id 2021-05-29 23:48:48 -07:00
Martin von Zweigbergk
e658cc0084 revsets: add a RevsetExpression::evaluate() method for convenience 2021-05-28 09:01:34 -07:00
Martin von Zweigbergk
c4e2e6b598 revsets: remove unnecessary lifetime parameter 2021-05-28 08:58:44 -07:00
Martin von Zweigbergk
ba8ff31e32 cli: make the working copy changes in jj status clearer 2021-05-23 22:08:34 -07:00
Martin von Zweigbergk
ef6e5c7ec2 TreeDiffIterator: simplify conditions by separating trees from non-trees 2021-05-19 21:27:53 -07:00
Martin von Zweigbergk
29fe1e9737 TreeEntryDiffIterator: yield (Option<T>, Option<T>) instead of Diff<T>
This will allow us to simplify `TreeDiffIterator::next()` in the next
commit.
2021-05-19 17:01:12 -07:00
Martin von Zweigbergk
54f6165ef1 repo_path: replace remaining uses of DirRepoPath by RepoPath 2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
69664f57fe working_copy: use RepoPath in write_tree() 2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
5421251e72 repo_path: change representation of RepoPath to have only a list of components 2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
8347d95159 repo_path: remove unnecessary RepoPath::new() and let caller use join() 2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
57d2bc4068 repo_path: remove unused functions 2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
c66990d3a3 repo_path: rename from() to from_internal_{,dir}_string()
Since `RepoPath` can be either a file or a directory, I made its name
not include `file`.
2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
ef726be78b repo_path: rename to_internal_string() to separate names for dir vs file 2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
257ea39e68 gitignore: add assertion that prefix is empty or ends with '/' 2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
03ae1b747c repo_path: remove FileRepoPath in favor of just RepoPath
I had initially hoped that the type-safety provided by the separate
`FileRepoPath` and `DirRepoPath` types would help prevent bugs. I'm
not sure if it has prevented any bugs so far. It has turned out that
there are more cases than I had hoped where it's unknown whether a
path is for a directory or a file. One such example is for the path of
a conflict. Since it can be conflict between a directory and a file,
it doesn't make sense to use either. Instead we end up with quite a
bit of conversion between the types. I feel like they are not worth
the extra complexity. This patch therefore starts simplifying it by
replacing uses of `FileRepoPath` by `RepoPath`. `DirRepoPath` is a
little more complicated because its string form ends with a '/'. I'll
address that in separate patches.
2021-05-19 15:11:04 -07:00
Martin von Zweigbergk
1fe8f6ac27 cli: on init, give a proper error message instead crashing when repo exists 2021-05-19 14:53:37 -07:00
Martin von Zweigbergk
525a5116a2 RepoLoader: stop returning Result since the functions cannot currently fail 2021-05-19 14:12:54 -07:00
Martin von Zweigbergk
ecc2a6b968 cli: avoid using angle brackets in name/email placeholder, to please git
Git doesn't like seeing "<" and ">" in the user's name or email, so
let's switch to parentheses.
2021-05-19 13:02:39 -07:00
Martin von Zweigbergk
b6038399f0 matchers: rename AlwaysMatcher to EverythingMatcher 2021-05-16 21:21:34 -07:00
Martin von Zweigbergk
45bb748f9b tests: use OS directory separator in path when writing files to working copy 2021-05-16 21:16:27 -07:00
Martin von Zweigbergk
31f3984728 cli: use placeholder name/email if not configured instead of crashing 2021-05-16 14:52:44 -07:00
Martin von Zweigbergk
b671eca7ad cli: print file paths as relative and using OS directory separator 2021-05-16 13:43:23 -07:00
Martin von Zweigbergk
4d51cd96a1 repo: scan for .jj/ directory up the tree
It's kind of silly that I haven't fixed this earlier. I just don't
need it often myself.
2021-05-16 11:06:13 -07:00
Martin von Zweigbergk
c8f2de1ecc matchers: simplify tests using hashset! macro and improve coverage 2021-05-16 10:44:45 -07:00
Martin von Zweigbergk
0d58060d34 repo: move code for loading store into StoreWrapper 2021-05-16 09:52:08 -07:00
Martin von Zweigbergk
6d61475f66 merge: use recursive merge when there are multiple common ancestors 2021-05-15 14:05:34 -07:00
Martin von Zweigbergk
d42e6c77b2 project: rename project from Jujube to Jujutsu
"Jujutsu" is probably much more familiar and relatable to most
people. Also, I'm still not sure how "jujube" is supposed to be
pronounced :P
2021-05-15 10:28:40 -07:00
Martin von Zweigbergk
1cdac0f902 working copy: use system path separator when creating file system path
Hopefully this fixes the failing Windows CI.
2021-05-15 10:04:09 -07:00
Martin von Zweigbergk
a71c56e5e1 evolution: rewrite orphan resolution as iterator 2021-05-15 09:16:19 -07:00
Martin von Zweigbergk
79b5b8d681 evolution: rewrite divergence resolution as iterator
This commit rewites the divergence-resolution part of `evolve()` as an
iterator (though not implementing the `Iterator` trait). Iterators are
just much easier to work with: they can easily be stopped, and errors
are easy to propagate. This patch therefore lets us propagate errors
from writing to stdout (typically pipe errors).
2021-05-15 09:16:19 -07:00
Martin von Zweigbergk
401bd2bd7b evolution: refactor evolve_divergent_change() to return result
I'm going to replace the `evolve()` function and its listener trait by
an iterator or something similar. This commit is to prepare for that.
2021-05-15 09:16:19 -07:00
Martin von Zweigbergk
a028f33e3b working copy: don't remove already-tracked files in ignored directories
The recent optimization to not walk ignored directories did not
account for the case where there already are files in the ignored
directory.
2021-05-15 09:15:45 -07:00
Martin von Zweigbergk
cbc3e915b7 working copy: allow overwriting untracked files
Before this commit, we would crash when checking out a commit that has
a file that's currently ignored in the working copy.
2021-05-15 08:24:56 -07:00
Martin von Zweigbergk
df175f549f working copy: don't assume that $HOME exists (it doesn't on Windows CI) 2021-05-13 23:02:56 -07:00
Martin von Zweigbergk
c1060610bd working copy: optimize simple case of entire directory being ignored
This makes the workging copy walk skip an entire ignored directory if
there are no negative patterns later in the ignore file. That speeds
up `jj st` in this repo with ~13k files in `target/` from ~100 ms to
~25 ms (6.0dB). This closes issue #8.
2021-05-13 22:28:38 -07:00
Martin von Zweigbergk
88f7f4732b gitignores: add own implementation and stop using libgit2's
This is to address issue #8. I haven't added the optimization to avoid
walking all the files in `target/` yet. Even so, this patch still
speeds up `jj st` in this repo, with ~13k files in `target/`, from
~320 ms to ~100 ms (-5.1dB). The time actually checking if paths match
gitignores seems to go down from 116 ms to 6 ms. I think that's mostly
because libgit2 has to look for `.gitignore` files in every parent
directory every time we ask it about a file, while the rewritten code
looks for a `.gitignore` file only when visiting a new directory.
2021-05-13 22:23:59 -07:00
Martin von Zweigbergk
31eff96cb7 cli: record full argv in operation log
When using the command line interface (which is the only interface so
far), it seems more useful to see the exact command that was run than
a logical description of what it does. This patch makes the CLI record
that information in the operation metadata in a new key/value field. I
put it in a generic key/value field instead of a more specialized
field because the key/value field seems like a useful thing to have in
general. However, that means that we "have to" do shell-escaping when
saving the data instead of leaving the data unescaped and adding the
shell-escaping when presenting it. I added very simple shell-escaping
for now.
2021-05-09 22:42:13 -07:00
Martin von Zweigbergk
70b99c960e transaction: make commit() return resulting ReadonlyRepo
I've wanted the API to look like this for a while. It seems like a
good API to me. It means that the caller won't have to reload the repo
after committing. The cost seems relatively small. It involves copying
potentially a lot of data in memory (at least the View object), but it
shouldn't involve reading from disk or any other processing. To reduce
the amount of data to copy, it may be worth switching to persistent
data types. I've also wanted to do that for the copying we do when
start a transaction.

I couldn't measure any slowdown caused by this change.
2021-05-08 13:50:59 -07:00
Martin von Zweigbergk
9e3e6f03a1 repo: store Operation object, not just its ID in ReadonlyRepo
This change simplifies a bit on its own, and it will help with the
next change as well.
2021-05-07 22:53:11 -07:00
Martin von Zweigbergk
c05878b9a6 maintenance: add support for latest nightly toolchain 2021-05-07 22:48:24 -07:00
Martin von Zweigbergk
eb348be32c revsets: make description() lazy by using new FilterRevset 2021-05-02 15:16:51 -07:00
Martin von Zweigbergk
e6f751a24d revsets: generalize merge revset implementation to be a generic filter 2021-05-02 14:58:10 -07:00
Martin von Zweigbergk
7065cecfdc revsets: add revset yielding merge commits 2021-05-02 14:33:38 -07:00
Martin von Zweigbergk
aef27d5701 revsets: remove transitive edges in graph iterator by default
The git.git repo seems to have lots of merges from far back in the
history into newer history. That results in `jj log -r 'git_refs()'`
being completely useless because of the number of such edges. For
example, v2.31.0 has almost 600 edges going out of it and presumably
merging (forking) back into various different previous versions. Git,
unlike Mercurial, seems to remove an edge from the graph if the edge
can also be reached via a longer path. This commit makes it so we also
do that (i.e. the filtered graph is a transitive reduction of the
graph before filtering).

This slows down `jj log -r ,,v2.0.0 -T ""` by about 2%. That's still
small enough that it doesn't seem worth it to have a separate iterator
for contiguous ranges (which would be an option).
2021-05-01 23:25:33 -07:00
Martin von Zweigbergk
33da97f0bf revsets: add iterator adapter for rendering simplified graph of set
When rendering a non-contiguous subset of the commits, we want to
still show the connections between the commits in the graph, even
though they're not directly connected. This commit introduces an
adaptor for the revset iterators that also yield the edges to show in
such a simplified graph.

This has no measurable impact on `jj log -r ,,v2.0.0` in the git.git
repo.


The output of `jj log -r 'v1.0.0 | v2.0.0'` now looks like this:

```
o   e156455ea491 e156455ea491 gitster@pobox.com 2014-05-28 11:04:19.000 -07:00 refs/tags/v2.0.0
:\  Git 2.0
: ~
o c2f3bf071ee9 c2f3bf071ee9 junkio@cox.net 2005-12-21 00:01:00.000 -08:00 refs/tags/v1.0.0
~ GIT 1.0.0
```

Before this commit, it looked like this:

```
o e156455ea491 e156455ea491 gitster@pobox.com 2014-05-28 11:04:19.000 -07:00 refs/tags/v2.0.0
| Git 2.0
| o   c2f3bf071ee9 c2f3bf071ee9 junkio@cox.net 2005-12-21 00:01:00.000 -08:00 refs/tags/v1.0.0
| |\  GIT 1.0.0
```

The output of `jj log -r 'git_refs()'` in the git.git repo is still
completely useless (it's >350k lines and >500MB of data). I think
that's because we don't filter out edges to ancestors that we have
transitive edges to. Mercurial also doesn't filter out such edges, but
Git (with `--simplify-by-decoration`) seems to filter them out. I'll
change it soon so we filter them out.
2021-05-01 14:56:52 -07:00
Martin von Zweigbergk
b953d7d801 git_store: revert lock timeout to 10s
This backs out commit 67e11e0fc3. We now use one thread per CPU in
tests (419002fab4), so I hope the tests won't need the 1-minute
timeout anymore.
2021-05-01 14:40:47 -07:00
Martin von Zweigbergk
df2caab274 tests: add a helper for building commit graphs when only topology is important 2021-04-30 22:46:20 -07:00
Martin von Zweigbergk
419002fab4 tests: use one thread per core in concurrency tests
The tests have been failing in GitHub's CI quite frequently. It's
about time I try to do something about it. Let's see if this helps.
2021-04-29 00:01:04 -07:00
Martin von Zweigbergk
46edbbef09 revsets: add revset function for getting all git refs
This adds a `git_refs()` revset that includes all commits pointed to
by a git ref. It's not very useful yet because the graph log doesn't
use the right type of edges for non-contiguous commits.
2021-04-28 23:34:17 -07:00
Martin von Zweigbergk
f5151bdbbe revsets: add a RevsetIterator type, to simplify API and enable adapters 2021-04-28 23:34:17 -07:00
Martin von Zweigbergk
0145be3693 index: add a newtype wrapper for IndexPosition
It seems better to both hide the specific type and to get some more
type safety.
2021-04-28 23:34:14 -07:00
Martin von Zweigbergk
5b18e89a4d diff: fix LCS when a line/word/byte has been moved later 2021-04-28 23:33:18 -07:00
Martin von Zweigbergk
13134bd5a4 cleanup: address warnings reported by new clippy version 2021-04-28 09:12:48 -07:00
Martin von Zweigbergk
c6f6498cc9 conflicts: add newline after conflict marker lines
Merging is currently done with line-level granularity, so it makes
sense to have newlines after the markers. That makes them easier to
edit out when resolving conflicts.
2021-04-24 13:53:24 -07:00
Martin von Zweigbergk
9dc18524fc revsets: add "ancestor difference" range operator (like git's ..) 2021-04-23 19:10:28 -07:00
Martin von Zweigbergk
49173de423 revsets: add DAG range operator (like hg's infix ::)
This lets you use the same operator as we currently have for ancestors
and descendants (`,,`) to also specify a DAG range. That's what
Mercurial uses the `::` operator for and what Git has `git log
--ancestry-path` for.
2021-04-23 19:10:26 -07:00
Martin von Zweigbergk
d8c209c82a revsets: give parents/children operators higher precedence than range operators 2021-04-23 18:45:42 -07:00
Martin von Zweigbergk
9de5f94af6 revsets: use same error variant for imcomplete parse as for syntax error 2021-04-23 16:59:04 -07:00
Martin von Zweigbergk
0b0374d401 revsets: make parsed Children and Descendants have roots and heads
It seems clearer to let the parsed `RevsetExpression`s have only root
and head expression instead of adding the ancestors when building the
expression tree.
2021-04-23 16:15:17 -07:00
Martin von Zweigbergk
f209354503 revsets: simplify and clarify description revset slightly 2021-04-23 13:30:31 -07:00
Martin von Zweigbergk
145731ec74 revsets: change operators around a bit to prepare for infix DAG range operator
I really liked the idea of having the operators for parents and
ancestors (etc.) look similar, but that turned out to be problematic
when we want to add an infix operator for a DAG range (hg's `::`
revset operator and git's `--ancestry-path` flag). Let's say we chose
`:*:` as the operator. Part of the problem is how to parse `foo:*:bar`
without eagerly parsing the `foo:`. It would also be nicer to use
exactly the same operator as prefix, postfix, and infix. Since the
"parents" operator can be repeated, we can't have it be just `:` and
the "ancestors" operator be `::`. We could make the "ancestors"
operator be something like `*:*` (or anything symmetric with the `:`
symbol on the inside). However, at that point, the operator is getting
ugly and hard to type. Another option would be to use `:` for
ancestors and `::` for parents, but that is counterintuitive and get
annoying if you want to repeat it. So it seems that the best option is
to simply pick different symbols for parents/children and
ancestors/descendants/range.

This patch changes the ancestors/descendants operators to both be
`,,`. I'm not at all attached to that particular symbol. I suspect
we'll change it later.
2021-04-23 11:11:07 -07:00
Martin von Zweigbergk
c894a7435f revsets: make function arguments always be revset expressions
Now that expressions may contain literal strings, we can simply have
functions accept only expressions arguments. That simplifies both the
grammar and the code.

A small drawback is that `description((foo), bar)` is now allowed and
does a search for the string "foo" (not "(foo)"). That seems
unlikely to trip up users.
2021-04-23 10:51:41 -07:00
Martin von Zweigbergk
5819687237 revsets: accept quoted symbol names
Git refs with names containing e.g "-" are currently not accepted
symbol names, and I don't plan to change the grammar to accept
them. Instead, let's have the user quote symbol names containing
unusual characters. That way we can keep these symbols reserved for
revset operators.

With this patch the user can do e.g. `jj diff -r '"v2.9.0-rc2"'`.
2021-04-23 10:37:42 -07:00
Martin von Zweigbergk
d78fd9e979 revsets: add functions and operators for children and descendants
This adds `children(<set>)` and `<set>:` for the children of the given
set, and `descendants(<set>)` and `<set>:*` for the descendants of the
given set. The children and descendants are filtered to be among
ancestors of non-obsolete commits. I haven't added a way of overriding
that yet.
2021-04-21 23:34:20 -07:00
Martin von Zweigbergk
618abf4379 revsets: use consistent "_op" suffix for operator rules in the grammar
This is especially important now that we leak the rule names into the
`SyntaxError` message. For example, the error message when doing `jj
diff -r :` will now mention "expected parents_op, ancestors_op, or
primary". It seems much clearer with the "_op" suffixes there. Longer
term, we should think more about how we can best surface syntax errors
from the library crate.
2021-04-21 18:53:58 -07:00
Martin von Zweigbergk
a4ef42962c revsets: don't crash when given ungrammatical revset
I also snuck in some updates to the test cases.
2021-04-21 18:53:58 -07:00
Martin von Zweigbergk
744f209e76 revsets: move parse-tests to to revset module (from separate test module)
The tests don't need any complex set up (no repo necessary), so they
can be in the `revset` module itself. I'm sure we'll need to split up
that module later (at least separate out the parsing), but that's a
separate problem.
2021-04-21 18:53:58 -07:00
Martin von Zweigbergk
6bc1361b84 index: make revision walk be by position instead of generation number
I don't know why I made it walk by generation number to start
with. Walking by position is better in at least two ways: 1) revsets
now depend on the walks to be by descending index position (though
they could equally well depend on the walks to be by generation number
-- it just needs to be consistent), and 2) the log output gets less
interleaved.

This commit makes the number of bytes in the graphlog output in the
git.git repo drop by ~40% due to the reduced amount of
interleaving. Also, it reduces the time of `jj bench walkrevs v1.0.0
v2.0.0` in the git.git repo by 32% (9.4ms -> 6.4ms) and `jj bench
walkrevs v2.0.0 v1.0.0` by 33% (7.7ms -> 5.1ms).
2021-04-21 18:53:56 -07:00
Martin von Zweigbergk
98f4e24892 cli: make benchmark ids include parameters
It makes no sense to compare a run of `jj walkrevs v1.0.0 v2.0.0` with
a run of `jj walkrevs v2.0.0 v1.0.0`, for example.
2021-04-21 16:56:45 -07:00
Martin von Zweigbergk
64fcf90c68 view: make root commit public 2021-04-18 23:04:15 -07:00
Martin von Zweigbergk
b52cfc156c revsets: add a public_heads() revset function 2021-04-18 22:52:31 -07:00
Martin von Zweigbergk
3a65c1d2ab revsets: add intersection operator 2021-04-18 22:45:12 -07:00
Martin von Zweigbergk
332580918c revsets: add union operator 2021-04-18 22:45:12 -07:00
Martin von Zweigbergk
2ac5d1f912 revsets: allow spaces in most places (but not after prefix operators) 2021-04-18 22:45:12 -07:00
Martin von Zweigbergk
c04f418e67 revsets: add difference operator 2021-04-18 22:45:12 -07:00
Martin von Zweigbergk
e733b074e1 revsets: restructure grammar to prepare for operator precedences 2021-04-18 22:45:12 -07:00
Martin von Zweigbergk
d9ae7cdd6d revsets: allow parenthesized expressions
We'll clearly want to allow parenthesized expressions once we have
infix operators (if not before). Let's prepare by allowing parentheses
already now.
2021-04-18 22:45:12 -07:00
Martin von Zweigbergk
d71c083a7f cli: use revsets also when looking up by description 2021-04-18 22:45:12 -07:00
Martin von Zweigbergk
62f0778942 revsets: add a description() revset
The revset is currently eagerly evaluated, which is clearly bad. We'll
need to fix that later.
2021-04-18 22:45:12 -07:00
Martin von Zweigbergk
05e9149157 revsets: add a non_obsolete_heads() revset
This change adds a `non_obsolete_heads(<set>)` revset, which walks up
ancestors of the input set until it gets to a non-obsolete and
non-pruned commit. That's what we do by default in `jj log`
(i.e. without `--all`). Now we can make `jj log` use revsets and teach
it a `-r` option!
2021-04-18 22:45:10 -07:00
Martin von Zweigbergk
30aa459d2a revsets: add a all_heads() revset function
This adds a `all_heads()` revset function, which contains all heads in
the view, i.e. including non-public heads and obsolete heads.
2021-04-18 22:31:46 -07:00
Martin von Zweigbergk
88904e2b63 revsets: add support for function syntax
This adds `parents(foo)` and `ancestors(foo)` as alternative ways of
writing `:foo` and `*:foo`. 

I haven't added support for for whitespace yet; the parsing is very
strict. The error messages will also need to be improved later.
2021-04-18 21:25:58 -07:00
Martin von Zweigbergk
2d6325b0f4 revsets: define grammar in pest 2021-04-18 21:25:58 -07:00
Martin von Zweigbergk
0d62a336af revsets: initial support for Mercurial-style revsets
This patch adds initial support for a DSL for specifying revisions
inspired by Mercurial's "revset" language. The initial support
includes prefix operators ":" (parents) and "*:" (ancestors) with
naive parsing of the revsets. Mercurial uses postfix operator "^" for
parent 1 just like Git does. It uses prefix operator "::" for
ancestors and the same operator as postfix operator for descendants. I
did it differently because I like the idea of using the same operator
as prefix/postfix depending on desired direction, so I wanted to apply
that to parents/children as well (and for
predecessors/successors). The "*" in the "*:" operator is copied from
regular expression syntax. Let's see how it works out. This is an
experimental VCS, after all.

I've updated the CLI to use the new revset support.

The implementation feels a little messy, but you have to start
somewhere...
2021-04-18 21:25:51 -07:00
Martin von Zweigbergk
7861968f64 index: make IndexRef::entry_by_id() etc return entry with repo's lifetime
It's useful to be able to know that given a `repo: RepoRef<'a>`, the
the lifetime of `repo.index().entry_by_id()` will also be `'a`.
2021-04-15 07:00:04 -07:00
Martin von Zweigbergk
4c3d73ff3b evolution: walk orphans using index
This actually seems to make it slightly slower, but it fixes an
important bug (we used to evolve only one topological branch per `jj
evolve` call). The slowdown seemed to be on the order of 5% when
evolving 100 commits on git.git's "what's cooking" branch.
2021-04-14 08:25:14 -07:00
Martin von Zweigbergk
783e1f6512 repo: make MutableRepo have an Arc<ReadonlyRepo> instead of a reference
I suspect that at least one reason that I didn't make
`MutableRepo::base_repo` by an `Arc<ReadonlyRepo>` before was that I
thought that that would mean that `start_transaction()` would need be
moved off of `ReadonlyRepo` so it can be given an
`&Arc<ReadonlyRepo>`, which would make it much less convenient to
use. It turns out that a `self` argument can actually be of type
`&Arc<ReadonlyRepo>`.
2021-04-11 13:42:31 -07:00
Martin von Zweigbergk
ce855bccfa repo: make reload() and reload_at() return a new ReadonlyRepo
After this patch `ReadonlyRepo` is even closer to readonly. That makes
it easier to reason about. It will allow some further cleanups too.
2021-04-11 10:39:29 -07:00
Martin von Zweigbergk
e3ca27bf77 revsets: support git refs 2021-04-10 10:10:09 -07:00
Martin von Zweigbergk
40f75ec641 revsets: don't crash if given non-hex symbol 2021-04-10 10:08:47 -07:00
Martin von Zweigbergk
9e8a7e2ba6 revsets: move code for resolving symbol to commit to new module 2021-04-10 09:46:27 -07:00
Martin von Zweigbergk
102f7a0416 diff: also recurse into final region after after unchanged regions
See test case for details.


Before:
test bench_diff_10k_lines_reversed  ... bench:  36,249,659 ns/iter (+/- 174,455)
test bench_diff_10k_modified_lines  ... bench:  37,258,890 ns/iter (+/- 803,963)
test bench_diff_10k_unchanged_lines ... bench:       4,252 ns/iter (+/- 69)
test bench_diff_1k_lines_reversed   ... bench:     982,834 ns/iter (+/- 6,467)
test bench_diff_1k_modified_lines   ... bench:   3,343,469 ns/iter (+/- 23,243)
test bench_diff_1k_unchanged_lines  ... bench:         231 ns/iter (+/- 2)
test bench_diff_git_git_read_tree_c ... bench:      95,559 ns/iter (+/- 816)


After:
test bench_diff_10k_lines_reversed  ... bench:  36,186,715 ns/iter (+/- 196,903)
test bench_diff_10k_modified_lines  ... bench:  37,511,000 ns/iter (+/- 1,370,476)
test bench_diff_10k_unchanged_lines ... bench:       3,099 ns/iter (+/- 8)
test bench_diff_1k_lines_reversed   ... bench:     986,010 ns/iter (+/- 11,565)
test bench_diff_1k_modified_lines   ... bench:   3,370,938 ns/iter (+/- 17,041)
test bench_diff_1k_unchanged_lines  ... bench:         230 ns/iter (+/- 2)
test bench_diff_git_git_read_tree_c ... bench:     102,189 ns/iter (+/- 1,052)


So this patch makes diffing even slower (but still easily fast enough
for all cases I've run into in real life). There's probably a lot that
can be done to make things faster, but the first priority is that the
diffs are correct and easy to read.
2021-04-08 23:54:54 -07:00
Martin von Zweigbergk
f4a41f3880 trees: make tree diff return an iterator instead of taking a callback
This is yet another step towards making it easy to propagate
`BrokenPipe` errors. The `jj diff` code (naturally) diffs two trees
and prints the diffs. If the printing fails, we shouldn't just crash
like we do today.

The new code is probably slower since it does more copying (the
callback got references to the `FileRepoPath` and `TreeValue`). I hope
that won't make a noticeable difference. At least `jj diff -r
334afbc76fbd --summary` didn't seem to get measurably slower.
2021-04-07 23:18:00 -07:00
Martin von Zweigbergk
8b2ce18254 trees: make diff_entries() return an iterator instead of taking a callback
The iterator version is easier to use and we get rid of the ugly type
parameter for the error type. I also simplified the code by using
`Peekable` iterators.
2021-04-07 15:48:11 -07:00
Martin von Zweigbergk
5c10c93e64 diff: fix tests broken by the previous commit
Sorry, I forgot to run the automated tests again :(
2021-04-07 11:00:04 -07:00
Martin von Zweigbergk
0dd000d236 diff: do final refinement at byte-level for non-word bytes
This results in significantly more readable diffs on commits like
659393bec2 in this repo.


Before:
test bench_diff_10k_lines_reversed  ... bench:  38,122,998 ns/iter (+/- 557,688)
test bench_diff_10k_modified_lines  ... bench:  32,556,563 ns/iter (+/- 548,114)
test bench_diff_10k_unchanged_lines ... bench:       4,231 ns/iter (+/- 15)
test bench_diff_1k_lines_reversed   ... bench:     958,296 ns/iter (+/- 46,963)
test bench_diff_1k_modified_lines   ... bench:   3,014,723 ns/iter (+/- 15,830)
test bench_diff_1k_unchanged_lines  ... bench:         249 ns/iter (+/- 2)
test bench_diff_git_git_read_tree_c ... bench:      78,599 ns/iter (+/- 1,079)

After:
test bench_diff_10k_lines_reversed  ... bench:  38,289,493 ns/iter (+/- 413,712)
test bench_diff_10k_modified_lines  ... bench:  37,352,516 ns/iter (+/- 1,293,950)
test bench_diff_10k_unchanged_lines ... bench:       4,238 ns/iter (+/- 13)
test bench_diff_1k_lines_reversed   ... bench:     967,253 ns/iter (+/- 8,506)
test bench_diff_1k_modified_lines   ... bench:   3,358,028 ns/iter (+/- 37,154)
test bench_diff_1k_unchanged_lines  ... bench:         233 ns/iter (+/- 1)
test bench_diff_git_git_read_tree_c ... bench:      95,787 ns/iter (+/- 740)


So the biggest slowdown is when there are modified lines.
2021-04-07 10:27:17 -07:00
Martin von Zweigbergk
f634ff0e3f files: make diff() return an iterator instead of using a callback
Iterators are generally nicer to work with. My immediate goal is to be
able to propagate errors when failing to write to stdout.
2021-04-07 10:07:18 -07:00
Martin von Zweigbergk
d7395cc34a diff: add copyright header 2021-04-06 21:26:37 -07:00
Martin von Zweigbergk
7e4e43f358 diff: first diff lines, then refine to words, producing better diffs
The new diff algorithm produces pretty bad diffs in some cases, such
as cc4b1e9230 in this repo (the parent of this commit). I think the
problem there is that many words are repeated over and over. Diffing
first at the line level and then refining the diff of the changed
ranges at the word level gives much better results. That's what this
patch does. After this patch, `jj diff -r cc4b1e923091` looks pretty
similar to the diff in GitHub's UI.

I hope to get around to doing the same for the merge code soon.

Impact on benchmarks:

Before:
test bench_diff_10k_lines_reversed  ... bench:  42,647,532 ns/iter (+/- 765,347)
test bench_diff_10k_modified_lines  ... bench:  21,407,980 ns/iter (+/- 126,366)
test bench_diff_10k_unchanged_lines ... bench:       4,235 ns/iter (+/- 16)
test bench_diff_1k_lines_reversed   ... bench:   1,190,483 ns/iter (+/- 7,192)
test bench_diff_1k_modified_lines   ... bench:   1,919,766 ns/iter (+/- 9,665)
test bench_diff_1k_unchanged_lines  ... bench:         231 ns/iter (+/- 1)
test bench_diff_git_git_read_tree_c ... bench:     174,702 ns/iter (+/- 1,199)

After:
test bench_diff_10k_lines_reversed  ... bench:  38,289,509 ns/iter (+/- 129,004)
test bench_diff_10k_modified_lines  ... bench:  33,140,659 ns/iter (+/- 3,989,339)
test bench_diff_10k_unchanged_lines ... bench:       3,099 ns/iter (+/- 14)
test bench_diff_1k_lines_reversed   ... bench:     973,551 ns/iter (+/- 94,895)
test bench_diff_1k_modified_lines   ... bench:   3,033,818 ns/iter (+/- 29,513)
test bench_diff_1k_unchanged_lines  ... bench:         230 ns/iter (+/- 1)
test bench_diff_git_git_read_tree_c ... bench:      79,100 ns/iter (+/- 963)


So most of them get slower, as expected. The last one, taken from a
real diff in the git.git repo, get faster, however (which is also what
I would have expected).
2021-04-04 21:50:31 -07:00
Martin von Zweigbergk
cc4b1e9230 test: fix merge tests to expect line-based merging
I made a quite late change in a recent patch to make the merge code to
merge based on lines instead of words. I forgot to update the tests
(and to even run them). Sorry :(
2021-04-01 08:27:27 -07:00
Martin von Zweigbergk
c071d412af diff: use new diff algorithm for content diff
The previous patch switched over the content-merge code to use the new
histogram diff code. This patch switches over the content-diff code to
use the histogram diff code. As before, the immediate goal is to speed
it up. `jj diff -r c28ded83fc` in the git.git repo is a good example
of a diff that's extremely slow to calculate with our current
LCS-based diff. With this patch, that drops from 35 s to 0.12 s.

The diff was slightly better before. I think that's mostly because of
our different definition of a "word" in the data. We can improve that
later. The speedup we get now is easily worth the slightly worse diff.
2021-03-31 22:22:59 -07:00
Martin von Zweigbergk
3c35dbace6 merge: use new diff algorithm for finding sync regions
With the histogram diff code from the previous patch, we can now start
using that for finding the "sync regions" in 3-way merge. That helps a
lot with the slow merging we had before this patch. `jj diff -r
9d540e9726` in the git.git repo drops from 22 s to 0.15 s with this
patch. (That commit is a rather arbitrary merge commit from aroun 5
years ago.)

With the new diff algorithm, the output of `jj diff -r 9d540e9726` in
git.git looks better if we find unchanged sync regions based on lines
than on words, so that's what I'm using in this patch. That's a change
compared the the LCS-based diff we used before this patch. I suspect
the reason that finding sync regions based on words works worse now is
not because of the change from LCS to histogram but because of the
change in how we define a word. My goal right now is mostly to make it
faster; I'll get back to refining the diff result later.
2021-03-31 22:16:19 -07:00
Martin von Zweigbergk
1e657c5331 diff: add a histogram(-like?) diff algorithm
The current diff algorithm does a full LCS on the words of the texts,
which is really slow. Diffing the working copy when e.g.
`src/commands.py` has changes far apart takes seconds. This patch adds
an implementation inspired by JGit's Histogram diff. I say "inspired"
because I just didn't quite understand it :P In particular, I didn't
understand what it does when it finds non-unique elements. I decided
to line up the leading common elements on both sides of the merge. I
don't know if that usually gives good enough results in practice.

I'm sure this can still be optimized a lot, but this seems good enough
as a start. There is also many things to improve about the quality of
the diffs.
2021-03-31 22:15:36 -07:00
Martin von Zweigbergk
998e23db3c index: add IndexEntry::parents() and predecessors() returning Vec<IndexEntry> 2021-03-31 14:48:03 -07:00
Martin von Zweigbergk
53d1757994 dag_walk: remove unused TopoIter 2021-03-18 16:42:30 -07:00
Martin von Zweigbergk
db4e8bc458 cargo: upgrade to protobuf 2.22.1 to avoid workaround for rustfmt::skip 2021-03-18 13:06:42 -07:00
Martin von Zweigbergk
07c2b2316f repo: remove obsolete part of a TODO (we use the index to filter out non-heads) 2021-03-17 08:28:21 -07:00
Martin von Zweigbergk
30cd94f842 dag_walk: rename unreachable() to heads() to match name we use in index module 2021-03-16 23:54:51 -07:00
Martin von Zweigbergk
5aec8b9d77 evolution: use index for filtering out ancestors of candidates in new_parent()
This speeds up `jj evolve` of 100 linear commits of the "what's
cooking" branch in the git.git repo further, from ~700 ms to ~400 ms.
2021-03-16 23:43:44 -07:00
Martin von Zweigbergk
73f20c8696 transaction: delete write_commit() and as_repo_ref() helpers
With this patch, the simple delegating helpers are gone from
`Transaction`.
2021-03-16 22:45:58 -07:00
Martin von Zweigbergk
f9873c49ec transaction: remove add_head(), remove_head(), and set_view() helpers 2021-03-16 22:31:28 -07:00
Martin von Zweigbergk
06df609482 transaction: delete check_out() and set_checkout() helpers 2021-03-16 22:31:28 -07:00
Martin von Zweigbergk
808d0af66d transaction: remove evolution() and store() helpers 2021-03-16 22:31:24 -07:00
Martin von Zweigbergk
16d97ef8c0 transaction: remove index() and view() helpers 2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
5ed14185a0 git: take a MutableRepo instead of a Transaction 2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
2c2b5fb3b7 evolution: take a MutableRepo instead of a Transaction 2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
c3b9d1cd13 rewrite: take a MutableRepo instead of a Transaction 2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
ee8423a69e MutableRepo: rename repo to base_repo to clarify its role 2021-03-16 22:05:50 -07:00
Martin von Zweigbergk
69de4698ac tests: set $HOME in a few tests to avoid depending in developer's ~/.gitignore
I just changed my `~/.gitignore` and some tests started failing
because the working copy respects the user's `~/.gitignore`. We should
probably not depend on `$HOME` in the library crate. For now, this
patch just makes sure we set it to an arbitrary directory in the tests
where it matters.
2021-03-16 22:05:36 -07:00
Martin von Zweigbergk
67e11e0fc3 git_store: wait 1 minute for lock on refs to help tests
`test_commit_parallel` was failing on Mac in the GitHub CI. I suspect
the reason was that it was timing out. The test runs in about 1 s on
my Linux desktop and in about 3 s on my Mac laptop. It failed after 31
in the GitHub CI. This patch increases the timeout to 1 minute to try
to make the test pass. It would be better to set the timeout to a
higher value only in tests, but this will be good enough for now. By
the way, it has turned out that git notes (at least libgit2's
implementation of them) are too slow, so we should probably eventually
create our own storage for the extra metadata instead.
2021-03-16 11:28:22 -07:00
Martin von Zweigbergk
81a0e0bd2a protobuf: upgrade to version 2.22.0
I only noticed that there was a newer version when running `cargo
install --path .`, which resulted in warnings about deprecated
functions. There's no other reason I'm aware of to upgrade now.
2021-03-15 17:09:29 -07:00
Martin von Zweigbergk
1ebdd4ecf0 MutableRepo: use index when enforcing view invariants
We can now finally use the commit index for filtering out ancestors
from the sets of heads.

I haven't timed the change from most of the recent work on
performance, but I did a measurement after this commit. I modified a
commit in the git.git repo's "what's cooking" branch (because that's
linear). Then I ran `jj evolve` so the 100 commits after it would get
evolved. That took ~700ms. `git rebase` of the same 100 commits took
~6s.

I also compared `jj op undo` of that `jj evolve` operation. With this
patch, that was sped up from ~6.8s to ~125ms.
2021-03-15 16:35:45 -07:00
Martin von Zweigbergk
3ecb4ec16b MutableRepo: in fast-path for adding head, simply remove parent heads 2021-03-15 15:38:09 -07:00
Martin von Zweigbergk
2c92fca75a MutableView: don't require whole Commit when CommitId is enough 2021-03-15 15:36:03 -07:00
Martin von Zweigbergk
b4b1de3ddc view: let MutableRepo enforce view invariants
`MutableRepo` has more information needed for taking fast-paths, and
it will have to make the same decision for doing incremental updates
of the evolution state anyway.
2021-03-15 15:17:36 -07:00
Martin von Zweigbergk
b9fe944e76 view: remove unnecessary removing of parents in add_head()
We call `enforce_invariants()` right after removing the parent
commits, and that will remove parents anyway.
2021-03-15 15:06:14 -07:00
Martin von Zweigbergk
12a47bd6ed MutableRepo: don't calculate evolution state only to update it 2021-03-15 15:03:50 -07:00
Martin von Zweigbergk
f0619c07ac MutableEvolution: make MutableRepo responsible for lazy calculation
This patch continues the work from the previous pathc. From this
patch, we no longer calculate the evolution state just because a
transaction starts. We still unnecessarily calculate it when adding a
commit within the transaction, however. I'll fix that next.
2021-03-15 15:03:14 -07:00
Martin von Zweigbergk
61acee52f4 ReadonlyEvolution: make ReadonlyRepo responsible for lazy calculation
This patch changes it so that `ReadonlyEvolution` does not lazily
calculate its state and the caller, i.e. `ReadonlyRepo`, is instead
responsible for the laziness. That will allow the caller to make
decisions based on whether the state has been
calculated. Specifically, we don't want to calculate the evolution
state in order to update it incrementally if it hasn't already been
calculated. It's better to just leave it uncalculated in that case.

As a result of moving the laziness out of `ReadonlyEvolution`, we also
don't need to the reference to `ReadonlyRepo` anymore, which
simplifies things a bunch. The next patch will continue by making the
corresponding change to `MutableEvolution`, which will let us simplify
even more.
2021-03-15 14:41:27 -07:00
Martin von Zweigbergk
91117f36b6 cargo: work around warning in generated protobuf code with new nightly rustc 2021-03-14 22:25:43 -07:00
Martin von Zweigbergk
1e9d428406 git: skip tags pointing to GPG keys and similar when importing refs 2021-03-14 20:14:18 -07:00
Martin von Zweigbergk
429a1ad7ab git: set authentication callback on fetch as well
I guess I had not run `jj git fetch` from GitHub until I tried to
fetch the result of PR #6 just now.
2021-03-14 17:18:51 -07:00
Jun Wu
935da3e13f lock: treat PermissionDenied on Windows as transient error
On Windows it can be PermissionDenied when creating the new file
exclusively. This change makes lock_concurrent test pass on Windows.
2021-03-14 15:51:32 -07:00
Jun Wu
eacab648b0 working_copy: clean up ".git" automatically
TreeState::write_tree leaves a ".git" file in the working copy. This is
undesirable but more problematic on Windows - The second time
TreeState::write_tree would panic because Repository::init_opts will fail
with a Permission Denied error.

This seems to be a libgit2 defect. But for now let's just remove ".git"
automatically. This makes `cargo test --test smoke_test` pass on Windows.
2021-03-14 15:49:42 -07:00
Jun Wu
4cd29a2130 working_copy: avoid std::os::unix on Windows
std::os::unix::fs::PermissionsExt::mode() does not exist on Windows.
Treat files on Windows as regular files.
2021-03-14 15:49:22 -07:00
Martin von Zweigbergk
5631e85502 view: don't enforce invariants in merge_views()
We now only call the function from `MutableRepo::merge()`. There we
pass the result to `MutableView::set_view()`, which already enforces
the invariants.
2021-03-14 11:07:34 -07:00