It's useful to know which commit is checked out in the underlying Git
repo (if there is one), so let's show that. This patch indicates that
commit with `HEAD@git` in the log output. It's probably not very
useful when the Git repo is "internal" (i.e. stored inside `.jj/`),
because then it's unlikely to change often. I therefore considered not
showing it when the Git repo is internal. However, it turned out that
`HEAD` points to a non-existent branch in the repo I use, so it won't
get imported anyway (by the function added in the previous patch). We
can always review this decision later.
This is part of #44.
This patch adds a place for tracking the current `HEAD` commit in the
underlying Git repo. It updates `git::import_refs()` to record it. We
don't use it anywhere yet.
This is part of #44.
If nothing changed in a transaction, it's rarely useful to commit it,
so let's avoid that. For example, if you run `jj git import` without
changing the anything in the Git repo, we now just print "Nothing
changed.".
The `Repo` doesn't do anything with the `WorkingCopy` except keeping a
reference to it for its users to use. In fact, the entire lib crate
doesn't do antyhing with the `WorkingCopy`. It therefore seems simpler
to have the users of the crate manage the `WorkingCopy` instance. This
patch does that by letting `Workspace` own it. By not keeping an
instance in `Repo`, which is `Sync`, we can also drop the
`Arc<Mutex<>>` wrapping.
I left `Repo::working_copy()` for convenience for now, but now it
creates a new instance every time. It's only used in tests.
This further decoupling should help us add support for multiple
working copies (#13).
Having a concept of a "workspace" will be useful for adding support
for multiple workspaces (#13). You can think of the "workspace" as a
repo combined with a working copy. A workspace corresponds 1:1 with a
`.jj/` directory. It's pretty close to what other VCS simply call a
"repo", but I've ended up using the word "repo" for what Git calls a
"bare repo".
Since the working copy can now handle conflicts, we don't need to
materialize conflicts when checking out a commit.
Before this patch, we used to create a new commit on top whenever we
checked out a commit with conflicts. That new commit was intended just
for resolving the conflicts. The typical workflow was the resolve the
conflicts and then amend. To use the same workflow after this patch,
one needs to explicitly create a new commit on top with `jj new` after
checking out a commit with conflict.
If you rewrite a commit that's also available on some remote, you'll
currently see both the old version and the new version in the view,
which means they're divergent. They're not logically divergent (the
rewritten version should replace the old version), so this is a UX
bug. I think it indicates that the set of current heads should be
redefined to be the *desired* heads. That's also what I had suspected
in the TODO removed by this change. I think another indication that
we should hide the old heads even if they have e.g. a remote branch
pointing to them is that we don't want them to be rebased if we
rewrite an ancestor.
So that's what I decided to do: let the view's heads be the desired
heads. The user can still define revsets for showing non-current
commits pointed to by e.g. remote branches.
I think this is just cleaner, and it gives us room to put other
store-related data in the `.jj/store/` directory. I may want to use
that place for writing the metadata we currently write in Git notes
(#7).
It's been a lot of work, but now we're finally able to remove the
`Evolution` state! `jj obslog` still works as before (it just walks
the predecessor pointers).
The removal of hidden heads was just there to help with the transition
away from evolution (#32). Now that we no longer depend on evolution
for removing old heads, we can remove the hack.
I'm going to teach `DescendantRebaser` to also update local branches
pointing to rewritten commits, taking over the responsibility from
`rewrite::update_branches_after_rewrite()`. For commits that have been
rewritten as multiple new commits (divergent, not split), that
function makes local branches pointing to the old commit point to all
the new commits. To replicate that behavior in `DescendantRebaser`, it
needs to know about divergent changes. This change addresses that.
I recently made the CLI remove hidden heads when a transaction is
committed (38474a9). Let's move that to `Transaction::commit()`, so
the library crate becomes more similar to how the CLI behaves and more
similar to our evolution-less future (#32).
Same reasoning as the previous change.
With this change, I believe we now record all rewritten and abandoned
commits correctly. We're now almost ready to switch the CLI away from
using evolution for automatically rebasing commits.
When we remove support for evolution (#32), we need to still make it
easy for application code to rebase descendants of rewritten and
abandoned commits. The way applications currently do that is by using
e.g. `CommitBuilder::for_rewrite_from()` followed by
`evolve_orphans()`. This patch puts some bookkeeping in `MutableRepo`
for rewritten and abandoned commits, along with a function for
creating a `DescendantRebaser` based on it. I'll then make
`CommitBuilder` record rewritten commits there.
The command's help text says "Abandon a revision", which I think is a
good indication that the command's name should be `abandon`. This
patch renames the command and other user-facing occurrences of the
word. The remaining occurrences should be removed when I remove
support for evolution.
Now that we have native branches, we can make `jj git push` only be
about pushing a branch to a remote branch with the same name.
We may want to add back support for the more advanced case of pushing
an arbitrary commit to an arbitrary branch later, but let's get the
common case simplified first.
I've finally decided to copy Git's branching model (issue #21), except
that I'm letting the name identify the branch across
remotes. Actually, now that I think about, that makes them more like
Mercurial's "bookmarks". Each branch will record the commit it points
to locally, as well as the commits it points to on each remote (as far
as the repo knows, of course). Those records are effectively the same
thing as Git's "remote-tracking branches"; the difference is that we
consider them the same branch. Consequently, when you pull a new
branch from a remote, we'll create that branch locally.
For example, if you pull branch "main" from a remote called "origin",
that will result in a local branch called "main", and also a record of
the position on the remote, which we'll show as "main@origin" in the
CLI (not part of this commit). If you then update the branch locally
and also pull a new target for it from "origin", the local "main"
branch will be divergent. I plan to make it so that pushing "main"
will update the remote's "main" iff it was currently at "main@origin"
(i.e. like using Git's `git push --force-with-lease`).
This commit adds a place to store information about branches in the
view model. The existing git_refs field will be used as input for the
branch information. For example, we can use it to tell if
"refs/heads/main" has changed and how it has changed. We will then use
that ref diff to update our own record of the "main" branch. That will
come later. In order to let git_refs take a back seat, I've also added
tags (like Git's lightweight tags) to the model in this commit.
I haven't ruled out *also* having some more persistent type of
branches (like Mercurials branches or topics).
There were some tests that discarded a transaction only because it
used to be easier to do that than to commit and reload the repo. We
get the new repo back when we commit the transaction these days, so
now it's often easier to commit the transaction instead.
When there are two concurrent operations, we would resolve conflicting
updates of git refs quite arbitrarily before this change. This change
introduces a new `refs` module with a function for doing a 3-way merge
of ref targets. For example, if both sides moved a ref forward but by
different amounts, we pick the descendant-most target. If we can't
resolve it, we leave it as a conflict. That's fine to do for git refs
because they can be resolved by simply running `jj git refresh` to
import refs again (the underlying git repo is the source of truth).
As with the previous change, I'm doing this now because mostly because
it is a good stepping stone towards branch support (issue #21). We'll
soon use the same 3-way merging for updating the local branch
definition (once we add that) when a branch changes in the git repo or
on a remote.
This adds support for having conflicting git refs in the view, but we
never create conflicts yet. The `git_refs()` revset includes all "add"
sides of any conflicts. Similarly `origin/main` (for example) resolves
to all "adds" if it's conflicted (meaning that `jj co origin/main` and
many other commands will error out if `origin/main` is
conflicted). The `git_refs` template renders the reference for all
"adds" and adds a "?" as suffix for conflicted refs.
The reason I'm adding this now is not because it's high priority on
its own (it's likely extremely uncommon to run two concurrent `jj git
refresh` and *also* update refs in the underlying git repo at the same
time) but because it's a building block for the branch support I've
planned (issue #21).
The two types have become very similar so it doesn't seem that there's
any point in having two types. We should probably do the same with
`ReadonlyEvolution` and `MutableEvolution`.
I've wanted the API to look like this for a while. It seems like a
good API to me. It means that the caller won't have to reload the repo
after committing. The cost seems relatively small. It involves copying
potentially a lot of data in memory (at least the View object), but it
shouldn't involve reading from disk or any other processing. To reduce
the amount of data to copy, it may be worth switching to persistent
data types. I've also wanted to do that for the copying we do when
start a transaction.
I couldn't measure any slowdown caused by this change.
I suspect that at least one reason that I didn't make
`MutableRepo::base_repo` by an `Arc<ReadonlyRepo>` before was that I
thought that that would mean that `start_transaction()` would need be
moved off of `ReadonlyRepo` so it can be given an
`&Arc<ReadonlyRepo>`, which would make it much less convenient to
use. It turns out that a `self` argument can actually be of type
`&Arc<ReadonlyRepo>`.
We can now finally use the commit index for filtering out ancestors
from the sets of heads.
I haven't timed the change from most of the recent work on
performance, but I did a measurement after this commit. I modified a
commit in the git.git repo's "what's cooking" branch (because that's
linear). Then I ran `jj evolve` so the 100 commits after it would get
evolved. That took ~700ms. `git rebase` of the same 100 commits took
~6s.
I also compared `jj op undo` of that `jj evolve` operation. With this
patch, that was sped up from ~6.8s to ~125ms.
`MutableRepo` has more information needed for taking fast-paths, and
it will have to make the same decision for doing incremental updates
of the evolution state anyway.
This patch continues the work from the previous pathc. From this
patch, we no longer calculate the evolution state just because a
transaction starts. We still unnecessarily calculate it when adding a
commit within the transaction, however. I'll fix that next.
This patch changes it so that `ReadonlyEvolution` does not lazily
calculate its state and the caller, i.e. `ReadonlyRepo`, is instead
responsible for the laziness. That will allow the caller to make
decisions based on whether the state has been
calculated. Specifically, we don't want to calculate the evolution
state in order to update it incrementally if it hasn't already been
calculated. It's better to just leave it uncalculated in that case.
As a result of moving the laziness out of `ReadonlyEvolution`, we also
don't need to the reference to `ReadonlyRepo` anymore, which
simplifies things a bunch. The next patch will continue by making the
corresponding change to `MutableEvolution`, which will let us simplify
even more.
We now only call the function from `MutableRepo::merge()`. There we
pass the result to `MutableView::set_view()`, which already enforces
the invariants.
This adds `MutableRepo::merge()`, which applies the difference between
two `ReadonRepo`s to itself. That results in much simpler code than
the current code in `merge_op_heads()`. It also lets us write `undo`
using the new function. Finally -- and this is the actual reason I did
it now -- it prepares for using the index when enforcing view
invariants.
It's sometimes useful to create a `RepoLoader` given an existing
`ReadonlyRepo`. We already do that in `ReadonlyRepo::reload()`. This
patch repurposes `ReadonlyRepo::reload()` for that.
This is yet another step towards making the `View` types
simpler. Perhaps we eventually won't need to wrap the types returned
from the `OpStore` at all.
I'd like to make `ReadonlyView` and `MutableView` focused on just the
state of the view (i.e. the set of heads, git refs, etc.). The
responsibility for managing the `.jj/view/op_heads/` directory should
be moved out of it. This prepares for that.
The methods are now only called from within the type. Inlining means
that the borrow checker will let us borrow these separate fields
concurrently. We'll take advantage of that soon.
I think the `Option<>` wrapping was from the time when `MutableView`
had a reference back to the repo (and `MutableIndex` was probably
wrapped out of habit).
Most methods on `Transaction` only need the `MutableRepo`, so it makes
for that functionality to be on the latter. That will let us update
the methods to also update the index, which would otherwise have been
harder because it would require a mutable borrow of both the view and
the index. This patch makes most current methods on `Transaction` just
delegate to `MutableRepo`. We may want to remove some of these
delegating methods later.
Updating the index on disk means that reader won't have to calculate
the state. Updating it in memory means that we can take advantage of
it while resolving conflicts. We will do that soon.
This patch introduces a new `IndexStore` struct. The idea is that it
will know about the directory in which the index files are stored, the
associations with operations. It may also cache `Arc<ReadonlyIndex>`
instances so if multiple `ReadonlyIndex` instances are loaded, they
can be returned from the cache. That may be useful when merging
operations because the operations are likely to share a large parent
index file. For now, however, all the new type has is `init()`,
`load()`, and `reinit()`.
The only way to load the repo at a current operation (as with
`--at-op`) is currently to first load it at the head operation and
then call `reload()` on the repo. This patch makes it so we can load
the repo directly at the requested operation.
We'll want to be able to load the repo at a given operation without
first loading the head operation as we do today. This patch introduces
a struct for keeping the state of a half-loaded repo. In that
half-loaded state, the store and the op-store have been loaded, but
the view has not yet been loaded. That makes it possible for callers
to use the loaded op-store for looking up an operation to load the
view at.
We want to support loading the repo at a specific operation without
first loading the head operation (like we currently do). One reason
for that is of course efficiency. A possibly more important reason is
that the head operation may be conflicted, depending on how we decide
to deal with operation-level conflicts. In order to do that, it makes
sense to move the creation of the `OpStore` outside of the
`View`. That also suggests that the `.jj/view/op_store/` directory
should move to `.jj/op_store/`, so this patch also does that. That's
consistent with how `.jj/store/` is outside of `.jj/working_copy/`.
With this change, we start writing the incremental index to disk, so
the next reader won't have to re-read the commits and create the
index.
As of this change, we simply write a new index file for each
transaction. That will clearly mean that the stack of files gets deep
pretty quickly. For now, the user will have to do `jj debug reindex`
when things get slow. I plan to change it so instead of writing an
incremental index file every time, we first check if the new index
file would have at least as many commits as the parent file, and if it
will, we write a combined one instead. That should apply recursively,
so we'd have O(log n) index files.
With tons of groundwork done, wee can now finally keep the index up to
date within a transaction! That means that we can start relying on the
index to always be valid, so we can use it e.g. for finding common
ancestors within a transaction. That should help speed up `jj evolve`
immensely on large repos.
We still don't write the updated index to disk when the transaction
closes. That will come later.
I want to keep the index updated within the transaction. I tried doing
that by adding a `trait Index`, implemented by `ReadonlyIndex` and
`MutableIndex`. However, `ReadonlyRepo::index` is of type
`Mutex<Option<Arc<IndexFile>>>` (because it is lazily initialized),
and we cannot get a `&dyn Index` that lives long enough to be returned
from a `Repo::index()` from that. It seems the best solution is to
instead create an `Index` enum (instead of a trait), with one readonly
and one mutable variant. This commit starts the migration to that
design by replacing the `Repo` trait by an enum. I never intended for
there there to be more implementations of `Repo` than `ReadonlyRepo`
and `MutableRepo` anyway.
The function used to be larger when we had more reference cycles
between e.g. the `WorkingCopy` and `Repo`. Now there's only
`Evolution` left, so let's inline the function.
This commits makes it so that running commands outside a repo results
in an error message instead of a panic.
We still don't look for a `.jj/` directory in ancestors of the current
directory.
It's annoying to have to have the Git repo and Jujube repo in separate
directories. This commit adds `jj init --git`, which creates a new
Jujube repo with an empty, bare git repo in `.jj/git/`. Hopefully the
`jj git` subcommands will eventually provide enough functionality for
working with the Git repo that the user won't have to use Git commands
directly. If they still do, they can run them from inside `.jj/git/`,
or create a new worktree based on that bare repo.
The implementation is quite straight-forward. One thing to note is
that I made `.jj/store` support relative paths to the Git repo. That's
mostly so the Jujube repo can be moved around freely.
This removes one level of indirection, which is nice because it was
visible to the callers. The `Index` struct is now empty. The next step
is obviously to delete it (and perhaps rename `IndexFile` to `Index`
or `ReadonlyIndex`).