A subsequent commit will teach more variants of ahead/behind messages,
so refactor the existing code to avoid code duplication and a future
combinatorial explosion.
Gerrit also needs to be able to push low-level refs into upstream remotes, just
like `jj git push` does. Doing so requires providing callbacks e.g. for various
password entry mechanisms, which was private to the `git` command module.
Pull these out to a new module `git_utils` so we can reuse it across the two
call sites.
This also moves a few other strictly Git-related functions into `git_utils`
as well, just for the sake of consistency.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Move canonicalization of the external git repo path into the Workspace::init_git_external().
This keeps necessary code together.
* Add a new variant of WorkspaceInitError for reporting path not found errors. The user error
string is written to pass existing tests.
It seems obvious in hindsight to have a virtual root operation just
like we have a virtual root commit. It removes the same kind of
problems by making sure there's always a common ancestor (or multiple)
between any two commits.
I think the reason I didn't add a root operation from the beginning
was that there used to be a mandatory working-copy commit in the view
(this was before support for multiple workspaces).
Perhaps we should remove the "initialize repo" operation now. The only
difference between their view objects is that the "initialize repo"
operation adds the root commit as a head. We could add that to the
root operation, but then the root operation's value depends on the
commit backend.
We've had the public_heads for as long as we've had the View object,
IIRC (I didn't check), but we still don't use it for anything. I don't
have any concrete plans for using it either. Maybe our config for
immutable commits is good enough, or maybe we'll want something more
generic (like Mercurial's phases). For now, I think we should simplify
by removing it the storage for public heads.
This function respects --ignore-working-copy and --at-op arguments, but the
name snapshot() sounds like it could forcibly trigger the snapshotting. Let's
clarify the actual behavior.
import_git_refs_and_head() and snapshot_working_copy() are unchecked functions,
and there are no external callers.
When debugging behavior of badly-GCed repos, I find it's annoying that "op log"
fails because the index can't be loaded. Since "op log" doesn't need a repo, I
think it's better to display the exact op-heads state without merging.
I'm going to make "op log" not load a repo nor resolve concurrent operations,
so there will be the case where no unique "current" operation exists. Since the
"current" operation represents the repo to be loaded, I think it's better to
not consider multi-head operations as the "current" ones. That's why the
templater field isn't Vec<_> but Option<_>.
If indexing failed due to missing commit objects, the repo won't be loadable
without --ignore-working-copy (at least in colocated environment.) In that
case, we can use "op abandon" to recover, but we had to work around the failed
index loading by --ignore-working-copy. Since "op abandon" isn't a repo-level
command, it's better to bypass working-copy snapshot and import of git refs at
all.
--at-op is rejected because it's useless and we'll need extra care for "@"
expression resolution and working-copy updates.
The formatting is closer to hg than git just because it's easier to
conditionalize the whole line per keyword. Our template language doesn't
have infix logical operators.
Unlike the other default templates, all remote branches are displayed because
it's "detailed" output.
Closes#2509
If the existing index was corrupt, it would have to be rebuilt while loading a
repo (at least in colocated environment.) Then, the index will be rebuilt again.
Since new operations and views may be added concurrently by another process,
there's a risk of data corruption. The keep_newer parameter is a mitigation
for this problem. It's set to preserve files modified within the last 2 weeks,
which is the default of "git gc". Still, a concurrent process may replace an
existing view which is about to be deleted by the gc process, and the view
file would be lost.
#12
It doesn't make sense to do gc from a non-head operation because that means
either the head operation would be corrupted or the --at-op argument is
ignored.
We would like to use a non-static version string at Google. We have a
workaround using Lazy, but it seems unfortunate to have to do
that. Using dynamic strings requires Clap's `string` feature, which
increases the binary size by 140 kiB (0.6%).
I'm going to add try_from_hex(), which requires Self: Sized. Such trait bound
could be added, but I don't think we'll need abstracted ObjectId constructors
at all.
I'm going to add a prefix resolution method to OpStore, but OpStore is
unrelated to the index. I think ObjectId, HexPrefix, and PrefixResolution can
be extracted to this module.
In order to implement GC (#12), we'll need to somehow prune old operations.
Perhaps the easiest implementation is to just remove unwanted operation files
and put tombstone file instead (like git shallow.) However, the removed
operations might be referenced by another jj process running in parallel. Since
the parallel operation thinks all the historical head commits are reachable, the
removed operations would have to be resurrected (or fix up index data, etc.)
when the op heads get merged.
The idea behind this patch is to split the "op log" GC into two steps:
1. recreate operations to be retained and make the old history unreachable,
2. delete unreachable operations if the head was created e.g. 3 days ago.
The latter will be run by "jj util gc". I don't think GC can be implemented
100% safe against lock-less append-only storage, and we'll probably need some
timestamp-based mechanism to not remove objects that might be referenced by
uncommitted operation.
FWIW, another nice thing about this implementation is that the index is
automatically invalidated as the op id changes. The bad thing is that the
"undo" description would contain an old op id. It seems the performance is
pretty okay.
This removes CommandError dependency from these resolution functions. We might
want to refactor the error types again if we introduce a real "opset" evaluator.
The error message for unresolved op heads now includes "@" instead of the whole
expression.
The resolver callback usually returns wider error type, which I don't think
is a variant of OpHeadResolutionError.
To help type inference, resolver's error type is E, not E1 where E: From<E1>.