We need to .collect_vec() the parents iterator to temporary buffer since the
borrowed iterator can't be returned back to the dag_walk functions. Another
option is to clone op_store and parent ids to remove &self lifetime from the
iterator, but that also means a temporary Vec is created.
GitBackend will use it to configure gix::Repository. I think UserSettings
is generally useful to pass store-specific parameters, so I've updated all
factory functions.
I'll make it propagate OpStoreError, but OpStoreError is quite different
from the existing StaleWorkingCopyError. I think this error isn't actually
an "error" but a description of the working copy state.
Per discussion in https://github.com/martinvonz/jj/discussions/2555. I'm
okay with either way, but it's confusing if we had "branch create" and
"branch set" and both of these could create a new branch.
Renamed `description_template_for_commit` to
`description_template_for_describe` since it's only used in
`cmd_describe`.
Renamed `description_template_for_cmd_split` to
`description_template_for_commit` and modified to accomodate empty
`intro` argument.
Fixes#2439.
We have a few places where we have a `MergedTreeValue` and need to
read the data associated with it so we can write to the working copy
or include it in a diff. Let's extract some of that shared logic to a
function so we can reuse it. I plan to use it for reading file
contents in advance while streaming a diff in `local_working_copy`
soon (and probably in `jj diff` thereafter), but I think it seems like
an improvement on its own.
Remove a couple of unnecessary unsafes:
- The NonZeroUsize is a constant where the unwrap will optimize away
anyway and we don't have an unsafe without any good reason there :)
- The other two were simply not needed, lifetimes worked fine, maybe
Rust became better since that code was written? NLL? Anyway, they're
gone now
As discussed in Discord, it's less useful if remote_branches() included
Git-tracking branches. Users wouldn't consider the backing Git repo as
a remote.
We could allow explicit 'remote_branches(remote=exact:"git")' query by changing
the default remote pattern to something like 'remote=~exact:"git"'. I don't
know which will be better overall, but we don't have support for negative
patterns anyway.
Since the concurrent diff algorithm is significantly slower when using
the Git backend, I think we'll have to use switch between the two
algorithms depending on backend. Even if the concurrent version always
performed as well as the sequential version, exactly how concurrent it
should be probably still depends on the backend. This commit therefore
adds a function to the `Backend` trait, so each backend can say how
much concurrency they deal well with. I then use that number for
choosing between the sequential and concurrent versions in
`MergedTree::diff_stream()`, and also to decide the number of
concurrent reads to do in the concurrent version.
One less git2 API use in CLI.
The function name GitBackend::init_colocated() is a bit odd, but we need to
specify the work-tree path, not the ".git" repo path. So we can't eliminate
the notion of the working copy path anyway.
This also adds `jobs`, the argument reading the thread count to use and `shell_command`.
While we're at it, make `execute` a no-op and teach `run` to resolve the passed revsets.
I also fixed my misunderstanding of `Clap` which makes
`jj run 'echo hello world' -r 'mine() & ~origin@remote' --jobs 4` parse correctly.
Also contains a small fix in the `pre-commit` example for it.
Summary: This is currently used by `new.rs`, `workspace.rs`, and `rebase.rs`,
and may be useful for other commands and custom CLIs. So just go ahead and move
it into the parent module hierarchy.
Also rename the function to `resolve_all_revs`, as it isn't actually specific to
rebase at all.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I0ea12afd8107f95a37a91340820221a0
Summary: A natural extension of the existing support, as suggested by Scott
Olson. Closes#2496.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I91c9c8c377ad67ccde7945ed41af6c79
What make rebase_to_dest_parent a good candidate for jj_lib::rewrite module:
- It is used both in obslog and interdiff. It's a sign that it may be moved to a lower layer
- CommandError is returned by converting from TreeMergeError. Not explicitly.
- It only use jj_lib::rewrite fonctions.
I'm going to implement a `Stream`-based version optimized for
high-latency (RPC-based) commit backends. So far, that implementation
is about 20% slower in the Linux repo when running `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. I think that's almost
only because the algorithm is different, not because it's async per
se.
This commit adds a `Stream`-based version of `MergedTree::diff()` that
just wraps the regular iterator in stream. I updated `jj diff` to use
it. I couldn't measure any difference on the command above in the
Linux repo. I think that means we can safely use the same
`Stream`-based interface regardless of backend, even if we end up
needing two different implementations of the `Stream`. We would then
be using the wrapped iterator from this commit for local backends, and
the new implementation for remote backends. But ideally we can make
the remote-friendly implementation fast enough that we don't need two
implementations.
During the transition to using more async code, I keep running into
https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want
to convert `MergedTree::diff()` into a `Stream`. I don't want to
update all call sites at once, so instead I'm adding a
`MergedTree::diff_stream()` method, which just wraps
`MergedTree::diff()` in a `Stream. However, since the iterator is
synchronous, it needs to block on the async `Backend::read_tree()`
calls. If we then also block on the `Stream` in the CLI, we run into
the panic.
We had similar code in two places for restoring paths from one tree to
another. Let's reuse it instead.
I put the new function in the `rewrite` module. I'm not sure if that's
right place. Maybe it belongs in `tree`?
Since gix::Repository::config_snapshot() borrows the repo instance, it has to
be allocated in caller's stack. That's why GitBackend::git_config() is removed.
My gut feeling is that gitoxide aims to be more transparent than libgit2. We'll
need to know more about the underlying Git data model.
Random comments on gix API:
* gix::Repository provides API similar to git2::Repository, but has less
"convenient" functions. For example, we need to use .find_object() +
.try_to/into_<kind>() instead of .find_<kind>().
* gix::Object, Blob, etc. own raw data as bytes. gix::object and gix::objs
types provide high-level views on such data.
* Tree building is pretty low-level compared to git2.
* gix leverages bstr (i.e. bytes) extensively.
It's probably not difficult to migrate git::import/export_refs(). It might
help eliminate the startup overhead of libssl initialization. The gix-based
GitBackend appears to be a bit faster, but that wouldn't practically matter.
#2316
If we create a `ColorFormatter`, add some labels to it, print
something using the configured style, and then return early because of
an error, we would leave the terminal in a bad state. I think many
shells reset color codes after a command returns, but let's still do
our best.
Like "jj log PATHS...", unmatched name isn't an error. I don't think
"jj branch list glob:'push-*'" should fail just because there are no in-flight
PR branches.
This avoids https://github.com/rust-lang/futures-rs/issues/2090. I
don't think we need to worry about reading legacy conflicts
asynchronously - async is really only useful for Google's backend
right now, and we don't use the legacy format at Google. In
particular, I don't want `MergedTree::value()` to have to be async.
Both local and remote refs are backed by the same value type since we'll need
some kind of runtime abstraction to represent "branches" keyword (which is a
list of local + remote branches.) It's tedious to implement separate
local/remote/both ref types.
The "unsynced" flag is inverted just because the positive term is slightly
easier to document.
I'm going to change "branches" to return a list instead of formatted string,
and I don't think "if(branches, ..)" should be invalidated by that. Perhaps,
a container type like String, Vec<T>, Option<T> can implement the cast.
Makes sure that, at the point the commit summary for the new commit is written,
the original commit that is being rewritten is already abandoned. Otherwise,
once we show divergent change ids (in a subsequent commit) in the short commit
template, the commits would be shown as divergent.
This also has an effect on whether branches are displayed next to the commit;
the changes in test_resotre_command happen because, now, the branch is properly
propagated to the restored commit before its summary is displayed.
test_templater_alias(), test_templater_alias_override(), and
test_templater_bad_alias_decl() aren't moved since they also test config loading
and error formatting. The first test in test_templater_parse_error() is left for
the same reason. test_templater_upper_lower() depends on the commit templater.
I don't think many of the tests in test_templater.rs should use "jj log" command
as they check very specific template syntax and function behaviors. Let's move
them to in-module tests. We could add a separate test file, but we would have
to export a couple of templater macros.
test_templater_timestamp_method() is migrated as example.
I want to fix error propagation before I start using async in this
code. This makes the diff iterator propagate errors from reading tree
objects.
Errors include the path and don't stop the iteration. The idea is that
we should be able to show the user an error inline in diff output if
we failed to read a tree. That's going to be especially useful for
backends that can return `BackendError::AccessDenied`. That error
variant doesn't yet exist, but I plan to add it, and use it in
Google's internal backend.
Reasons to introduce this alias:
* Reduces complexity of a type, to silence Clippy warnings in the
future if we use this type as a type parameter
* The type is used quite frequently, so it makes sense to have a name
for it
* It's easier to visually scan for the end of the type when you don't
have to match opening and closing angle brackets
Since "jj git fetch --branch" supports glob patterns, users would expect that
"jj git push --branch glob:.." also works.
The error handling bits are copied from "branch" sub commands. We might want to
extract it to a common helper function, but I haven't figured out a reasonable
boundary point yet.
Thanks to @glencbz for noticing that VS Code works fine now as a
merge tool, and thanks to @solson for suggesting
`merge-tool-edits-conflict-markers = true`.
AFAICT, all callers of `Merge::to_file_merge()` are already well
prepared for working with executable files. It's called from these
places:
* `local_working_copy.rs`: Materialized conflicts are correctly
updated using `Merge::with_new_file_ids()`.
* `merge_tools/`: Same as above.
* `cmd_cat()`: We already ignore the executable bit when we print
non-conflicted files, so it makes sense to also ignore it for
conflicted files.
* `git_diff_part()`: We print all conflicts with mode "100644" (the
mode for regular files). Maybe it's best to use "100755" for
conflicts that are unambiguously executable, or maybe it's better to
use a fake mode like "000000" for all conflicts. Either way, the
current behavior seems fine.
* `diff_content()`: We use the diff content in various diff
formats. We could add more detail about the executable bits in some
of them, but I think the current output is fine. For example,
instead of our current "Created conflict in my-file", we could say
"Created conflict in executable file my-file" or "Created conflict
in ambiguously executable file my-file". That's getting verbose,
though.
So, I think all we need to do is to make `Merge::to_file_merge()` not
require its inputs to be non-executable.
Closes#1279.
I'm about to make conflicts also get materialized in executable
files. We'll lose some of the test coverage in `test_chmod_command.rs`
then, because the those tests rely on the materialized content to
describe the executable bits. So this commit adds a debug command for
printing tree values and uses that in the tests.
If we add glob support, users will probably want to do something like
'jj branch untrack glob:"*@origin"'. It would be annoying if the command
failed just because one of the remote branches has already been untracked.
Since branch tracking/untracking is idempotent, it's safe to continue in
those cases.
The parse rule is lax compared to revset. We could require the pattern to be
quoted, but that would mean glob patterns have to be quoted like 'glob:"foo*"'.
find_forgettable_branches() is unchanged for now. I might want to rewrite it
to not remove untracked remote branches (because untracked branches aren't
associated with the local counterparts.)
We need to let async-ness propagate up from the backend because
`block_on()` doesn't like to be called recursively. The conflict
materialization code is a good place to make async because it doesn't
depends on anything that isn't already async-ready.
I personally don't mind if "jj branch list" showed all non-tracking branches,
but I agree it would be a mess if ~500 remote branches were listed. So let's
hide them by default as non-tracking branches aren't so interesting.
Closes#1136
This will be the option to include non-tracking remote branches. We could add
more fine-grained filtering flags, but I think --all is good enough and easier
to remember.
This patch also updates many of the test outputs to include synchronized remote
branches. I think verbose outputs will help catch future bugs.
This replaces our existing mechanism of adding `/.jj/` to
`.git/info/exclude` by adding `*` to `.jj/.gitignore`, as suggested by
@ppwwyyxx. That simplifies the code quite a bit, and it avoids the
problem with `.git/info/exclude` not existing (it apparently doesn't
exist when the user uses
https://git-scm.com/docs/git-init#_template_directory).
Closes#2385.
We can provide more actionable error message than "not fast-forwardable". If
the push was fast-forwardable, "jj branch track" should be able to merge the
remote branch without conflicts, so the added step would be minimal.
Although this is logically correct, the error message is a bit cryptic. It's
probably better to reject push if non-tracking remote branches exist.
#1136
We'll use remote_ref.tracking_target() to classify push action, but not all
callers of local_remote_branches() need tracking_target() instead of target.
This means that the commits previously pinned by remote branches are no longer
abandoned. I think that's more correct since "push" is the operation to
propagate local view to remote, and uninteresting commits should have been
locally abandoned.
Since I'm going to make git::push_branches() update the repo view internally,
it should fail fast if the remote name is reserved. Before, the problem was
detected on git::import_refs().
Since pushed remote branches will share the common base targets with locals,
these branches should be marked as tracking. git::push_branches() will handle
that. It looks ugly that the public GitBranchPushTargets type keeps "force"-d
branches as a separate set, but we'll need to rework that anyway when we
implement --force-with-lease behavior. So let's leave it for now.
Some of the git::push_updates() tests have been migrated to the new function.
I left a couple of basic tests for git::push_updates() because push_updates()
will be used to implement a low-level "jj git push-refs" command.
This add support for custom `jj` binaries to use custom working-copy
backends. It works in the same way as with the other backends, i.e. we
write a `.jj/working_copy/type` file when the working copy is
initialized, and then we let that file control which implementation to
use (see previous commit).
I included an example of a (useless) working-copy implementation. I
hope we can figure out a way to test the examples some day.
This makes `Workspace::load()` look a new `.jj/working_copy/type` file
in order to load the right working copy implementation, just like
`Repo::load()` picks the right backends based on `.jj/store/type`,
`.jj/op_store/type`, etc. We don't write the file yet, and we don't
have a way of adding alternative working copy implementations, so it
will always be `LocalWorkingCopy` for now.
Our internal working copy implementations at Google will need the
commit so they can walk history backwards until they get to a "public"
commit. They'll then use that to tell build tools and virtual file
systems to present that as a base.
I'm not sure if we'll need to update `reset()` too. It's currently
only used by `jj untrack`, which doesn't change the commit's parent,
so it wouldn't affect any history walks.
`ReadonlyRepo::init()` takes callbacks for initializing each kind of
backend. We called these things like `op_store_initializer`. I found
that confusing because it is not a `OpStoreFactory` (which is for
loading an existing backend). This patch tries to clarify that by
renaming the arguments and adding types for each kind of callback
function.
This patch adds MutableRepo::track_remote_branch() as we'll probably need to
track the default branch on "jj git clone". untrack_remote_branch() is also
added for consistency.
I'm not sure if this is the best way to render non-tracking branches, but
it helps to write CLI tests. Maybe we can add some hint or decoration to
non-tracking branches, but I'd like to avoid bikeshedding at this point.
Since we haven't migrated the push function yet, a deleted branch can be
pushed to non-tracking remotes. This will be addressed later.
#1136
I'm about to make `LockedLocalWorkingCopy` not borrow from
`LocalWorkingCopy`. That will make it easier to forget to update any
`LocalWorkingCopy` variables when the modifications have been
committed. This patch introduces a wrapper around
`LockedLocalWorkingCopy` to help prevent that.
Thanks to Yuya for the suggestion.
`LocalWorkingCopy::check_out()` can be expressed using the planned
`WorkingCopy` trait, so it doesn't need to be in the trait itself
`WorkingCopy`. I wasn't sure if I should make it a free function in
`working_copy`, but I ended up moving it onto `Workspace`.
This isn't important, but I'm going to change remote_targets to store RemoteRef
instead of RefTarget, so I went ahead and change the other field types as well.
Summary: Yuya's changes and mine had a semantic conflict ("merge skew") between
the two of them, as b7c7b19e changed the `op log `output slightly, whereas
220292ad included a new test that used `op log` itself.
Generated by `cargo insta review`.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I51d4de7316b1abc09be4f9fa0dd0d1a1
We could fix do_git_clone() instead, but it seemed a bit weird that the
git_repo_path is relative to the store path which is unknown to callers.
Fixes#2374
There's a subtle behavior change. Unlike the original remove_remote_branch(),
remote_views entry is not discarded when the branches map becomes empty. The
reasoning here is that the remote view can be added/removed when the remote
is added/removed respectively, though that's not implemented yet. Since the
serialized data cannot represent an empty remote, such view may generate
non-unique content hash.
Summary: This allows `workspace forget` to forget multiple workspaces in a
single action; it now behaves more consistently with other verbs like `abandon`
which can take multiple revisions at one time.
There's some hoop-jumping involved to ensure the oplog transaction description
looks nice, but as they say: small conveniences cost a lot.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: Id91da269f87b145010c870b7dc043748
The `TreeStateError` type is specific to the current local-disk
working-copy backend, so it should not be part of the generic
working-copy interface I'm trying to create.
Summary: Workspaces are most useful to test different versions (commits) of
the tree within the same repository, but in many cases you want to check out a
specific commit within a workspace.
Make that trivial with a `--revision` option which will be used as the basis
for the new workspace. If no `-r` option is given, then the previous behavior
applies: the workspace is created with a working copy commit created on top of
the current working copy commit's parent.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I23549efe29bc23fb9f75437b6023c237
Before this patch, it was an error to run `jj config set --user foo
'[1]'` twice. But it's only been broken since the previous commit
because '[1]' was interpreted as a string before then.
Now we have a separate map for "git" tracking remote, we can always preserve
the last imported/exported git_refs. The option to restore git-tracking refs
has been removed. Perhaps, --what can be reorganized as --local and --remote
<NAME>.
The commit backend at Google is cloud-based (and so are the other
backends); it reads and writes commits from/to a server, which stores
them in a database. That makes latency much higher than for disk-based
backends. To reduce the latency, we have a local daemon process that
caches and prefetches objects. There are still many cases where
latency is high, such as when diffing two uncached commits. We can
improve that by changing some of our (jj's) algorithms to read many
objects concurrently from the backend. In the case of tree-diffing, we
can fetch one level (depth) of the tree at a time. There are several
ways of doing that:
* Make the backend methods `async`
* Use many threads for reading from the backend
* Add backend methods for batch reading
I don't think we typically need CPU parallelism, so it's wasteful to
have hundreds of threads running in order to fetch hundreds of objects
in parallel (especially when using a synchronous backend like the Git
backend). Batching would work well for the tree-diffing case, but it's
not as composable as `async`. For example, if we wanted to fetch some
commits at the same time as we were doing a diff, it's hard to see how
to do that with batching. Using async seems like our best bet.
I didn't make the backend interface's write functions async because
writes are already async with the daemon we have at Google. That
daemon will hash the object and immediately return, and then send the
object to the server in the background. I think any cloud-based
solution will need a similar daemon process. However, we may need to
reconsider this if/when jj gets used on a server with a custom backend
that writes directly to a database (i.e. no async daemon in between).
I've tried to measure the performance impact. That's the largest
difference I've been able to measure was on `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0` in the Linux repo,
which increases from 749 ms to 773 ms (3.3%). In most cases I've
tested, there's no measurable difference. I've tried diffing from the
root commit, as well as `jj --ignore-working-copy log --no-graph -r
'::v3.0 & author(torvalds)' -T 'commit_id ++ "\n"'` (to test a
commit-heavy load).
Before this patch, when updating to a commit that has a file that's
currently an ignored file on disk, jj would crash. After this patch,
we instead leave the conflicting files or directories on disk. We
print a helpful message about how to inspect the differences between
the intended working copy and the actual working copy, and how to
discard the unintended changes.
Closes#976.
On my Debian laptop, openssl_init() takes ~30ms to load the default CA
certificates serialized in PEM format, and the cost is added to each jj
invocation. This change saves 20s (of 50s) on my machine.
% wc -l /usr/lib/ssl/cert.pem
3517 /usr/lib/ssl/cert.pem
It's about time we make the working copy a pluggable backend like we
have for the other storage. We will use it at Google for at least two
reasons:
* To support our virtual file system. That will be a completely
separate working copy backend, which will interact with the virtual
file system to update and snapshot the working copy.
* On local disk, we need to tell our build system where to find the
paths that are not in the sparse patterns. We plan to do that by
wrapping the standard local working copy backend (the one moved in
this commit), writing a symlink that points to the mainline commit
where the "background" files can be read from.
Let's start by renaming the exising implementation to
`local_working_copy`.