Commit graph

2215 commits

Author SHA1 Message Date
Yuya Nishihara
b07b370ed3 merge: simply generate content hash from interleaved values 2023-11-06 07:21:06 +09:00
Yuya Nishihara
46ffb2f0b2 merge: store negative/positive terms internally in an interleaved Vec
Many callers use interleaved iterators, and recently-added serialization code
is built on top of that, so I think it's better to store terms in that format.

map() functions no longer use MergeBuilder as we know the mapped values are
ordered properly. flatten() and simplify() are reimplemented to work with the
interleaved values. The other changes are trivial.
2023-11-06 07:21:06 +09:00
Yuya Nishihara
287728fee7 merge: extract trivial_merge() that takes interleaved adds/removes iterator
The Merge type will store interleaved terms instead of separate adds/removes
vecs.
2023-11-06 07:21:06 +09:00
Yuya Nishihara
01523ba4f3 merge: rewrite bottom half of trivial_merge() for non-copyable types
The input values of trivial_merge() will be changed to Iterator<Item = T>
where T: Eq + Hash. It could be <Item = &'a T>, but it doesn't have to be.
2023-11-06 07:21:06 +09:00
Martin von Zweigbergk
7c923514ee git: add config to disable abandoning of unreachable commits
Some users prefer to have commits not get abandoned when importing
refs. This adds a config option for that.

Closes #2504.
2023-11-05 06:10:54 -08:00
Martin von Zweigbergk
7bf8906f9c git: extract a function for abandoning unreachable commits
This motivation for this is so we can easily skip calling the function
if the user has opted out of the propagation of abandoned commits we
usually do (#2504). However, it seems like a good piece of code to
extract regardless of that feature.
2023-11-05 06:10:54 -08:00
Yuya Nishihara
d9fbf21794 merge: have Merge::adds()/removes() return iterator
The Merge type will be changed to store interleaved values internally.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
1c6913d618 merge: use Merge::iter() instead of adds()/removes() where order doesn't matter
Merge::iter() will be a slice::Iter, and be more efficient than chaining adds
and removes.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
99e6ff493a merge: fix copy-paste error in doc comment for adds() 2023-11-05 16:43:06 +09:00
Yuya Nishihara
f6d85c51cd merge: add non-optional Merge accessor to the zeroth value
We have a few callers which just need to obtain an object common among all
the merge values. Let's add a non-failing accessor for that purpose.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
b12c688ea0 merge: add method for indexed adds/removes access
The current adds()/removes() will be changed to return iterators.
2023-11-05 16:43:06 +09:00
Martin von Zweigbergk
6a5615c933 rewrite: use MergedTree::diff_stream() when restoring from tree 2023-11-04 21:07:49 -07:00
Yuya Nishihara
602b44258e workspace: add function that initializes colocated git repository
One less git2 API use in CLI.

The function name GitBackend::init_colocated() is a bit odd, but we need to
specify the work-tree path, not the ".git" repo path. So we can't eliminate
the notion of the working copy path anyway.
2023-11-05 08:48:35 +09:00
Yuya Nishihara
77e16243d6 tests: assert paths of initialized GitBackend 2023-11-05 08:48:35 +09:00
Yuya Nishihara
ce46c10c96 git_backend: extract inner function that initializes backend with open git repo 2023-11-05 08:48:35 +09:00
Yuya Nishihara
dce640aaf1 workspace: one less cloning of workspace_root in init_external_git()
Just a trivial code cleanup.
2023-11-05 08:48:35 +09:00
Yuya Nishihara
c866b4a42d workspace: fix repository path in init_internal_git() doc comment
Also rephrased "Git backend" as "Git repo" since the new backend storage will
be created.
2023-11-05 08:48:35 +09:00
Antoine Cezar
5973ab47b9 commands: move rebase_to_dest_parent to jj_lib::rewrite
What make rebase_to_dest_parent a good candidate for jj_lib::rewrite module:

- It is used both in obslog and interdiff. It's a sign that it may be moved to a lower layer
- CommandError is returned by converting from TreeMergeError. Not explicitly.
- It only use jj_lib::rewrite fonctions.
2023-11-03 20:48:00 +01:00
Martin von Zweigbergk
904c37d36d working copy: use MergedTree::diff_stream()
This will make it a little faster to update the working copy at Google
once we've made `MergedTree::diff_stream()` fetch trees
concurrently. (It only makes it a little faster because we still fetch
files serially.)
2023-11-03 08:15:10 -07:00
Martin von Zweigbergk
72245cfac5 merged_tree: add Stream-based version of diff(), delegating for now
I'm going to implement a `Stream`-based version optimized for
high-latency (RPC-based) commit backends. So far, that implementation
is about 20% slower in the Linux repo when running `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. I think that's almost
only because the algorithm is different, not because it's async per
se.

This commit adds a `Stream`-based version of `MergedTree::diff()` that
just wraps the regular iterator in stream. I updated `jj diff` to use
it. I couldn't measure any difference on the command above in the
Linux repo. I think that means we can safely use the same
`Stream`-based interface regardless of backend, even if we end up
needing two different implementations of the `Stream`. We would then
be using the wrapped iterator from this commit for local backends, and
the new implementation for remote backends. But ideally we can make
the remote-friendly implementation fast enough that we don't need two
implementations.
2023-11-03 08:15:10 -07:00
Martin von Zweigbergk
24b706641f async: switch to pollster's block_on()
During the transition to using more async code, I keep running into
https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want
to convert `MergedTree::diff()` into a `Stream`. I don't want to
update all call sites at once, so instead I'm adding a
`MergedTree::diff_stream()` method, which just wraps
`MergedTree::diff()` in a `Stream. However, since the iterator is
synchronous, it needs to block on the async `Backend::read_tree()`
calls. If we then also block on the `Stream` in the CLI, we run into
the panic.
2023-11-03 08:15:10 -07:00
Martin von Zweigbergk
3a378dc234 cli: add a function for restoring part of a tree from another tree
We had similar code in two places for restoring paths from one tree to
another. Let's reuse it instead.

I put the new function in the `rewrite` module. I'm not sure if that's
right place. Maybe it belongs in `tree`?
2023-11-02 06:07:45 -07:00
Yuya Nishihara
162dcd49b4 cli: rewrite base GitIgnoreFile lookup to use gitoxide instead of libgit2
Since gix::Repository::config_snapshot() borrows the repo instance, it has to
be allocated in caller's stack. That's why GitBackend::git_config() is removed.
2023-11-02 19:33:06 +09:00
Yuya Nishihara
c88e69ad6f git_backend: replace git2::Repository with gix::Repository
My gut feeling is that gitoxide aims to be more transparent than libgit2. We'll
need to know more about the underlying Git data model.

Random comments on gix API:

 * gix::Repository provides API similar to git2::Repository, but has less
   "convenient" functions. For example, we need to use .find_object() +
   .try_to/into_<kind>() instead of .find_<kind>().
 * gix::Object, Blob, etc. own raw data as bytes. gix::object and gix::objs
   types provide high-level views on such data.
 * Tree building is pretty low-level compared to git2.
 * gix leverages bstr (i.e. bytes) extensively.

It's probably not difficult to migrate git::import/export_refs(). It might
help eliminate the startup overhead of libssl initialization. The gix-based
GitBackend appears to be a bit faster, but that wouldn't practically matter.

#2316
2023-11-02 19:33:06 +09:00
Yuya Nishihara
9a86b77e38 tests: force gitoxide to not load config nor use "main" as default branch
AFAIK, there's no global config state for gitoxide. We can use
Config::isolated() in tests, but GitBackend should load config files in a
normal way.

https://docs.rs/gix/0.55.2/gix/open/permissions/struct.Config.html#method.isolated
https://docs.rs/gix/0.55.2/gix/init/constant.DEFAULT_BRANCH_NAME.html
2023-11-02 19:33:06 +09:00
Yuya Nishihara
f5a61dc2b7 git_backend: open just-initialized repo with canonicalized path
Otherwise, the initialized repo could have a different work-dir path than the
load()-ed one. libgit2 appears to do some normalization somewhere, but gix
won't.
2023-11-02 19:33:06 +09:00
Yuya Nishihara
fd187d266f git_backend: box GitBackendInit/LoadError up front
These error enums will wrap gix error types, and will become bigger enough for
clippy to complain.
2023-11-02 19:33:06 +09:00
Yuya Nishihara
b48569b104 cargo: add gitoxide (or gix) dependency
I've enabled the "index" component from the "basic" feature set, which would
be needed to implement colocated repo functionality. The doc suggests that
a library shouldn't activate "max-performance-safe", but our crate is also
an application so it would be okay to enable the feature. We'll need "parallel"
anyway to make GitBackend Sync.

https://docs.rs/gix/latest/gix/#feature-flags
2023-11-02 19:33:06 +09:00
Isabella Basso
749d8bb15a git: preserve HEAD when possible
Closes: #2210
2023-11-01 08:23:52 -03:00
Yuya Nishihara
1788b5014e git_backend: remove redundant copy back of author timestamp
Only the committer timestamp can be updated inside a loop.
2023-10-31 06:51:27 +09:00
Yuya Nishihara
f5aa739c70 git_backend: use .strip_suffix() instead of manual slicing 2023-10-31 06:51:27 +09:00
Yuya Nishihara
9bd84c55e0 git_backend: use file mode extensively in read_tree()
Both filemode() and kind() are calculated from the same underlying data,
and kind() is libgit2-specific API.
2023-10-31 06:51:27 +09:00
Yuya Nishihara
b3c9cab12d git_backend: handle read_tree() lookup/encoding errors gracefully 2023-10-31 06:51:27 +09:00
Yuya Nishihara
847adc832f git_backend: use lossy conversion to decode non-UTF-8 commit message
If message() returned None, it doesn't mean the commit message is empty. I
originally mapped it to an error, but that made import of linux repo fail.

https://docs.rs/git2/latest/git2/struct.Commit.html#method.message
2023-10-31 06:51:27 +09:00
Yuya Nishihara
06c254e742 git_backend: use non-owned str::from_utf8() to decode symlink target
Just for consistency with the other changes. str::Utf8Error is 2 words long,
so I removed the boxing.
2023-10-31 06:51:27 +09:00
Yuya Nishihara
d1c71c05c9 git_backend: remove redundant error handling for invalid hash length
The only error that could be returned by libgit2 is invalid hash length, and
we check that explicitly. If we switch the backends to gitoxide, there will be
panicking constructor.

https://docs.rs/git2/latest/git2/struct.Oid.html#method.from_bytes
2023-10-31 06:51:27 +09:00
Ilya Grigoriev
8bc3f5fd67 cli rebase: Allow jj rebase -r to rebase a commit onto a descendant
#1188

There are some additional test changes because children and descendants are now
rebased before the commit itself.
2023-10-30 10:56:27 -07:00
Yuya Nishihara
2d3fe7eee2 rewrite: replace use of "lift"ed function application with try_collect()
Also removed redundant borrow + clone.
2023-10-30 13:50:37 +09:00
Martin von Zweigbergk
35a23172ec backend: delete unused Phase enum
The idea was to support phases like in hg, but that hasn't happened
yet. We can add back this simple enum if we do add support for phases.
2023-10-29 12:02:40 -07:00
Martin von Zweigbergk
cfcdd71865 backend: make read_conflict synchronous again
This avoids https://github.com/rust-lang/futures-rs/issues/2090. I
don't think we need to worry about reading legacy conflicts
asynchronously - async is really only useful for Google's backend
right now, and we don't use the legacy format at Google. In
particular, I don't want `MergedTree::value()` to have to be async.
2023-10-28 16:45:40 -07:00
Martin von Zweigbergk
a1ef9dc845 merged_tree: propagate backend errors in diff iterator
I want to fix error propagation before I start using async in this
code. This makes the diff iterator propagate errors from reading tree
objects.

Errors include the path and don't stop the iteration. The idea is that
we should be able to show the user an error inline in diff output if
we failed to read a tree. That's going to be especially useful for
backends that can return `BackendError::AccessDenied`. That error
variant doesn't yet exist, but I plan to add it, and use it in
Google's internal backend.
2023-10-26 06:20:56 -07:00
Martin von Zweigbergk
309f1200d6 merge: introduce a type alias for Merge<Option<TreeValue>>
Reasons to introduce this alias:

* Reduces complexity of a type, to silence Clippy warnings in the
  future if we use this type as a type parameter

* The type is used quite frequently, so it makes sense to have a name
  for it

* It's easier to visually scan for the end of the type when you don't
  have to match opening and closing angle brackets
2023-10-26 06:20:56 -07:00
Martin von Zweigbergk
6ad71e658d merged_tree: rename MergedTreeValue to MergedTreeVal
I'm going to add `MergedTreeValue` as an alias for
`Merge<Option<TreeValue>>`, but we already have a type by that name in
`merged_tree`. This patch renames it away, to make room for the new
alias. I used `MergedTreeVal` for this borrowing version to be a bit
like how `str` is a borrowed version of `String`.
2023-10-26 06:20:56 -07:00
Yuya Nishihara
1bfe5b5b56 cli: add string pattern support to "git push --branch"
Since "jj git fetch --branch" supports glob patterns, users would expect that
"jj git push --branch glob:.." also works.

The error handling bits are copied from "branch" sub commands. We might want to
extract it to a common helper function, but I haven't figured out a reasonable
boundary point yet.
2023-10-26 04:51:17 +09:00
Yuya Nishihara
a6ac9b46e7 git: simply call fetch() with one or more branch name filters 2023-10-25 03:58:48 +09:00
Yuya Nishihara
560b63544a cli: parse "git fetch --branch" parameter as string pattern
Even though "*" can't be used as a branch name to fetch, it should be better
to explicitly enable glob matching like the other commands.
2023-10-25 03:58:48 +09:00
Martin von Zweigbergk
8157d4ff63 merge: materialize conflicts in executable files like regular files
AFAICT, all callers of `Merge::to_file_merge()` are already well
prepared for working with executable files. It's called from these
places:

* `local_working_copy.rs`: Materialized conflicts are correctly
  updated using `Merge::with_new_file_ids()`.

* `merge_tools/`: Same as above.

* `cmd_cat()`: We already ignore the executable bit when we print
  non-conflicted files, so it makes sense to also ignore it for
  conflicted files.

* `git_diff_part()`: We print all conflicts with mode "100644" (the
  mode for regular files). Maybe it's best to use "100755" for
  conflicts that are unambiguously executable, or maybe it's better to
  use a fake mode like "000000" for all conflicts. Either way, the
  current behavior seems fine.

* `diff_content()`: We use the diff content in various diff
  formats. We could add more detail about the executable bits in some
  of them, but I think the current output is fine. For example,
  instead of our current "Created conflict in my-file", we could say
  "Created conflict in executable file my-file" or "Created conflict
  in ambiguously executable file my-file". That's getting verbose,
  though.

So, I think all we need to do is to make `Merge::to_file_merge()` not
require its inputs to be non-executable.

Closes #1279.
2023-10-24 06:45:45 -07:00
Martin von Zweigbergk
21b11df8a9 merge: make non-conflicted debug string for Merge shorter
Resolves states are most common and the current format is pretty
verbose. Let's print it as if `Merge` were an enum with `Resolved` and
`Conflicted` variants instead.
2023-10-24 06:45:45 -07:00
Yuya Nishihara
831dc84c5b git: remove RefName::GitRef variant which isn't used anymore 2023-10-24 09:45:01 +09:00
Yuya Nishihara
a756333216 git: move RefName enum from view module
It's no longer used by View API.
2023-10-24 09:45:01 +09:00