When deciding the order to visit commits to rebase, we currently look
up parents in the index. I'm trying to remove the current `IndexEntry`
type and will probably have revsets iterators yield simply
`CommitId`. Let's therefore look up commit objects here.
I timed this by rewriting all commits in the jj repo. I couldn't
measure any difference. That makes sense since we cache the commits in
`Store` and we would read the commit when rebasing it anyway.
The function is only used in tests, so it doesn't belong in
`default_revset_engine`. Also, it's not specific to that
implementation, so I rewrote as a revset evaluation.
There should be no problem to evaluate revset against base_repo and collect
commit objects from (mut_)repo, but it seemed a bit odd.
In rebase examples other than the "new --insert-after", we could switch to
tx.repo(). However, I think the use of tx.base_repo() makes it clear that
there's no data dependency on the previous mutation.
I'd like to be able to pass a `self` of `type `&ReadonlyRepo` to
functions that take a `&dyn Repo`. For that, we need `ReadonlyRepo`
itself to implement `Repo` instead of having `Arc<ReadonlyRepo>`
implement it. I could have solved it in a different way, but the `Arc`
requirement seems like an unnecessary constraint.
In most cases, we just need to access the commit backend and then the
shorter `base()` works. I noticed because I wanted to implement `Repo`
on `ReadonlyRepo` instead of on `Arc<ReadonlyRepo>` and then these
uses failed.
The functions resolving a change id to commits currently return a
`Vec<IndexEntry>`. We want to avoid depending on `IndexEntry` and we
only need the commit ids here.
The index position is specific to the default index implementation and
we don't want to use it in outside of there. This commit removes the
use of it as a key for nodes in the graphlog.
I timed it on the git.git repo using `jj log -r 'all()' -T commit_id`
(the worst case I can think of) and it slowed down from ~2.02 s to
~2.20 s (~9%).
Since we hid the graph iterator implementation behind
`Revset::iter_graph()`, I don't think we have any callers of
`Revset::iter()` require the iteration to be in index position order,
so let's not promise that. We do want to promise that the iteration is
in topological order with children before parents, however.
I think requests to reset the author came up twice in the last week,
so let's just add support for it. I copied git's behavior of resetting
the name, email, and timestamp. The flag name is also from git.
We need 1.64 to bump `clap` to `4.1`. We don't really need to upgrade
to that, but being on an older version causes minor confusions like
#1393. Rust 1.64 is very close to 6 months old at this point.
For large repos, it's useful to be able to use shorter change id and
commit id prefixes by resolving the prefix in a limited subset of the
repo (typically the same subset that you'd want to see in your default
log output). For very large repos, like Google's internal one, the
shortest unique prefix evaluated within the whole repo is practically
useless because it's long enough that the user would want to copy and
paste it anyway.
Mercurial supports this with its `revisions.disambiguatewithin` config
(added in https://www.mercurial-scm.org/repo/hg/rev/503f936489dd). I'd
like to add the same feature to jj. Mercurial's implementation works
by attempting to resolve the prefix in the whole repo and then, if the
prefix was ambiguous, it resolves it in the configured subset
instead. The advantage of doing it that way is that there's no extra
cost of resolving the revset defining the subset if the prefix was not
ambiguous within the whole repo. However, there are two important
reasons to do it differently in jj:
* We support very large repos using custom backends, and it's probably
cheaper to resolve a prefix within the subset because it can all be
cached on the client. Resolving the prefix within the whole repo
requires a roundtrip to the server.
* We want to be able to resolve change id prefixes, which is always
done in *some* revset. That revset is currently `all()`, i.e. all
visible commits. Even on local disk, it's probably cheaper to
resolve a small revset first and then resolve the prefix within that
than it is to build up the index of all visible change ids.
We could achieve the goal by letting each revset engine respect the
configured subset, but since the solution proposed above makes sense
also for local-disk repos, I think it's better to do it outside of the
revset engine, so all revset engines can share the code.
This commit prepares for the new functionality by moving the symbol
resolution out of `Index::evaluate_revset()`.
The callers don't need to hold on to the revset expression once it's
been evaluated, and having an owned expression (well, an expression
with shared ownership) will avoid a clone in the next commit.
A mapped template is basically a combined function that takes context: &C,
extracts Vec<O>, and formats each item with Template<C>. It cannot be cleanly
turned into a function of (&C) -> Vec<Template<()>> type. So list-like methods
are implemented on Box<dyn ListTemplate<C>> instead.
I'm going to add a trait that provides .join() -> Box<dyn Template>.
wrap_template() should handle it transparently, but the current interface
would require excessive boxing.
This involves a little hack to insert a lambda parameter 'x' to be used at
keyword position. If the template language were dynamically typed (and were
interpreted), .map() implementation would be simpler. I considered that, but
interpreter version has its own warts (late error reporting, uneasy to cache
static object, etc.), and I don't think the current template engine is
complex enough to rewrite from scratch.
.map() returns template, which can't be join()-ed. This will be fixed later.
A lambda expression will be allowed only in .map() operation. The syntax is
borrowed from Rust closure.
In Mercurial, a map operation is implemented by context substitution. For
example, 'parents % "{node}"' prints parents[i].node for each. There are two
major problems: 1. the top-level context cannot be referred from the inner map
expression. 2. context of different types inserts arbitrarily-named keywords
(e.g. a dict type inserts "{key}" and "{value}", but how we could know.)
These issues should be avoided by using explicitly named parameters.
parents.map(|parent| parent.commit_id ++ " " ++ commit_id)
^^^^^^^^^ global keyword
A downside is that we can't reuse template fragment in map expression. Suppose
we have -T commit_summary, -T 'parents.map(commit_summary)' doesn't work.
# only usable as a top-level template
'commit_summary' = 'commit_id.short() ++ " " ++ description.first_line()'
Another problem is that a lambda expression might be confused with an alias
function.
# .map(f) doesn't work, but .map(g) does
'f(x)' = 'x'
'g' = '|x| x'
The `jj debug` commands are hidden from help and are described as
"Low-level commands not intended for users", but e.g. `jj debug
completion` is intended for users, and should be visible in the help
output.
By using one letter for the path type before and one letter for path
type after, we can encode much more information than just the current
'M'/'A'/'R'. In particular, we can indicate new and resolved
conflicts. The color still encodes the same information as before. The
output looks a bit weird after many years of using `hg status`. It's a
bit more similar to the `git status -s` format with one letter for the
index and one with the working copy. Will we get used to it and find
it useful?