mirror of
https://github.com/salsa-rs/salsa.git
synced 2025-02-02 09:46:06 +00:00
move the accepted RFCs to the book and describe a new process
This commit is contained in:
parent
3b8f0754c3
commit
baee86bc25
6 changed files with 821 additions and 1 deletions
|
@ -9,4 +9,9 @@
|
|||
- [YouTube videos](./videos.md)
|
||||
- [Plumbing](./plumbing.md)
|
||||
- [Query groups](./plumbing/query_groups.md)
|
||||
- [Database](./plumbing/database.md)
|
||||
- [Database](./plumbing/database.md)
|
||||
- [RFCs](./rfcs.md)
|
||||
- [Template](./rfcs/template.md)
|
||||
- [RFC 0001: Query group traits](./rfcs/RFC0001-Query-Group-Traits.md)
|
||||
- [RFC 0002: Intern queries](./rfcs/RFC0002-Intern-Queries.md)
|
||||
- [RFC 0003: Query dependencies](./rfcs/RFC0003-Query-Dependencies.md)
|
29
book/src/rfcs.md
Normal file
29
book/src/rfcs.md
Normal file
|
@ -0,0 +1,29 @@
|
|||
# RFCs
|
||||
|
||||
The Salsa RFC process is used to describe the motivations for major changes made
|
||||
to Salsa. RFCs are recorded here in the Salsa book as a historical record of the
|
||||
considerations that were raised at the time. Note that the contents of RFCs,
|
||||
once merged, is typically not updated to match further changes. Instead, the
|
||||
rest of the book is updated to include the RFC text and then kept up to
|
||||
date as more PRs land and so forth.
|
||||
|
||||
## Creating an RFC
|
||||
|
||||
If you'd like to propose a major new Salsa feature, simply clone the repository
|
||||
and create a new chapter under the list of RFCs based on the [RFC template].
|
||||
Then open a PR with a subject line that starts with "RFC:".
|
||||
|
||||
[RFC template]: ./rfcs/template.md
|
||||
|
||||
## RFC vs Implementation
|
||||
|
||||
The RFC can be in its own PR, or it can also includ work on the implementation
|
||||
together, whatever works best for you.
|
||||
|
||||
## Does my change need an RFC?
|
||||
|
||||
Not all PRs require RFCs. RFCs are only needed for larger features or major
|
||||
changes to how Salsa works. And they don't have to be super complicated, but
|
||||
they should capture the most important reasons you would like to make the
|
||||
change. When in doubt, it's ok to just open a PR, and we can always request an
|
||||
RFC if we want one.
|
373
book/src/rfcs/RFC0001-Query-Group-Traits.md
Normal file
373
book/src/rfcs/RFC0001-Query-Group-Traits.md
Normal file
|
@ -0,0 +1,373 @@
|
|||
# Motivation
|
||||
|
||||
- Support `dyn QueryGroup` for each query group trait as well as `impl QueryGroup`
|
||||
- `dyn QueryGroup` will be much more convenient, at the cost of runtime efficiency
|
||||
- Don't require you to redeclare each query in the final database, just the query groups
|
||||
|
||||
# User's guide
|
||||
|
||||
## Declaring a query group
|
||||
|
||||
User's will declare query groups by decorating a trait with `salsa::query_group`:
|
||||
|
||||
```rust
|
||||
#[salsa::query_group(MyGroupStorage)]
|
||||
trait MyGroup {
|
||||
// Inputs are annotated with `#[salsa::input]`. For inputs, the final trait will include
|
||||
// a `set_my_input(&mut self, key: K1, value: V1)` method automatically added,
|
||||
// as well as possibly other mutation methods.
|
||||
#[salsa::input]
|
||||
fn my_input(&self, key: K1) -> V1;
|
||||
|
||||
// "Derived" queries are just a getter.
|
||||
fn my_query(&self, key: K2) -> V2;
|
||||
}
|
||||
```
|
||||
|
||||
The `query_group` attribute is a procedural macro. It takes as
|
||||
argument the name of the **storage struct** for the query group --
|
||||
this is a struct, generated by the macro, which represents the query
|
||||
group as a whole. It is attached to a trait definition which defines the
|
||||
individual queries in the query group.
|
||||
|
||||
The macro generates three things that users interact with:
|
||||
|
||||
- the trait, here named `MyGroup`. This will be used when writing the definitions
|
||||
for the queries and other code that invokes them.
|
||||
- the storage struct, here named `MyGroupStorage`. This will be used later when
|
||||
constructing the final database.
|
||||
- query structs, named after each query but converted to camel-case
|
||||
and with the word query (e.g., `MyInputQuery` for `my_input`). These
|
||||
types are rarely needed, but are presently useful for things like
|
||||
invoking the GC. These types violate our rule that "things the user
|
||||
needs to name should be given names by the user", but we choose not
|
||||
to fully resolve this question in this RFC.
|
||||
|
||||
In addition, the macro generates a number of structs that users should
|
||||
not have to be aware of. These are described in the "reference guide"
|
||||
section.
|
||||
|
||||
### Controlling query modes
|
||||
|
||||
Input queries, as described in the trait, are specified via the
|
||||
`#[salsa::input]` attribute.
|
||||
|
||||
Derived queries can be customized by the following attributes,
|
||||
attached to the getter method (e.g., `fn my_query(..)`):
|
||||
|
||||
- `#[salsa::invoke(foo::bar)]` specifies the path to the function to invoke
|
||||
when the query is called (default is `my_query`).
|
||||
- `#[salsa::volatile]` specifies a "volatile" query, which is assumed to
|
||||
read untracked input and hence must be re-executed on every revision.
|
||||
- `#[salsa::dependencies]` specifies a "dependencies-only" query, which is assumed to
|
||||
read untracked input and hence must be re-executed on every revision.
|
||||
|
||||
## Creating the database
|
||||
|
||||
Creating a salsa database works by using a `#[salsa::database(..)]`
|
||||
attribute. The `..` content should be a list of paths leading to the
|
||||
storage structs for each query group that the database will
|
||||
implement. It is no longer necessary to list the individual
|
||||
queries. In addition to the `salsa::database` query, the struct must
|
||||
have access to a `salsa::Runtime` and implement the `salsa::Database`
|
||||
trait. Hence the complete declaration looks roughly like so:
|
||||
|
||||
```rust
|
||||
#[salsa::database(MyGroupStorage)]
|
||||
struct MyDatabase {
|
||||
runtime: salsa::Runtime<MyDatabase>,
|
||||
}
|
||||
|
||||
impl salsa::Database for MyDatabase {
|
||||
fn salsa_runtime(&self) -> salsa::Runtime<MyDatabase> {
|
||||
&self.runtime
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This (procedural) macro generates various impls and types that cause
|
||||
`MyDatabase` to implement all the traits for the query groups it
|
||||
supports, and which customize the storage in the runtime to have all
|
||||
the data needed. Users should not have to interact with these details,
|
||||
and they are written out in the reference guide section.
|
||||
|
||||
# Reference guide
|
||||
|
||||
The goal here is not to give the *full* details of how to do the
|
||||
lowering, but to describe the key concepts. Throughout the text, we
|
||||
will refer to names (e.g., `MyGroup` or `MyGroupStorage`) that appear
|
||||
in the example from the User's Guide -- this indicates that we use
|
||||
whatever name the user provided.
|
||||
|
||||
## The `plumbing::QueryGroup` trait
|
||||
|
||||
The `QueryGroup` trait is a new trait added to the plumbing module. It
|
||||
is implemented by the query group storage struct `MyGroupStorage`. Its
|
||||
role is to link from that struct to the various bits of data that the
|
||||
salsa runtime needs:
|
||||
|
||||
```rust
|
||||
pub trait QueryGroup<DB: Database> {
|
||||
type GroupStorage;
|
||||
type GroupKey;
|
||||
}
|
||||
```
|
||||
|
||||
This trait is implemented by the **storage struct** (`MyGroupStorage`)
|
||||
in our example. You can see there is a bit of confusing nameing going
|
||||
on here -- what we call (for user's) the "storage struct" actually
|
||||
does not wind up containing the true *storage* (that is, the hasmaps
|
||||
and things salsa uses). Instead, it merely implements the `QueryGroup`
|
||||
trait, which has associated types that lead us to structs we need:
|
||||
|
||||
- the **group storage** contains the hashmaps and things for all the queries in the group
|
||||
- the **group key** is an enum with variants for each of the
|
||||
queries. It basically stores all the data needed to identify some
|
||||
particular *query value* from within the group -- that is, the name
|
||||
of the query, plus the keys used to invoke it.
|
||||
|
||||
As described further on, the `#[salsa::query_group]` macro is
|
||||
responsible will generate an impl of this trait for the
|
||||
`MyGroupStorage` struct, along with the group storage and group key
|
||||
type definitions.
|
||||
|
||||
## The `plumbing::HasQueryGroup<G>` trait
|
||||
|
||||
The `HasQueryGroup<G>` struct a new trait added to the plumbing
|
||||
module. It is implemented by the database struct `MyDatabase` for
|
||||
every query group that `MyDatabase` supports. Its role is to offer
|
||||
methods that move back and forth between the context of the *full
|
||||
database* to the context of an *individual query group*:
|
||||
|
||||
```rust
|
||||
pub trait HasQueryGroup<G>: Database
|
||||
where
|
||||
G: QueryGroup<Self>,
|
||||
{
|
||||
/// Access the group storage struct from the database.
|
||||
fn group_storage(db: &Self) -> &G::GroupStorage;
|
||||
|
||||
/// "Upcast" a group key into a database key.
|
||||
fn database_key(group_key: G::GroupKey) -> Self::DatabaseKey;
|
||||
}
|
||||
```
|
||||
|
||||
Here the "database key" is an enum that contains variants for each
|
||||
group. Its role is to take group key and puts it into the context of
|
||||
the entire database.
|
||||
|
||||
## The `Query` trait
|
||||
|
||||
The query trait (pre-existing) is extended to include links to its
|
||||
group, and methods to convert from the group storage to the query
|
||||
storage, plus methods to convert from a query key up to the group key:
|
||||
|
||||
```rust
|
||||
pub trait Query<DB: Database>: Debug + Default + Sized + 'static {
|
||||
/// Type that you you give as a parameter -- for queries with zero
|
||||
/// or more than one input, this will be a tuple.
|
||||
type Key: Clone + Debug + Hash + Eq;
|
||||
|
||||
/// What value does the query return?
|
||||
type Value: Clone + Debug;
|
||||
|
||||
/// Internal struct storing the values for the query.
|
||||
type Storage: plumbing::QueryStorageOps<DB, Self> + Send + Sync;
|
||||
|
||||
/// Associate query group struct.
|
||||
type Group: plumbing::QueryGroup<
|
||||
DB,
|
||||
GroupStorage = Self::GroupStorage,
|
||||
GroupKey = Self::GroupKey,
|
||||
>;
|
||||
|
||||
/// Generated struct that contains storage for all queries in a group.
|
||||
type GroupStorage;
|
||||
|
||||
/// Type that identifies a particular query within the group + its key.
|
||||
type GroupKey;
|
||||
|
||||
/// Extact storage for this query from the storage for its group.
|
||||
fn query_storage(group_storage: &Self::GroupStorage) -> &Self::Storage;
|
||||
|
||||
/// Create group key for this query.
|
||||
fn group_key(key: Self::Key) -> Self::GroupKey;
|
||||
}
|
||||
```
|
||||
|
||||
## Converting to/from the context of the full database generically
|
||||
|
||||
Putting all the previous plumbing traits together, this means
|
||||
that given:
|
||||
|
||||
- a database `DB` that implements `HasGroupStorage<G>`;
|
||||
- a group struct `G` that implements `QueryGroup<DB>`; and,
|
||||
- and a query struct `Q` that implements `Query<DB, Group = G>`
|
||||
|
||||
we can (generically) get the storage for the individual query
|
||||
`Q` out from the database `db` via a two-step process:
|
||||
|
||||
```rust
|
||||
let group_storage = HasGroupStorage::group_storage(db);
|
||||
let query_storage = Query::query_storage(group_storage);
|
||||
```
|
||||
|
||||
Similarly, we can convert from the key to an individual query
|
||||
up to the "database key" in a two-step process:
|
||||
|
||||
```rust
|
||||
let group_key = Query::group_key(key);
|
||||
let db_key = HasGroupStorage::database_key(group_key);
|
||||
```
|
||||
|
||||
## Lowering query groups
|
||||
|
||||
The role of the `#[salsa::query_group(MyGroupStorage)] trait MyGroup {
|
||||
.. }` macro is primarily to generate the group storage struct and the
|
||||
impl of `QueryGroup`. That involves generating the following things:
|
||||
|
||||
- the query trait `MyGroup` itself, but with:
|
||||
- `salsa::foo` attributes stripped
|
||||
- `#[salsa::input]` methods expanded to include setters:
|
||||
- `fn set_my_input(&mut self, key: K1, value__: V1);`
|
||||
- `fn set_constant_my_input(&mut self, key: K1, value__: V1);`
|
||||
- the query group storage struct `MyGroupStorage`
|
||||
- We also generate an impl of `QueryGroup<DB>` for `MyGroupStorage`,
|
||||
linking to the internal strorage struct and group key enum
|
||||
- the individual query types
|
||||
- Ideally, we would use Rust hygiene to hide these struct, but as
|
||||
that is not currently possible they are given names based on the
|
||||
queries, but converted to camel-case (e.g., `MyInputQuery` and `MyQueryQuery`).
|
||||
- They implement the `salsa::Query` trait.
|
||||
- the internal group storage struct
|
||||
- Ideally, we would use Rust hygiene to hide this struct, but as
|
||||
that is not currently possible it is entitled
|
||||
`MyGroupGroupStorage<DB>`. Note that it is generic with respect to
|
||||
the database `DB`. This is because the actual query storage
|
||||
requires sometimes storing database key's and hence we need to
|
||||
know the final database type.
|
||||
- It contains one field per query with a link to the storage information
|
||||
for that query:
|
||||
- `my_query: <MyQueryQuery as salsa::plumbing::Query<DB>>::Storage`
|
||||
- (the `MyQueryQuery` type is also generated, see the "individual query types" below)
|
||||
- The internal group storage struct offers a public, inherent method
|
||||
`for_each_query`:
|
||||
- `fn for_each_query(db: &DB, op: &mut dyn FnMut(...)`
|
||||
- this is invoked by the code geneated by `#[salsa::database]` when implementing the
|
||||
`for_each_query` method of the `plumbing::DatabaseOps` trait
|
||||
- the group key
|
||||
- Again, ideally we would use hygiene to hide the name of this struct,
|
||||
but since we cannot, it is entitled `MyGroupGroupKey`
|
||||
- It is an enum which contains one variant per query with the value being the key:
|
||||
- `my_query(<MyQueryQuery as salsa::plumbing::Query<DB>>::Key)`
|
||||
- The group key enum offers a public, inherent method `maybe_changed_since`:
|
||||
- `fn maybe_changed_since<DB>(db: &DB, db_descriptor: &DB::DatabaseKey, revision: Revision)`
|
||||
- it is invoked when implementing `maybe_changed_since` for the database key
|
||||
|
||||
## Lowering database storage
|
||||
|
||||
The `#[salsa::database(MyGroup)]` attribute macro creates the links to the query groups.
|
||||
It generates the following things:
|
||||
|
||||
- impl of `HasQueryGroup<MyGroup>` for `MyDatabase`
|
||||
- Naturally, there is one such impl for each query group.
|
||||
- the database key enum
|
||||
- Ideally, we would use Rust hygiene to hide this enum, but currently
|
||||
it is called `__SalsaDatabaseKey`.
|
||||
- The database key is an enum with one variant per query group:
|
||||
- `MyGroupStorage(<MyGroupStorage as QueryGroup<MyDatabase>>::GroupKey)`
|
||||
- the database storage struct
|
||||
- Ideally, we would use Rust hygiene to hide this enum, but currently
|
||||
it is called `__SalsaDatabaseStorage`.
|
||||
- The database storage struct contains one field per query group, storing
|
||||
its internal storage:
|
||||
- `my_group_storage: <MyGroupStorage as QueryGroup<MyDatabase>>::GroupStorage`
|
||||
- impl of `plumbing::DatabaseStorageTypes` for `MyDatabase`
|
||||
- This is a plumbing trait that links to the database storage / database key types.
|
||||
- The `salsa::Runtime` uses it to determine what data to include. The query types
|
||||
use it to determine a database-key.
|
||||
- impl of `plumbing::DatabaseOps` for `MyDatabase`
|
||||
- This contains a `for_each_query` method, which is implemented by invoking, in turn,
|
||||
the inherent methods defined on each query group storage struct.
|
||||
- impl of `plumbing::DatabaseKey` for the database key enum
|
||||
- This contains a method `maybe_changed_since`. We implement this by
|
||||
matching to get a particular group key, and then invoking the
|
||||
inherent method on the group key struct.
|
||||
|
||||
# Alternatives
|
||||
|
||||
This proposal results from a fair amount of iteration. Compared to the
|
||||
status quo, there is one primary downside. We also explain a few things here that
|
||||
may not be obvious.
|
||||
|
||||
## Why include a group storage struct?
|
||||
|
||||
You might wonder why we need the `MyGroupStorage` struct at all. It is a touch of boilerplate,
|
||||
but there are several advantages to it:
|
||||
|
||||
- You can't attach associated types to the trait itself. This is because the "type version"
|
||||
of the trait (`dyn MyGroup`) may not be available, since not all traits are dyn-capable.
|
||||
- We try to keep to the principle that "any type that might be named
|
||||
externally from the macro is given its name by the user". In this
|
||||
case, the `[salsa::database]` attribute needed to name group storage
|
||||
structs.
|
||||
- In earlier versions, we tried to auto-generate these names, but
|
||||
this failed because sometimes users would want to `pub use` the
|
||||
query traits and hide their original paths.
|
||||
- (One exception to this principle today are the per-query structs.)
|
||||
- We expect that we can use the `MyGroupStorage` to achieve more
|
||||
encapsulation in the future. While the struct must be public and
|
||||
named from the database, the *trait* (and query key/value types)
|
||||
actually does not have to be.
|
||||
|
||||
## Downside: Size of a database key
|
||||
|
||||
Database keys now wind up with two discriminants: one to identify the
|
||||
group, and one to identify the query. That's a bit sad. This could be
|
||||
overcome by using unsafe code: the idea would be that a group/database
|
||||
key would be stored as the pair of an integer and a `union`. Each
|
||||
group within a given database would be assigned a range of integer
|
||||
values, and the unions would store the actual key values. We leave
|
||||
such a change for future work.
|
||||
|
||||
# Future possibilities
|
||||
|
||||
Here are some ideas we might want to do later.
|
||||
|
||||
## No generics
|
||||
|
||||
We leave generic parameters on the query group trait etc for future work.
|
||||
|
||||
## Public / private
|
||||
|
||||
We'd like the ability to make more details from the query groups
|
||||
private. This will require some tinkering.
|
||||
|
||||
## Inline query definitions
|
||||
|
||||
Instead of defining queries in separate functions, it might be nice to
|
||||
have the option of defining query methods in the trait itself:
|
||||
|
||||
```rust
|
||||
#[salsa::query_group(MyGroupStorage)]
|
||||
trait MyGroup {
|
||||
#[salsa::input]
|
||||
fn my_input(&self, key: K1) -> V1;
|
||||
|
||||
fn my_query(&self, key: K2) -> V2 {
|
||||
// define my-query right here!
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
It's a bit tricky to figure out how to handle this, so that is left
|
||||
for future work. Also, it would mean that the method body itself is
|
||||
inside of a macro (the procedural macro) which can make IDE
|
||||
integration harder.
|
||||
|
||||
## Non-query functions
|
||||
|
||||
It might be nice to be able to include functions in the trait that are
|
||||
*not* queries, but rather helpers that compose queries. This should be
|
||||
pretty easy, just need a suitable `#[salsa]` attribute.
|
||||
|
272
book/src/rfcs/RFC0002-Intern-Queries.md
Normal file
272
book/src/rfcs/RFC0002-Intern-Queries.md
Normal file
|
@ -0,0 +1,272 @@
|
|||
# Summary
|
||||
|
||||
- We introduce `#[salsa::interned]` queries which convert a `Key` type
|
||||
into a numeric index of type `Value`, where `Value` is either the
|
||||
type `InternId` (defined by a salsa) or some newtype thereof.
|
||||
- Each interned query `foo` also produces an inverse `lookup_foo`
|
||||
method that converts back from the `Value` to the `Key` that was
|
||||
interned.
|
||||
- The `InternId` type (defined by salsa) is basically a newtype'd integer,
|
||||
but it internally uses `NonZeroU32` to enable space-saving optimizations
|
||||
in memory layout.
|
||||
- The `Value` types can be any type that implements the
|
||||
`salsa::InternIndex` trait, also introduced by this RFC. This trait
|
||||
has two methods, `from_intern_id` and `as_intern_id`.
|
||||
- The interning is integrated into the GC and tracked like any other
|
||||
query, which means that interned values can be garbage-collected,
|
||||
and any computation that was dependent on them will be collected.
|
||||
|
||||
# Motivation
|
||||
|
||||
## The need for interning
|
||||
|
||||
Many salsa applications wind up needing the ability to construct
|
||||
"interned keys". Frequently this pattern emerges because we wish to
|
||||
construct identifiers for things in the input. These identifiers
|
||||
generally have a "tree-like shape". For example, in a compiler, there
|
||||
may be some set of input files -- these are enumerated in the inputs
|
||||
and serve as the "base" for a path that leads to items in the user's
|
||||
input. But within an input file, there are additional structures, such
|
||||
as `struct` or `impl` declarations, and these structures may contain
|
||||
further structures within them (such as fields or methods). This gives
|
||||
rise to a path like so that can be used to identify a given item:
|
||||
|
||||
```
|
||||
PathData = <file-name>
|
||||
| PathData / <identifier>
|
||||
```
|
||||
|
||||
These paths *could* be represented in the compiler with an `Arc`, but
|
||||
because they are omnipresent, it is convenient to intern them instead
|
||||
and use an integer. Integers are `Copy` types, which is convenient,
|
||||
and they are also small (32 bits typically suffices in practice).
|
||||
|
||||
## Why interning is difficult today: garbage collection
|
||||
|
||||
Unfortunately, integrating interning into salsa at present presents
|
||||
some hard choices, particularly with a long-lived application. You can
|
||||
easily add an interning table into the database, but unless you do
|
||||
something clever, **it will simply grow and grow forever**. But as the
|
||||
user edits their programs, some paths that used to exist will no
|
||||
longer be relevant -- for example, a given file or impl may be
|
||||
removed, invalidating all those paths that were based on it.
|
||||
|
||||
Due to the nature of salsa's recomputation model, it is not easy to
|
||||
detect when paths that used to exist in a prior revision are no longer
|
||||
relevant in the next revision. **This is because salsa never
|
||||
explicitly computes "diffs" of this kind between revisions -- it just
|
||||
finds subcomputations that might have gone differently and re-executes
|
||||
them.** Therefore, if the code that created the paths (e.g., that
|
||||
processed the result of the parser) is part of a salsa query, it will
|
||||
simply not re-create the invalidated paths -- there is no explicit
|
||||
"deletion" point.
|
||||
|
||||
In fact, the same is true of all of salsa's memoized query values. We
|
||||
may find that in a new revision, some memoized query values are no
|
||||
longer relevant. For example, in revision R1, perhaps we computed
|
||||
`foo(22)` and `foo(44)`, but in the new input, we now only need to
|
||||
compute `foo(22)`. The `foo(44)` value is still memoized, we just
|
||||
never asked for its value. **This is why salsa includes a garbage
|
||||
collector, which can be used to cleanup these memoized values that are
|
||||
no longer relevant.**
|
||||
|
||||
But using a garbage collection strategy with a hand-rolled interning
|
||||
scheme is not easy. You *could* trace through all the values in
|
||||
salsa's memoization tables to implement a kind of mark-and-sweep
|
||||
scheme, but that would require for salsa to add such a mechanism. It
|
||||
might also be quite a lot of tracing! The current salsa GC mechanism has no
|
||||
need to walk through the values themselves in a memoization table, it only
|
||||
examines the keys and the metadata (unless we are freeing a value, of course).
|
||||
|
||||
## How this RFC changes the situation
|
||||
|
||||
This RFC presents an alternative. The idea is to move the interning
|
||||
into salsa itself by creating special "interning
|
||||
queries". Dependencies on these queries are tracked like any other
|
||||
query and hence they integrate naturally with salsa's garbage
|
||||
collection mechanisms.
|
||||
|
||||
# User's guide
|
||||
|
||||
This section covers how interned queries are expected to be used.
|
||||
|
||||
## Declaring an interned query
|
||||
|
||||
You can declare an interned query like so:
|
||||
|
||||
```rust
|
||||
#[salsa::query_group]
|
||||
trait Foo {
|
||||
#[salsa::interned]
|
||||
fn intern_path_data(&self, data: PathData) -> salsa::InternId;
|
||||
]
|
||||
```
|
||||
|
||||
**Query keys.** Like any query, these queries can take any number of keys. If multiple
|
||||
keys are provided, then the interned key is a tuple of each key
|
||||
value. In order to be interned, the keys must implement `Clone`,
|
||||
`Hash` and `Eq`.
|
||||
|
||||
**Return type.** The return type of an interned key may be of any type
|
||||
that implements `salsa::InternIndex`: salsa provides an impl for the
|
||||
type `salsa::InternId`, but you can implement it for your own.
|
||||
|
||||
**Inverse query.** For each interning query, we automatically generate
|
||||
a reverse query that will invert the interning step. It is named
|
||||
`lookup_XXX`, where `XXX` is the name of the query. Hence here it
|
||||
would be `fn lookup_intern_path(&self, key: salsa::InternId) -> Path`.
|
||||
|
||||
## The expected us
|
||||
|
||||
Using an interned query is quite straightforward. You simply invoke it
|
||||
with a key, and you will get back an integer, and you can use the
|
||||
generated `lookup` method to convert back to the original value:
|
||||
|
||||
```rust
|
||||
let key = db.intern_path(path_data1);
|
||||
let path_data2 = db.lookup_intern_path_data(key);
|
||||
```
|
||||
|
||||
Note that the interned value will be cloned -- so, like all Salsa
|
||||
values, it is best if that is a cheap operation. Interestingly,
|
||||
interning can help to keep recursive, tree-shapes values cheap,
|
||||
because the "pointers" within can be replaced with interned keys.
|
||||
|
||||
## Custom return types
|
||||
|
||||
The return type for an intern query does not have to be a `InternId`. It can
|
||||
be any type that implements the `salsa::InternKey` trait:
|
||||
|
||||
```rust
|
||||
pub trait InternKey {
|
||||
/// Create an instance of the intern-key from a `InternId` value.
|
||||
fn from_intern_id(v: InternId) -> Self;
|
||||
|
||||
/// Extract the `InternId` with which the intern-key was created.
|
||||
fn as_intern_id(&self) -> InternId;
|
||||
}
|
||||
```
|
||||
|
||||
## Recommended practice
|
||||
|
||||
This section shows the recommended practice for using interned keys,
|
||||
building on the `Path` and `PathData` example that we've been working
|
||||
with.
|
||||
|
||||
### Naming Convention
|
||||
|
||||
First, note the recommended naming convention: the *intern key* is
|
||||
`Foo` and the key's associated data `FooData` (in our case, `Path` and
|
||||
`PathData`). The intern key is given the shorter name because it is
|
||||
used far more often. Moreover, other types should never store the full
|
||||
data, but rather should store the interned key.
|
||||
|
||||
### Defining the intern key
|
||||
|
||||
The intern key should always be a newtype struct that implements
|
||||
the `InternKey` trait. So, something like this:
|
||||
|
||||
```rust
|
||||
pub struct Path(InternId);
|
||||
|
||||
impl salsa::InternKey for Path {
|
||||
fn from_intern_id(v: InternId) -> Self {
|
||||
Path(v)
|
||||
}
|
||||
|
||||
fn as_intern_id(&self) -> InternId {
|
||||
self.0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Convenient lookup method
|
||||
|
||||
It is often convenient to add a `lookup` method to the newtype key:
|
||||
|
||||
```rust
|
||||
impl Path {
|
||||
// Adding this method is often convenient, since you can then
|
||||
// write `path.lookup(db)` to access the data, which reads a bit better.
|
||||
pub fn lookup(&self, db: &impl MyDatabase) -> PathData {
|
||||
db.lookup_intern_path_data(*self)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Defining the data type
|
||||
|
||||
Recall that our paths were defined by a recursive grammar like so:
|
||||
|
||||
```
|
||||
PathData = <file-name>
|
||||
| PathData / <identifier>
|
||||
```
|
||||
|
||||
This recursion is quite typical of salsa applications. The recommended
|
||||
way to encode it in the `PathData` structure itself is to build on other
|
||||
intern keys, like so:
|
||||
|
||||
```rust
|
||||
#[derive(Clone, Hash, Eq, ..)]
|
||||
enum PathData {
|
||||
Root(String),
|
||||
Child(Path, String),
|
||||
// ^^^^ Note that the recursive reference here
|
||||
// is encoded as a Path.
|
||||
}
|
||||
```
|
||||
|
||||
Note though that the `PathData` type will be cloned whenever the value
|
||||
for an interned key is looked up, and it may also be cloned to store
|
||||
dependency information between queries. So, as an optimization, you
|
||||
might prefer to avoid `String` in favor of `Arc<String>` -- or even
|
||||
intern the strings as well.
|
||||
|
||||
## Interaction with the garbage collector
|
||||
|
||||
Interned keys can be garbage collected as normal, with one
|
||||
caveat. Even if requested, Salsa will never collect the results
|
||||
generated in the current revision. This is because it would permit the
|
||||
same key to be interned twice in the same revision, possibly mapping
|
||||
to distinct intern keys each time.
|
||||
|
||||
Note that if an interned key *is* collected, its index will be
|
||||
re-used. Salsa's dependency tracking system should ensure that
|
||||
anything incorporating the older value is considered dirty, but you
|
||||
may see the same index showing up more than once in the logs.
|
||||
|
||||
# Reference guide
|
||||
|
||||
Interned keys are implemented using a hash-map that maps from the
|
||||
interned data to its index, as well as a vector containing (for each
|
||||
index) various bits of data. In addition to the interned data, we must
|
||||
track the revision in which the value was interned and the revision in
|
||||
which it was last accessed, to help manage the interaction with the
|
||||
GC. Finally, we have to track some sort of free list that tracks the
|
||||
keys that are being re-used. The current implementation never actually
|
||||
shrinks the vectors and maps from their maximum size, but this might
|
||||
be a useful thing to be able to do (this is effectively a memory
|
||||
allocator, so standard allocation strategies could be used here).
|
||||
|
||||
## InternId
|
||||
|
||||
Presently the `InternId` type is implemented to wrap a `NonZeroU32`:
|
||||
|
||||
```rust
|
||||
pub struct InternId {
|
||||
value: NonZeroU32,
|
||||
}
|
||||
```
|
||||
|
||||
This means that `Option<InternId>` (or `Option<Path>`, continuing our
|
||||
example from before) will only be a single word. To accommodate this,
|
||||
the `InternId` constructors require that the value is less than
|
||||
`InternId::MAX`; the value is deliberately set low (currently to
|
||||
`0xFFFF_FF00`) to allow for more sentinel values in the future (Rust
|
||||
doesn't presently expose the capability of having sentinel values
|
||||
other than zero on stable, but it is possible on nightly).
|
||||
|
||||
# Alternatives and future work
|
||||
|
||||
None at present.
|
121
book/src/rfcs/RFC0003-Query-Dependencies.md
Normal file
121
book/src/rfcs/RFC0003-Query-Dependencies.md
Normal file
|
@ -0,0 +1,121 @@
|
|||
# Summary
|
||||
|
||||
Allow to specify a dependency on a query group without making it a super trait.
|
||||
|
||||
# Motivation
|
||||
|
||||
Currently, there's only one way to express that queries from group `A` can use
|
||||
another group `B`: namely, `B` can be a super-trait of `A`:
|
||||
|
||||
```rust
|
||||
#[salsa::query_group(AStorage)]
|
||||
trait A: B {
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
This approach works and allows one to express complex dependencies. However,
|
||||
this approach falls down when one wants to make a dependency a private
|
||||
implementation detail: Clients with `db: &impl A` can freely call `B` methods on
|
||||
the `db`.
|
||||
|
||||
This is a bad situation from software engineering point of view: if everything
|
||||
is accessible, it's hard to make distinction between public API and private
|
||||
implementation details. In the context of salsa the situation is even worse,
|
||||
because it breaks "firewall" pattern. It's customary to wrap low-level
|
||||
frequently-changing or volatile queries into higher-level queries which produce
|
||||
stable results and contain invalidation. In the current salsa, however, it's
|
||||
very easy to accidentally call a low-level volatile query instead of a wrapper,
|
||||
introducing and undesired dependency.
|
||||
|
||||
# User's guide
|
||||
|
||||
To specify query dependencies, a `requires` attribute should be used:
|
||||
|
||||
```rust
|
||||
#[salsa::query_group(SymbolsDatabaseStorage)]
|
||||
#[salsa::requires(SyntaxDatabase)]
|
||||
#[salsa::requires(EnvDatabase)]
|
||||
pub trait SymbolsDatabase {
|
||||
fn get_symbol_by_name(&self, name: String) -> Symbol;
|
||||
}
|
||||
```
|
||||
|
||||
The argument of `requires` is a path to a trait. The traits from all `requires`
|
||||
attributes are available when implementing the query:
|
||||
|
||||
```rust
|
||||
fn get_symbol_by_name(
|
||||
db: &(impl SymbolsDatabase + SyntaxDatabase + EnvDatabase),
|
||||
name: String,
|
||||
) -> Symbol {
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
However, these traits are **not** available without explicit bounds:
|
||||
|
||||
```rust
|
||||
fn fuzzy_find_symbol(db: &impl SymbolsDatabase, name: String) {
|
||||
// Can't accidentally call methods of the `SyntaxDatabase`
|
||||
}
|
||||
```
|
||||
|
||||
Note that, while the RFC does not propose to add per-query dependencies, query
|
||||
implementation can voluntarily specify only a subset of traits from `requires`
|
||||
attribute:
|
||||
|
||||
```rust
|
||||
fn get_symbol_by_name(
|
||||
// Purposefully don't depend on EnvDatabase
|
||||
db: &(impl SymbolsDatabase + SyntaxDatabase),
|
||||
name: String,
|
||||
) -> Symbol {
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
# Reference guide
|
||||
|
||||
The implementation is straightforward and consists of adding traits from
|
||||
`requires` attributes to various `where` bounds. For example, we would generate
|
||||
the following blanket for above example:
|
||||
|
||||
```rust
|
||||
impl<T> SymbolsDatabase for T
|
||||
where
|
||||
T: SyntaxDatabase + EnvDatabase,
|
||||
T: salsa::plumbing::HasQueryGroup<SymbolsDatabaseStorage>
|
||||
{
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
# Alternatives and future work
|
||||
|
||||
The semantics of `requires` closely resembles `where`, so we could imagine a
|
||||
syntax based on magical where clauses:
|
||||
|
||||
```rust
|
||||
#[salsa::query_group(SymbolsDatabaseStorage)]
|
||||
pub trait SymbolsDatabase
|
||||
where ???: SyntaxDatabase + EnvDatabase
|
||||
{
|
||||
fn get_symbol_by_name(&self, name: String) -> Symbol;
|
||||
}
|
||||
```
|
||||
|
||||
However, it's not obvious what should stand for `???`. `Self` won't be ideal,
|
||||
because supertraits are a sugar for bounds on `Self`, and we deliberately want
|
||||
different semantics. Perhaps picking a magical identifier like `DB` would work
|
||||
though?
|
||||
|
||||
One potential future development here is per-query-function bounds, but they can
|
||||
already be simulated by voluntarily requiring less bounds in the implementation
|
||||
function.
|
||||
|
||||
Another direction for future work is privacy: because traits from `requires`
|
||||
clause are not a part of public interface, in theory it should be possible to
|
||||
restrict their visibility. In practice, this still hits public-in-private lint,
|
||||
at least with a trivial implementation.
|
||||
|
20
book/src/rfcs/template.md
Normal file
20
book/src/rfcs/template.md
Normal file
|
@ -0,0 +1,20 @@
|
|||
# Summary
|
||||
|
||||
Summarize the effects of the RFC bullet point form.
|
||||
|
||||
# Motivation
|
||||
|
||||
Say something about your goals here.
|
||||
|
||||
# User's guide
|
||||
|
||||
Describe effects on end users here.
|
||||
|
||||
# Reference guide
|
||||
|
||||
Describe implementation details or other things here.
|
||||
|
||||
# Alternatives and future work
|
||||
|
||||
Various downsides, rejected approaches, or other considerations.
|
||||
|
Loading…
Reference in a new issue