375: Editing pass over the Overview, Tutorial, and Reference sections of the book r=nikomatsakis a=seanchen1991

This PR includes a number of proposed changes, mostly fixing typos, a few additions/rearrangements where I thought it made sense, as well as a couple of questions delineated in brackets, to the book in an effort to polish it up a bit. 

There's also a question around whether the book should standardize around 'Salsa' or 'salsa'. 

Co-authored-by: Sean Chen <seanchen11235@gmail.com>
Co-authored-by: Sean Chen <skypemaster007@gmail.com>
This commit is contained in:
bors[bot] 2022-08-25 10:27:21 +00:00 committed by GitHub
commit 7872a53ef2
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
12 changed files with 107 additions and 112 deletions

View file

@ -2,8 +2,8 @@
## Video available
To get the most complete introduction to Salsa's inner works, check
out [the "How Salsa Works" video](https://youtu.be/_muY4HjSqVw). If
To get the most complete introduction to Salsa's inner workings, check
out [the "How Salsa Works" video](https://youtu.be/_muY4HjSqVw). If
you'd like a deeper dive, [the "Salsa in more depth"
video](https://www.youtube.com/watch?v=i_IhACacPRY) digs into the
details of the incremental algorithm.
@ -13,21 +13,21 @@ details of the incremental algorithm.
## Key idea
The key idea of `salsa` is that you define your program as a set of
**queries**. Every query is used like function `K -> V` that maps from
**queries**. Every query is used like a function `K -> V` that maps from
some key of type `K` to a value of type `V`. Queries come in two basic
varieties:
- **Inputs**: the base inputs to your system. You can change these
whenever you like.
- **Functions**: pure functions (no side effects) that transform your
inputs into other values. The results of queries is memoized to
inputs into other values. The results of queries are memoized to
avoid recomputing them a lot. When you make changes to the inputs,
we'll figure out (fairly intelligently) when we can re-use these
memoized values and when we have to recompute them.
## How to use Salsa in three easy steps
Using salsa is as easy as 1, 2, 3...
Using Salsa is as easy as 1, 2, 3...
1. Define one or more **query groups** that contain the inputs
and queries you will need. We'll start with one such group, but
@ -48,4 +48,4 @@ things work.
## Digging into the plumbing
Check out the [plumbing](plumbing.md) chapter to see a deeper explanation of the
code that salsa generates and how it connects to the salsa library.
code that Salsa generates and how it connects to the Salsa library.

View file

@ -2,15 +2,15 @@
{{#include caveat.md}}
This page contains a brief overview of the pieces of a salsa program.
This page contains a brief overview of the pieces of a Salsa program.
For a more detailed look, check out the [tutorial](./tutorial.md), which walks through the creation of an entire project end-to-end.
## Goal of Salsa
The goal of salsa is to support efficient **incremental recomputation**.
salsa is used in rust-analyzer, for example, to help it recompile your program quickly as you type.
The goal of Salsa is to support efficient **incremental recomputation**.
Salsa is used in rust-analyzer, for example, to help it recompile your program quickly as you type.
The basic idea of a salsa program is like this:
The basic idea of a Salsa program is like this:
```rust
let mut input = ...;
@ -35,9 +35,9 @@ But this picture still conveys a few important concepts:
## Database
Each time you run your program, salsa remembers the values of each computation in a **database**.
Each time you run your program, Salsa remembers the values of each computation in a **database**.
When the inputs change, it consults this database to look for values that can be reused.
The database is also used to implement interning (making a canonical version of a value that can be copied around and cheaply compared for equality) and other convenient salsa features.
The database is also used to implement interning (making a canonical version of a value that can be copied around and cheaply compared for equality) and other convenient Salsa features.
## Inputs
@ -66,9 +66,9 @@ let file: ProgramFile = ProgramFile::new(
);
```
### Salsa structs are just an integer
### Salsa structs are just integers
The `ProgramFile` struct generates by the `salsa::input` macro doesn't actually store any data. It's just a newtyped integer id:
The `ProgramFile` struct generated by the `salsa::input` macro doesn't actually store any data. It's just a newtyped integer id:
```rust
// Generated by the `#[salsa::input]` macro:
@ -129,16 +129,16 @@ fn parse_file(db: &dyn crate::Db, file: ProgramFile) -> Ast {
}
```
When you call a tracked function, salsa will track which inputs it accesses (in this example, `file.contents(db)`).
When you call a tracked function, Salsa will track which inputs it accesses (in this example, `file.contents(db)`).
It will also memoize the return value (the `Ast`, in this case).
If you call a tracked function twice, salsa checks if the inputs have changed; if not, it can return the memoized value.
The algorithm salsa uses to decide when a tracked function needs to be re-executed is called the [red-green algorithm](./reference/algorithm.md), and it's where the name salsa comes from.
If you call a tracked function twice, Salsa checks if the inputs have changed; if not, it can return the memoized value.
The algorithm Salsa uses to decide when a tracked function needs to be re-executed is called the [red-green algorithm](./reference/algorithm.md), and it's where the name Salsa comes from.
Tracked functions have to follow a particular structure:
- They must take a `&`-reference to the database as their first argument.
- Note that because this is an `&`-reference, it is not possible to create or modify inputs during a tracked function!
- They must take a "salsa struct" as the second argument -- in our example, this is an input struct, but there are other kinds of salsa structs we'll describe shortly.
- They must take a "Salsa struct" as the second argument -- in our example, this is an input struct, but there are other kinds of Salsa structs we'll describe shortly.
- They _can_ take additional arguments, but it's faster and better if they don't.
Tracked functions can return any clone-able type. A clone is required since, when the value is cached, the result will be cloned out of the database. Tracked functions can also be annotated with `#[return_ref]` if you would prefer to return a reference into the database instead (if `parse_file` were so annotated, then callers would actually get back an `&Ast`, for example).
@ -196,7 +196,7 @@ struct Item {
Maybe our parser first creates an `Item` with the name `foo` and then later a second `Item` with the name `bar`.
Then the user changes the input to reorder the functions.
Although we are still creating the same number of items, we are now creating them in the reverse order, so the naive algorithm will match up the _old_ `foo` struct with the new `bar` struct.
This will look to salsa as though the `foo` function was renamed to `bar` and the `bar` function was renamed to `foo`.
This will look to Salsa as though the `foo` function was renamed to `bar` and the `bar` function was renamed to `foo`.
We'll still get the right result, but we might do more recomputation than we needed to do if we understood that they were just reordered.
To address this, you can tag fields in a tracked struct as `#[id]`. These fields are then used to "match up" struct instances across executions:
@ -210,7 +210,7 @@ struct Item {
}
```
### Specified the result of tracked functions for particular structs
### Specify the result of tracked functions for particular structs
Sometimes it is useful to define a tracked function but specify its value for some particular struct specially.
For example, maybe the default way to compute the representation for a function is to read the AST, but you also have some built-in functions in your language and you want to hard-code their results.
@ -235,11 +235,11 @@ fn create_builtin_item(db: &dyn crate::Db) -> Item {
}
```
Specifying is only possible for tracked functions that take a single tracked struct as argument (besides the database).
Specifying is only possible for tracked functions that take a single tracked struct as an argument (besides the database).
## Interned structs
The final kind of salsa struct are **interned structs**.
The final kind of Salsa struct are **interned structs**.
Interned structs are useful for quick equality comparison.
They are commonly used to represent strings or other primitive values.
@ -263,13 +263,13 @@ let w2 = Word::new(db, "bar".to_string());
let w3 = Word::new(db, "foo".to_string());
```
When you create two interned structs with the same field values, you are guaranted to get back the same integer id. So here, we know that `assert_eq!(w1, w3)` is true and `assert_ne!(w1, w2)`.
When you create two interned structs with the same field values, you are guaranteed to get back the same integer id. So here, we know that `assert_eq!(w1, w3)` is true and `assert_ne!(w1, w2)`.
You can access the fields of an interned struct using a getter, like `word.text(db)`. These getters respect the `#[return_ref]` annotation. Like tracked structs, the fields of interned structs are immutable.
## Accumulators
The final salsa concept are **accumulators**. Accumulators are a way to report errors or other "side channel" information that is separate from the main return value of your function.
The final Salsa concept are **accumulators**. Accumulators are a way to report errors or other "side channel" information that is separate from the main return value of your function.
To create an accumulator, you declare a type as an _accumulator_:

View file

@ -2,19 +2,19 @@
{{#include ../caveat.md}}
This page covers how data is organized in salsa and how links between salsa items (e.g., dependency tracking) works.
This page covers how data is organized in Salsa and how links between Salsa items (e.g., dependency tracking) work.
## Salsa items and ingredients
A **salsa item** is some item annotated with a salsa annotation that can be included in a jar.
For example, a tracked function is a salsa item:
A **Salsa item** is some item annotated with a Salsa annotation that can be included in a jar.
For example, a tracked function is a Salsa item:
```rust
#[salsa::tracked]
fn foo(db: &dyn Db, input: MyInput) { }
```
...and so is a salsa input...
...and so is a Salsa input...
```rust
#[salsa::input]
@ -28,19 +28,19 @@ struct MyInput { }
struct MyStruct { }
```
Each salsa item needs certain bits of data at runtime to operate.
Each Salsa item needs certain bits of data at runtime to operate.
These bits of data are called **ingredients**.
Most salsa items generate a single ingredient, but sometimes they make more than one.
Most Salsa items generate a single ingredient, but sometimes they make more than one.
For example, a tracked function generates a [`FunctionIngredient`].
A tracked struct however generates several ingredients, one for the struct itself (a [`TrackedStructIngredient`],
A tracked struct, however, generates several ingredients, one for the struct itself (a [`TrackedStructIngredient`],
and one [`FunctionIngredient`] for each value field.
[`FunctionIngredient`]: https://github.com/salsa-rs/salsa/blob/becaade31e6ebc58cd0505fc1ee4b8df1f39f7de/components/salsa-2022/src/function.rs#L42
[`TrackedStructIngredient`]: https://github.com/salsa-rs/salsa/blob/becaade31e6ebc58cd0505fc1ee4b8df1f39f7de/components/salsa-2022/src/tracked_struct.rs#L18
### Ingredients define the core logic of salsa
### Ingredients define the core logic of Salsa
Most of the interesting salsa code lives in these ingredients.
Most of the interesting Salsa code lives in these ingredients.
For example, when you create a new tracked struct, the method [`TrackedStruct::new_struct`] is invoked;
it is responsible for determining the tracked struct's id.
Similarly, when you call a tracked function, that is translated into a call to [`TrackedFunction::fetch`],
@ -50,13 +50,6 @@ or whether the function must be executed.
[`TrackedStruct::new_struct`]: https://github.com/salsa-rs/salsa/blob/becaade31e6ebc58cd0505fc1ee4b8df1f39f7de/components/salsa-2022/src/tracked_struct.rs#L76
[`TrackedFunction::fetch`]: https://github.com/salsa-rs/salsa/blob/becaade31e6ebc58cd0505fc1ee4b8df1f39f7de/components/salsa-2022/src/function/fetch.rs#L15
### Ingredient interfaces are not stable or subject to semver
Interfaces are not meant to be directly used by salsa users.
The salsa macros generate code that invokes the ingredients.
The APIs may change in arbitrary ways across salsa versions,
as the macros are kept in sync.
### The `Ingredient` trait
Each ingredient implements the [`Ingredient<DB>`] trait, which defines generic operations supported by any kind of ingredient.
@ -65,12 +58,12 @@ For example, the method `maybe_changed_after` can be used to check whether some
[`Ingredient<DB>`]: https://github.com/salsa-rs/salsa/blob/becaade31e6ebc58cd0505fc1ee4b8df1f39f7de/components/salsa-2022/src/ingredient.rs#L15
[`maybe_changed_after`]: https://github.com/salsa-rs/salsa/blob/becaade31e6ebc58cd0505fc1ee4b8df1f39f7de/components/salsa-2022/src/ingredient.rs#L21-L22
We'll see below that each database `DB` is able to take an `IngredientIndex` and use that to get a `&dyn Ingredient<DB>` for the corresponding ingredient.
This allows the database to perform generic operations on a numbered ingredient without knowing exactly what the type of that ingredient is.
We'll see below that each database `DB` is able to take an `IngredientIndex` and use that to get an `&dyn Ingredient<DB>` for the corresponding ingredient.
This allows the database to perform generic operations on an indexed ingredient without knowing exactly what the type of that ingredient is.
### Jars are a collection of ingredients
When you declare a salsa jar, you list out each of the salsa items that are included in that jar:
When you declare a Salsa jar, you list out each of the Salsa items that are included in that jar:
```rust,ignore
#[salsa::jar]
@ -91,15 +84,14 @@ struct Jar(
)
```
The `IngredientsFor` trait is used to define the ingredients needed by some salsa item, such as the tracked function `foo`
or the tracked struct `MyInput`.
Each salsa item defines a type `I`, so that `<I as IngredientsFor>::Ingredient` gives the ingredients needed by `I`.
The `IngredientsFor` trait is used to define the ingredients needed by some Salsa item, such as the tracked function `foo` or the tracked struct `MyInput`.
Each Salsa item defines a type `I` so that `<I as IngredientsFor>::Ingredient` gives the ingredients needed by `I`.
### Database is a tuple of jars
### A database is a tuple of jars
Salsa's database storage ultimately boils down to a tuple of jar structs,
Salsa's database storage ultimately boils down to a tuple of jar structs
where each jar struct (as we just saw) itself contains the ingredients
for the salsa items within that jar.
for the Salsa items within that jar.
The database can thus be thought of as a list of ingredients,
although that list is organized into a 2-level hierarchy.
@ -107,9 +99,9 @@ The reason for this 2-level hierarchy is that it permits separate compilation an
The crate that lists the jars doens't have to know the contents of the jar to embed the jar struct in the database.
And some of the types that appear in the jar may be private to another struct.
### The HasJars trait and the Jars type
### The `HasJars` trait and the `Jars` type
Each salsa database implements the `HasJars` trait,
Each Salsa database implements the `HasJars` trait,
generated by the `salsa::db` procedural macro.
The `HarJars` trait, among other things, defines a `Jars` associated type that maps to a tuple of the jars in the trait.
@ -167,15 +159,20 @@ We can then do things like ask, "did this input change since revision R?" by
* using the ingredient index to find the route and get a `&dyn Ingredient<DB>`
* and then invoking the `maybe_changed_since` method on that trait object.
### HasJarsDyn
### `HasJarsDyn`
There is one catch in the above setup.
We need the database to be dyn-safe, and we also need to be able to define the database trait and so forth without knowing the final database type to enable separate compilation.
The user's code always interacts with a `dyn crate::Db` value, where `crate::Db` is the trait defined by the jar; the `crate::Db` trait extends `salsa::HasJar` which in turn extends `salsa::Database`.
Ideally, we would have `salsa::Database` extend `salsa::HasJars`, which is the main trait that gives access to the jars data.
But we don't want to do that because `HasJars` defines an associated type `Jars`, and that would mean that every reference to `dyn crate::Db` would have to specify the jars type using something like `dyn crate::Db<Jars = J>`.
This would be unergonomic, but what's worse, it would actually be impossible: the final Jars type combines the jars from multiple crates, and so it is not known to any individual jar crate.
To workaround this, `salsa::Database` in fact extends *another* trait, `HasJarsDyn`, that doesn't reveal the `Jars` or ingredient types directly, but just has various method that can be performed on an ingredient, given its `IngredientIndex`.
Traits like `Ingredient<DB>` require knowing the full `DB` type.
If we had one function ingredient directly invoke a method on `Ingredient<DB>`, that would imply that it has to be fully generic and only instantiated at the final crate, when the full database type is available.
We solve this via the `HasJarsDyn` trait. The `HasJarsDyn` trait exports method that combine the "find ingredient, invoking method" steps into one method:
We solve this via the `HasJarsDyn` trait. The `HasJarsDyn` trait exports a method that combines the "find ingredient, invoking method" steps into one method:
[Perhaps this code snippet should only preview the HasJarsDyn method that is being referred to]
```rust,ignore
{{#include ../../../components/salsa-2022/src/storage.rs:HasJarsDyn}}
```
@ -205,13 +202,13 @@ The implementation of this method is defined by the `#[salsa::db]` macro; it sim
{{#include ../../../components/salsa-2022-macros/src/db.rs:create_jars}}
```
This implementation for `create_jar` is geneated by the `#[salsa::jar]` macro, and simply walks over the representative type for each salsa item and ask *it* to create its ingredients
This implementation for `create_jar` is geneated by the `#[salsa::jar]` macro, and simply walks over the representative type for each salsa item and asks *it* to create its ingredients
```rust,ignore
{{#include ../../../components/salsa-2022-macros/src/jar.rs:create_jar}}
```
The code to create the ingredients for any particular item is generated by their associated macros (e.g., `#[salsa::tracked]`, `#[salsa::input]`), but it always follows a particular structure.
To create an ingredient, we first invoke `Routes::push` which creates the routes to that ingredient and assigns it an `IngredientIndex`.
We can then invoke (e.g.) `FunctionIngredient::new` to create the structure.
To create an ingredient, we first invoke `Routes::push`, which creates the routes to that ingredient and assigns it an `IngredientIndex`.
We can then invoke a function such as `FunctionIngredient::new` to create the structure.
The *routes* to an ingredient are defined as closures that, given the `DB::Jars`, can find the data for a particular ingredient.

View file

@ -1,11 +1,11 @@
# The "red-green" algorithm
This page explains the basic salsa incremental algorithm.
The algorithm is called the "red-green" algorithm, which is where the name salsa comes from.
This page explains the basic Salsa incremental algorithm.
The algorithm is called the "red-green" algorithm, which is where the name Salsa comes from.
### Database revisions
The salsa database always tracks a single **revision**. Each time you set an input, the revision is incremented. So we start in revision `R1`, but when a `set` method is called, we will go to `R2`, then `R3`, and so on. For each input, we also track the revision in which it was last changed.
The Salsa database always tracks a single **revision**. Each time you set an input, the revision is incremented. So we start in revision `R1`, but when a `set` method is called, we will go to `R2`, then `R3`, and so on. For each input, we also track the revision in which it was last changed.
### Basic rule: when inputs change, re-execute!
@ -20,7 +20,7 @@ fn parse_module(db: &dyn Db, module: Module) -> Ast {
Ast::parse_text(module_text)
}
#[salsa::tracked(ref)]
#[salsa::tracked(return_ref)]
fn module_text(db: &dyn Db, module: Module) -> String {
panic!("text for module `{module:?}` not set")
}
@ -65,7 +65,11 @@ If the module text is changed, we saw that we have to re-execute `parse_module`,
## Durability: an optimization
As an optimization, salsa includes the concept of **durability**. When you set the value of a tracked function, you can also set it with a given _durability_:
As an optimization, Salsa includes the concept of **durability**, which is the notion of how often some piece of tracked data changes.
For example, when compiling a Rust program, you might mark the inputs from crates.io as _high durability_ inputs, since they are unlikely to change. The current workspace could be marked as _low durability_, since changes to it are happening all the time.
When you set the value of a tracked function, you can also set it with a given _durability_:
```rust
module_text::set_with_durability(
@ -78,4 +82,3 @@ module_text::set_with_durability(
For each durability, we track the revision in which _some input_ with that durability changed. If a tracked function depends (transitively) only on high durability inputs, and you change a low durability input, then we can very easily determine that the tracked function result is still valid, avoiding the need to traverse the input edges one by one.
An example: if compiling a Rust program, you might mark the inputs from crates.io as _high durability_ inputs, since they are unlikely to change. The current workspace could be marked as _low durability_.

View file

@ -1,9 +1,10 @@
# Defining the parser: reporting errors
The last interesting case in the parser is how to handle a parse error.
Because salsa functions are memoized and may not execute, they should not have side-effects,
Because Salsa functions are memoized and may not execute, they should not have side-effects,
so we don't just want to call `eprintln!`.
If we did so, the error would only be reported the first time the function was called.
If we did so, the error would only be reported the first time the function was called, but not
on subsequent calls in the situation where the simply returns its memoized value.
Salsa defines a mechanism for managing this called an **accumulator**.
In our case, we define an accumulator struct called `Diagnostics` in the `ir` module:
@ -15,7 +16,7 @@ In our case, we define an accumulator struct called `Diagnostics` in the `ir` mo
Accumulator structs are always newtype structs with a single field, in this case of type `Diagnostic`.
Memoized functions can _push_ `Diagnostic` values onto the accumulator.
Later, you can invoke a method to find all the values that were pushed by the memoized functions
or any function that it called
or any functions that they called
(e.g., we could get the set of `Diagnostic` values produced by the `parse_statements` function).
The `Parser::report_error` method contains an example of pushing a diagnostic:

View file

@ -14,11 +14,10 @@ In `calc`, the database struct is in the [`db`] module, and it looks like this:
```
The `#[salsa::db(...)]` attribute takes a list of all the jars to include.
The struct must have a field named `storage` whose types is `salsa::Storage<Self>`, but it can also contain whatever other fields you want.
The struct must have a field named `storage` whose type is `salsa::Storage<Self>`, but it can also contain whatever other fields you want.
The `storage` struct owns all the data for the jars listed in the `db` attribute.
The `salsa::db` attribute autogenerates a bunch of impls for things like the `salsa::HasJar<crate::Jar>` trait that we saw earlier.
This means that
## Implementing the `salsa::Database` trait
@ -44,7 +43,7 @@ It's not required, but implementing the `Default` trait is often a convenient wa
{{#include ../../../calc-example/calc/src/db.rs:default_impl}}
```
## Implementing the traits for each Jar
## Implementing the traits for each jar
The `Database` struct also needs to implement the [database traits for each jar](./jar.md#database-trait-for-the-jar).
In our case, though, we already wrote that impl as a [blanket impl alongside the jar itself](./jar.md#implementing-the-database-trait-for-the-jar),

View file

@ -7,7 +7,7 @@ Before we can do that, though, we have to address one question: how do we inspec
## The `DebugWithDb` trait
Because an interned type like `Expression` just stores an integer, the traditional `Debug` trait is not very useful.
To properly print a `Expression`, you need to access the salsa database to find out what its value is.
To properly print a `Expression`, you need to access the Salsa database to find out what its value is.
To solve this, `salsa` provides a `DebugWithDb` trait that acts like the regular `Debug`, but takes a database as argument.
For types that implement this trait, you can invoke the `debug` method.
This returns a temporary that implements the ordinary `Debug` trait, allowing you to write something like
@ -18,7 +18,7 @@ eprintln!("Expression = {:?}", expr.debug(db));
and get back the output you expect.
The `DebugWithDb` trait is automatically derived for all `#[input]`, `#[interned]` and `#[tracked]` structs.
The `DebugWithDb` trait is automatically derived for all `#[input]`, `#[interned]`, and `#[tracked]` structs.
## Forwarding to the ordinary `Debug` trait

View file

@ -6,17 +6,17 @@ now we are going to define them for real.
## "Salsa structs"
In addition to regular Rust types, we will make use of various **salsa structs**.
A salsa struct is a struct that has been annotated with one of the salsa annotations:
In addition to regular Rust types, we will make use of various **Salsa structs**.
A Salsa struct is a struct that has been annotated with one of the Salsa annotations:
* [`#[salsa::input]`](#input-structs), which designates the "base inputs" to your computation;
* [`#[salsa::tracked]`](#tracked-structs), which designate intermediate values created during your computation;
* [`#[salsa::interned]`](#interned-structs), which designate small values that are easy to compare for equality.
All salsa structs store the actual values of their fields in the salsa database.
All Salsa structs store the actual values of their fields in the Salsa database.
This permits us to track when the values of those fields change to figure out what work will need to be re-executed.
When you annotate a struct with one of the above salsa attributes, salsa actually generates a bunch of code to link that struct into the database.
When you annotate a struct with one of the above Salsa attributes, Salsa actually generates a bunch of code to link that struct into the database.
This code must be connected to some [jar](./jar.md).
By default, this is `crate::Jar`, but you can specify a different jar with the `jar=` attribute (e.g., `#[salsa::input(jar = MyJar)]`).
You must also list the struct in the jar definition itself, or you will get errors.
@ -24,7 +24,7 @@ You must also list the struct in the jar definition itself, or you will get erro
## Input structs
The first thing we will define is our **input**.
Every salsa program has some basic inputs that drive the rest of the computation.
Every Salsa program has some basic inputs that drive the rest of the computation.
The rest of the program must be some deterministic function of those base inputs,
such that when those inputs change, we can try to efficiently recompute the new result of that function.
@ -38,8 +38,8 @@ In our compiler, we have just one simple input, the `SourceProgram`, which has a
### The data lives in the database
Although they are declared like other Rust structs, salsa structs are implemented quite differently.
The values of their fields are stored in the salsa database, and the struct itself just contains a numeric identifier.
Although they are declared like other Rust structs, Salsa structs are implemented quite differently.
The values of their fields are stored in the Salsa database, and the struct itself just contains a numeric identifier.
This means that the struct instances are copy (no matter what fields they contain).
Creating instances of the struct and accessing fields is done by invoking methods like `new` as well as getters and setters.
@ -81,7 +81,7 @@ In this case, the parser is going to take in the `SourceProgram` struct that we
```
Like with an input, the fields of a tracked struct are also stored in the database.
Unlike an input, those fields are immutable (they cannot be "set"), and salsa compares them across revisions to know when they have changed.
Unlike an input, those fields are immutable (they cannot be "set"), and Salsa compares them across revisions to know when they have changed.
In this case, if parsing the input produced the same `Program` result (e.g., because the only change to the input was some trailing whitespace, perhaps),
then subsequent parts of the computation won't need to re-execute.
(We'll revisit the role of tracked structs in reuse more in future parts of the IR.)
@ -95,17 +95,12 @@ Apart from the fields being immutable, the API for working with a tracked struct
## Representing functions
We will also use a tracked struct to represent each function:
Next we will define a **tracked struct**.
Whereas inputs represent the *start* of a computation, tracked structs represent intermediate values created during your computation.
The `Function` struct is going to be created by the parser to represent each of the functions defined by the user:
```rust
{{#include ../../../calc-example/calc/src/ir.rs:functions}}
```
Like with an input, the fields of a tracked struct are also stored in the database.
Unlike an input, those fields are immutable (they cannot be "set"), and salsa compares them across revisions to know when they have changed.
If we had created some `Function` instance `f`, for example, we might find that `the f.body` field changes
because the user changed the definition of `f`.
This would mean that we have to re-execute those parts of the code that depended on `f.body`
@ -126,7 +121,7 @@ For more details, see the [algorithm](../reference/algorithm.md) page of the ref
## Interned structs
The final kind of salsa struct are *interned structs*.
The final kind of Salsa struct are *interned structs*.
As with input and tracked structs, the data for an interned struct is stored in the database, and you just pass around a single integer.
Unlike those structs, if you intern the same data twice, you get back the **same integer**.
@ -151,16 +146,16 @@ assert_eq!(f1, f2);
### Interned ids are guaranteed to be consistent within a revision, but not across revisions (but you don't have to care)
Interned ids are guaranteed not to change within a single revision, so you can intern things from all over your program and get back consistent results.
When you change the inputs, however, salsa may opt to clear some of the interned values and choose different integers.
When you change the inputs, however, Salsa may opt to clear some of the interned values and choose different integers.
However, if this happens, it will also be sure to re-execute every function that interned that value, so all of them still see a consistent value,
just a different one than they saw in a previous revision.
In other words, within a salsa computation, you can assume that interning produces a single consistent integer, and you don't have to think about it.
If however you export interned identifiers outside the computation, and then change the inputs, they may not longer be valid or may refer to different values.
In other words, within a Salsa computation, you can assume that interning produces a single consistent integer, and you don't have to think about it.
If, however, you export interned identifiers outside the computation, and then change the inputs, they may no longer be valid or may refer to different values.
### Expressions and statements
We'll won't use any special "salsa structs" for expressions and statements:
We won't use any special "Salsa structs" for expressions and statements:
```rust
{{#include ../../../calc-example/calc/src/ir.rs:statements_and_expressions}}
@ -170,4 +165,4 @@ Since statements and expressions are not tracked, this implies that we are only
whenever anything in a function body changes, we consider the entire function body dirty and re-execute anything that depended on it.
It usually makes sense to draw some kind of "reasonably coarse" boundary like this.
One downside of the way we have set things up: we inlined the position into each of the structs.
One downside of the way we have set things up: we inlined the position into each of the structs [what exactly does this mean?].

View file

@ -1,12 +1,12 @@
# Jars and databases
Before we can define the interesting parts of our salsa program, we have to setup a bit of structure that defines the salsa **database**.
The database is a struct that ultimately stores all of salsa's intermediate state, such as the memoized return values from [tracked functions].
Before we can define the interesting parts of our Salsa program, we have to setup a bit of structure that defines the Salsa **database**.
The database is a struct that ultimately stores all of Salsa's intermediate state, such as the memoized return values from [tracked functions].
[tracked functions]: ../overview.md#tracked-functions
The database itself is defined in terms of intermediate structures, called **jars**[^jar], which themselves contain the data for each function.
This setup allows salsa programs to be divided amongst many crates.
This setup allows Salsa programs to be divided amongst many crates.
Typically, you define one jar struct per crate, and then when you construct the final database, you simply list the jar structs.
This permits the crates to define private functions and other things that are members of the jar struct, but not known directly to the database.
@ -23,7 +23,7 @@ To define a jar struct, you create a tuple struct with the `#[salsa::jar]` annot
```
Although it's not required, it's highly recommended to put the `jar` struct at the root of your crate, so that it can be referred to as `crate::Jar`.
All of the other salsa annotations reference a jar struct, and they all default to the path `crate::Jar`.
All of the other Salsa annotations reference a jar struct, and they all default to the path `crate::Jar`.
If you put the jar somewhere else, you will have to override that default.
## Defining the database trait
@ -44,7 +44,7 @@ where `Jar` is the jar struct. If your jar depends on other jars, you can have m
Typically the `Db` trait has no other members or supertraits, but you are also free to add whatever other things you want in the trait.
When you define your final database, it will implement the trait, and you can then define the implementation of those other things.
This allows you to create a way for your jar to request context or other info from the database that is not moderated through salsa,
This allows you to create a way for your jar to request context or other info from the database that is not moderated through Salsa,
should you need that.
## Implementing the database trait for the jar
@ -62,10 +62,10 @@ and that's what we do here:
## Summary
If the concept of a jar seems a bit abstract to you, don't overthink it. The TL;DR is that when you create a salsa program, you need to do:
If the concept of a jar seems a bit abstract to you, don't overthink it. The TL;DR is that when you create a Salsa program, you need to perform the following steps:
- In each of your crates:
- Define a `#[salsa::jar(db = Db)]` struct, typically at `crate::Jar`, and list each of your various salsa-annotated things inside of it.
- Define a `#[salsa::jar(db = Db)]` struct, typically at `crate::Jar`, and list each of your various Salsa-annotated things inside of it.
- Define a `Db` trait, typically at `crate::Db`, that you will use in memoized functions and elsewhere to refer to the database struct.
- Once, typically in your final crate:
- Define a database `D`, as described in the [next section](./db.md), that will contain a list of each of the jars for each of your crates.

View file

@ -8,7 +8,7 @@ and create the `Statement`, `Function`, and `Expression` structures that [we def
To minimize dependencies, we are going to write a [recursive descent parser][rd].
Another option would be to use a [Rust parsing framework](https://rustrepo.com/catalog/rust-parsing_newest_1).
We won't cover the parsing itself in this tutorial -- you can read the code if you want to see how it works.
We're going to focus only on the salsa-related aspects.
We're going to focus only on the Salsa-related aspects.
[rd]: https://en.wikipedia.org/wiki/Recursive_descent_parser
@ -21,18 +21,18 @@ The starting point for the parser is the `parse_statements` function:
```
This function is annotated as `#[salsa::tracked]`.
That means that, when it is called, salsa will track what inputs it reads as well as what value it returns.
That means that, when it is called, Salsa will track what inputs it reads as well as what value it returns.
The return value is *memoized*,
which means that if you call this function again without changing the inputs,
salsa will just clone the result rather than re-execute it.
Salsa will just clone the result rather than re-execute it.
### Tracked functions are the unit of reuse
Tracked functions are the core part of how salsa enables incremental reuse.
Tracked functions are the core part of how Salsa enables incremental reuse.
The goal of the framework is to avoid re-executing tracked functions and instead to clone their result.
Salsa uses the [red-green algorithm](../reference/algorithm.md) to decide when to re-execute a function.
The short version is that a tracked function is re-executed if either (a) it directly reads an input, and that input has changed
or (b) it directly invokes another tracked function, and that function's return value has changed.
The short version is that a tracked function is re-executed if either (a) it directly reads an input, and that input has changed,
or (b) it directly invokes another tracked function and that function's return value has changed.
In the case of `parse_statements`, it directly reads `ProgramSource::text`, so if the text changes, then the parser will re-execute.
By choosing which functions to mark as `#[tracked]`, you control how much reuse you get.
@ -45,25 +45,25 @@ and because (b) since strings are just a big blob-o-bytes without any structure,
Some systems do choose to do more granular reparsing, often by doing a "first pass" over the string to give it a bit of structure,
e.g. to identify the functions,
but deferring the parsing of the body of each function until later.
Setting up a scheme like this is relatively easy in salsa, and uses the same principles that we will use later to avoid re-executing the type checker.
Setting up a scheme like this is relatively easy in Salsa and uses the same principles that we will use later to avoid re-executing the type checker.
### Parameters to a tracked function
The **first** parameter to a tracked function is **always** the database, `db: &dyn crate::Db`.
It must be a `dyn` value of whatever database is associated with the jar.
The **second** parameter to a tracked function is **always** some kind of salsa struct.
The **second** parameter to a tracked function is **always** some kind of Salsa struct.
The first parameter to a memoized function is always the database,
which should be a `dyn Trait` value for the database trait associated with the jar
(the default jar is `crate::Jar`).
Tracked functions may take other arguments as well, though our examples here do not.
Functions that take additional arguments are less efficient and flexible.
It's generally better to structure tracked functions as functions of a single salsa struct if possible.
It's generally better to structure tracked functions as functions of a single Salsa struct if possible.
### The `return_ref` annotation
You may have noticed that `parse_statements` is tagged with `#[salsa::tracked(return_ref)]`.
You may have noticed that `parse_statements` is tagged with `#[salsa::tracked(return_ref)]`.
Ordinarily, when you call a tracked function, the result you get back is cloned out of the database.
The `return_ref` attribute means that a reference into the database is returned instead.
So, when called, `parse_statements` will return an `&Vec<Statement>` rather than cloning the `Vec`.

View file

@ -1,7 +1,7 @@
# Basic structure
Before we do anything with salsa, let's talk about the basic structure of the calc compiler.
Part of salsa's design is that you are able to write programs that feel 'pretty close' to what a natural Rust program looks like.
Before we do anything with Salsa, let's talk about the basic structure of the calc compiler.
Part of Salsa's design is that you are able to write programs that feel 'pretty close' to what a natural Rust program looks like.
## Example program
@ -34,7 +34,7 @@ enum Statement {
Print(Expression),
}
/// Defines `fn <name>(<args>) = <body>`
/// Defines `fn <name>(<args>) = <body>`
struct Function {
name: FunctionId,
args: Vec<VariableId>,
@ -75,7 +75,7 @@ The "checker" has the job of ensuring that the user only references variables th
We're going to write the checker in a "context-less" style,
which is a bit less intuitive but allows for more incremental re-use.
The idea is to compute, for a given expression, which variables it references.
Then there is a function "check" which ensures that those variables are a subset of those that are already defined.
Then there is a function `check` which ensures that those variables are a subset of those that are already defined.
## Interpreter

View file

@ -4,7 +4,7 @@ There are currently two videos about Salsa available, but they describe an older
- [How Salsa Works](https://youtu.be/_muY4HjSqVw), which gives a
high-level introduction to the key concepts involved and shows how
to use salsa;
to use Salsa;
- [Salsa In More Depth](https://www.youtube.com/watch?v=i_IhACacPRY),
which digs into the incremental algorithm and explains -- at a
high-level -- how Salsa is implemented.