mirror of
https://github.com/salsa-rs/salsa.git
synced 2025-01-22 12:56:33 +00:00
Editing overview, tutorial, and reference sections
This commit is contained in:
parent
6320991c8b
commit
0f7a8c33ae
10 changed files with 43 additions and 40 deletions
|
@ -1,8 +1,8 @@
|
|||
# How Salsa works
|
||||
# How salsa works
|
||||
|
||||
## Video available
|
||||
|
||||
To get the most complete introduction to Salsa's inner works, check
|
||||
To get the most complete introduction to salsa's inner works, check
|
||||
out [the "How Salsa Works" video](https://youtu.be/_muY4HjSqVw). If
|
||||
you'd like a deeper dive, [the "Salsa in more depth"
|
||||
video](https://www.youtube.com/watch?v=i_IhACacPRY) digs into the
|
||||
|
@ -25,7 +25,7 @@ varieties:
|
|||
we'll figure out (fairly intelligently) when we can re-use these
|
||||
memoized values and when we have to recompute them.
|
||||
|
||||
## How to use Salsa in three easy steps
|
||||
## How to use salsa in three easy steps
|
||||
|
||||
Using salsa is as easy as 1, 2, 3...
|
||||
|
||||
|
@ -48,4 +48,4 @@ things work.
|
|||
## Digging into the plumbing
|
||||
|
||||
Check out the [plumbing](plumbing.md) chapter to see a deeper explanation of the
|
||||
code that salsa generates and how it connects to the salsa library.
|
||||
code that salsa generates and how it connects to the salsa library.
|
||||
|
|
|
@ -1,11 +1,11 @@
|
|||
# Salsa overview
|
||||
# Overview of salsa
|
||||
|
||||
{{#include caveat.md}}
|
||||
|
||||
This page contains a brief overview of the pieces of a salsa program.
|
||||
For a more detailed look, check out the [tutorial](./tutorial.md), which walks through the creation of an entire project end-to-end.
|
||||
|
||||
## Goal of Salsa
|
||||
## Goal of salsa
|
||||
|
||||
The goal of salsa is to support efficient **incremental recomputation**.
|
||||
salsa is used in rust-analyzer, for example, to help it recompile your program quickly as you type.
|
||||
|
@ -28,9 +28,9 @@ Some time later, you modify the input and invoke your program again.
|
|||
In reality, of course, you can have many inputs and "your program" may be many different methods and functions defined on those inputs.
|
||||
But this picture still conveys a few important concepts:
|
||||
|
||||
- Salsa separates out the "incremental computation" (the function `your_program`) from some outer loop that is defining the inputs.
|
||||
- Salsa gives you the tools to define `your_program`.
|
||||
- Salsa assumes that `your_program` is a purely deterministic function of its inputs, or else this whole setup makes no sense.
|
||||
- salsa separates out the "incremental computation" (the function `your_program`) from some outer loop that is defining the inputs.
|
||||
- salsa gives you the tools to define `your_program`.
|
||||
- salsa assumes that `your_program` is a purely deterministic function of its inputs, or else this whole setup makes no sense.
|
||||
- The mutation of inputs always happens outside of `your_program`, as part of this master loop.
|
||||
|
||||
## Database
|
||||
|
@ -41,7 +41,7 @@ The database is also used to implement interning (making a canonical version of
|
|||
|
||||
## Inputs
|
||||
|
||||
Every Salsa program begins with an **input**.
|
||||
Every salsa program begins with an **input**.
|
||||
Inputs are special structs that define the starting point of your program.
|
||||
Everything else in your program is ultimately a deterministic function of these inputs.
|
||||
|
||||
|
@ -66,9 +66,9 @@ let file: ProgramFile = ProgramFile::new(
|
|||
);
|
||||
```
|
||||
|
||||
### Salsa structs are just an integer
|
||||
### salsa structs are just an integer
|
||||
|
||||
The `ProgramFile` struct generates by the `salsa::input` macro doesn't actually store any data. It's just a newtyped integer id:
|
||||
The `ProgramFile` struct generated by the `salsa::input` macro doesn't actually store any data. It's just a newtyped integer id:
|
||||
|
||||
```rust
|
||||
// Generated by the `#[salsa::input]` macro:
|
||||
|
@ -210,7 +210,7 @@ struct Item {
|
|||
}
|
||||
```
|
||||
|
||||
### Specified the result of tracked functions for particular structs
|
||||
### Specify the result of tracked functions for particular structs
|
||||
|
||||
Sometimes it is useful to define a tracked function but specify its value for some particular struct specially.
|
||||
For example, maybe the default way to compute the representation for a function is to read the AST, but you also have some built-in functions in your language and you want to hard-code their results.
|
||||
|
@ -235,7 +235,7 @@ fn create_builtin_item(db: &dyn crate::Db) -> Item {
|
|||
}
|
||||
```
|
||||
|
||||
Specifying is only possible for tracked functions that take a single tracked struct as argument (besides the database).
|
||||
Specifying is only possible for tracked functions that take a single tracked struct as an argument (besides the database).
|
||||
|
||||
## Interned structs
|
||||
|
||||
|
@ -263,7 +263,7 @@ let w2 = Word::new(db, "bar".to_string());
|
|||
let w3 = Word::new(db, "foo".to_string());
|
||||
```
|
||||
|
||||
When you create two interned structs with the same field values, you are guaranted to get back the same integer id. So here, we know that `assert_eq!(w1, w3)` is true and `assert_ne!(w1, w2)`.
|
||||
When you create two interned structs with the same field values, you are guaranteed to get back the same integer id. So here, we know that `assert_eq!(w1, w3)` is true and `assert_ne!(w1, w2)`.
|
||||
|
||||
You can access the fields of an interned struct using a getter, like `word.text(db)`. These getters respect the `#[return_ref]` annotation. Like tracked structs, the fields of interned structs are immutable.
|
||||
|
||||
|
|
|
@ -20,7 +20,7 @@ fn parse_module(db: &dyn Db, module: Module) -> Ast {
|
|||
Ast::parse_text(module_text)
|
||||
}
|
||||
|
||||
#[salsa::tracked(ref)]
|
||||
#[salsa::tracked(return_ref)]
|
||||
fn module_text(db: &dyn Db, module: Module) -> String {
|
||||
panic!("text for module `{module:?}` not set")
|
||||
}
|
||||
|
@ -65,7 +65,11 @@ If the module text is changed, we saw that we have to re-execute `parse_module`,
|
|||
|
||||
## Durability: an optimization
|
||||
|
||||
As an optimization, salsa includes the concept of **durability**. When you set the value of a tracked function, you can also set it with a given _durability_:
|
||||
As an optimization, salsa includes the concept of **durability**, which is the notion of how often some piece of tracked data changes.
|
||||
|
||||
For example, when compiling a Rust program, you might mark the inputs from crates.io as _high durability_ inputs, since they are unlikely to change. The current workspace could be marked as _low durability_, since changes to it are happening all the time.
|
||||
|
||||
When you set the value of a tracked function, you can also set it with a given _durability_:
|
||||
|
||||
```rust
|
||||
module_text::set_with_durability(
|
||||
|
@ -78,4 +82,3 @@ module_text::set_with_durability(
|
|||
|
||||
For each durability, we track the revision in which _some input_ with that durability changed. If a tracked function depends (transitively) only on high durability inputs, and you change a low durability input, then we can very easily determine that the tracked function result is still valid, avoiding the need to traverse the input edges one by one.
|
||||
|
||||
An example: if compiling a Rust program, you might mark the inputs from crates.io as _high durability_ inputs, since they are unlikely to change. The current workspace could be marked as _low durability_.
|
||||
|
|
|
@ -3,9 +3,10 @@
|
|||
The last interesting case in the parser is how to handle a parse error.
|
||||
Because salsa functions are memoized and may not execute, they should not have side-effects,
|
||||
so we don't just want to call `eprintln!`.
|
||||
If we did so, the error would only be reported the first time the function was called.
|
||||
If we did so, the error would only be reported the first time the function was called, but not
|
||||
on subsequent calls in the situation where the simply returns its memoized value.
|
||||
|
||||
Salsa defines a mechanism for managing this called an **accumulator**.
|
||||
salsa defines a mechanism for managing this called an **accumulator**.
|
||||
In our case, we define an accumulator struct called `Diagnostics` in the `ir` module:
|
||||
|
||||
```rust
|
||||
|
@ -15,7 +16,7 @@ In our case, we define an accumulator struct called `Diagnostics` in the `ir` mo
|
|||
Accumulator structs are always newtype structs with a single field, in this case of type `Diagnostic`.
|
||||
Memoized functions can _push_ `Diagnostic` values onto the accumulator.
|
||||
Later, you can invoke a method to find all the values that were pushed by the memoized functions
|
||||
or any function that it called
|
||||
or any functions that they called
|
||||
(e.g., we could get the set of `Diagnostic` values produced by the `parse_statements` function).
|
||||
|
||||
The `Parser::report_error` method contains an example of pushing a diagnostic:
|
||||
|
|
|
@ -14,11 +14,11 @@ In `calc`, the database struct is in the [`db`] module, and it looks like this:
|
|||
```
|
||||
|
||||
The `#[salsa::db(...)]` attribute takes a list of all the jars to include.
|
||||
The struct must have a field named `storage` whose types is `salsa::Storage<Self>`, but it can also contain whatever other fields you want.
|
||||
The struct must have a field named `storage` whose type is `salsa::Storage<Self>`, but it can also contain whatever other fields you want.
|
||||
The `storage` struct owns all the data for the jars listed in the `db` attribute.
|
||||
|
||||
The `salsa::db` attribute autogenerates a bunch of impls for things like the `salsa::HasJar<crate::Jar>` trait that we saw earlier.
|
||||
This means that
|
||||
This means that ... [what goes here?]
|
||||
|
||||
## Implementing the `salsa::Database` trait
|
||||
|
||||
|
@ -44,7 +44,7 @@ It's not required, but implementing the `Default` trait is often a convenient wa
|
|||
{{#include ../../../calc-example/calc/src/db.rs:default_impl}}
|
||||
```
|
||||
|
||||
## Implementing the traits for each Jar
|
||||
## Implementing the traits for each jar
|
||||
|
||||
The `Database` struct also needs to implement the [database traits for each jar](./jar.md#database-trait-for-the-jar).
|
||||
In our case, though, we already wrote that impl as a [blanket impl alongside the jar itself](./jar.md#implementing-the-database-trait-for-the-jar),
|
||||
|
|
|
@ -21,6 +21,9 @@ and get back the output you expect.
|
|||
## Implementing the `DebugWithDb` trait
|
||||
|
||||
For now, unfortunately, you have to implement the `DebugWithDb` trait manually, as we do not provide a derive.
|
||||
|
||||
> You can find the tracking issue for addressing this [here][debug_with_db_issue].
|
||||
|
||||
This is tedious but not difficult. Here is an example of implementing the trait for `Expression`:
|
||||
|
||||
```rust
|
||||
|
@ -33,6 +36,7 @@ Some things to note:
|
|||
- The [`Formatter`] methods (e.g., [`debug_tuple`]) can be used to provide consistent output.
|
||||
- When printing the value of a field, use `.field(&a.debug(db))` for fields that are themselves interned or entities, and use `.field(&a)` for fields that just implement the ordinary `Debug` trait.
|
||||
|
||||
[debug_with_db_issue]: https://github.com/salsa-rs/salsa/issues/317
|
||||
[`debug_tuple`]: https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_tuple
|
||||
[`formatter`]: https://doc.rust-lang.org/std/fmt/struct.Formatter.html#
|
||||
|
||||
|
|
|
@ -4,7 +4,7 @@ Before we can define the [parser](./parser.md), we need to define the intermedia
|
|||
In the [basic structure](./structure.md), we defined some "pseudo-Rust" structures like `Statement` and `Expression`;
|
||||
now we are going to define them for real.
|
||||
|
||||
## "Salsa structs"
|
||||
## "salsa structs"
|
||||
|
||||
In addition to regular Rust types, we will make use of various **salsa structs**.
|
||||
A salsa struct is a struct that has been annotated with one of the salsa annotations:
|
||||
|
@ -95,17 +95,12 @@ Apart from the fields being immutable, the API for working with a tracked struct
|
|||
## Representing functions
|
||||
|
||||
We will also use a tracked struct to represent each function:
|
||||
Next we will define a **tracked struct**.
|
||||
Whereas inputs represent the *start* of a computation, tracked structs represent intermediate values created during your computation.
|
||||
|
||||
The `Function` struct is going to be created by the parser to represent each of the functions defined by the user:
|
||||
|
||||
```rust
|
||||
{{#include ../../../calc-example/calc/src/ir.rs:functions}}
|
||||
```
|
||||
|
||||
Like with an input, the fields of a tracked struct are also stored in the database.
|
||||
Unlike an input, those fields are immutable (they cannot be "set"), and salsa compares them across revisions to know when they have changed.
|
||||
If we had created some `Function` instance `f`, for example, we might find that `the f.body` field changes
|
||||
because the user changed the definition of `f`.
|
||||
This would mean that we have to re-execute those parts of the code that depended on `f.body`
|
||||
|
@ -156,11 +151,11 @@ However, if this happens, it will also be sure to re-execute every function that
|
|||
just a different one than they saw in a previous revision.
|
||||
|
||||
In other words, within a salsa computation, you can assume that interning produces a single consistent integer, and you don't have to think about it.
|
||||
If however you export interned identifiers outside the computation, and then change the inputs, they may not longer be valid or may refer to different values.
|
||||
If, however, you export interned identifiers outside the computation, and then change the inputs, they may no longer be valid or may refer to different values.
|
||||
|
||||
### Expressions and statements
|
||||
|
||||
We'll won't use any special "salsa structs" for expressions and statements:
|
||||
We won't use any special "salsa structs" for expressions and statements:
|
||||
|
||||
```rust
|
||||
{{#include ../../../calc-example/calc/src/ir.rs:statements_and_expressions}}
|
||||
|
@ -170,4 +165,4 @@ Since statements and expressions are not tracked, this implies that we are only
|
|||
whenever anything in a function body changes, we consider the entire function body dirty and re-execute anything that depended on it.
|
||||
It usually makes sense to draw some kind of "reasonably coarse" boundary like this.
|
||||
|
||||
One downside of the way we have set things up: we inlined the position into each of the structs.
|
||||
One downside of the way we have set things up: we inlined the position into each of the structs [what exactly does this mean?].
|
||||
|
|
|
@ -62,7 +62,7 @@ and that's what we do here:
|
|||
|
||||
## Summary
|
||||
|
||||
If the concept of a jar seems a bit abstract to you, don't overthink it. The TL;DR is that when you create a salsa program, you need to do:
|
||||
If the concept of a jar seems a bit abstract to you, don't overthink it. The TL;DR is that when you create a salsa program, you need to perform the following steps:
|
||||
|
||||
- In each of your crates:
|
||||
- Define a `#[salsa::jar(db = Db)]` struct, typically at `crate::Jar`, and list each of your various salsa-annotated things inside of it.
|
||||
|
|
|
@ -31,8 +31,8 @@ salsa will just clone the result rather than re-execute it.
|
|||
Tracked functions are the core part of how salsa enables incremental reuse.
|
||||
The goal of the framework is to avoid re-executing tracked functions and instead to clone their result.
|
||||
Salsa uses the [red-green algorithm](../reference/algorithm.md) to decide when to re-execute a function.
|
||||
The short version is that a tracked function is re-executed if either (a) it directly reads an input, and that input has changed
|
||||
or (b) it directly invokes another tracked function, and that function's return value has changed.
|
||||
The short version is that a tracked function is re-executed if either (a) it directly reads an input, and that input has changed,
|
||||
or (b) it directly invokes another tracked function and that function's return value has changed.
|
||||
In the case of `parse_statements`, it directly reads `ProgramSource::text`, so if the text changes, then the parser will re-execute.
|
||||
|
||||
By choosing which functions to mark as `#[tracked]`, you control how much reuse you get.
|
||||
|
@ -45,7 +45,7 @@ and because (b) since strings are just a big blob-o-bytes without any structure,
|
|||
Some systems do choose to do more granular reparsing, often by doing a "first pass" over the string to give it a bit of structure,
|
||||
e.g. to identify the functions,
|
||||
but deferring the parsing of the body of each function until later.
|
||||
Setting up a scheme like this is relatively easy in salsa, and uses the same principles that we will use later to avoid re-executing the type checker.
|
||||
Setting up a scheme like this is relatively easy in salsa and uses the same principles that we will use later to avoid re-executing the type checker.
|
||||
|
||||
### Parameters to a tracked function
|
||||
|
||||
|
@ -63,7 +63,7 @@ It's generally better to structure tracked functions as functions of a single sa
|
|||
|
||||
### The `return_ref` annotation
|
||||
|
||||
You may have noticed that `parse_statements` is tagged with `#[salsa::tracked(return_ref)]`.
|
||||
You may have noticed that `parse_statements` is tagged with `#[salsa::tracked(return_ref)]`. [The function isn't actually tagged with `return_ref`; is the text incorrect or the code example?]
|
||||
Ordinarily, when you call a tracked function, the result you get back is cloned out of the database.
|
||||
The `return_ref` attribute means that a reference into the database is returned instead.
|
||||
So, when called, `parse_statements` will return an `&Vec<Statement>` rather than cloning the `Vec`.
|
||||
|
|
|
@ -34,7 +34,7 @@ enum Statement {
|
|||
Print(Expression),
|
||||
}
|
||||
|
||||
/// Defines `fn <name>(<args>) = <body>`
|
||||
/// Defines `fn <name>(<args>) = <body>`
|
||||
struct Function {
|
||||
name: FunctionId,
|
||||
args: Vec<VariableId>,
|
||||
|
@ -75,7 +75,7 @@ The "checker" has the job of ensuring that the user only references variables th
|
|||
We're going to write the checker in a "context-less" style,
|
||||
which is a bit less intuitive but allows for more incremental re-use.
|
||||
The idea is to compute, for a given expression, which variables it references.
|
||||
Then there is a function "check" which ensures that those variables are a subset of those that are already defined.
|
||||
Then there is a function `check` which ensures that those variables are a subset of those that are already defined.
|
||||
|
||||
## Interpreter
|
||||
|
||||
|
|
Loading…
Reference in a new issue