From 0f7a8c33ae78d2380ba8fcde04284d71997ec6d4 Mon Sep 17 00:00:00 2001 From: Sean Chen Date: Mon, 22 Aug 2022 13:03:03 -0500 Subject: [PATCH] Editing overview, tutorial, and reference sections --- book/src/how_salsa_works.md | 8 ++++---- book/src/overview.md | 22 +++++++++++----------- book/src/reference/algorithm.md | 9 ++++++--- book/src/tutorial/accumulators.md | 7 ++++--- book/src/tutorial/db.md | 6 +++--- book/src/tutorial/debug.md | 4 ++++ book/src/tutorial/ir.md | 13 ++++--------- book/src/tutorial/jar.md | 2 +- book/src/tutorial/parser.md | 8 ++++---- book/src/tutorial/structure.md | 4 ++-- 10 files changed, 43 insertions(+), 40 deletions(-) diff --git a/book/src/how_salsa_works.md b/book/src/how_salsa_works.md index 542ad57d..4b3e8efc 100644 --- a/book/src/how_salsa_works.md +++ b/book/src/how_salsa_works.md @@ -1,8 +1,8 @@ -# How Salsa works +# How salsa works ## Video available -To get the most complete introduction to Salsa's inner works, check +To get the most complete introduction to salsa's inner works, check out [the "How Salsa Works" video](https://youtu.be/_muY4HjSqVw). If you'd like a deeper dive, [the "Salsa in more depth" video](https://www.youtube.com/watch?v=i_IhACacPRY) digs into the @@ -25,7 +25,7 @@ varieties: we'll figure out (fairly intelligently) when we can re-use these memoized values and when we have to recompute them. -## How to use Salsa in three easy steps +## How to use salsa in three easy steps Using salsa is as easy as 1, 2, 3... @@ -48,4 +48,4 @@ things work. ## Digging into the plumbing Check out the [plumbing](plumbing.md) chapter to see a deeper explanation of the -code that salsa generates and how it connects to the salsa library. \ No newline at end of file +code that salsa generates and how it connects to the salsa library. diff --git a/book/src/overview.md b/book/src/overview.md index 2a7c58da..d04ebae1 100644 --- a/book/src/overview.md +++ b/book/src/overview.md @@ -1,11 +1,11 @@ -# Salsa overview +# Overview of salsa {{#include caveat.md}} This page contains a brief overview of the pieces of a salsa program. For a more detailed look, check out the [tutorial](./tutorial.md), which walks through the creation of an entire project end-to-end. -## Goal of Salsa +## Goal of salsa The goal of salsa is to support efficient **incremental recomputation**. salsa is used in rust-analyzer, for example, to help it recompile your program quickly as you type. @@ -28,9 +28,9 @@ Some time later, you modify the input and invoke your program again. In reality, of course, you can have many inputs and "your program" may be many different methods and functions defined on those inputs. But this picture still conveys a few important concepts: -- Salsa separates out the "incremental computation" (the function `your_program`) from some outer loop that is defining the inputs. -- Salsa gives you the tools to define `your_program`. -- Salsa assumes that `your_program` is a purely deterministic function of its inputs, or else this whole setup makes no sense. +- salsa separates out the "incremental computation" (the function `your_program`) from some outer loop that is defining the inputs. +- salsa gives you the tools to define `your_program`. +- salsa assumes that `your_program` is a purely deterministic function of its inputs, or else this whole setup makes no sense. - The mutation of inputs always happens outside of `your_program`, as part of this master loop. ## Database @@ -41,7 +41,7 @@ The database is also used to implement interning (making a canonical version of ## Inputs -Every Salsa program begins with an **input**. +Every salsa program begins with an **input**. Inputs are special structs that define the starting point of your program. Everything else in your program is ultimately a deterministic function of these inputs. @@ -66,9 +66,9 @@ let file: ProgramFile = ProgramFile::new( ); ``` -### Salsa structs are just an integer +### salsa structs are just an integer -The `ProgramFile` struct generates by the `salsa::input` macro doesn't actually store any data. It's just a newtyped integer id: +The `ProgramFile` struct generated by the `salsa::input` macro doesn't actually store any data. It's just a newtyped integer id: ```rust // Generated by the `#[salsa::input]` macro: @@ -210,7 +210,7 @@ struct Item { } ``` -### Specified the result of tracked functions for particular structs +### Specify the result of tracked functions for particular structs Sometimes it is useful to define a tracked function but specify its value for some particular struct specially. For example, maybe the default way to compute the representation for a function is to read the AST, but you also have some built-in functions in your language and you want to hard-code their results. @@ -235,7 +235,7 @@ fn create_builtin_item(db: &dyn crate::Db) -> Item { } ``` -Specifying is only possible for tracked functions that take a single tracked struct as argument (besides the database). +Specifying is only possible for tracked functions that take a single tracked struct as an argument (besides the database). ## Interned structs @@ -263,7 +263,7 @@ let w2 = Word::new(db, "bar".to_string()); let w3 = Word::new(db, "foo".to_string()); ``` -When you create two interned structs with the same field values, you are guaranted to get back the same integer id. So here, we know that `assert_eq!(w1, w3)` is true and `assert_ne!(w1, w2)`. +When you create two interned structs with the same field values, you are guaranteed to get back the same integer id. So here, we know that `assert_eq!(w1, w3)` is true and `assert_ne!(w1, w2)`. You can access the fields of an interned struct using a getter, like `word.text(db)`. These getters respect the `#[return_ref]` annotation. Like tracked structs, the fields of interned structs are immutable. diff --git a/book/src/reference/algorithm.md b/book/src/reference/algorithm.md index 20b6e746..871037ae 100644 --- a/book/src/reference/algorithm.md +++ b/book/src/reference/algorithm.md @@ -20,7 +20,7 @@ fn parse_module(db: &dyn Db, module: Module) -> Ast { Ast::parse_text(module_text) } -#[salsa::tracked(ref)] +#[salsa::tracked(return_ref)] fn module_text(db: &dyn Db, module: Module) -> String { panic!("text for module `{module:?}` not set") } @@ -65,7 +65,11 @@ If the module text is changed, we saw that we have to re-execute `parse_module`, ## Durability: an optimization -As an optimization, salsa includes the concept of **durability**. When you set the value of a tracked function, you can also set it with a given _durability_: +As an optimization, salsa includes the concept of **durability**, which is the notion of how often some piece of tracked data changes. + +For example, when compiling a Rust program, you might mark the inputs from crates.io as _high durability_ inputs, since they are unlikely to change. The current workspace could be marked as _low durability_, since changes to it are happening all the time. + +When you set the value of a tracked function, you can also set it with a given _durability_: ```rust module_text::set_with_durability( @@ -78,4 +82,3 @@ module_text::set_with_durability( For each durability, we track the revision in which _some input_ with that durability changed. If a tracked function depends (transitively) only on high durability inputs, and you change a low durability input, then we can very easily determine that the tracked function result is still valid, avoiding the need to traverse the input edges one by one. -An example: if compiling a Rust program, you might mark the inputs from crates.io as _high durability_ inputs, since they are unlikely to change. The current workspace could be marked as _low durability_. diff --git a/book/src/tutorial/accumulators.md b/book/src/tutorial/accumulators.md index 319cecac..7352ab6c 100644 --- a/book/src/tutorial/accumulators.md +++ b/book/src/tutorial/accumulators.md @@ -3,9 +3,10 @@ The last interesting case in the parser is how to handle a parse error. Because salsa functions are memoized and may not execute, they should not have side-effects, so we don't just want to call `eprintln!`. -If we did so, the error would only be reported the first time the function was called. +If we did so, the error would only be reported the first time the function was called, but not +on subsequent calls in the situation where the simply returns its memoized value. -Salsa defines a mechanism for managing this called an **accumulator**. +salsa defines a mechanism for managing this called an **accumulator**. In our case, we define an accumulator struct called `Diagnostics` in the `ir` module: ```rust @@ -15,7 +16,7 @@ In our case, we define an accumulator struct called `Diagnostics` in the `ir` mo Accumulator structs are always newtype structs with a single field, in this case of type `Diagnostic`. Memoized functions can _push_ `Diagnostic` values onto the accumulator. Later, you can invoke a method to find all the values that were pushed by the memoized functions -or any function that it called +or any functions that they called (e.g., we could get the set of `Diagnostic` values produced by the `parse_statements` function). The `Parser::report_error` method contains an example of pushing a diagnostic: diff --git a/book/src/tutorial/db.md b/book/src/tutorial/db.md index cfb46fa6..359d7e99 100644 --- a/book/src/tutorial/db.md +++ b/book/src/tutorial/db.md @@ -14,11 +14,11 @@ In `calc`, the database struct is in the [`db`] module, and it looks like this: ``` The `#[salsa::db(...)]` attribute takes a list of all the jars to include. -The struct must have a field named `storage` whose types is `salsa::Storage`, but it can also contain whatever other fields you want. +The struct must have a field named `storage` whose type is `salsa::Storage`, but it can also contain whatever other fields you want. The `storage` struct owns all the data for the jars listed in the `db` attribute. The `salsa::db` attribute autogenerates a bunch of impls for things like the `salsa::HasJar` trait that we saw earlier. -This means that +This means that ... [what goes here?] ## Implementing the `salsa::Database` trait @@ -44,7 +44,7 @@ It's not required, but implementing the `Default` trait is often a convenient wa {{#include ../../../calc-example/calc/src/db.rs:default_impl}} ``` -## Implementing the traits for each Jar +## Implementing the traits for each jar The `Database` struct also needs to implement the [database traits for each jar](./jar.md#database-trait-for-the-jar). In our case, though, we already wrote that impl as a [blanket impl alongside the jar itself](./jar.md#implementing-the-database-trait-for-the-jar), diff --git a/book/src/tutorial/debug.md b/book/src/tutorial/debug.md index d7302c2b..48053103 100644 --- a/book/src/tutorial/debug.md +++ b/book/src/tutorial/debug.md @@ -21,6 +21,9 @@ and get back the output you expect. ## Implementing the `DebugWithDb` trait For now, unfortunately, you have to implement the `DebugWithDb` trait manually, as we do not provide a derive. + +> You can find the tracking issue for addressing this [here][debug_with_db_issue]. + This is tedious but not difficult. Here is an example of implementing the trait for `Expression`: ```rust @@ -33,6 +36,7 @@ Some things to note: - The [`Formatter`] methods (e.g., [`debug_tuple`]) can be used to provide consistent output. - When printing the value of a field, use `.field(&a.debug(db))` for fields that are themselves interned or entities, and use `.field(&a)` for fields that just implement the ordinary `Debug` trait. +[debug_with_db_issue]: https://github.com/salsa-rs/salsa/issues/317 [`debug_tuple`]: https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_tuple [`formatter`]: https://doc.rust-lang.org/std/fmt/struct.Formatter.html# diff --git a/book/src/tutorial/ir.md b/book/src/tutorial/ir.md index 98391a73..1520ab6b 100644 --- a/book/src/tutorial/ir.md +++ b/book/src/tutorial/ir.md @@ -4,7 +4,7 @@ Before we can define the [parser](./parser.md), we need to define the intermedia In the [basic structure](./structure.md), we defined some "pseudo-Rust" structures like `Statement` and `Expression`; now we are going to define them for real. -## "Salsa structs" +## "salsa structs" In addition to regular Rust types, we will make use of various **salsa structs**. A salsa struct is a struct that has been annotated with one of the salsa annotations: @@ -95,17 +95,12 @@ Apart from the fields being immutable, the API for working with a tracked struct ## Representing functions We will also use a tracked struct to represent each function: -Next we will define a **tracked struct**. -Whereas inputs represent the *start* of a computation, tracked structs represent intermediate values created during your computation. - The `Function` struct is going to be created by the parser to represent each of the functions defined by the user: ```rust {{#include ../../../calc-example/calc/src/ir.rs:functions}} ``` -Like with an input, the fields of a tracked struct are also stored in the database. -Unlike an input, those fields are immutable (they cannot be "set"), and salsa compares them across revisions to know when they have changed. If we had created some `Function` instance `f`, for example, we might find that `the f.body` field changes because the user changed the definition of `f`. This would mean that we have to re-execute those parts of the code that depended on `f.body` @@ -156,11 +151,11 @@ However, if this happens, it will also be sure to re-execute every function that just a different one than they saw in a previous revision. In other words, within a salsa computation, you can assume that interning produces a single consistent integer, and you don't have to think about it. -If however you export interned identifiers outside the computation, and then change the inputs, they may not longer be valid or may refer to different values. +If, however, you export interned identifiers outside the computation, and then change the inputs, they may no longer be valid or may refer to different values. ### Expressions and statements -We'll won't use any special "salsa structs" for expressions and statements: +We won't use any special "salsa structs" for expressions and statements: ```rust {{#include ../../../calc-example/calc/src/ir.rs:statements_and_expressions}} @@ -170,4 +165,4 @@ Since statements and expressions are not tracked, this implies that we are only whenever anything in a function body changes, we consider the entire function body dirty and re-execute anything that depended on it. It usually makes sense to draw some kind of "reasonably coarse" boundary like this. -One downside of the way we have set things up: we inlined the position into each of the structs. +One downside of the way we have set things up: we inlined the position into each of the structs [what exactly does this mean?]. diff --git a/book/src/tutorial/jar.md b/book/src/tutorial/jar.md index c8cbddb0..ebae1042 100644 --- a/book/src/tutorial/jar.md +++ b/book/src/tutorial/jar.md @@ -62,7 +62,7 @@ and that's what we do here: ## Summary -If the concept of a jar seems a bit abstract to you, don't overthink it. The TL;DR is that when you create a salsa program, you need to do: +If the concept of a jar seems a bit abstract to you, don't overthink it. The TL;DR is that when you create a salsa program, you need to perform the following steps: - In each of your crates: - Define a `#[salsa::jar(db = Db)]` struct, typically at `crate::Jar`, and list each of your various salsa-annotated things inside of it. diff --git a/book/src/tutorial/parser.md b/book/src/tutorial/parser.md index 301a6daa..954cc543 100644 --- a/book/src/tutorial/parser.md +++ b/book/src/tutorial/parser.md @@ -31,8 +31,8 @@ salsa will just clone the result rather than re-execute it. Tracked functions are the core part of how salsa enables incremental reuse. The goal of the framework is to avoid re-executing tracked functions and instead to clone their result. Salsa uses the [red-green algorithm](../reference/algorithm.md) to decide when to re-execute a function. -The short version is that a tracked function is re-executed if either (a) it directly reads an input, and that input has changed -or (b) it directly invokes another tracked function, and that function's return value has changed. +The short version is that a tracked function is re-executed if either (a) it directly reads an input, and that input has changed, +or (b) it directly invokes another tracked function and that function's return value has changed. In the case of `parse_statements`, it directly reads `ProgramSource::text`, so if the text changes, then the parser will re-execute. By choosing which functions to mark as `#[tracked]`, you control how much reuse you get. @@ -45,7 +45,7 @@ and because (b) since strings are just a big blob-o-bytes without any structure, Some systems do choose to do more granular reparsing, often by doing a "first pass" over the string to give it a bit of structure, e.g. to identify the functions, but deferring the parsing of the body of each function until later. -Setting up a scheme like this is relatively easy in salsa, and uses the same principles that we will use later to avoid re-executing the type checker. +Setting up a scheme like this is relatively easy in salsa and uses the same principles that we will use later to avoid re-executing the type checker. ### Parameters to a tracked function @@ -63,7 +63,7 @@ It's generally better to structure tracked functions as functions of a single sa ### The `return_ref` annotation -You may have noticed that `parse_statements` is tagged with `#[salsa::tracked(return_ref)]`. +You may have noticed that `parse_statements` is tagged with `#[salsa::tracked(return_ref)]`. [The function isn't actually tagged with `return_ref`; is the text incorrect or the code example?] Ordinarily, when you call a tracked function, the result you get back is cloned out of the database. The `return_ref` attribute means that a reference into the database is returned instead. So, when called, `parse_statements` will return an `&Vec` rather than cloning the `Vec`. diff --git a/book/src/tutorial/structure.md b/book/src/tutorial/structure.md index d067b909..9b5ad003 100644 --- a/book/src/tutorial/structure.md +++ b/book/src/tutorial/structure.md @@ -34,7 +34,7 @@ enum Statement { Print(Expression), } - /// Defines `fn () = ` +/// Defines `fn () = ` struct Function { name: FunctionId, args: Vec, @@ -75,7 +75,7 @@ The "checker" has the job of ensuring that the user only references variables th We're going to write the checker in a "context-less" style, which is a bit less intuitive but allows for more incremental re-use. The idea is to compute, for a given expression, which variables it references. -Then there is a function "check" which ensures that those variables are a subset of those that are already defined. +Then there is a function `check` which ensures that those variables are a subset of those that are already defined. ## Interpreter