Commit graph

21 commits

Author SHA1 Message Date
Gábor Szabó
d958767b13
add repository to Cargo.toml (#358) 2024-05-10 21:14:17 +08:00
Zixuan Chen
0660b1a1be
fix: upgrade wasm-bindgen to fix str free err (#353)
* fix: upgrade wasm-bindgen to fix str free err

* chore: fix ci
2024-05-09 15:22:34 +08:00
Zixuan Chen
4021ad820c
chore: loro-rs v0.5.1 2024-05-06 13:37:55 +08:00
Zixuan Chen
5660bcc4a5
chore: bump loro-rs to v0.5.0 2024-04-29 18:09:07 +08:00
Zixuan Chen
454b4088a6
chore(rs): bump versions of rust crates 2024-04-09 16:23:48 +08:00
Leon Zhao
51890ff8d8
fix: decode iter return result by updating columnar to 0.3.4 (#309) 2024-04-01 17:29:07 +08:00
Zixuan Chen
06e3a5420d
refactor: reduce tracker mem usage by using nonmax id (#282) 2024-02-29 22:55:57 +08:00
Zixuan Chen
f648b353ad
chore: upgrade rust crates 2024-02-16 11:30:56 +08:00
Zixuan Chen
57287fa6d8
chore: add pkg info 2024-02-16 11:25:12 +08:00
Zixuan Chen
0bcc3bd56d
chore: upgrade wasm-bindgen to 0.2.90 (#262) 2024-01-29 22:40:33 +08:00
Zixuan Chen
a6be7d2ea6
refactor: make InternalString an internal struct (#233) 2024-01-02 20:11:22 +08:00
Zixuan Chen
bc27a47531
feat: stabilizing encoding (#219)
This PR implements a new encode schema that is more extendible and more compact. It’s also simpler and takes less binary size and maintaining effort. It is inspired by the [Automerge Encoding Format](https://automerge.org/automerge-binary-format-spec/).

The main motivation is the extensibility. When we integrate a new CRDT algorithm, we don’t want to make a breaking change to the encoding or keep multiple versions of the encoding schema in the code, as it will make our WASM size much larger. We need a stable and extendible encoding schema for our v1.0 version.

This PR also exposes the ops that compose the current container state. For example, now you can make a query about which operation a certain character quickly. This behavior is required in the new snapshot encoding, so it’s included in this PR.

# Encoding Schema

## Header

The header has 22 bytes.

- (0-4 bytes) Magic Bytes: The encoding starts with `loro` as magic bytes.
- (4-20 bytes) Checksum: MD5 checksum of the encoded data, including the header starting from 20th bytes. The checksum is encoded as a 16-byte array. The `checksum` and `magic bytes` fields are trimmed when calculating the checksum.
- (20-21 bytes) Encoding Method (2 bytes, big endian): Multiple encoding methods are available for a specific encoding version.

## Encode Mode: Updates

In this approach, only ops, specifically their historical record, are encoded, while document states are excluded.

Like Automerge's format, we employ columnar encoding for operations and changes.

Previously, operations were ordered by their Operation ID (OpId) before columnar encoding. However, sorting operations based on their respective containers initially enhance compression potential.

## Encode Mode: Snapshot

This mode simultaneously captures document state and historical data. Upon importing a snapshot into a new document, initialization occurs directly from the snapshot, bypassing the need for CRDT-based recalculations.

Unlike previous snapshot encoding methods, the current binary output in snapshot mode is compatible with the updates mode. This enhances the efficiency of importing snapshots into non-empty documents, where initialization via snapshot is infeasible. 

Additionally, when feasible, we leverage the sequence of operations to construct state snapshots. In CRDTs, deducing the specific ops constituting the current container state is feasible. These ops are tagged in relation to the container, facilitating direct state reconstruction from them. This approach, pioneered by Automerge, significantly improves compression efficiency.
2024-01-02 17:03:24 +08:00
Zixuan Chen
564dde7703
chore: publish mvp rust api 2023-11-28 21:29:11 +08:00
Zixuan Chen
6ef1e12d71
chore: rm zerovec 2023-11-28 21:01:01 +08:00
leeeon233
c4b753dfd8 chore: add license 2023-11-12 23:23:12 +08:00
Zixuan Chen
a40b5c6e4a
feat: support richtext in wasm & mark text with arbitrary value (#142)
- Support mark text with custom value [LORO-299] Allow users to mark text with custom value #139
- Expose richtext in wasm
2023-11-02 14:20:34 +08:00
Zixuan Chen
f208744ec1
feat: encode/decode v2 2023-08-29 15:13:52 +08:00
Zixuan Chen
5ee860b74e chore: use serde 1 2023-08-04 11:09:32 +08:00
Zixuan Chen
6983a2b00c refactor: mov loro value to loro_common 2023-07-15 00:47:47 +08:00
Zixuan Chen
fc49b4b3b4 refactor: mov important basic types into loro-common 2023-07-14 16:38:53 +08:00
Zixuan Chen
dde0152912 refactor: prepare for snapshot encoding 2023-07-14 16:05:06 +08:00