Commit graph

10 commits

Author SHA1 Message Date
Zixuan Chen
9ecc0a90b1
refactor!: Add prelim support, making creating sub container easier (#300)
This PR includes a BREAKING CHANGE.

It enables you to create containers before attaching them to the document, making the API more intuitive and straightforward.

A container can be either attached to a document or detached. When it's detached, its history/state does not persist. You can attach a container to a document by inserting it into an attached container. Once a container is attached, its state, along with all of its descendants's states, will be recreated in the document. After attaching, the container and its descendants, will each have their corresponding "attached" version of themselves?

When a detached container x is attached to a document, you can use x.getAttached() to obtain the corresponding attached container.
2024-03-30 11:38:24 +08:00
Zixuan Chen
1233b8df82
refactor: replace debug-log with tracing (#285)
* refactor: replace debug-log with tracing

* chore: fix setDebug in wasm

* chore: rm useless comments
2024-03-04 20:06:24 +08:00
Zixuan Chen
b8eb57f4a5
Refactor ID (#274)
* refactor: add idlp and add lamport info to snapshot enc

* fix: fix warnings

* fix: idlp err due to incorrect merge

* fix: comments

* test: fix fuzz
2024-02-27 23:36:17 +08:00
Zixuan Chen
bd57eb52b1
refactor: replace i32 with i64 (#269) 2024-02-18 17:27:33 +08:00
Leon Zhao
f6cc5da0f1
refactor: Optimizing Encoding Representation for Child Container Creation to Reduce Document Size (#247)
* refactor: encoding container id

* fix: container indexing when merged ops in encoding

* chore: add compress encode size for draw example

* fix: do not need cids in encoding

* chore: change name containerIdx to containerType in encoding
2024-01-22 12:19:09 +08:00
Zixuan Chen
b4701a4de6
refactor: use rich text style config (#246)
* refactor: use rich text style config

* chore: rm log

* feat: support config text style in wasm

* feat: overlapped styles

* chore: add warning style key cannot contain ':'

* test: refine test case for richtext

* test: refine test
2024-01-17 22:55:46 +08:00
Zixuan Chen
5f1353791d
Remove txn abort and reduce mem usage (#240)
* refactor: rm txn.abort and related undo behavior

* perf: simplify richtext state when there is not styles

* perf: reduce text cost when there is no style

* chore: refine logs

* perf: remove cid in states to reduce mem overhead

* refactor: reduce mem overhead by using a compacter mapvalue

* refactor: rm the box inside richtext state
2024-01-08 17:29:11 +08:00
Zixuan Chen
a30abf7af1
chore: upgrade debug-log to fix type issue 2024-01-05 16:07:47 +08:00
Zixuan Chen
aaec64a503
Update debug-log (#236)
* chore: use new version of debug-log

* fix: rm group_end
2024-01-05 12:12:04 +08:00
Zixuan Chen
bc27a47531
feat: stabilizing encoding (#219)
This PR implements a new encode schema that is more extendible and more compact. It’s also simpler and takes less binary size and maintaining effort. It is inspired by the [Automerge Encoding Format](https://automerge.org/automerge-binary-format-spec/).

The main motivation is the extensibility. When we integrate a new CRDT algorithm, we don’t want to make a breaking change to the encoding or keep multiple versions of the encoding schema in the code, as it will make our WASM size much larger. We need a stable and extendible encoding schema for our v1.0 version.

This PR also exposes the ops that compose the current container state. For example, now you can make a query about which operation a certain character quickly. This behavior is required in the new snapshot encoding, so it’s included in this PR.

# Encoding Schema

## Header

The header has 22 bytes.

- (0-4 bytes) Magic Bytes: The encoding starts with `loro` as magic bytes.
- (4-20 bytes) Checksum: MD5 checksum of the encoded data, including the header starting from 20th bytes. The checksum is encoded as a 16-byte array. The `checksum` and `magic bytes` fields are trimmed when calculating the checksum.
- (20-21 bytes) Encoding Method (2 bytes, big endian): Multiple encoding methods are available for a specific encoding version.

## Encode Mode: Updates

In this approach, only ops, specifically their historical record, are encoded, while document states are excluded.

Like Automerge's format, we employ columnar encoding for operations and changes.

Previously, operations were ordered by their Operation ID (OpId) before columnar encoding. However, sorting operations based on their respective containers initially enhance compression potential.

## Encode Mode: Snapshot

This mode simultaneously captures document state and historical data. Upon importing a snapshot into a new document, initialization occurs directly from the snapshot, bypassing the need for CRDT-based recalculations.

Unlike previous snapshot encoding methods, the current binary output in snapshot mode is compatible with the updates mode. This enhances the efficiency of importing snapshots into non-empty documents, where initialization via snapshot is infeasible. 

Additionally, when feasible, we leverage the sequence of operations to construct state snapshots. In CRDTs, deducing the specific ops constituting the current container state is feasible. These ops are tagged in relation to the container, facilitating direct state reconstruction from them. This approach, pioneered by Automerge, significantly improves compression efficiency.
2024-01-02 17:03:24 +08:00