Commit graph

58 commits

Author SHA1 Message Date
Leon Zhao
db2353192a feat: export diff batch to ffi 2025-01-15 15:05:13 +08:00
Zixuan Chen
5688a017d6
Fix warnings (#484)
* fix: warnings

* fix: warnings
2024-09-29 21:15:19 +08:00
Zixuan Chen
bef39ce6b5
Feat: allow editing on detached mode (#473)
* refactor: add detached editing config and prepare the architecture for editing detached doc

* feat: subscribe for peer id change

* fix: undo after checkout & add tests for detached editing

* test: add fuzzer for detached editing

* feat: expose detached editing configure to wasm

* test: add wasm test for detached editing
2024-09-24 11:16:59 +08:00
Zixuan Chen
d460346585
feat: jsonpath experimental support (#466)
* feat: jsonpath experimental support

* fix: add support for negative index unionindex and unionkey

* chore: export lorodoc in loro-js and fix a few tests related to map's entries order

* chore: fix type err
2024-09-19 19:22:39 +08:00
Zixuan Chen
a9d4de7a18
Test: Add fuzzing tests for gc mode correctly & fix several failed cases (#461)
* test: fuzz gc correctly

* fix: lots of gc snapshot issues

* fix: vv to frontiers

* test: add an arbtest for gc fuzzing tests

* test: fix a few test issues

* fix: apply diff of a dangling container that was deleted before trimmed version
2024-09-12 20:07:57 +08:00
Zixuan Chen
21e3ffea45
perf: refine state fast snapshot & fix a few tree event issues (#459)
* perf: refine state fast snapshot

* fix: tree apply diff err

* fix: get child index

* fix: use better tree event
2024-09-11 22:54:14 +08:00
Zixuan Chen
68ffd31ba0
Feat when exporting gc snapshot with short history, don't encode the latest state (#445)
* feat: export gc snapshot without latest state when ops len is small

* test: add a test for time tracker usage

* chore: add gc snapshot bench

* chore: simplify

* test: record timestamp for time tracker
2024-09-04 21:00:17 +08:00
Zixuan Chen
4c325ab87c
Merge branch 'dev' into feat-gc 2024-09-03 21:21:14 +08:00
Zixuan Chen
08d53cae93
feat: subscribe for local updates (#444) 2024-09-03 14:04:43 +08:00
Leon Zhao
c1620fdb37
feat: memkv export import all (#422)
* feat: sstable

* fix: add magic number version

* feat: new mem kv store based sstable

* feat: binary_search

* fix: sstable iter scan

* fix: new mem kv

* feat: add cache for sstable

* fix: encode schema comment

* fix: sstable iter scan

* chore: clean

* fix: export all

* fix: sstable scan bound

* fix: sstable iter scan next==prev

* fix: merge iter next_back

* fix: mem kv export

* chore: clean

* fix: prev to key

* fix: prev find block

* fix: get prev block idx

* refactor: kv store

* fix: checksum when import

* fix: meta first last key

* Revert "fix: meta first last key"

This reverts commit a069c1ed37.

* fix: skip empty iter

* fix: remove key from large block

* chore: comment

* feat: compress block

* fix: remove key in large block

* chore: const

* doc: intro sstable encode

* test: add kv store fuzz

* style: format file

* feat: add fuzz to kv store (#428)

* fix: kv fuzzer

* fix: debug

* bk

* fix: block iter next back

* fix: block prev iter left = next idx

* feat: move kv store a crate

* fix: remove value len from normal block

* doc: sstable format

* test: add more test

* test: add test

* feat: new merge iter

* chore: revert

* fix: rename next back

* fix: rename mem sstable

* fix: rename to mem

* fix: use Bytes as key

* fix: use simple merge iter

* feat: compress option

* fix: remove empty iter

* style: refine some impl details

* fix: large block compress

* feat: use write read for encode

* doc: refine doc

* fix: simplify the first chunk

* feat: import many times

* refactor: refine styles

* test: fuzz merge iter

* fix: rename peek_xxx()

* fix: better sstable iter inner

* fix: use mem kv store

* pref: mem kv store

* perf: export mem kv

* chore: clean

---------

Co-authored-by: Zixuan Chen <remch183@outlook.com>
2024-08-30 11:44:34 +08:00
Zixuan Chen
e084eca580
Merge branch 'main' into dev 2024-08-25 21:12:07 +08:00
Zixuan Chen
1812caea65
Refactor: use kv internally for docstate (#426)
* refactor: use kv in state

* refactor: do not load the state into the inner fxhashmap if not needed

* refactor: calc offset without unsafe code

* style: replace unsafe code
2024-08-24 14:16:06 +08:00
Zixuan Chen
ea5f91f6a6
chore: fix typos 2024-08-19 11:36:59 +08:00
东灯
8f3234a7fe
chore: add test tools (#410) 2024-07-25 19:14:02 +08:00
Zixuan Chen
a276010128
feat: replace states with container store 2024-06-20 13:17:32 +08:00
Zixuan Chen
9c050b8b0b
refactor: fix name err & add counter state fast snapshot 2024-06-18 18:17:46 +08:00
Leon Zhao
fffd49b5fa
Use fractional index to order the children of the tree (#298)
* feat: fractional index

---------

Co-authored-by: Zixuan Chen <remch183@outlook.com>
2024-05-07 14:01:13 +08:00
Zixuan Chen
31a8569840
Movable List (#293)
* bk: add move op content

* bk: add inner_movable_list diff and related stuff

* perf: high perf state

* fix: update old list item cache

* fix: should use id in del

* feat: two kinds of len for movable list state

* bk: add op index to movable list

* bk: make basic handler test pass

* refactor: add move_from to list event

* fix: make all existing tests pass

* bk: list move event hint into event

* bk: convert inner event into user event

Co-authored-by: Leon <leeeon233@gmail.com>

* fix: convert issue when inserting new value

* feat: add op group for movable list

* feat: diff calc

* feat: add mov support to tracker

* fix: when applying diff, state should be force update

* feat: encoded op

* feat: snapshot encode

* fix: pass basic sync

* fix: snapshot encode/decode

* fix: warnings

* feat: expose mov list to loro crate

* test: fuzz movable list

* test: fix fuzz integration

* fix: movable list basic move sync

* fix: movable list events

* fix: movable event err

* fix: register child container on movable list

* fix: should not return child index if the value is already overwritten

* fix: local event err in movable list

* fix: get elem at pos

* refactor: extract mut op that could break invariants

* fix: event err

* fix: child container to elem err

* fix: bringback event issue

* fix: event err

* fix: event emit

* fix: id to cursor iter issue

* chore: fix a few warnings

* fix: warnings

* fix: fix move in tracker

* test: add consistency check

* test: fix tracker

* refactor: simplify event conversion in docstate

* refactor: refine move event

* refactor: simplify the maintain of parent child links

* fix: revive err

* fix: warnings

* fix: it's possible that pos change but cannot find the respective list item

* fix: elem may be dropped after snapshot

* fix: warnings

* fix: richtext time travel issue

* fix: move op used wrong delete id on tracker

* fix: handle events created by concurrent moves correctly

* fix: event hint error, used op index for list event

* fix: move_from flag err

* fix: id to cursor get err

* test: add mov fuzz target

* fix: the pos of inserting new container

* fix: used wrong event hint index

* fix: del event hint

* fix: warnings

* fix: internal diff to event err

* fix: event's move flag error
This "move" flag does not actually mean that the insertion
is caused by the move op.
就算是 move 造成的它不一定就能是 true
它得是下游真的能在“前一个版本的 array 里找到“,才能是 true

* fix: remove redundant elements from the movable list

The Movable List is currently flawed; an element may not exist on the movable list state, yet there are operations that revive its corresponding list item. In such cases, the diff calculation does not send back the corresponding element state (this occurs when tracing back, which fuzz testing currently does not cover. It might only be exposed by randomly switching to a version and then checking for consistency; otherwise, as long as all elements are in memory, this problem does not arise).

Moreover, there is no need to store elements in the state that do not have a corresponding list item. They will be deleted during the Snapshot, and relying on "them still being in the state" is incorrect behavior. Such adjustments also eliminate the need to maintain the `pending_elements` field.

By allowing the opgroup to record the mapping from pos id to state id, we can ensure that the events sent to the movable list state will include the corresponding state.

Movable List 现在是有错的,elem 可能不存在 movable list state 上,但是又有操作把它对应的 list item 复活了,此时 diff calc 不会把对应 element 状态发送回来(往前回溯的时候会出现,fuzz 现在没覆盖到。得有随意切换一个版本然后 check consistency 才可能会暴露;否则现在大家 elements 都在内存,就没这个问题)

而且我们没有必要在状态中存储没有对应 list item 的 element。在 Snapshot 的时候它们都会被删掉,如果依赖了“它们还会在 state 内”就是错误的行为。这样的调整也让我们不需要去维护 pending_elements 这个 field 了

通过让 opgroup 记录了 pos id → state id 的映射,可以保证发给 movable list state 的事件中会带上对应的 state

* test: make fuzzer stricter

* test: test expectation error

* refactor: rename stable pos to cursor

* tests: chore list bench init

* test: add bench

* bench: add mov & set bench

* feat(wasm): movable list js api

* fix: make movablelist able to attach even if it's already attached & refine the type of subscribe

* fix: remove the loro doc param in .unsub

* refactor: refine ts types and export setContainer api

* chore: fix warnings

* chore: rm debug logs

* perf: reduce mem usage of opgroup

* bench: add list criterion bench

---------

Co-authored-by: Leon <leeeon233@gmail.com>
2024-04-26 12:08:53 +08:00
Zixuan Chen
edb0ef75f6
chore: Update VSCode settings and dependencies (#299) 2024-03-30 06:53:22 +08:00
Leon Zhao
a47cf06712
Refactor fuzzing test (#271)
* feat: new fuzz test

* test: add arbtest

* fix: remove PROPTEST_FACTOR
2024-03-08 16:40:06 +08:00
Zixuan Chen
08847d6639
refactor: rename client_id in idspan to peer (#287)
* refactor: rename client_id in idspan to peer

* fix: type err
2024-03-02 19:10:33 +08:00
Leon Zhao
dcbdd55195
feat: remove deleted set in tree state and optimize api (#259)
Co-authored-by: Zixuan Chen <me@zxch3n.com>
2024-01-30 09:54:54 +08:00
Zixuan Chen
9e57ccbc00
Fix avoid rich text apply diff err when time travel (#256)
* fix: avoid enter invalid richtext state

* fix: only include the style when the doc contains both style start and style end

* fix: iter_range err in richtext state

* fix: richtext state iter range

* fix: iter range err

* fix: iter range

* chore: rm log

* fix: iter range

* fix: get affected range

* fix: return err if given checkout target is invalid
2024-01-21 19:51:27 +08:00
Zixuan Chen
1295ac6d61
(wasm) Extract VersionVector class and fix inconsistent PeerID repr (#249)
* refactor(wasm): extract VersionVector class and fix inconsistent PeerID in wasm

* fix: example type err

* fix: binding err

* fix: peer id repr should be consistent

* test: update tests
2024-01-18 13:28:28 +08:00
Leon Zhao
692c5e3436
feat: group ops (#243) 2024-01-12 16:47:44 +08:00
Zixuan Chen
b8cf4dc4c3
Refine the new encoding schema (#244)
* perf: refine the new encoding schema

* chore: rm auto derived fromprimitive and toprimitive from encode mode
2024-01-11 22:49:18 +08:00
Zixuan Chen
bc27a47531
feat: stabilizing encoding (#219)
This PR implements a new encode schema that is more extendible and more compact. It’s also simpler and takes less binary size and maintaining effort. It is inspired by the [Automerge Encoding Format](https://automerge.org/automerge-binary-format-spec/).

The main motivation is the extensibility. When we integrate a new CRDT algorithm, we don’t want to make a breaking change to the encoding or keep multiple versions of the encoding schema in the code, as it will make our WASM size much larger. We need a stable and extendible encoding schema for our v1.0 version.

This PR also exposes the ops that compose the current container state. For example, now you can make a query about which operation a certain character quickly. This behavior is required in the new snapshot encoding, so it’s included in this PR.

# Encoding Schema

## Header

The header has 22 bytes.

- (0-4 bytes) Magic Bytes: The encoding starts with `loro` as magic bytes.
- (4-20 bytes) Checksum: MD5 checksum of the encoded data, including the header starting from 20th bytes. The checksum is encoded as a 16-byte array. The `checksum` and `magic bytes` fields are trimmed when calculating the checksum.
- (20-21 bytes) Encoding Method (2 bytes, big endian): Multiple encoding methods are available for a specific encoding version.

## Encode Mode: Updates

In this approach, only ops, specifically their historical record, are encoded, while document states are excluded.

Like Automerge's format, we employ columnar encoding for operations and changes.

Previously, operations were ordered by their Operation ID (OpId) before columnar encoding. However, sorting operations based on their respective containers initially enhance compression potential.

## Encode Mode: Snapshot

This mode simultaneously captures document state and historical data. Upon importing a snapshot into a new document, initialization occurs directly from the snapshot, bypassing the need for CRDT-based recalculations.

Unlike previous snapshot encoding methods, the current binary output in snapshot mode is compatible with the updates mode. This enhances the efficiency of importing snapshots into non-empty documents, where initialization via snapshot is infeasible. 

Additionally, when feasible, we leverage the sequence of operations to construct state snapshots. In CRDTs, deducing the specific ops constituting the current container state is feasible. These ops are tagged in relation to the container, facilitating direct state reconstruction from them. This approach, pioneered by Automerge, significantly improves compression efficiency.
2024-01-02 17:03:24 +08:00
Leon zhao
acafc76aff
Feat: diff calc bring back & tree new event and value (#149)
* feat: new tree state

* fix: emit meta event

* fix: semantic tree event

* fix: diff calc bring_back

* chore: clear comments

* fix: merge

* fix: tree snapshot

* fix: filter empty bring back

* feat: tree add external diff

* fix: imported changes were not mergeable (#147)

* fix: imported changes were not mergeable
now the small encoding size is supported in example

* fix: stupid err in richtext checkout

* fix: rle oplog encode err
- support pending changes
- start counters were wrong

* fix: utf16 query err (#151)

* fix: tree movable node lamport

* fix: merge

* perf: bring back

* doc: add deep value meta doc

* refactor: bring back only when record diff

---------

Co-authored-by: Zixuan Chen <remch183@outlook.com>
2023-11-05 15:53:33 +08:00
Zixuan Chen
7a19b49acb
Add richtext example using Quill (#145)
* feat: richtext example init

* fix: pass richtext event delta consistency check

* chore: debug history
2023-11-03 16:59:27 +08:00
Zixuan Chen
95e6130d93
Fix: richtext event (#138)
Support rich text event. Now it will emit the delta event correctly in the Quill Delta format.
2023-11-01 20:02:05 +08:00
Zixuan Chen
734b832c00
Fix checkout event (#126)
* tests: add checkout err tests

* fix: checkout event err when create child
2023-10-30 14:16:50 +08:00
Zixuan Chen
d942e3d7a2
Feat: Peritext-like rich text support (#123)
* feat: richtext wip

* feat: add insert to style range map wip

* feat: richtext state

* fix: fix style state inserting and style map

* fix: tiny vec merge err

* fix: comment err

* refactor: use new generic-btree & refine impl

* feat: fugue tracker

* feat: tracker

* feat: tracker

* fix: fix a few err in impl

* feat: init richtext content state

* feat: refactor arena

* feat: extract anchor_type info out of style flag

* refactor: state apply op more efficiently
we can now reuse the repr in state and op

* fix: new clippy errors

* refactor: use state chunk as delta item

* refactor: use two op to insert style start and style end

* feat: diff calc

* feat: handler

* fix: tracker checkout err

* fix: pass basic richtext handler tests

* fix: pass handler basic marking tests

* fix: pass all peritext criteria

* feat: snapshot encoding for richtext init

* refactor: replace Text with Richtext

* refacotr: rm text code

* fix: richtext checkout err

* refactor: diff of text and map

* refactor: del span

* refactor: event

* fix: fuzz err

* fix: pass all tests

* fix: fuzz err

* fix: list child cache err

* chore: rm debug code

* fix: encode enhanced err

* fix: encode enchanced

* fix: fix several richtext issue

* fix: richtext anchor err

* chore: rm debug code

* fix: richtext fuzz err

* feat: speedup text snapshot decode

* perf: optimize snapshot encoding

* perf: speed up decode & insert

* fix: fugue span merge err

* perf: speedup delete & id cursor map

* fix: fugue merge err

* chore: update utils

* perf: speedup text insert / del

* fix: cursor cache

* perf: reduce conversion by introducing InsertText

* perf: speed up by refined cursor cache

* chore: update gbtree dep

* refactor(wasm): use quill delta format

* chore: fix warnings
2023-10-29 14:02:13 +08:00
leeeon233
f1adc7d15d chore: use columnar 0.3.2 2023-09-11 14:54:54 +08:00
Zixuan Chen
72cc8c6ed5
fix: map version checkout err (#101) 2023-08-04 22:41:02 +08:00
Zixuan Chen
c105ff2220
Feat: checkout to target version & use unicode index by default (#98)
* feat: checkout to frontiers

* feat: record timestamp

* fix: use unicode len by default for text
now "你好" has length of 2 instead of 6

* chore: rm dbg!
2023-08-04 10:45:23 +08:00
Zixuan Chen
abec22cd22 fix: text sync issues 2023-07-12 12:30:36 +08:00
Zixuan Chen
508ca4b5c6 refactor: use a new version of txn 2023-07-10 12:06:11 +08:00
Zixuan Chen
abd3e38253 chore: bk 2023-07-02 23:24:17 +08:00
Zixuan Chen
c50294ac22 feat: use text tracker diff 2023-06-29 16:09:42 +08:00
Zixuan Chen
490a54d559 feat: expose from loro crate 2023-03-21 11:09:12 +08:00
leeeon233
9544e27be4 feat: add delta compose 2023-03-01 14:12:05 +08:00
leeeon233
5b6f864479 feat: init nodejs bindgen 2023-02-16 11:22:50 +08:00
Zixuan Chen
9748779f08
Bench: report (#49) 2022-12-27 14:18:46 +08:00
Zixuan Chen
594b60dafb
Perf store cache in parent node (#36)
* refactor: make internal and leaf use same type of cache

* refactor: add cache update

* test : add normalization to arb test

* test: fuzz

* fix: internal insert bug

* fix: missing utf16

* test: fix test sub overflow

* feat: use heapless for binary heap

* refactor: refine warning

* test: reduce test time

* perf: reduce computation when finding pos

* bench: fix ignore parse time in benching

* feat: make it compile in new sig (should be merged)

* fix: type err

* fix: fix type err

* fix: cache when merge & borrow

* refactor: simplify code

* fix: cumulated tree trait bug

* fix: a few fatal bugs (still buggy)

* fix: global tree trait

* refactor: rm an unused fn

* fix: insert at cursor bug

* fix: in cursor insert cache may be invalid

strip the checker there

* chore: remove needless check

* refactor: add inline to methods

* test: remove cfg=mem for mem example

* fix: type err
2022-12-06 16:34:46 +08:00
Zixuan Chen
7a6e50931d chore: replace justfile with deno task 2022-11-21 12:50:15 +08:00
Zixuan Chen
8f7a5a08e0 refactor: fix warning and remove dead codes 2022-11-18 21:16:29 +08:00
Zixuan Chen
1ca3f0e774 refactor: rename feature fuzzing to test_utils 2022-11-14 10:49:42 +08:00
Zixuan Chen
743e2b597b chore: fix all warnings 2022-11-11 22:44:18 +08:00
Zixuan Chen
9d48e5df88 refactor: fix type error 2022-10-31 12:33:44 +08:00
Zixuan Chen
e0a472fd1a feat: basic wasm interface 2022-10-31 12:22:07 +08:00