--------- Co-authored-by: Zixuan Chen <remch183@outlook.com>
11 KiB
JSON Schema for Loro's OpLog
Introduction
Loro supports multiple data structures and introduces many new concepts. Having only binary export formats would make it difficult for developers to understand the underlying processes. Better transparency leads to better developer experience. A human-readable JSON representation enables users to better understand and operate the document and to develop related tools.
To better understand this document, you may first need to understand how Loro stores historical editing data:
It should be noted that considering the usage scenario, JSON Schema only supports backward compatibility but not forward compatibility.
Specification
Root object
The root object contains all Change
s, Op
s, and critical metadata like start/end versions and schema version.
We will also extract the 64-bit integer PeerID to the beginning of the document and replace it internally with incrementing numbers starting from zero: 0, 1, 2, 3... This significantly reduces the document size and enhances readability.
{
"schema_version": number,
"start_version": Map<string, number>,
"peers": string[],
"changes": Change[],
}
schema_version
: the version of the schema that the document is encoded with. It's 1 for the current specification.start_version
: the startFrontiers
version of the document. They are represented as a map from the decimal string representation ofPeerID
toCounter
.peers
: the list of peers in the document. We represent all PeerIDs as decimal strings to avoid exceeding JavaScript's number limit.changes
: the list of changes in the document.
Changes
Change
s are crucial in the OpLog. A REG(Replay event graph) is a directed acyclic graph where each node is a Change
, and each edge is a causal dependency between Change
s. The metadata of the Change
s helps us reconstruct the graph.
You can also attach a commit message to a Change
like you usually do with Git's commit.
{
"id": string,
"timestamp": number,
"deps": OpID[],
"lamport": number,
"msg": string,
"ops": Op[]
}
type OpID = `${number}@${PeerID}`;
id
: the string representation of the uniqueID
of eachChange
, in the form of{Counter}@{PeerID}
which is the@
character connectingCounter
andPeerID
. Of course, ThisPeerID
is the index of peers in the global context.timestamp
: the number of Unix timestamp when the change is committed. Timestamp is not recorded by defaultdeps
: a list of causal dependency of thisChange
, each item is theID
represented by a string.lamport
: the lamport timestamp of theChange
.msg
: the commit message.ops
: all of theOp
in theChange
.
Operations
Operation (abbreviated as Op
) is the most complex part of the document. Loro currently supports multiple containers List
, Map
, RichText
, Movable List
and Movable Tree
. Each data structure has several different Op
s.
But in general, each Op
is composed of the ContainerID
of the container that created it, a counter, and the corresponding content of the Op
.
type Op = {
"container": ContainerID,
"counter": number,
"content": OpContent // Its detailed definition is elaborated below, with different types for different Containers.
};
type OpContent = ListOp | TextOp | MapOp | TreeOp | MovableListOp | UnknownOp;
type ContainerID =
| `cid:root-${string}:${ContainerType}`
| `cid:${number}@${PeerID}:${ContainerType}`;
container
: theContainerID
of the container that created thisOp
, represented by a string starts withcid:
.counter
: the counter part of the OpIDcontent
: the semantic content of theOp
, it is different for each field depending on theContainer
.
The following is the content of each container。
List
type ListOp = ListInsertOp | ListDeleteOp;
Insert
type ListInsertOp = {
"type": "insert",
"pos": number,
"value": LoroValue
}
type
:insert
.pos
: the index of the insert operation.value
: the insert content which is a list ofLoroValue
Delete
type ListDeleteOp = {
"type": "delete",
"pos": number,
"len": number,
"start_id": OpID
}
type
:delete
.pos
: the start index of the deletion.len
: the length of deleted content.start_id
: the string id of start element deleted.
MovableList
type MovableListOp = ListInsertOp | ListDeleteOp | MovableListMoveOp | MovableListSetOp;
Insert
type ListInsertOp = {
"type": "insert",
"pos": number,
"value": LoroValue
}
type
:insert
,pos
: the index of the insert operation.value
: the insert content which is a list ofLoroValue
Delete
type ListDeleteOp = {
"type": "delete",
"pos": number,
"len": number,
"start_id": OpID
}
type
:delete
pos
: the start index of the deletion.len
: the length of deleted content.start_id
: the string id of start element deleted.
Move
type MovableListMoveOp = {
"type": "move",
"from": number,
"to": number,
"elem_id": ElemID
}
type ElemID = `L${number}@${PeerID}`
type
:insert
,delete
,move
orset
.from
: the index of the element before is moved.to
: the index of the index moved to after moving out the elementelem_id
: the ID (described by lamport@peer) of the element moved.
Set
type MovableListSetOp = {
"type": "set",
"elem_id": ElemID,
"value": LoroValue
}
type ElemID = `L${number}@${PeerID}`
type
:insert
,delete
,move
orset
.elem_id
: the ID (described by lamport@peer) of the element replaced.value
: the value set.
Map
type MapOp = MapInsertOp | MapDeleteOp;
Insert
type MapInsertOp = {
"type": "insert",
"key": string,
"value": LoroValue
}
type
:insert
.key
: the key of the insertion.value
: the value of the insertion.
Delete
type MapDeleteOp = {
"type": "delete",
"key": string
}
type
:delete
.key
: the key of the deletion
Text
type TextOp = TextInsertOp | TextDeleteOp | TextMarkOp | TextMarkEndOp;
Insert
type TextInsertOp = {
"type": "insert",
"pos": number,
"text": string
}
type
: insert
.
pos
: the index of the insert operation. The position is based on the Unicode code point length.
text
: the string of the insertion.
Delete
type TextDeleteOp = {
"type": "delete",
"pos": number,
"len": number,
"start_id": OpID
}
type
: delete
.
pos
: the index of the deletion. The position is based on the Unicode code point length.
len
: the length of the text deleted.
start_id
: the string id of the beginning element deleted.
Mark
type TextMarkOp = {
"type": "mark",
"start": number,
"end": number,
"style_key": string,
"style_value": LoroValue,
"info": number
}
type
: mark
start
: the start index of text need to mark. The position is based on the Unicode code point length.
end
: the end index of text need to mark. The position is based on the Unicode code point length.
style_key
: the key of style, it is customizable.
style_value
: the value of style, it is customizable.
info
: the config of the style, whether to expand the style when inserting new text around it.
MarkEnd
type TextMarkEndOp = {
"type": "mark_end"
}
type
: mark_end
.
Tree
type TreeOp = TreeCreateOp | TreeMoveOp | TreeDeleteOp;
Create
type TreeCreateOp = {
"type": "create",
"target": TreeID,
"parent": TreeID | null,
"fractional_index": string
}
type TreeID = `${number}@${PeerID}`
type
:create
.target
: the string format of targetTreeID
moved.parent
: the string format ofTreeID
ornull
. If it isnull
, the target node will be a root node.fractional_index
: the fractional index with hex string format of the target node.
Move
type TreeMoveOp = {
"type": "move",
"target": TreeID,
"parent": TreeID | null,
"fractional_index": string
}
type TreeID = `${number}@${PeerID}`
type
:move
.target
: the string format of targetTreeID
moved.parent
: the string format ofTreeID
ornull
. If it isnull
, the target node will be a root node.fractional_index
: the fractional index with hex string format of the target node.
Delete
type TreeDeleteOp = {
"type": "delete",
"target": TreeID
}
type TreeID = `${number}@${PeerID}`
type
:delete
.target
: the string format of targetTreeID
deleted.
Unknown
To support forward compatibility, we have an unknown type. When an Op
with a newly supported Container from a newer version is decoded into the older version, it will be treated as an unknown type in a more general form, such as binary and string. When the new version decodes an unknown Op
, the newer version of Loro will know its true type and decode correctly.
type UnknownOp = {
"type": "unknown",
"prop": number,
"value_type": string,
"value": `${EncodeValue}`
}
type
: just an unknown type.prop
: a property of the encoded op, it's a number.value_type
: the type ofEncodeValue
.value
: common data types used in encoding with json string format.
Value
In this section, we will introduction two Value in Loro. One is LoroValue
, it's an enum of data types supported by Loro, such as the value inserted by List
or Map
.
The another is EncodedValue
, it's just used in encoding module for unknown type.
LoroValue
These are data types supported by Loro and its json format:
null
:null
Bool
:true
orfalse
F64
:number
(float)I64
:number
orbigint
(signed)Binary
:UInt8Array
String
:string
List
:Array<LoroValue>
Map
:Map<string, LoroValue>
Container
: the id of container.🦜:cid:{Counter}@{PeerID}:{ContainerType}
or🦜:cid:root-{Name}:{ContainerType}
Note: Compared with the string format, we add a prefix 🦜:
when encoding the json format of ContainerID
to prevent users from saving the string format of ContainerID
and misinterpreting it as ContainerID
when decoding.
EncodedValue
The EncodedValue
is the specific type used by Loro when encoding, it's an internal value, users do not need to get it clear. It is specially designed to handle the schema mismatch due to forward and backward compatibility. In JSON encoding schema, the EncodedValue
will be encoded as an object.