diff options
| author | Bruce Hill <bruce@bruce-hill.com> | 2024-11-29 19:36:17 -0500 |
|---|---|---|
| committer | Bruce Hill <bruce@bruce-hill.com> | 2024-11-29 19:36:17 -0500 |
| commit | 0d6ef67a014f231b4e24b71bb85ec1e4df5b6319 (patch) | |
| tree | 9ee243a7722fb3ef6b08a34bdfeae330bca967c1 /docs | |
| parent | 6d2017d5b811826ac84e8d1df6dba84381cf6d2d (diff) | |
Add serialization docs
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/serialization.md | 69 |
1 files changed, 69 insertions, 0 deletions
diff --git a/docs/serialization.md b/docs/serialization.md new file mode 100644 index 00000000..6f0749ca --- /dev/null +++ b/docs/serialization.md @@ -0,0 +1,69 @@ +# Serialization + +Data serialization and deserialization is notoriously difficult to do correctly +and tedious to implement. In order to make this process easier, Tomo comes with +built-in support for serialization and deserialization of most built-in types, +as well as user-defined structs and enums. Serialization is a process that +takes Tomo values and converts them to bytes, which can be saved in a file or +sent over a network. Serialized bytes can the be deserialized to retrieve the +original value. + +## Serializing + +To serialize data, simply call the method `:serialize()` on any value and it +will return an array of bytes that encode the value's data: + +```tomo +value := Int64(5) +>> serialized := value:serialize() += [0x0A] : [Byte] +``` + +Serialization produces a fairly compact representation of data as a flat array +of bytes. In this case, a 64-bit integer can be represented in a single byte +because it's a small number. + +## Deserializing + +To deserialize data, you must provide its type explicitly. The current syntax +is a placeholder, but it looks like this: + +```tomo +i := 123 +bytes := i:serialize() + +roundtripped := DESERIALIZE(bytes):Int +>> roundtripped += 123 :Int +``` + +## Pointers + +In the case of pointers, deserialization creates a new heap-allocated region of +memory for the values. This means that if you serialize a pointer, it will +store all of the memory contents of that pointer, but not the literal memory +address of the pointer, which may not be valid memory when deserialization +occurs. The upshot is that you can easily serialize datastructures that rely on +pointers, but pointers returned from deserialization will point to new memory +and will not point to the same memory as any pre-existing pointers. + +One of the nice things about this process is that it automatically handles +cyclic datastructures correctly, enabling you to serialize cyclic structures +like circularly linked lists or graphs: + +```tomo +struct Cycle(name:Text, next=NONE:@Cycle) + +c := @Cycle("A") +c.next = @Cycle("B", next=c) +>> c += @Cycle(name="A", next=@Cycle(name="B", next=@~1)) +>> serialized := c:serialize() += [0x02, 0x02, 0x41, 0x01, 0x04, 0x02, 0x42, 0x01, 0x02] : [Byte] +>> roundtrip := DESERIALIZE(serialized):@Cycle += @Cycle(name="A", next=@Cycle(name="B", next=@~1)) : @Cycle +``` + +The deserialized version of the data correctly preserves the cycle +(`roundtrip.next.next == roundtrip`). The representation is also very compact: +only 9 bytes for the whole thing! |
