Serialization
Data serialization and deserialization is notoriously difficult to do correctly and tedious to implement. In order to make this process easier, Tomo comes with built-in support for serialization and deserialization of most built-in types, as well as user-defined structs and enums. Serialization is a process that takes Tomo values and converts them to bytes, which can be saved in a file or sent over a network. Serialized bytes can the be deserialized to retrieve the original value.
Serializing
To serialize data, declare a variable with type [Byte] and assign any
arbitrary type to that value.
value := Int64(5)
serialized : [Byte] = value
assert serialized == [0x0A]
Serialization produces a fairly compact representation of data as a flat list of bytes. In this case, a 64-bit integer can be represented in a single byte because it's a small number.
The same process works with more complicated data:
struct Foo(x:Int, y:Text)
foo := Foo(123, "Hello")
serialized : [Byte] = foo
assert serialized == [0x00, 0xf6, 0x01, 0x0a, 0x48, 0x65, 0x6c, 0x6c, 0x6f]
Deserializing
To deserialize data, you can assign a list of bytes to a variable with your target type:
value_bytes : [Byte] = [Byte(0x0A)]
value : Int64 = value_bytes
assert value == 5
foo_bytes : [Byte] = [0x00, 0xf6, 0x01, 0x0a, 0x48, 0x65, 0x6c, 0x6c, 0x6f]
foo : Foo = foo_bytes
assert foo == Foo(123, "Hello")
Pointers
In the case of pointers, deserialization creates a new heap-allocated region of memory for the values. This means that if you serialize a pointer, it will store all of the memory contents of that pointer, but not the literal memory address of the pointer, which may not be valid memory when deserialization occurs. The upshot is that you can easily serialize datastructures that rely on pointers, but pointers returned from deserialization will point to new memory and will not point to the same memory as any pre-existing pointers.
One of the nice things about this process is that it automatically handles cyclic datastructures correctly, enabling you to serialize cyclic structures like circularly linked lists or graphs:
struct Cycle(name:Text, next:@Cycle?=none)
c := @Cycle("A")
c.next = @Cycle("B", next=c)
say("$c")
# @Cycle(name="A", next=@Cycle(name="B", next=@~1))
bytes : [Byte] = c
say("$bytes")
# [0x02, 0x02, 0x41, 0x01, 0x04, 0x02, 0x42, 0x01, 0x02]
roundtrip : @Cycle = bytes
say("$roundtrip")
# @Cycle(name="A", next=@Cycle(name="B", next=@~1))
assert roundtrip.next.next == roundtrip
The deserialized version of the data correctly preserves the cycle
(roundtrip.next.next == roundtrip). The representation is also very compact:
only 9 bytes for the whole thing!
Unserializable Types
Unfortunately, not all types can be easily serialized. In particular, functions
(and closures) cannot be serialized because their data contents cannot be
easily converted to portable byte lists. Type objects themselves (e.g. the
variable Text) also cannot be serialized. All other datatypes can be
serialized.
1 # Serialization3 Data serialization and deserialization is notoriously difficult to do correctly4 and tedious to implement. In order to make this process easier, Tomo comes with5 built-in support for serialization and deserialization of most built-in types,6 as well as user-defined structs and enums. Serialization is a process that7 takes Tomo values and converts them to bytes, which can be saved in a file or8 sent over a network. Serialized bytes can the be deserialized to retrieve the9 original value.11 ## Serializing14 arbitrary type to that value.17 value := Int64(5)18 serialized : [Byte] = value19 assert serialized == [0x0A]20 ```22 Serialization produces a fairly compact representation of data as a flat list23 of bytes. In this case, a 64-bit integer can be represented in a single byte24 because it's a small number.26 The same process works with more complicated data:29 struct Foo(x:Int, y:Text)31 foo := Foo(123, "Hello")32 serialized : [Byte] = foo33 assert serialized == [0x00, 0xf6, 0x01, 0x0a, 0x48, 0x65, 0x6c, 0x6c, 0x6f]34 ```36 ## Deserializing38 To deserialize data, you can assign a list of bytes to a variable with your39 target type:42 value_bytes : [Byte] = [Byte(0x0A)]43 value : Int64 = value_bytes44 assert value == 546 foo_bytes : [Byte] = [0x00, 0xf6, 0x01, 0x0a, 0x48, 0x65, 0x6c, 0x6c, 0x6f]47 foo : Foo = foo_bytes48 assert foo == Foo(123, "Hello")49 ```51 ## Pointers53 In the case of pointers, deserialization creates a new heap-allocated region of54 memory for the values. This means that if you serialize a pointer, it will55 store all of the memory contents of that pointer, but not the literal memory56 address of the pointer, which may not be valid memory when deserialization57 occurs. The upshot is that you can easily serialize datastructures that rely on58 pointers, but pointers returned from deserialization will point to new memory59 and will not point to the same memory as any pre-existing pointers.61 One of the nice things about this process is that it automatically handles62 cyclic datastructures correctly, enabling you to serialize cyclic structures63 like circularly linked lists or graphs:66 struct Cycle(name:Text, next:@Cycle?=none)68 c := @Cycle("A")69 c.next = @Cycle("B", next=c)70 say("$c")71 # @Cycle(name="A", next=@Cycle(name="B", next=@~1))72 bytes : [Byte] = c73 say("$bytes")74 # [0x02, 0x02, 0x41, 0x01, 0x04, 0x02, 0x42, 0x01, 0x02]75 roundtrip : @Cycle = bytes76 say("$roundtrip")77 # @Cycle(name="A", next=@Cycle(name="B", next=@~1))78 assert roundtrip.next.next == roundtrip79 ```81 The deserialized version of the data correctly preserves the cycle83 only 9 bytes for the whole thing!85 ## Unserializable Types87 Unfortunately, not all types can be easily serialized. In particular, functions88 (and closures) cannot be serialized because their data contents cannot be89 easily converted to portable byte lists. Type objects themselves (e.g. the91 serialized.