code / tomo

Lines41.3K C23.7K Markdown9.7K YAML5.0K Tomo2.3K
7 others 763
Python231 Shell230 make212 INI47 Text21 SVG16 Lua6
(105 lines)

Domain-Specific Languages

Tomo supports defining different flavors of text that represent specific languages, with type safety guarantees that help prevent code injection. Code injection occurs when you insert untrusted user input into a string without properly escaping the user input. Tomo's lang feature addresses this issue by letting you define custom text types that automatically escape interpolated values and give type checking errors if you attempt to use one type of string where a different type of string is needed.

lang HTML
    convert(t:Text -> HTML)
        t = t.translate({
            "&" = "&",
            "<" = "&lt;",
            ">" = "&gt;",
            '"' = "&quot",
            "'" = "&#39;",
        })
        return HTML.from_text(t)

    func paragraph(content:HTML -> HTML)
        return $HTML"<p>$content</p>"

In this example, we're representing HTML as a language and we want to avoid situations where a malicious user might set their username to something like <script>alert('pwned')</script>.

username := Text.read_line("Choose a username: ")
assert username == "<script>alert('pwned')</script>"
page := $HTML"
    <html><body>
    Hello $username! How are you?
    </body></html>
"
say(page.text)

What we don't want to happen is to get a page that looks like:

<html><body>
Hello <script>alert('pwned')</script>! How are you?
</body></html>

Thankfully, Tomo handles automatic escaping and gives you a properly sanitized result:

<html><body>
Hello &lt;script&gt;alert(&#39;pwned&#39;)&lt;/script&gt;! How are you?
</body></html>

This works because the compiler checks for a function in the HTML namespace that was defined with the name HTML that takes a Text argument and returns an HTML value (a constructor). When performing interpolation, the interpolation will only succeed if such a function exists and it will apply that function to the value before concatenating it.

If you have a function that only accepts an HTML argument, you cannot use a Text value, you must produce a valid HTML value instead. The same is true for returning a value for a function that returns an HTML value or assigning to a variable that holds HTML values.

Languages can also be built around a namespace-based method call API, instead of building a global function API that takes language arguments. For example, instead of building a global function called execute() that takes a ShellScript argument, you could instead build something like this:

lang Sh
    convert(text:Text -> Sh)
        return Sh.from_text("'" ++ text.replace("'", "''") ++ "'")

    func execute(sh:Sh -> Text)
        ...

dir := ask("List which dir? ")
cmd := $Sh@(ls -l @dir)
result := cmd.execute()

Conversions

You can define your own rules for converting between types using the convert keyword. Conversions can be defined either inside of the language's block, another type's block or at the top level.

lang Sh
    convert(text:Text -> Sh)
        return Sh.from_text("'" ++ text.replace("'", "''") ++ "'")

struct Foo(x,y:Int)
    convert(f:Foo -> Sh)
        return Sh.from_text("$(f.x),$(f.y)")

convert(texts:[Text] -> Sh)
    return $Sh" ".join([Sh(t) for t in texts])
1 # Domain-Specific Languages
3 Tomo supports defining different flavors of text that represent specific
4 languages, with type safety guarantees that help prevent code injection. Code
5 injection occurs when you insert untrusted user input into a string without
6 properly escaping the user input. Tomo's `lang` feature addresses this issue by
7 letting you define custom text types that automatically escape interpolated
8 values and give type checking errors if you attempt to use one type of string
9 where a different type of string is needed.
11 ```tomo
12 lang HTML
13 convert(t:Text -> HTML)
14 t = t.translate({
15 "&" = "&amp;",
16 "<" = "&lt;",
17 ">" = "&gt;",
18 '"' = "&quot",
19 "'" = "&#39;",
20 })
21 return HTML.from_text(t)
23 func paragraph(content:HTML -> HTML)
24 return $HTML"<p>$content</p>"
25 ```
27 In this example, we're representing HTML as a language and we want to avoid
28 situations where a malicious user might set their username to something like
29 `<script>alert('pwned')</script>`.
31 ```
32 username := Text.read_line("Choose a username: ")
33 assert username == "<script>alert('pwned')</script>"
34 page := $HTML"
35 <html><body>
36 Hello $username! How are you?
37 </body></html>
39 say(page.text)
40 ```
42 What we _don't_ want to happen is to get a page that looks like:
44 ```html
45 <html><body>
46 Hello <script>alert('pwned')</script>! How are you?
47 </body></html>
48 ```
50 Thankfully, Tomo handles automatic escaping and gives you a properly sanitized
51 result:
53 ```html
54 <html><body>
55 Hello &lt;script&gt;alert(&#39;pwned&#39;)&lt;/script&gt;! How are you?
56 </body></html>
57 ```
59 This works because the compiler checks for a function in the HTML namespace
60 that was defined with the name `HTML` that takes a `Text` argument and returns
61 an `HTML` value (a constructor). When performing interpolation, the
62 interpolation will only succeed if such a function exists and it will apply
63 that function to the value before concatenating it.
65 If you have a function that only accepts an `HTML` argument, you cannot use a
66 `Text` value, you must produce a valid `HTML` value instead. The same is true
67 for returning a value for a function that returns an `HTML` value or assigning
68 to a variable that holds `HTML` values.
70 Languages can also be built around a namespace-based method call API, instead
71 of building a global function API that takes language arguments. For example,
72 instead of building a global function called `execute()` that takes a
73 `ShellScript` argument, you could instead build something like this:
75 ```tomo
76 lang Sh
77 convert(text:Text -> Sh)
78 return Sh.from_text("'" ++ text.replace("'", "''") ++ "'")
80 func execute(sh:Sh -> Text)
81 ...
83 dir := ask("List which dir? ")
84 cmd := $Sh@(ls -l @dir)
85 result := cmd.execute()
86 ```
88 ## Conversions
90 You can define your own rules for converting between types using the `convert`
91 keyword. Conversions can be defined either inside of the language's block,
92 another type's block or at the top level.
94 ```tomo
95 lang Sh
96 convert(text:Text -> Sh)
97 return Sh.from_text("'" ++ text.replace("'", "''") ++ "'")
99 struct Foo(x,y:Int)
100 convert(f:Foo -> Sh)
101 return Sh.from_text("$(f.x),$(f.y)")
103 convert(texts:[Text] -> Sh)
104 return $Sh" ".join([Sh(t) for t in texts])
105 ```