2024-08-19 11:41:04 -07:00
|
|
|
# Domain-Specific Languages
|
|
|
|
|
|
|
|
Tomo supports defining different flavors of text that represent specific
|
|
|
|
languages, with type safety guarantees that help prevent code injection. Code
|
|
|
|
injection occurs when you insert untrusted user input into a string without
|
|
|
|
properly escaping the user input. Tomo's `lang` feature addresses this issue by
|
|
|
|
letting you define custom text types that automatically escape interpolated
|
|
|
|
values and give type checking errors if you attempt to use one type of string
|
|
|
|
where a different type of string is needed.
|
|
|
|
|
|
|
|
```tomo
|
|
|
|
lang HTML:
|
2025-02-19 15:50:50 -08:00
|
|
|
func HTML(t:Text -> HTML):
|
2024-09-16 13:18:01 -07:00
|
|
|
t = t:replace_all({
|
2025-01-12 13:49:58 -08:00
|
|
|
$/&/ = "&",
|
|
|
|
$/</ = "<",
|
|
|
|
$/>/ = ">",
|
|
|
|
$/"/ = """,
|
|
|
|
$/'/ = "'",
|
2024-09-16 13:18:01 -07:00
|
|
|
})
|
2024-09-24 10:26:49 -07:00
|
|
|
return HTML.without_escaping(t)
|
2024-08-19 11:41:04 -07:00
|
|
|
|
2024-10-09 10:26:28 -07:00
|
|
|
func paragraph(content:HTML -> HTML):
|
2024-08-19 11:41:04 -07:00
|
|
|
return $HTML"<p>$content</p>"
|
|
|
|
```
|
|
|
|
|
|
|
|
In this example, we're representing HTML as a language and we want to avoid
|
|
|
|
situations where a malicious user might set their username to something like
|
|
|
|
`<script>alert('pwned')</script>`.
|
|
|
|
|
|
|
|
```
|
|
|
|
>> username := Text.read_line("Choose a username: ")
|
|
|
|
= "<script>alert('pwned')</script>"
|
|
|
|
page := $HTML"
|
|
|
|
<html><body>
|
|
|
|
Hello $username! How are you?
|
|
|
|
</body></html>
|
|
|
|
"
|
2025-03-01 13:53:58 -08:00
|
|
|
say(page.text)
|
2024-08-19 11:41:04 -07:00
|
|
|
```
|
|
|
|
|
|
|
|
What we _don't_ want to happen is to get a page that looks like:
|
|
|
|
|
|
|
|
```html
|
|
|
|
<html><body>
|
|
|
|
Hello <script>alert('pwned')</script>! How are you?
|
|
|
|
</body></html>
|
|
|
|
```
|
|
|
|
|
|
|
|
Thankfully, Tomo handles automatic escaping and gives you a properly sanitized
|
|
|
|
result:
|
|
|
|
|
|
|
|
```html
|
|
|
|
<html><body>
|
|
|
|
Hello <script>alert('pwned')</script>! How are you?
|
|
|
|
</body></html>
|
|
|
|
```
|
|
|
|
|
|
|
|
This works because the compiler checks for a function in the HTML namespace
|
2025-02-19 15:50:50 -08:00
|
|
|
that was defined with the name `HTML` that takes a `Text` argument and returns
|
|
|
|
an `HTML` value (a constructor). When performing interpolation, the
|
2024-08-19 11:41:04 -07:00
|
|
|
interpolation will only succeed if such a function exists and it will apply
|
|
|
|
that function to the value before concatenating it.
|
|
|
|
|
|
|
|
If you have a function that only accepts an `HTML` argument, you cannot use a
|
|
|
|
`Text` value, you must produce a valid `HTML` value instead. The same is true
|
|
|
|
for returning a value for a function that returns an `HTML` value or assigning
|
|
|
|
to a variable that holds `HTML` values.
|
|
|
|
|
|
|
|
Languages can also be built around a namespace-based method call API, instead
|
|
|
|
of building a global function API that takes language arguments. For example,
|
|
|
|
instead of building a global function called `execute()` that takes a
|
|
|
|
`ShellScript` argument, you could instead build something like this:
|
|
|
|
|
|
|
|
```tomo
|
|
|
|
lang Sh:
|
2025-02-19 15:50:50 -08:00
|
|
|
func Sh(text:Text -> Sh):
|
2024-09-24 10:26:49 -07:00
|
|
|
return Sh.without_escaping("'" ++ text:replace($/'/, "''") ++ "'")
|
2024-08-19 11:41:04 -07:00
|
|
|
|
2024-10-09 10:26:28 -07:00
|
|
|
func execute(sh:Sh -> Text):
|
2024-08-19 11:41:04 -07:00
|
|
|
...
|
|
|
|
|
2024-09-16 13:18:01 -07:00
|
|
|
dir := ask("List which dir? ")
|
2024-08-19 11:41:04 -07:00
|
|
|
cmd := $Sh@(ls -l @dir)
|
|
|
|
result := cmd:execute()
|
|
|
|
```
|