diff options
| author | Bruce Hill <bruce@bruce-hill.com> | 2025-03-05 00:21:30 -0500 |
|---|---|---|
| committer | Bruce Hill <bruce@bruce-hill.com> | 2025-03-05 00:21:30 -0500 |
| commit | 0a3ad8ba914ab42ebbb88a3d955f71d71d581fc1 (patch) | |
| tree | e984b58347627f0417a6961dbb8e83afe4739653 /docs/text.md | |
| parent | 665050940f1562b045efe942686d04b3c3fac381 (diff) | |
Alphabetize and index functions
Diffstat (limited to 'docs/text.md')
| -rw-r--r-- | docs/text.md | 302 |
1 files changed, 169 insertions, 133 deletions
diff --git a/docs/text.md b/docs/text.md index 960aa07d..67f65f91 100644 --- a/docs/text.md +++ b/docs/text.md @@ -270,6 +270,42 @@ pattern documentation](patterns.md) for more details. ## Text Functions +- [`func as_c_string(text: Text -> CString)`](#`as_c_string) +- [`func at(text: Text, index: Int -> Text)`](#`at) +- [`func by_line(text: Text -> func(->Text?))`](#`by_line) +- [`func by_match(text: Text, pattern: Pattern -> func(->Match?))`](#`by_match) +- [`func by_split(text: Text, pattern: Pattern = $// -> func(->Text?))`](#`by_split) +- [`func bytes(text: Text -> [Byte])`](#`bytes) +- [`func codepoint_names(text: Text -> [Text])`](#`codepoint_names) +- [`func each(text: Text, pattern: Pattern, fn: func(m: Match), recursive: Bool = yes -> Int?)`](#`each) +- [`func ends_with(text: Text, suffix: Text -> Bool)`](#`ends_with) +- [`func find(text: Text, pattern: Pattern, start: Int = 1 -> Int?)`](#`find) +- [`func find_all(text: Text, pattern: Pattern -> [Match])`](#`find_all) +- [`func from(text: Text, first: Int -> Text)`](#`from) +- [`func from_codepoint_names(codepoints: [Int32] -> [Text])`](#`from_bytes) +- [`func from_c_string(str: CString -> Text)`](#`from_c_string) +- [`func from_codepoint_names(codepoint_names: [Text] -> [Text])`](#`from_codepoint_names) +- [`func from_codepoint_names(codepoints: [Int32] -> [Text])`](#`from_codepoints) +- [`func has(text: Text, pattern: Pattern -> Bool)`](#`has) +- [`func join(glue: Text, pieces: [Text] -> Text)`](#`join) +- [`func split(text: Text -> [Text])`](#`lines) +- [`func lower(text: Text -> Text)`](#`lower) +- [`func map(text: Text, pattern: Pattern, fn: func(text:Match)->Text -> Text, recursive: Bool = yes)`](#`map) +- [`func matches(text: Text, pattern: Pattern -> [Text])`](#`matches) +- [`func quoted(text: Text, color: Bool = no -> Text)`](#`quoted) +- [`func repeat(text: Text, count:Int -> Text)`](#`repeat) +- [`func replace(text: Text, pattern: Pattern, replacement: Text, backref: Pattern = $/\/, recursive: Bool = yes -> Text)`](#`replace) +- [`func replace_all(replacements:{Pattern,Text}, backref: Pattern = $/\/, recursive: Bool = yes -> Text)`](#`replace_all) +- [`func reversed(text: Text -> Text)`](#`reversed) +- [`func slice(text: Text, from: Int = 1, to: Int = -1 -> Text)`](#`slice) +- [`func split(text: Text, pattern: Pattern = "" -> [Text])`](#`split) +- [`func starts_with(text: Text, prefix: Text -> Bool)`](#`starts_with) +- [`func title(text: Text -> Text)`](#`title) +- [`func to(text: Text, last: Int -> Text)`](#`to) +- [`func trim(text: Text, pattern: Pattern = $/{whitespace/, trim_left: Bool = yes, trim_right: Bool = yes -> Text)`](#`trim) +- [`func upper(text: Text -> Text)`](#`upper) +- [`func utf32_codepoints(text: Text -> [Int32])`](#`utf32_codepoints) + ### `as_c_string` **Description:** @@ -470,31 +506,6 @@ An array of codepoint names (`[Text]`). --- -### `utf32_codepoints` - -**Description:** -Returns an array of Unicode code points for UTF32 encoding of the text. - -**Signature:** -```tomo -func utf32_codepoints(text: Text -> [Int32]) -``` - -**Parameters:** - -- `text`: The text from which to extract Unicode code points. - -**Returns:** -An array of 32-bit integer Unicode code points (`[Int32]`). - -**Example:** -```tomo ->> "Amélie":utf32_codepoints() -= [65[32], 109[32], 233[32], 108[32], 105[32], 101[32]] : [Int32] -``` - ---- - ### `each` **Description:** @@ -552,87 +563,108 @@ func ends_with(text: Text, suffix: Text -> Bool) --- -### `from_c_string` +### `find` **Description:** -Converts a C-style string to a `Text` value. +Finds the first occurrence of a [pattern](patterns.md) in the given text (if +any). **Signature:** ```tomo -func from_c_string(str: CString -> Text) +func find(text: Text, pattern: Pattern, start: Int = 1 -> Int?) ``` **Parameters:** -- `str`: The C-style string to be converted. +- `text`: The text to be searched. +- `pattern`: The [pattern](patterns.md) to search for. +- `start`: The index to start the search. **Returns:** -A `Text` value representing the C-style string. +`!Match` if the target [pattern](patterns.md) is not found, otherwise a `Match` +struct containing information about the match. **Example:** ```tomo ->> Text.from_c_string(CString("Hello")) -= "Hello" +>> " #one #two #three ":find($/#{id}/, start=-999) += none : Match? +>> " #one #two #three ":find($/#{id}/, start=999) += none : Match? +>> " #one #two #three ":find($/#{id}/) += Match(text="#one", index=2, captures=["one"]) : Match? +>> " #one #two #three ":find("{id}", start=6) += Match(text="#two", index=9, captures=["two"]) : Match? ``` --- -### `from_codepoint_names` +### `find_all` **Description:** -Returns text that has the given codepoint names (according to the Unicode -specification) as its codepoints. Note: the text will be normalized, so the -resulting text's codepoints may not exactly match the input codepoints. +Finds all occurrences of a [pattern](patterns.md) in the given text. **Signature:** ```tomo -func from_codepoint_names(codepoint_names: [Text] -> [Text]) +func find_all(text: Text, pattern: Pattern -> [Match]) ``` **Parameters:** -- `codepoint_names`: The names of each codepoint in the desired text. Names - are case-insentive. +- `text`: The text to be searched. +- `pattern`: The [pattern](patterns.md) to search for. **Returns:** -A new text with the specified codepoints after normalization has been applied. -Any invalid names are ignored. +An array of every match of the [pattern](patterns.md) in the given text. +Note: if `text` or `pattern` is empty, an empty array will be returned. **Example:** ```tomo ->> Text.from_codepoint_names([ - "LATIN CAPITAL LETTER A WITH RING ABOVE", - "LATIN SMALL LETTER K", - "LATIN SMALL LETTER E", -] -= "Åke" +>> " #one #two #three ":find_all($/#{alpha}/) += [Match(text="#one", index=2, captures=["one"]), Match(text="#two", index=8, captures=["two"]), Match(text="#three", index=13, captures=["three"])] + +>> " ":find_all("{alpha}") += [] + +>> " foo(baz(), 1) doop() ":find_all("{id}(?)") += [Match(text="foo(baz(), 1)", index=2, captures=["foo", "baz(), 1"]), Match(text="doop()", index=17, captures=["doop", ""])] + +>> "":find_all($//) += [] + +>> "Hello":find_all($//) += [] ``` --- -### `from_codepoints` +### `from` **Description:** -Returns text that has been constructed from the given UTF32 codepoints. Note: -the text will be normalized, so the resulting text's codepoints may not exactly -match the input codepoints. +Get a slice of the text, starting at the given position. **Signature:** ```tomo -func from_codepoint_names(codepoints: [Int32] -> [Text]) +func from(text: Text, first: Int -> Text) ``` **Parameters:** -- `codepoints`: The UTF32 codepoints in the desired text. +- `text`: The text to be sliced. +- `frist`: The index of the first grapheme cluster to include (1-indexed). **Returns:** -A new text with the specified codepoints after normalization has been applied. +The text from the given grapheme cluster to the end of the text. Note: a +negative index counts backwards from the end of the text, so `-1` refers to the +last cluster, `-2` the second-to-last, etc. Slice ranges will be truncated to +the length of the string. **Example:** ```tomo ->> Text.from_codepoints([197[32], 107[32], 101[32]]) -= "Åke" +>> "hello":from(2) += "ello" + +>> "hello":from(-2) += "lo" ``` --- @@ -664,108 +696,87 @@ A new text based on the input UTF8 bytes after normalization has been applied. --- -### `find` +### `from_c_string` **Description:** -Finds the first occurrence of a [pattern](patterns.md) in the given text (if -any). +Converts a C-style string to a `Text` value. **Signature:** ```tomo -func find(text: Text, pattern: Pattern, start: Int = 1 -> Int?) +func from_c_string(str: CString -> Text) ``` **Parameters:** -- `text`: The text to be searched. -- `pattern`: The [pattern](patterns.md) to search for. -- `start`: The index to start the search. +- `str`: The C-style string to be converted. **Returns:** -`!Match` if the target [pattern](patterns.md) is not found, otherwise a `Match` -struct containing information about the match. +A `Text` value representing the C-style string. **Example:** ```tomo ->> " #one #two #three ":find($/#{id}/, start=-999) -= none : Match? ->> " #one #two #three ":find($/#{id}/, start=999) -= none : Match? ->> " #one #two #three ":find($/#{id}/) -= Match(text="#one", index=2, captures=["one"]) : Match? ->> " #one #two #three ":find("{id}", start=6) -= Match(text="#two", index=9, captures=["two"]) : Match? +>> Text.from_c_string(CString("Hello")) += "Hello" ``` --- -### `find_all` +### `from_codepoint_names` **Description:** -Finds all occurrences of a [pattern](patterns.md) in the given text. +Returns text that has the given codepoint names (according to the Unicode +specification) as its codepoints. Note: the text will be normalized, so the +resulting text's codepoints may not exactly match the input codepoints. **Signature:** ```tomo -func find_all(text: Text, pattern: Pattern -> [Match]) +func from_codepoint_names(codepoint_names: [Text] -> [Text]) ``` **Parameters:** -- `text`: The text to be searched. -- `pattern`: The [pattern](patterns.md) to search for. +- `codepoint_names`: The names of each codepoint in the desired text. Names + are case-insentive. **Returns:** -An array of every match of the [pattern](patterns.md) in the given text. -Note: if `text` or `pattern` is empty, an empty array will be returned. +A new text with the specified codepoints after normalization has been applied. +Any invalid names are ignored. **Example:** ```tomo ->> " #one #two #three ":find_all($/#{alpha}/) -= [Match(text="#one", index=2, captures=["one"]), Match(text="#two", index=8, captures=["two"]), Match(text="#three", index=13, captures=["three"])] - ->> " ":find_all("{alpha}") -= [] - ->> " foo(baz(), 1) doop() ":find_all("{id}(?)") -= [Match(text="foo(baz(), 1)", index=2, captures=["foo", "baz(), 1"]), Match(text="doop()", index=17, captures=["doop", ""])] - ->> "":find_all($//) -= [] - ->> "Hello":find_all($//) -= [] +>> Text.from_codepoint_names([ + "LATIN CAPITAL LETTER A WITH RING ABOVE", + "LATIN SMALL LETTER K", + "LATIN SMALL LETTER E", +] += "Åke" ``` --- -### `from` +### `from_codepoints` **Description:** -Get a slice of the text, starting at the given position. +Returns text that has been constructed from the given UTF32 codepoints. Note: +the text will be normalized, so the resulting text's codepoints may not exactly +match the input codepoints. **Signature:** ```tomo -func from(text: Text, first: Int -> Text) +func from_codepoint_names(codepoints: [Int32] -> [Text]) ``` **Parameters:** -- `text`: The text to be sliced. -- `frist`: The index of the first grapheme cluster to include (1-indexed). +- `codepoints`: The UTF32 codepoints in the desired text. **Returns:** -The text from the given grapheme cluster to the end of the text. Note: a -negative index counts backwards from the end of the text, so `-1` refers to the -last cluster, `-2` the second-to-last, etc. Slice ranges will be truncated to -the length of the string. +A new text with the specified codepoints after normalization has been applied. **Example:** ```tomo ->> "hello":from(2) -= "ello" - ->> "hello":from(-2) -= "lo" +>> Text.from_codepoints([197[32], 107[32], 101[32]]) += "Åke" ``` --- @@ -887,67 +898,67 @@ The lowercase version of the text. --- -### `matches` +### `map` **Description:** -Checks if the `Text` matches target [pattern](patterns.md) and returns an array -of the matching text captures or a null value if the entire text doesn't match -the pattern. +For each occurrence of the given [pattern](patterns.md), replace the text with +the result of calling the given function on that match. **Signature:** ```tomo -func matches(text: Text, pattern: Pattern -> [Text]) +func map(text: Text, pattern: Pattern, fn: func(text:Match)->Text -> Text, recursive: Bool = yes) ``` **Parameters:** - `text`: The text to be searched. - `pattern`: The [pattern](patterns.md) to search for. +- `fn`: The function to apply to each match. +- `recursive`: Whether to recursively map `fn` to each of the captures of the + pattern before handing them to `fn`. **Returns:** -An array of the matching text captures if the entire text matches the pattern, -or a null value otherwise. +The text with the matching parts replaced with the result of applying the given +function to each. **Example:** ```tomo ->> "hello world":matches($/{id}/) -= none : [Text]? - ->> "hello world":matches($/{id} {id}/) -= ["hello", "world"] : [Text]? +>> "hello world":map($/world/, func(m:Match): m.text:upper()) += "hello WORLD" +>> "Some nums: 1 2 3 4":map($/{int}/, func(m:Match): "$(Int.parse(m.text)! + 10)") += "Some nums: 11 12 13 14" ``` --- -### `map` +### `matches` **Description:** -For each occurrence of the given [pattern](patterns.md), replace the text with -the result of calling the given function on that match. +Checks if the `Text` matches target [pattern](patterns.md) and returns an array +of the matching text captures or a null value if the entire text doesn't match +the pattern. **Signature:** ```tomo -func map(text: Text, pattern: Pattern, fn: func(text:Match)->Text -> Text, recursive: Bool = yes) +func matches(text: Text, pattern: Pattern -> [Text]) ``` **Parameters:** - `text`: The text to be searched. - `pattern`: The [pattern](patterns.md) to search for. -- `fn`: The function to apply to each match. -- `recursive`: Whether to recursively map `fn` to each of the captures of the - pattern before handing them to `fn`. **Returns:** -The text with the matching parts replaced with the result of applying the given -function to each. +An array of the matching text captures if the entire text matches the pattern, +or a null value otherwise. **Example:** ```tomo ->> "hello world":map($/world/, func(m:Match): m.text:upper()) -= "hello WORLD" ->> "Some nums: 1 2 3 4":map($/{int}/, func(m:Match): "$(Int.parse(m.text)! + 10)") -= "Some nums: 11 12 13 14" +>> "hello world":matches($/{id}/) += none : [Text]? + +>> "hello world":matches($/{id} {id}/) += ["hello", "world"] : [Text]? ``` --- @@ -1355,3 +1366,28 @@ The uppercase version of the text. >> "amélie":upper() = "AMÉLIE" ``` + +--- + +### `utf32_codepoints` + +**Description:** +Returns an array of Unicode code points for UTF32 encoding of the text. + +**Signature:** +```tomo +func utf32_codepoints(text: Text -> [Int32]) +``` + +**Parameters:** + +- `text`: The text from which to extract Unicode code points. + +**Returns:** +An array of 32-bit integer Unicode code points (`[Int32]`). + +**Example:** +```tomo +>> "Amélie":utf32_codepoints() += [65[32], 109[32], 233[32], 108[32], 105[32], 101[32]] : [Int32] +``` |
