From 44cd26f2cebd760a53aa4ff1b7779e718a101650 Mon Sep 17 00:00:00 2001 From: Bruce Hill Date: Sun, 6 Apr 2025 22:45:02 -0400 Subject: Rename Array -> List in all code and docs --- docs/arrays.md | 908 --------------------------------------------------------- 1 file changed, 908 deletions(-) delete mode 100644 docs/arrays.md (limited to 'docs/arrays.md') diff --git a/docs/arrays.md b/docs/arrays.md deleted file mode 100644 index ec114442..00000000 --- a/docs/arrays.md +++ /dev/null @@ -1,908 +0,0 @@ -# Arrays - -Tomo supports arrays as a container type that holds a list of elements of any -type in a compact format. Arrays are immutable by default, but use -copy-on-write semantics to efficiently mutate in place when possible. **Arrays -are 1-indexed**, which means the first item in the array has index `1`. - -## Syntax - -Arrays are written using square brackets and a list of comma-separated elements: - -```tomo -nums := [10, 20, 30] -``` - -Each element must have the same type (or be easily promoted to the same type). If -you want to have an empty array, you must specify what type goes inside the array -like this: - -```tomo -empty : [Int] = [] -``` - -For type annotations, an array that holds items with type `T` is written as `[T]`. - -### Array Comprehensions - -Arrays can also use comprehensions, where you specify how to dynamically create -all the elements by iteration instead of manually specifying each: - -```tomo ->> [i*10 for i in (3).to(8)] -= [30, 40, 50, 60, 70, 80] ->> [i*10 for i in (3).to(8) if i != 4] -= [30, 50, 60, 70, 80] -``` - -Comprehensions can be combined with regular items or other comprehensions: - -```tomo ->> [-1, i*10 for i in (3).to(8), i for i in 3] -= [-1, 30, 40, 50, 60, 70, 80, 1, 2, 3] -``` - -## Length - -Array length can be accessed by the `.length` field: - -```tomo ->> [10, 20, 30].length -= 3 -``` - -## Indexing - -Array values are accessed using square bracket indexing. Since arrays are -1-indexed, the index `1` corresponds to the first item in the array. Negative -indices are used to refer to items from the back of the array, so `-1` is the -last item, `-2` is the second-to-last, and so on. - -```tomo -arr := [10, 20, 30, 40] ->> arr[1] -= 10 - ->> arr[2] -= 20 - ->> arr[-1] -= 40 - ->> arr[-2] -= 30 -``` - -If an array index of `0` or any value larger than the length of the array is -used, it will trigger a runtime error that will print what the invalid array -index was, the length of the array, and a stack trace. As a performance -operation, if array bounds checking proves to be a performance hot spot, you -can explicitly disable bounds checking by adding `arr[i; unchecked]` to the -array access. - -## Iteration - -You can iterate over the items in an array like this: - -```tomo -for item in array: - ... - -for i, item in array: - ... -``` - -Array iteration operates over the value of the array when the loop began, so -modifying the array during iteration is safe and will not result in the loop -iterating over any of the new values. - -## Concatenation - -Arrays can be concatenated with the `++` operator, which returns an array that -has the items from one appended to the other. This should not be confused with -the addition operator `+`, which does not work with arrays. - -```tomo ->> [1, 2] ++ [3, 4] -= [1, 2, 3, 4] -``` - -## Implementation Details - -Under the hood, arrays are implemented as a struct that contains a pointer to a -contiguous chunk of memory storing the elements of the array and some other -metadata. Since Tomo has datatypes with different sizes, like `Bool`s which -take one byte and `struct`s which can take up many bytes, it's worth noting -that arrays store the elements compactly and inline, without the need for each -array cell to hold a pointer to where the data actually lives. - -The other metadata stored with an array includes its length as well as the -_stride_ of the array. The stride is not exposed to the user, but it's the gap -in bytes between each element in the array. The reason this is mentioned is -that it is possible to create immutable slices of arrays in constant time by -creating a new struct that points to the appropriate starting place for the -array items and has the appropriate stride. The upshot is that a method like -`array.reversed()` does not actually copy the array, it simply returns a struct -that points to the back of the array with a negative stride. Arrays adhere to -copy-on-write semantics, so we can cheaply create many read-only references to -the same data, and only need to do copying if we plan to modify data. After -doing a modification, future modifications can be done in-place as long as -there is only one reference to that data. - -Internally, we also take advantage of this inside of tables, which compactly -store all of the key/value pairs in a contiguous array and we can return an -immutable slice of that array showing only the keys or only the values by -choosing the right starting point and stride. - -## Copy on Write - -Arrays can be thought of as values that have copy-on-write semantics that use -reference counting to perform efficient in-place mutations instead of copying -as a performance optimization when it wouldn't affect the program's semantics. -Without getting too deep into the details, suffice it to say that when you -create an array, that array can be thought of as a singular "value" in the same -way that `123` is a value. That variable's value will never change unless you -explicitly perform an assignment operation on the variable or call a method on -the variable. - -Because it would be tedious to require users to write all array operations as -pure functions like `array = array.with_value_at_index(value=x, index=i)`, Tomo -provides the familiar imperative syntax for modifying arrays, but keeps the -semantics of the pure functional style. Writing `array[i] = x` is -_semantically_ equivalent to `array = array.with_value_at_index(value=x, -index=i)`, but much more readable and easy to write. Similarly, -`array.insert(x)` is semantically equivalent to `array = -array.with_value_inserted(x)`. We implement these mutating methods as functions -that take a pointer to an array variable, which then either mutate the array's -data in-place (if this is the only thing referencing that data) or construct a -new array and store its value in the memory where the array variable is stored. - -When there is only a single reference to an array value, we can perform these -modifications in-place (arrays typically have a little bit of spare capacity at -the end, so appending usually doesn't trigger a reallocation). When there are -shared references, we must create a copy of the array's data before modifying -it so the other references don't see the effects of the mutation. Here are some -simple examples: - -```tomo -nums := [10, 20, 30, 39] - -// Efficient in-place mutation because data references are not shared: -nums[4] = 40 - -// Constant time operation, but increments the reference count: -tmp := nums ->> tmp -= [10, 20, 30, 40] - -// Now, a mutation will trigger a copy-on-write, -// which resets the reference count to zero: -nums[4] = 999 ->> nums -= [10, 20, 30, 999] - -// Because of the copy-on-write, `tmp` is unchanged: ->> tmp -= [10, 20, 30, 40] - -// Since the reference count has been reset, we can do more -// mutations without triggering another copy-on-write: -nums[4] = -1 ->> nums -= [10, 20, 30, -1] -``` - -Array reference counting is _approximate_, but will only ever err on the side -of correctness at the expense of performance, not the other way around. -Occasionally, unnecessary copying may occur, but you should never experience an -array value changing because of some operation performed on a different array -value. - -## Array Pointers - -Since the normal case of arrays is to treat them like immutable values, what do -we do if we actually want to have a shared reference to an array whose contents -change over time? In that case, we want to use the `@` operator to create a -pointer to a heap-allocated array and pass that pointer around. This is the same -behavior that you get in Python when you create a `list`: - -```tomo -nums := @[10, 20, 30] -tmp := nums - -nums.insert(40) ->> tmp -= @[10, 20, 30, 40] -``` - -Having multiple pointers to the same heap-allocated array does not cause the -array's reference count to increase, because there is only one "value" in play: -the one stored on the heap. It's only when we store the "value" in multiple -places that we need to increment the reference count: - -```tomo -// Increment the reference count, because `value` now has to hold -// whatever data was at the pointer's location at this point in time: -value := nums[] -``` - -The TL;DR is: you can cheaply modify local variables that aren't aliased or -`@`-allocated arrays, but if you assign a local variable array to another -variable or dereference a heap pointer, it may trigger copy-on-write behavior. - -## Array Methods - -- [`func binary_search(arr: [T], by: func(x,y:&T->Int32) = T.compare -> Int)`](#binary_search) -- [`func by(arr: [T], step: Int -> [T])`](#by) -- [`func clear(arr: @[T] -> Void)`](#clear) -- [`func counts(arr: [T] -> {T=Int})`](#counts) -- [`func find(arr: [T], target: T -> Int?)`](#find) -- [`func first(arr: [T], predicate: func(item:&T -> Bool) -> Int)`](#first) -- [`func from(arr: [T], first: Int -> [T])`](#from) -- [`func has(arr: [T] -> Bool)`](#has) -- [`func heap_pop(arr: @[T], by: func(x,y:&T->Int32) = T.compare -> T?)`](#heap_pop) -- [`func heap_push(arr: @[T], item: T, by=T.compare -> Void)`](#heap_push) -- [`func heapify(arr: @[T], by: func(x,y:&T->Int32) = T.compare -> Void)`](#heapify) -- [`func insert(arr: @[T], item: T, at: Int = 0 -> Void)`](#insert) -- [`func insert_all(arr: @[T], items: [T], at: Int = 0 -> Void)`](#insert_all) -- [`func pop(arr: &[T], index: Int = -1 -> T?)`](#pop) -- [`func random(arr: [T], random: func(min,max:Int64->Int64)? = none -> T)`](#random) -- [`func remove_at(arr: @[T], at: Int = -1, count: Int = 1 -> Void)`](#remove_at) -- [`func remove_item(arr: @[T], item: T, max_count: Int = -1 -> Void)`](#remove_item) -- [`func reversed(arr: [T] -> [T])`](#reversed) -- [`func sample(arr: [T], count: Int, weights: [Num]? = ![Num], random: func(->Num)? = none -> [T])`](#sample) -- [`func shuffle(arr: @[T], random: func(min,max:Int64->Int64)? = none -> Void)`](#shuffle) -- [`func shuffled(arr: [T], random: func(min,max:Int64->Int64)? = none -> [T])`](#shuffled) -- [`func slice(arr: [T], from: Int, to: Int -> [T])`](#slice) -- [`func sort(arr: @[T], by=T.compare -> Void)`](#sort) -- [`sorted(arr: [T], by=T.compare -> [T])`](#sorted) -- [`to(arr: [T], last: Int -> [T])`](#to) -- [`unique(arr: [T] -> {T})`](#unique) - -### `binary_search` -Performs a binary search on a sorted array. - -```tomo -func binary_search(arr: [T], by: func(x,y:&T->Int32) = T.compare -> Int) -``` - -- `arr`: The sorted array to search. -- `by`: The comparison function used to determine order. If not specified, the - default comparison function for the item type will be used. - -**Returns:** -Assuming the input array is sorted according to the given comparison function, -return the index where the given item would be inserted to maintain the sorted -order. That is, if the item is found, return its index, otherwise return the -place where it would be found if it were inserted and the array were sorted. - -**Example:** -```tomo ->> [1, 3, 5, 7, 9].binary_search(5) -= 3 - ->> [1, 3, 5, 7, 9].binary_search(-999) -= 1 - ->> [1, 3, 5, 7, 9].binary_search(999) -= 6 -``` - ---- - -### `by` -Creates a new array with elements spaced by the specified step value. - -```tomo -func by(arr: [T], step: Int -> [T]) -``` - -- `arr`: The original array. -- `step`: The step value for selecting elements. - -**Returns:** -A new array with every `step`-th element from the original array. - -**Example:** -```tomo ->> [1, 2, 3, 4, 5, 6].by(2) -= [1, 3, 5] -``` - ---- - -### `clear` -Clears all elements from the array. - -```tomo -func clear(arr: @[T] -> Void) -``` - -- `arr`: The mutable reference to the array to be cleared. - -**Returns:** -Nothing. - -**Example:** -```tomo ->> my_array.clear() -``` - ---- - -### `counts` -Counts the occurrences of each element in the array. - -```tomo -func counts(arr: [T] -> {T=Int}) -``` - -- `arr`: The array to count elements in. - -**Returns:** -A table mapping each element to its count. - -**Example:** -```tomo ->> [10, 20, 30, 30, 30].counts() -= {10=1, 20=1, 30=3} -``` - ---- - -### `find` -Finds the index of the first occurrence of an element (if any). - -```tomo -func find(arr: [T], target: T -> Int?) -``` - -- `arr`: The array to search through. -- `item`: The item to find in the array. - -**Returns:** -The index of the first occurrence or `!Int` if not found. - -**Example:** -```tomo ->> [10, 20, 30, 40, 50].find(20) -= 2 : Int? - ->> [10, 20, 30, 40, 50].find(9999) -= none : Int? -``` - ---- - -### `first` -Find the index of the first item that matches a predicate function (if any). - -```tomo -func first(arr: [T], predicate: func(item:&T -> Bool) -> Int) -``` - -- `arr`: The array to search through. -- `predicate`: A function that returns `yes` if the item should be returned or - `no` if it should not. - -**Returns:** -Returns the index of the first item where the predicate is true or `!Int` if no -item matches. - -**Example:** -```tomo ->> [4, 5, 6].find(func(i:&Int): i.is_prime()) -= 5 : Int? ->> [4, 6, 8].find(func(i:&Int): i.is_prime()) -= none : Int? -``` - ---- - -### `from` -Returns a slice of the array starting from a specified index. - -```tomo -func from(arr: [T], first: Int -> [T]) -``` - -- `arr`: The original array. -- `first`: The index to start from. - -**Returns:** -A new array starting from the specified index. - -**Example:** -```tomo ->> [10, 20, 30, 40, 50].from(3) -= [30, 40, 50] -``` - ---- - -### `has` -Checks if the array has any elements. - -```tomo -func has(arr: [T] -> Bool) -``` - -- `arr`: The array to check. - -**Returns:** -`yes` if the array has elements, `no` otherwise. - -**Example:** -```tomo ->> [10, 20, 30].has(20) -= yes -``` - ---- - -### `heap_pop` -Removes and returns the top element of a heap or `none` if the array is empty. -By default, this is the *minimum* value in the heap. - -```tomo -func heap_pop(arr: @[T], by: func(x,y:&T->Int32) = T.compare -> T?) -``` - -- `arr`: The mutable reference to the heap. -- `by`: The comparison function used to determine order. If not specified, the - default comparison function for the item type will be used. - -**Returns:** -The removed top element of the heap or `none` if the array is empty. - -**Example:** -```tomo ->> my_heap := [30, 10, 20] ->> my_heap.heapify() ->> my_heap.heap_pop() -= 10 -``` - ---- - -### `heap_push` -Adds an element to the heap and maintains the heap property. By default, this -is a *minimum* heap. - -```tomo -func heap_push(arr: @[T], item: T, by=T.compare -> Void) -``` - -- `arr`: The mutable reference to the heap. -- `item`: The item to be added. -- `by`: The comparison function used to determine order. If not specified, the - default comparison function for the item type will be used. - -**Returns:** -Nothing. - -**Example:** -```tomo ->> my_heap.heap_push(10) -``` - ---- - -### `heapify` -Converts an array into a heap. - -```tomo -func heapify(arr: @[T], by: func(x,y:&T->Int32) = T.compare -> Void) -``` - -- `arr`: The mutable reference to the array to be heapified. -- `by`: The comparison function used to determine order. If not specified, the - default comparison function for the item type will be used. - -**Returns:** -Nothing. - -**Example:** -```tomo ->> my_heap := [30, 10, 20] ->> my_heap.heapify() -``` - ---- - -### `insert` -Inserts an element at a specified position in the array. - -```tomo -func insert(arr: @[T], item: T, at: Int = 0 -> Void) -``` - -- `arr`: The mutable reference to the array. -- `item`: The item to be inserted. -- `at`: The index at which to insert the item (default is `0`). Since indices - are 1-indexed and negative indices mean "starting from the back", an index of - `0` means "after the last item". - -**Returns:** -Nothing. - -**Example:** -```tomo ->> arr := [10, 20] ->> arr.insert(30) ->> arr -= [10, 20, 30] - ->> arr.insert(999, at=2) ->> arr -= [10, 999, 20, 30] -``` - ---- - -### `insert_all` -Inserts an array of items at a specified position in the array. - -```tomo -func insert_all(arr: @[T], items: [T], at: Int = 0 -> Void) -``` - -- `arr`: The mutable reference to the array. -- `items`: The items to be inserted. -- `at`: The index at which to insert the item (default is `0`). Since indices - are 1-indexed and negative indices mean "starting from the back", an index of - `0` means "after the last item". - -**Returns:** -Nothing. - -**Example:** -```tomo -arr := [10, 20] -arr.insert_all([30, 40]) ->> arr -= [10, 20, 30, 40] - -arr.insert_all([99, 100], at=2) ->> arr -= [10, 99, 100, 20, 30, 40] -``` - ---- - -### `pop` -Removes and returns an item from the array. If the given index is present in -the array, the item at that index will be removed and the array will become one -element shorter. - -```tomo -func pop(arr: &[T], index: Int = -1 -> T?) -``` - -- `arr`: The array to remove an item from. -- `index`: The index from which to remove the item (default: the last item). - -**Returns:** -`none` if the array is empty or the given index does not exist in the array, -otherwise the item at the given index. - -**Example:** -```tomo ->> arr := [10, 20, 30, 40] - ->> arr.pop() -= 40 ->> arr -= &[10, 20, 30] - ->> arr.pop(index=2) -= 20 ->> arr -= &[10, 30] -``` - ---- - -### `random` -Selects a random element from the array. - -```tomo -func random(arr: [T], random: func(min,max:Int64->Int64)? = none -> T) -``` - -- `arr`: The array from which to select a random element. -- `random`: If provided, this function will be used to get a random index in the array. Returned - values must be between `min` and `max` (inclusive). (Used for deterministic pseudorandom number - generation) - -**Returns:** -A random element from the array. - -**Example:** -```tomo ->> [10, 20, 30].random() -= 20 -``` - ---- - -### `remove_at` -Removes elements from the array starting at a specified index. - -```tomo -func remove_at(arr: @[T], at: Int = -1, count: Int = 1 -> Void) -``` - -- `arr`: The mutable reference to the array. -- `at`: The index at which to start removing elements (default is `-1`, which means the end of the array). -- `count`: The number of elements to remove (default is `1`). - -**Returns:** -Nothing. - -**Example:** -```tomo -arr := [10, 20, 30, 40, 50] -arr.remove_at(2) ->> arr -= [10, 30, 40, 50] - -arr.remove_at(2, count=2) ->> arr -= [10, 50] -``` - ---- - -### `remove_item` -Removes all occurrences of a specified item from the array. - -```tomo -func remove_item(arr: @[T], item: T, max_count: Int = -1 -> Void) -``` - -- `arr`: The mutable reference to the array. -- `item`: The item to be removed. -- `max_count`: The maximum number of occurrences to remove (default is `-1`, meaning all occurrences). - -**Returns:** -Nothing. - -**Example:** -```tomo -arr := [10, 20, 10, 20, 30] -arr.remove_item(10) ->> arr -= [20, 20, 30] - -arr.remove_item(20, max_count=1) ->> arr -= [20, 30] -``` - ---- - -### `reversed` -Returns a reversed slice of the array. - -```tomo -func reversed(arr: [T] -> [T]) -``` - -- `arr`: The array to be reversed. - -**Returns:** -A slice of the array with elements in reverse order. - -**Example:** -```tomo ->> [10, 20, 30].reversed() -= [30, 20, 10] -``` - ---- - -### `sample` -Selects a sample of elements from the array, optionally with weighted -probabilities. - -```tomo -func sample(arr: [T], count: Int, weights: [Num]? = ![Num], random: func(->Num)? = none -> [T]) -``` - -- `arr`: The array to sample from. -- `count`: The number of elements to sample. -- `weights`: The probability weights for each element in the array. These - values do not need to add up to any particular number, they are relative - weights. If no weights are given, elements will be sampled with uniform - probability. -- `random`: If provided, this function will be used to get random values for - sampling the array. The provided function should return random numbers - between `0.0` (inclusive) and `1.0` (exclusive). (Used for deterministic - pseudorandom number generation) - -**Errors:** -Errors will be raised if any of the following conditions occurs: -- The given array has no elements and `count >= 1` -- `count < 0` (negative count) -- The number of weights provided doesn't match the length of the array. -- Any weight in the weights array is negative, infinite, or `NaN` -- The sum of the given weights is zero (zero probability for every element). - -**Returns:** -A list of sampled elements from the array. - -**Example:** -```tomo ->> [10, 20, 30].sample(2, weights=[90%, 5%, 5%]) -= [10, 10] -``` - ---- - -### `shuffle` -Shuffles the elements of the array in place. - -```tomo -func shuffle(arr: @[T], random: func(min,max:Int64->Int64)? = none -> Void) -``` - -- `arr`: The mutable reference to the array to be shuffled. -- `random`: If provided, this function will be used to get a random index in the array. Returned - values must be between `min` and `max` (inclusive). (Used for deterministic pseudorandom number - generation) - -**Returns:** -Nothing. - -**Example:** -```tomo ->> arr.shuffle() -``` - ---- - -### `shuffled` -Creates a new array with elements shuffled. - -```tomo -func shuffled(arr: [T], random: func(min,max:Int64->Int64)? = none -> [T]) -``` - -- `arr`: The array to be shuffled. -- `random`: If provided, this function will be used to get a random index in the array. Returned - values must be between `min` and `max` (inclusive). (Used for deterministic pseudorandom number - generation) - -**Returns:** -A new array with shuffled elements. - -**Example:** -```tomo ->> [10, 20, 30, 40].shuffled() -= [40, 10, 30, 20] -``` - ---- - -### `slice` -Returns a slice of the array spanning the given indices (inclusive). - -```tomo -func slice(arr: [T], from: Int, to: Int -> [T]) -``` - -- `arr`: The original array. -- `from`: The first index to include. -- `to`: The last index to include. - -**Returns:** -A new array spanning the given indices. Note: negative indices are counted from -the back of the array, so `-1` refers to the last element, `-2` the -second-to-last, and so on. - -**Example:** -```tomo ->> [10, 20, 30, 40, 50].slice(2, 4) -= [20, 30, 40] - ->> [10, 20, 30, 40, 50].slice(-3, -2) -= [30, 40] -``` - ---- - -### `sort` -Sorts the elements of the array in place in ascending order (small to large). - -```tomo -func sort(arr: @[T], by=T.compare -> Void) -``` - -- `arr`: The mutable reference to the array to be sorted. -- `by`: The comparison function used to determine order. If not specified, the - default comparison function for the item type will be used. - -**Returns:** -Nothing. - -**Example:** -```tomo -arr := [40, 10, -30, 20] -arr.sort() ->> arr -= [-30, 10, 20, 40] - -arr.sort(func(a,b:&Int): a.abs() <> b.abs()) ->> arr -= [10, 20, -30, 40] -``` - ---- - -### `sorted` -Creates a new array with elements sorted. - -```tomo -sorted(arr: [T], by=T.compare -> [T]) -``` - -- `arr`: The array to be sorted. -- `by`: The comparison function used to determine order. If not specified, the - default comparison function for the item type will be used. - -**Returns:** -A new array with sorted elements. - -**Example:** -```tomo ->> [40, 10, -30, 20].sorted() -= [-30, 10, 20, 40] - ->> [40, 10, -30, 20].sorted(func(a,b:&Int): a.abs() <> b.abs()) -= [10, 20, -30, 40] -``` - ---- - -### `to` -Returns a slice of the array from the start of the original array up to a specified index (inclusive). - -```tomo -to(arr: [T], last: Int -> [T]) -``` - -- `arr`: The original array. -- `last`: The index up to which elements should be included. - -**Returns:** -A new array containing elements from the start up to the specified index. - -**Example:** -```tomo ->> [10, 20, 30, 40, 50].to(3) -= [10, 20, 30] - ->> [10, 20, 30, 40, 50].to(-2) -= [10, 20, 30, 40] -``` - ---- - -### `unique` -Returns a Set that contains the unique elements of the array. - -```tomo -unique(arr: [T] -> {T}) -``` - -- `arr`: The array to process. - -**Returns:** -A set containing only unique elements from the array. - -**Example:** -```tomo ->> [10, 20, 10, 10, 30].unique() -= {10, 20, 30} -``` -- cgit v1.2.3