Skip to content
Jonathan's Page

Arbitrary Iteration

Commentscode, tech, tips, tricks, generators5 min read

Lets get to it

How many times do you find that you need to iterate over n things to produce some data or an effect? Python has a syntactic thing for this, but JavaScript does not. (that I know of, at least) Let’s say we have a React app and want to have three blank rows in a To-Do list. No data is backing them yet, so we just need to have rows there. The naive approach would be to do this:

<Todo>
<Item />
<Item />
<Item />
</Todo>

This is fine, but it could be better. Let’s say we want to have a minimum of three but at least one blank after data is loaded. (Contrived, I know. This is just an example!) With this requirement, we cannot just put three empty Item components within the list. We have to compute n = max(1, 3 - data.length) and then render just that many items.

You may decide to use a for-loop. That makes sense. But, we cannot use for-loops in JSX unfortunately. So, if we go the for-loop route, that leaves us creating a list out-of flow and referencing it where we want to place it:

const placeholders = [];
for (const x = 0; x < n; x++) {
placeholders.push(<Item key={x} />);
}
<Todo>
{data}
{placeholders}
</Todo>;

While this is the most "correct" way to do this, it doesn't feel right. The JSX expression, <Item>, is outside the main expression: <Todo>.

Ah! Let’s map!

const placeholders = []; //but where do we get this?
<Todo>
{data}
{placeholders.map((x, i) => (
<Item key={i} />
))}
</Todo>;

With this pattern, while it looks good, we do not use the values in the placeholder array: wasted allocations. (Not ideal) But wait, where do we get an array of the correct length to begin with?

If we use new Array(n), we get an array with the property length set to n, but without any indices. The constructor does not allocate the full array. If you attempt to call map on this array, it will not produce what you intuit. Since map will only call its callback for every entry (index-value pairs) in the array, the array being technically empty is a problem.

If we use Array.from({length: n}), we get a fully allocated array for the range 0 - n, pulling the value at the key n from the input object for the resulting index. If the key does not exist, the resultant will have an entry of n: undefined. In this example, resulting in undefined for each index value... We get the output we want (a mappable array), but with some waste. (a fully allocated array that we will only use to iterate)

But wait, Array.from takes an optional mapper function as the second argument... This solves the example problem. However, I haven't covered the main point yet! 🙃

The solution:

<Todo>
{data}
{Array.from({ length: n }, () => (
<Item />
))}
</Todo>

Let's pretend we can stream elements and the full allocation can be avoided in this example. And now...

Generators

Generators are special functions that return an instance of Generator which conforms to the iterable protocol and the iterator protocol. (I will simply refer to this as an iterator.) The body of the function does not execute when invoked. Instead, the runtime effectively splits the body into chunks delimited by yield expressions. These chunks only run when the next() method is invoked on the iterator.

Here is a basic generator function:

function* range(start, end) {
while (start < end) {
yield start++;
}
}

This function returns an iterator that iterates over the range [start,end). We can consume the iterator directly calling iter.next() (see the iterator protocol):

while (((x = iter.next()), !x.done)) {
// use the current x.value
}

However, because it also implements the iterable protocol, we can use anything that consumes iterable objects such as these:

  • for(x of iter)
  • Array.from(iter)
  • [...iter] ⬅ the spread operator

To build on this, let's look at another generator function. This one calls a factory function for each iteration and "yields" the result to the caller. Notice that we can pass an iterator, or pass a number. Passing a number will result in a call to our first function with range(0, n), replacing the input argument with the resulting iterator; normalizing it so the rest of the function can operate on just the iterator. This allows us to simply call generator(n, fn) or to have a custom range like generator(range(3,10), pagerFactory).

function* generateItems(iter, factory) {
// normalize input
if (typeof iter === 'number') {
iter = range(0, iter);
}
// The main part:
for (const x of iter) {
yield factory(x);
}
}

Let’s put this to use

By combining an array spread on the return of generateItems we get:

<Todo>
{data}
{[...generateItems(n, (x) => <Item />)]}
</Todo>

This gives us the correct number of placeholders, and on top of that, only meaningful allocations are made. This still fully allocates the array, but if we could stream the elements, we could avoid allocating the array.

The various generator functions discussed in this post can be found on this gist.

Stats

I created a contrived example and collected some stats to further exemplify the waste of using Array.from().map() and new Array(n).

In these scripts, we iterate 100,000,000 times creating a volatile object with an x key set to the value of the current iteration... This is overly simple and completely useless. However, it makes my point clear: pre-allocating an array of that length consumes quite a bit more memory and takes considerably longer. Links to these examples can be found below.

node preallocate.js
Time: 19.522s
Used: 1500.27MB ⟸ 1.5GB!
node generator.js
Time: 9.023s
Used: 4.33MB ⟸ 4.3MB! (99.71% less!)

Just for fun, I also made a version to retain everything... it will crash without giving node more memory...

node --max-old-space-size=8192 fully-retain.js
Time: 21.574s
Used: 4392.69MB ⟸ 4.29GB!
Attachments:
generator.jspreallocate.jsfully-retain.js
© 2024 by Jonathan's Page. All rights reserved.
Theme by LekoArts