Switch from heap/stack to just a heap

This commit switches strategies for storing `JsValue` from a heap/stack
to just one heap. This mirrors the new strategy for `JsValue` storage
in #1002 and should make multiplexing those strategies at
`wasm-bindgen`-time much easier.

Instead of having one array which acts as a stack for borrowed values
and one array for a heap of borrowed values, only one JS array is used
for storage of JS values now. This makes `getObject` far simpler by
simply being an array access, but it means that cloning an object now
reserves a new slot instead of reference counting it. If the old
reference counting behavior is needed it's thought that `Rc<JsValue>`
can be used in Rust.

The new "heap" has an initial stack pointer which grows downwards, and a
heap which grows upwards. The heap is a singly-linked-list which is
allocated/deallocated from. The stack grows downwards to zero and
presumably starts generating errors once it underflows. An initial stack
size of 32 is chosen as that should encompass all use cases today, but
we can eventually probably add configuration for this!

Note that the heap is initialized to all `null` for the stack and then
the initial JS values (`undefined`, `null`, `true`, `false`) are pushed
onto the heap in reserved locations.
This commit is contained in:
Alex Crichton
2018-11-29 18:15:36 -08:00
parent e746ad5a0a
commit 49d835a7bc
4 changed files with 137 additions and 250 deletions

View File

@ -5,18 +5,21 @@ around JS objects in wasm, but that's not allowed today! While indeed true,
that's where the polyfill comes in.
The question here is how we shoehorn JS objects into a `u32` for wasm to use.
The current strategy for this approach is to maintain two module-local variables
in the generated `foo.js` file: a stack and a heap.
The current strategy for this approach is to maintain a module-local variable
in the generated `foo.js` file: a `heap`.
### Temporary JS objects on the stack
### Temporary JS objects on the "stack"
The stack in `foo.js` is, well, a stack. JS objects are pushed on the top of the
stack, and their index in the stack is the identifier that's passed to wasm. JS
objects are then only removed from the top of the stack as well. This data
structure is mainly useful for efficiently passing a JS object into wasm without
a sort of "heap allocation". The downside of this, however, is that it only
works for when wasm doesn't hold onto a JS object (aka it only gets a
"reference" in Rust parlance).
The first slots in the `heap` in `foo.js` are considered a stack. This stack,
like typical program execution stacks, grows down. JS objects are pushed on the
bottom of the stack, and their index in the stack is the identifier that's passed
to wasm. A stack pointer is maintained to figure out where the next item is
pushed.
JS objects are then only removed from the bottom of the stack as well. Removal
is simply storing null then incrementing a counter. Because of the "stack-y"
nature of this sceheme it only works for when wasm doesn't hold onto a JS object
(aka it only gets a "reference" in Rust parlance).
Let's take a look at an example.
@ -47,11 +50,14 @@ and what we actually generate looks something like:
// foo.js
import * as wasm from './foo_bg';
const stack = [];
const heap = new Array(32);
heap.push(undefined, null, true, false);
let stack_pointer = 32;
function addBorrowedObject(obj) {
stack.push(obj);
return stack.length - 1;
stack_pointer -= 1;
heap[stack_pointer] = obj;
return stack_pointer;
}
export function foo(arg0) {
@ -59,7 +65,7 @@ export function foo(arg0) {
try {
wasm.foo(idx0);
} finally {
stack.pop();
heap[stack_pointer++] = undefined;
}
}
```
@ -68,13 +74,13 @@ Here we can see a few notable points of action:
* The wasm file was renamed to `foo_bg.wasm`, and we can see how the JS module
generated here is importing from the wasm file.
* Next we can see our `stack` module variable which is used to push/pop items
from the stack.
* Next we can see our `heap` module variable which is to store all JS values
reference-able from wasm.
* Our exported function `foo`, takes an arbitrary argument, `arg0`, which is
converted to an index with the `addBorrowedObject` object function. The index
is then passed to wasm so wasm can operate with it.
* Finally, we have a `finally` which frees the stack slot as it's no longer
used, issuing a `pop` for what was pushed at the start of the function.
used, popping the value that was pushed at the start of the function.
It's also helpful to dig into the Rust side of things to see what's going on
there! Let's take a look at the code that `#[wasm_bindgen]` generates in Rust:
@ -104,12 +110,13 @@ And as with the JS, the notable points here are:
in a `JsValue`. There's some trickery here that's not worth going into just
yet, but we'll see in a bit what's happening under the hood.
### Long-lived JS objects in a slab
### Long-lived JS objects
The above strategy is useful when JS objects are only temporarily used in Rust,
for example only during one function call. Sometimes, though, objects may have a
dynamic lifetime or otherwise need to be stored on Rust's heap. To cope with
this there's a second half of management of JS objects, a slab.
this there's a second half of management of JS objects, naturally corresponding
to the other side of the JS `heap` array.
JS Objects passed to wasm that are not references are assumed to have a dynamic
lifetime inside of the wasm module. As a result the strict push/pop of the stack
@ -135,16 +142,16 @@ different. Let's see the generated JS's slab in action:
```js
import * as wasm from './foo_bg'; // imports from wasm file
const slab = [];
let slab_next = 0;
const heap = new Array(32);
heap.push(undefined, null, true, false);
let heap_next = 36;
function addHeapObject(obj) {
if (slab_next === slab.length)
slab.push(slab.length + 1);
const idx = slab_next;
const next = slab[idx];
slab_next = next;
slab[idx] = { obj, cnt: 1 };
if (heap_next === heap.length)
heap.push(heap.length + 1);
const idx = heap_next;
heap_next = heap[idx];
heap[idx] = obj;
return idx;
}
@ -154,24 +161,17 @@ export function foo(arg0) {
}
export function __wbindgen_object_drop_ref(idx) {
let obj = slab[idx];
obj.cnt -= 1;
if (obj.cnt > 0)
return;
// If we hit 0 then free up our space in the slab
slab[idx] = slab_next;
slab_next = idx;
heap[idx ] = heap_next;
heap_next = idx;
}
```
Unlike before we're now calling `addHeapObject` on the argument to `foo` rather
than `addBorrowedObject`. This function will use `slab` and `slab_next` as a
than `addBorrowedObject`. This function will use `heap` and `heap_next` as a
slab allocator to acquire a slot to store the object, placing a structure there
once it's found.
Note here that a reference count is used in addition to storing the object.
That's so we can create multiple references to the JS object in Rust without
using `Rc`, but it's overall not too important to worry about here.
once it's found. Note that this is going on the right-half of the array, unlike
the stack which resides on the left half. This discipline mirrors the stack/heap
in normal programs, roughly.
Another curious aspect of this generated module is the
`__wbindgen_object_drop_ref` function. This is one that's actually imported from
@ -229,10 +229,9 @@ If you'll recall as well, when we took `&JsValue` above we generated a wrapper
of `ManuallyDrop` around the local binding, and that's because we wanted to
avoid invoking this destructor when the object comes from the stack.
### Indexing both a slab and the stack
### Working with `heap` in reality
You might be thinking at this point that this system may not work! There's
indexes into both the slab and the stack mixed up, but how do we differentiate?
It turns out that the examples above have been simplified a bit, but otherwise
the lowest bit is currently used as an indicator of whether you're a slab or a
stack index.
The above explanations are pretty close to what happens today, but in reality
there's a few differences especially around handling constant values like
`undefined`, `null`, etc. Be sure to check out the actual generated JS and the
generation code for the full details!